]> Git Repo - linux.git/log
linux.git
4 years agoperf tools: Rename 'enum dso_kernel_type' to 'enum dso_space_type'
Jiri Olsa [Sat, 8 Aug 2020 12:21:54 +0000 (14:21 +0200)]
perf tools: Rename 'enum dso_kernel_type' to 'enum dso_space_type'

Rename enum dso_kernel_type to enum dso_space_type, which seems like
better fit.

Committer notes:

This is used with 'struct dso'->kernel, which once was a boolean, so
DSO_SPACE__USER is zero, !zero means some sort of kernel space, be it
the host kernel space or a guest kernel space.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agoMAINTAINERS: Add missing tools/lib/perf/ path to perf maintainers
Rob Herring [Fri, 7 Aug 2020 19:32:25 +0000 (13:32 -0600)]
MAINTAINERS: Add missing tools/lib/perf/ path to perf maintainers

Commit 3ce311afb558 ("libperf: Move to tools/lib/perf") moved libperf
out of tools/perf/, but failed to update MAINTAINERS.

Signed-off-by: Rob Herring <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agolibperf: Fix man page typos
Rob Herring [Fri, 7 Aug 2020 19:32:41 +0000 (13:32 -0600)]
libperf: Fix man page typos

Fix various typos and inconsistent capitalization of CPU in the libperf
man pages.

Signed-off-by: Rob Herring <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agoperf test: Allow multiple probes in record+script_probe_vfs_getname.sh
Michael Petlan [Wed, 22 Jul 2020 13:58:45 +0000 (15:58 +0200)]
perf test: Allow multiple probes in record+script_probe_vfs_getname.sh

Sometimes when adding a kprobe by perf, it results in multiple probe
points, such as the following:

  # ./perf probe -l
    probe:vfs_getname    (on getname_flags:73@fs/namei.c with pathname)
    probe:vfs_getname_1  (on getname_flags:73@fs/namei.c with pathname)
    probe:vfs_getname_2  (on getname_flags:73@fs/namei.c with pathname)
  # cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/vfs_getname _text+5501804 pathname=+0(+0(%gpr31)):string
  p:probe/vfs_getname_1 _text+5505388 pathname=+0(+0(%gpr31)):string
  p:probe/vfs_getname_2 _text+5508396 pathname=+0(+0(%gpr31)):string

In this test, we need to record all of them and expect any of them in
the perf-script output, since it's not clear which one will be used for
the desired syscall:

  # perf stat -e probe:vfs_getname\* -- touch /tmp/nic

   Performance counter stats for 'touch /tmp/nic':

                31      probe:vfs_getname_2
                 0      probe:vfs_getname_1
                 1      probe:vfs_getname
       0.001421826 seconds time elapsed

       0.001506000 seconds user
       0.000000000 seconds sys

If the test relies only on probe:vfs_getname, it might easily miss the
relevant data.

Signed-off-by: Michael Petlan <[email protected]>
Cc: Jiri Olsa <[email protected]>
LPU-Reference: 20200722135845[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agoperf bench mem: Always memset source before memcpy
Vincent Whitchurch [Mon, 10 Aug 2020 13:34:04 +0000 (15:34 +0200)]
perf bench mem: Always memset source before memcpy

For memcpy, the source pages are memset to zero only when --cycles is
used.  This leads to wildly different results with or without --cycles,
since all sources pages are likely to be mapped to the same zero page
without explicit writes.

Before this fix:

$ export cmd="./perf stat -e LLC-loads -- ./perf bench \
  mem memcpy -s 1024MB -l 100 -f default"
$ $cmd

         2,935,826      LLC-loads
       3.821677452 seconds time elapsed

$ $cmd --cycles

       217,533,436      LLC-loads
       8.616725985 seconds time elapsed

After this fix:

$ $cmd

       214,459,686      LLC-loads
       8.674301124 seconds time elapsed

$ $cmd --cycles

       214,758,651      LLC-loads
       8.644480006 seconds time elapsed

Fixes: 47b5757bac03c338 ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
Signed-off-by: Vincent Whitchurch <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agoperf sched: Prefer sched_waking event when it exists
David Ahern [Fri, 7 Aug 2020 16:48:44 +0000 (10:48 -0600)]
perf sched: Prefer sched_waking event when it exists

Commit fbd705a0c618 ("sched: Introduce the 'trace_sched_waking'
tracepoint") added sched_waking tracepoint which should be preferred
over sched_wakeup when analyzing scheduling delays.

Update 'perf sched record' to collect sched_waking events if it exists
and fallback to sched_wakeup if it does not. Similarly, update timehist
command to skip sched_wakeup events if the session includes sched_waking
(ie., sched_waking is preferred over sched_wakeup).

Signed-off-by: David Ahern <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
4 years agox86/alternatives: Acquire pte lock with interrupts enabled
Sebastian Andrzej Siewior [Thu, 13 Aug 2020 10:50:26 +0000 (12:50 +0200)]
x86/alternatives: Acquire pte lock with interrupts enabled

pte lock is never acquired in-IRQ context so it does not require interrupts
to be disabled. The lock is a regular spinlock which cannot be acquired
with interrupts disabled on RT.

RT complains about pte_lock() in __text_poke() because it's invoked after
disabling interrupts.

__text_poke() has to disable interrupts as use_temporary_mm() expects
interrupts to be off because it invokes switch_mm_irqs_off() and uses
per-CPU (current active mm) data.

Move the PTE lock handling outside the interrupt disabled region.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by; Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agoxen: Sync up with the canonical protocol definition in Xen
Oleksandr Andrushchenko [Thu, 13 Aug 2020 06:21:12 +0000 (09:21 +0300)]
xen: Sync up with the canonical protocol definition in Xen

This is the sync up with the canonical definition of the
display protocol in Xen.

1. Add protocol version as an integer

Version string, which is in fact an integer, is hard to handle in the
code that supports different protocol versions. To simplify that
also add the version as an integer.

2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE

There are cases when display data buffer is created with non-zero
offset to the data start. Handle such cases and provide that offset
while creating a display buffer.

3. Add XENDISPL_OP_GET_EDID command

Add an optional request for reading Extended Display Identification
Data (EDID) structure which allows better configuration of the
display connectors over the configuration set in XenStore.
With this change connectors may have multiple resolutions defined
with respect to detailed timing definitions and additional properties
normally provided by displays.

If this request is not supported by the backend then visible area
is defined by the relevant XenStore's "resolution" property.

If backend provides extended display identification data (EDID) with
XENDISPL_OP_GET_EDID request then EDID values must take precedence
over the resolutions defined in XenStore.

4. Bump protocol version to 2.

Signed-off-by: Oleksandr Andrushchenko <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
4 years agodrm/xen-front: Add YUYV to supported formats
Oleksandr Andrushchenko [Thu, 13 Aug 2020 06:21:11 +0000 (09:21 +0300)]
drm/xen-front: Add YUYV to supported formats

Add YUYV to supported formats, so the frontend can work with the
formats used by cameras and other HW.

Signed-off-by: Oleksandr Andrushchenko <[email protected]>
Acked-by: Noralf Trønnes <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
4 years agodrm/xen-front: Fix misused IS_ERR_OR_NULL checks
Oleksandr Andrushchenko [Thu, 13 Aug 2020 06:21:10 +0000 (09:21 +0300)]
drm/xen-front: Fix misused IS_ERR_OR_NULL checks

The patch c575b7eeb89f: "drm/xen-front: Add support for Xen PV
display frontend" from Apr 3, 2018, leads to the following static
checker warning:

drivers/gpu/drm/xen/xen_drm_front_gem.c:140 xen_drm_front_gem_create()
warn: passing zero to 'ERR_CAST'

drivers/gpu/drm/xen/xen_drm_front_gem.c
   133  struct drm_gem_object *xen_drm_front_gem_create(struct drm_device *dev,
   134                                                  size_t size)
   135  {
   136          struct xen_gem_object *xen_obj;
   137
   138          xen_obj = gem_create(dev, size);
   139          if (IS_ERR_OR_NULL(xen_obj))
   140                  return ERR_CAST(xen_obj);

Fix this and the rest of misused places with IS_ERR_OR_NULL in the
driver.

Fixes: c575b7eeb89f: "drm/xen-front: Add support for Xen PV display frontend"
Signed-off-by: Oleksandr Andrushchenko <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Reviewed-by: Dan Carpenter <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
4 years agoxen/gntdev: Fix dmabuf import with non-zero sgt offset
Oleksandr Andrushchenko [Thu, 13 Aug 2020 06:21:09 +0000 (09:21 +0300)]
xen/gntdev: Fix dmabuf import with non-zero sgt offset

It is possible that the scatter-gather table during dmabuf import has
non-zero offset of the data, but user-space doesn't expect that.
Fix this by failing the import, so user-space doesn't access wrong data.

Fixes: bf8dc55b1358 ("xen/gntdev: Implement dma-buf import functionality")
Signed-off-by: Oleksandr Andrushchenko <[email protected]>
Acked-by: Juergen Gross <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
4 years agoALSA: echoaudio: Fix potential Oops in snd_echo_resume()
Dinghao Liu [Thu, 13 Aug 2020 07:46:30 +0000 (15:46 +0800)]
ALSA: echoaudio: Fix potential Oops in snd_echo_resume()

Freeing chip on error may lead to an Oops at the next time
the system goes to resume. Fix this by removing all
snd_echo_free() calls on error.

Fixes: 47b5d028fdce8 ("ALSA: Echoaudio - Add suspend support #2")
Signed-off-by: Dinghao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agogenirq: Unlock irq descriptor after errors
Guenter Roeck [Tue, 11 Aug 2020 18:00:12 +0000 (11:00 -0700)]
genirq: Unlock irq descriptor after errors

In irq_set_irqchip_state(), the irq descriptor is not unlocked after an
error is encountered. While that should never happen in practice, a buggy
driver may trigger it. This would result in a lockup, so fix it.

Fixes: 1d0326f352bb ("genirq: Check irq_data_get_irq_chip() return value before use")
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
4 years agocrypto: algif_aead - fix uninitialized ctx->init
Ondrej Mosnacek [Wed, 12 Aug 2020 12:58:25 +0000 (14:58 +0200)]
crypto: algif_aead - fix uninitialized ctx->init

In skcipher_accept_parent_nokey() the whole af_alg_ctx structure is
cleared by memset() after allocation, so add such memset() also to
aead_accept_parent_nokey() so that the new "init" field is also
initialized to zero. Without that the initial ctx->init checks might
randomly return true and cause errors.

While there, also remove the redundant zero assignments in both
functions.

Found via libkcapi testsuite.

Cc: Stephan Mueller <[email protected]>
Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when ctx->more is zero")
Suggested-by: Herbert Xu <[email protected]>
Signed-off-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
4 years agomfd: syscon: Use a unique name with regmap_config
Suman Anna [Mon, 27 Jul 2020 21:10:08 +0000 (16:10 -0500)]
mfd: syscon: Use a unique name with regmap_config

The DT node full name is currently being used in regmap_config
which in turn is used to create the regmap debugfs directories.
This name however is not guaranteed to be unique and the regmap
debugfs registration can fail in the cases where the syscon nodes
have the same unit-address but are present in different DT node
hierarchies. Replace this logic using the syscon reg resource
address instead (inspired from logic used while creating platform
devices) to ensure a unique name is given for each syscon.

Signed-off-by: Suman Anna <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: Replace HTTP links with HTTPS ones
Alexander A. Klimov [Wed, 22 Jul 2020 19:24:54 +0000 (21:24 +0200)]
mfd: Replace HTTP links with HTTPS ones

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: dln2: Run event handler loop under spinlock
Andy Shevchenko [Thu, 23 Jul 2020 13:02:46 +0000 (16:02 +0300)]
mfd: dln2: Run event handler loop under spinlock

The event handler loop must be run with interrupts disabled.
Otherwise we will have a warning:

[ 1970.785649] irq 31 handler lineevent_irq_handler+0x0/0x20 enabled interrupts
[ 1970.792739] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:159 __handle_irq_event_percpu+0x162/0x170
[ 1970.860732] RIP: 0010:__handle_irq_event_percpu+0x162/0x170
...
[ 1970.946994] Call Trace:
[ 1970.949446]  <IRQ>
[ 1970.951471]  handle_irq_event_percpu+0x2c/0x80
[ 1970.955921]  handle_irq_event+0x23/0x43
[ 1970.959766]  handle_simple_irq+0x57/0x70
[ 1970.963695]  generic_handle_irq+0x42/0x50
[ 1970.967717]  dln2_rx+0xc1/0x210 [dln2]
[ 1970.971479]  ? usb_hcd_unmap_urb_for_dma+0xa6/0x1c0
[ 1970.976362]  __usb_hcd_giveback_urb+0x77/0xe0
[ 1970.980727]  usb_giveback_urb_bh+0x8e/0xe0
[ 1970.984837]  tasklet_action_common.isra.0+0x4a/0xe0
...

Recently xHCI driver switched to tasklets in the commit 36dc01657b49
("usb: host: xhci: Support running urb giveback in tasklet context").

The handle_irq_event_* functions are expected to be called with interrupts
disabled and they rightfully complain here because we run in tasklet context
with interrupts enabled.

Use a event spinlock to protect event handler from being interrupted.

Note, that there are only two users of this GPIO and ADC drivers and both of
them are using generic_handle_irq() which makes above happen.

Fixes: 338a12814297 ("mfd: Add support for Diolan DLN-2 devices")
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: madera: Improve handling of regulator unbinding
Charles Keepax [Thu, 23 Jul 2020 10:54:59 +0000 (11:54 +0100)]
mfd: madera: Improve handling of regulator unbinding

The current unbinding process for Madera has some issues. The trouble
is runtime PM is disabled as the first step of the process, but
some of the drivers release IRQs causing regmap IRQ to issue a
runtime get which fails. To allow runtime PM to remain enabled during
mfd_remove_devices, the DCVDD regulator must remain available. In
the case of external DCVDD's this is simple, the regulator can simply
be disabled/put after the call to mfd_remove_devices. However, in
the case of an internally supplied DCVDD the regulator needs to be
released after the other MFD children depending on it.

Use the new MFD mfd_remove_devices_late functionality to split
the DCVDD regulator off from the other drivers.

Signed-off-by: Charles Keepax <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: mfd-core: Add mechanism for removal of a subset of children
Charles Keepax [Thu, 23 Jul 2020 10:54:58 +0000 (11:54 +0100)]
mfd: mfd-core: Add mechanism for removal of a subset of children

Currently, the only way to remove MFD children is with a call to
mfd_remove_devices, which will remove all the children. Under
some circumstances it is useful to remove only a subset of the
child devices. For example if some additional clean up is required
between removal of certain child devices.

To accomplish this a level field is added to mfd_cell, the normal
mfd_remove_devices is modified to not remove devices that are set
to a higher level and a corresponding mfd_remove_devices_late
function is added to remove those children.

See further discussion at:
https://lore.kernel.org/lkml/20200616075834.GF2608702@dell/

Suggested-by: Lee Jones <[email protected]>
Signed-off-by: Charles Keepax <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: intel_soc_pmic_mrfld: Simplify the return expression of intel_scu_ipc_dev_iowrite8()
Xu Wang [Mon, 27 Jul 2020 03:04:07 +0000 (03:04 +0000)]
mfd: intel_soc_pmic_mrfld: Simplify the return expression of intel_scu_ipc_dev_iowrite8()

Simplify the return expression.

Signed-off-by: Xu Wang <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: max14577: Remove redundant initialization of variable current_bits
Colin Ian King [Thu, 23 Jul 2020 16:15:41 +0000 (17:15 +0100)]
mfd: max14577: Remove redundant initialization of variable current_bits

The variable current_bits is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: rn5t618: Fix caching of battery related registers
Andreas Kemnade [Fri, 17 Jul 2020 21:00:02 +0000 (23:00 +0200)]
mfd: rn5t618: Fix caching of battery related registers

Battery status changes dynamically, so the corresponding registers
need to be considered volatile. Affected registers are:

 - fuel gauge
 - battery status
 - battery interrupt

Signed-off-by: Andreas Kemnade <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: max77693-private: Drop a duplicated word
Randy Dunlap [Sun, 19 Jul 2020 00:29:31 +0000 (17:29 -0700)]
mfd: max77693-private: Drop a duplicated word

Drop the repeated word "in" in a comment.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: da9055: pdata.h: Drop a duplicated word
Randy Dunlap [Sun, 19 Jul 2020 00:29:17 +0000 (17:29 -0700)]
mfd: da9055: pdata.h: Drop a duplicated word

Drop the repeated word "that" in a comment.

Signed-off-by: Randy Dunlap <[email protected]>
Acked-by: Adam Thomson <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: rn5t618: Make restart handler atomic safe
Andreas Kemnade [Tue, 21 Jul 2020 07:34:02 +0000 (09:34 +0200)]
mfd: rn5t618: Make restart handler atomic safe

The restart handler is executed during the shutdown phase which is
atomic/irq-less. The i2c framework supports atomic transfers since
commit 63b96983a5dd ("i2c: core: introduce callbacks for atomic
transfers") to address this use case. Using i2c regmap in that
situation is not allowed:

[  165.177465] [ BUG: Invalid wait context ]
[  165.181479] 5.8.0-rc3-00003-g0e9088558027-dirty #11 Not tainted
[  165.187400] -----------------------------
[  165.191410] systemd-shutdow/1 is trying to lock:
[  165.196030] d85b4438 (rn5t618:170:(&rn5t618_regmap_config)->lock){+.+.}-{3:3}, at: regmap_update_bits_base+0x30/0x70
[  165.206573] other info that might help us debug this:
[  165.211625] context-{4:4}
[  165.214248] 2 locks held by systemd-shutdow/1:
[  165.218691]  #0: c131c47c (system_transition_mutex){+.+.}-{3:3}, at: __do_sys_reboot+0x90/0x204
[  165.227405]  #1: c131efb4 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x0/0x118
[  165.236288] stack backtrace:
[  165.239174] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.8.0-rc3-00003-g0e9088558027-dirty #11
[  165.248220] Hardware name: Freescale i.MX6 SoloLite (Device Tree)
[  165.254330] [<c0112110>] (unwind_backtrace) from [<c010bfa0>] (show_stack+0x10/0x14)
[  165.262084] [<c010bfa0>] (show_stack) from [<c058093c>] (dump_stack+0xe8/0x120)
[  165.269407] [<c058093c>] (dump_stack) from [<c01835a4>] (__lock_acquire+0x81c/0x2ca0)
[  165.277246] [<c01835a4>] (__lock_acquire) from [<c0186344>] (lock_acquire+0xe4/0x490)
[  165.285090] [<c0186344>] (lock_acquire) from [<c0c98638>] (__mutex_lock+0x74/0x954)
[  165.292756] [<c0c98638>] (__mutex_lock) from [<c0c98f34>] (mutex_lock_nested+0x1c/0x24)
[  165.300769] [<c0c98f34>] (mutex_lock_nested) from [<c07593ec>] (regmap_update_bits_base+0x30/0x70)
[  165.309741] [<c07593ec>] (regmap_update_bits_base) from [<c076b838>] (rn5t618_trigger_poweroff_sequence+0x34/0x64)
[  165.320097] [<c076b838>] (rn5t618_trigger_poweroff_sequence) from [<c076b874>] (rn5t618_restart+0xc/0x2c)
[  165.329669] [<c076b874>] (rn5t618_restart) from [<c01514f8>] (notifier_call_chain+0x48/0x80)
[  165.338113] [<c01514f8>] (notifier_call_chain) from [<c01516a8>] (__atomic_notifier_call_chain+0x70/0x118)
[  165.347770] [<c01516a8>] (__atomic_notifier_call_chain) from [<c0151768>] (atomic_notifier_call_chain+0x18/0x20)
[  165.357949] [<c0151768>] (atomic_notifier_call_chain) from [<c010a828>] (machine_restart+0x68/0x80)
[  165.367001] [<c010a828>] (machine_restart) from [<c0153224>] (__do_sys_reboot+0x11c/0x204)
[  165.375272] [<c0153224>] (__do_sys_reboot) from [<c0100080>] (ret_fast_syscall+0x0/0x28)
[  165.383364] Exception stack(0xd80a5fa8 to 0xd80a5ff0)
[  165.388420] 5fa0:                   00406948 00000000 fee1dead 28121969 01234567 73299b00
[  165.396602] 5fc0: 00406948 00000000 00000000 00000058 be91abc8 00000000 be91ab60 004056f8
[  165.404781] 5fe0: 00000058 be91aabc b6ed4d45 b6e56746

Signed-off-by: Andreas Kemnade <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: kempld-core: Fix 'assignment of read-only location' error
Stephen Rothwell [Fri, 17 Jul 2020 03:41:54 +0000 (13:41 +1000)]
mfd: kempld-core: Fix 'assignment of read-only location' error

 drivers/mfd/kempld-core.c: In function 'kempld_register_cells_generic':
 drivers/mfd/kempld-core.c:105:13: error: assignment of read-only location 'devs[i++]'
  105 |   devs[i++] = kempld_devs[KEMPLD_I2C];
      |             ^
 drivers/mfd/kempld-core.c:108:13: error: assignment of read-only location 'devs[i++]'
  108 |   devs[i++] = kempld_devs[KEMPLD_WDT];
      |             ^
 drivers/mfd/kempld-core.c:111:13: error: assignment of read-only location 'devs[i++]'
  111 |   devs[i++] = kempld_devs[KEMPLD_GPIO];
      |             ^
 drivers/mfd/kempld-core.c:114:13: error: assignment of read-only location 'devs[i++]'
  114 |   devs[i++] = kempld_devs[KEMPLD_UART];
      |             ^

Fixes: e49aa9a9bd22 ("mfd: core: Make a best effort attempt to match devices with the correct of_nodes")
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: axp20x: Allow the AXP803 to be probed by I2C
Frank Lee [Wed, 8 Jul 2020 07:19:36 +0000 (15:19 +0800)]
mfd: axp20x: Allow the AXP803 to be probed by I2C

The AXP803 can be used both using the RSB proprietary bus, or a more
traditional I2C bus.

Let's add that possibility.

Signed-off-by: Frank Lee <[email protected]>
Acked-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: da9063: Add support for latest DA silicon revision
Adam Thomson [Mon, 13 Jul 2020 09:38:59 +0000 (10:38 +0100)]
mfd: da9063: Add support for latest DA silicon revision

This update adds new regmap tables to support the latest DA silicon
which will automatically be selected based on the chip and variant
information read from the device.

Signed-off-by: Adam Thomson <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: da9063: Fix revision handling to correctly select reg tables
Adam Thomson [Mon, 13 Jul 2020 09:38:57 +0000 (10:38 +0100)]
mfd: da9063: Fix revision handling to correctly select reg tables

The current implementation performs checking in the i2c_probe()
function of the variant_code but does this immediately after the
containing struct has been initialised as all zero. This means the
check for variant code will always default to using the BB tables
and will never select AD. The variant code is subsequently set
by device_init() and later used by the RTC so really it's a little
fortunate this mismatch works.

This update adds raw I2C read access functionality to read the chip
and variant/revision information (common to all revisions) so that
it can subsequently correctly choose the proper regmap tables for
real initialisation.

Signed-off-by: Adam Thomson <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agodt-bindings: mfd: st,stmfx: Remove I2C unit name
Fabio Estevam [Thu, 2 Jul 2020 11:32:33 +0000 (08:32 -0300)]
dt-bindings: mfd: st,stmfx: Remove I2C unit name

Remove the I2C unit name to fix the following build warning with
'make dt_binding_check':

Warning (unit_address_vs_reg): /example-0/i2c@0: node has a unit name, but no reg or ranges property

Signed-off-by: Fabio Estevam <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agodt-bindings: mfd: ti,j721e-system-controller.yaml: Add J721e system controller
Roger Quadros [Mon, 29 Jun 2020 12:52:49 +0000 (15:52 +0300)]
dt-bindings: mfd: ti,j721e-system-controller.yaml: Add J721e system controller

Add DT binding schema for J721e system controller.

Signed-off-by: Roger Quadros <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: motorola-cpcap: Disable interrupt for suspend
Tony Lindgren [Mon, 6 Jul 2020 18:39:34 +0000 (11:39 -0700)]
mfd: motorola-cpcap: Disable interrupt for suspend

Otherwise we get spammed with errors on resume after rtcwake:

 cpcap-core spi0.0: Failed to read IRQ status: -108

Note that rtcwake is still capable of waking up the system with
this patch.

Cc: Merlijn Wajer <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Sebastian Reichel <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: smsc-ece1099: Remove driver
Michael Walle [Wed, 1 Jul 2020 21:23:48 +0000 (23:23 +0200)]
mfd: smsc-ece1099: Remove driver

This MFD driver has no user. The keypad driver of this device never made
it into the kernel. Therefore, this driver is useless. Remove it.

Signed-off-by: Michael Walle <[email protected]>
Cc: Sourav Poddar <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: core: Add OF_MFD_CELL_REG() helper
Lee Jones [Thu, 11 Jun 2020 09:12:05 +0000 (10:12 +0100)]
mfd: core: Add OF_MFD_CELL_REG() helper

Extend current list of helpers to provide support for parent drivers
wishing to match specific child devices to particular OF nodes.

Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: core: Fix formatting of MFD helpers
Lee Jones [Thu, 11 Jun 2020 09:18:43 +0000 (10:18 +0100)]
mfd: core: Fix formatting of MFD helpers

Remove unnecessary '\'s and leading tabs.

This will help to clean-up future diffs when subsequent changes are
made.

Hint: The aforementioned changes follow this patch.

Signed-off-by: Lee Jones <[email protected]>
4 years agomfd: core: Make a best effort attempt to match devices with the correct of_nodes
Lee Jones [Thu, 11 Jun 2020 06:35:33 +0000 (07:35 +0100)]
mfd: core: Make a best effort attempt to match devices with the correct of_nodes

Currently, when a child platform device (sometimes referred to as a
sub-device) is registered via the Multi-Functional Device (MFD) API,
the framework attempts to match the newly registered platform device
with its associated Device Tree (OF) node.  Until now, the device has
been allocated the first node found with an identical OF compatible
string.  Unfortunately, if there are, say for example '3' devices
which are to be handled by the same driver and therefore have the same
compatible string, each of them will be allocated a pointer to the
*first* node.

An example Device Tree entry might look like this:

  mfd_of_test {
          compatible = "mfd,of-test-parent";
          #address-cells = <0x02>;
          #size-cells = <0x02>;

          child@aaaaaaaaaaaaaaaa {
                  compatible = "mfd,of-test-child";
                  reg = <0xaaaaaaaa 0xaaaaaaaa 0 0x11>,
                        <0xbbbbbbbb 0xbbbbbbbb 0 0x22>;
          };

          child@cccccccc {
                  compatible = "mfd,of-test-child";
                  reg = <0x00000000 0xcccccccc 0 0x33>;
          };

          child@dddddddd00000000 {
                  compatible = "mfd,of-test-child";
                  reg = <0xdddddddd 0x00000000 0 0x44>;
          };
  };

When used with example sub-device registration like this:

  static const struct mfd_cell mfd_of_test_cell[] = {
        OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 0, "mfd,of-test-child"),
        OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child"),
        OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child")
  };

... the current implementation will result in all devices being allocated
the first OF node found containing a matching compatible string:

  [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0
  [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@aaaaaaaaaaaaaaaa
  [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1
  [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa
  [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2
  [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@aaaaaaaaaaaaaaaa

After this patch each device will be allocated a unique OF node:

  [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0
  [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@aaaaaaaaaaaaaaaa
  [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1
  [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@cccccccc
  [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2
  [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@dddddddd00000000

Which is fine if all OF nodes are identical.  However if we wish to
apply an attribute to particular device, we really need to ensure the
correct OF node will be associated with the device containing the
correct address.  We accomplish this by matching the device's address
expressed in DT with one provided during sub-device registration.
Like this:

  static const struct mfd_cell mfd_of_test_cell[] = {
        OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child", 0xdddddddd00000000),
        OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child", 0xaaaaaaaaaaaaaaaa),
        OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 3, "mfd,of-test-child", 0x00000000cccccccc)
  };

This will ensure a specific device (designated here using the
platform_ids; 1, 2 and 3) is matched with a particular OF node:

  [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0
  [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@dddddddd00000000
  [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1
  [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa
  [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2
  [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@cccccccc

This implementation is still not infallible, hence the mention of
"best effort" in the commit subject.  Since we have not *insisted* on
the existence of 'reg' properties (in some scenarios they just do not
make sense) and no device currently uses the new 'of_reg' attribute,
we have to make an on-the-fly judgement call whether to associate the
OF node anyway.  Which we do in cases where parent drivers haven't
specified a particular OF node to match to.  So there is a *slight*
possibility of the following result (note: the implementation here is
convoluted, but it shows you one means by which this process can
still break):

  /*
   * First entry will match to the first OF node with matching compatible
   * Second will fail, since the first took its OF node and is no longer available
   * Third will succeed
   */
  static const struct mfd_cell mfd_of_test_cell[] = {
        OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child"),
OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child", 0xaaaaaaaaaaaaaaaa),
        OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 3, "mfd,of-test-child", 0x00000000cccccccc)
  };

The result:

  [0.753869] mfd-of-test-parent mfd_of_test: Registering 3 devices
  [0.756597] mfd-of-test-child: Failed to locate of_node [id: 2]
  [0.759999] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1
  [0.760314] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa
  [0.760908] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2
  [0.761183] mfd-of-test-child mfd-of-test-child.2: No OF node associated with this device
  [0.761621] mfd-of-test-child mfd-of-test-child.3: Probing platform device: 3
  [0.761899] mfd-of-test-child mfd-of-test-child.3: Using OF node: child@cccccccc

We could code around this with some pre-parsing semantics, but the
added complexity required to cover each and every corner-case is not
justified.  Merely patching the current failing (via this patch) is
already working with some pretty small corner-cases.  Other issues
should be patched in the parent drivers which can be achieved simply
by implementing OF_MFD_CELL_REG().

Signed-off-by: Lee Jones <[email protected]>
4 years agoMerge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Linus Torvalds [Thu, 13 Aug 2020 00:17:00 +0000 (17:17 -0700)]
Merge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "Not much this cycle - mostly non urgent driver fixes:

   - ds1374: use watchdog core

   - pcf2127: add alarm and pcf2129 support"

* tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: pcf2127: fix alarm handling
  rtc: pcf2127: add alarm support
  rtc: pcf2127: add pca2129 device id
  rtc: max77686: Fix wake-ups for max77620
  rtc: ds1307: provide an indication that the watchdog has fired
  rtc: ds1374: remove unused define
  rtc: ds1374: fix RTC_DRV_DS1374_WDT dependencies
  rtc: cleanup obsolete comment about struct rtc_class_ops
  rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable
  rtc: ds1374: wdt: Use watchdog core for watchdog part
  rtc: Replace HTTP links with HTTPS ones
  rtc: goldfish: Enable interrupt in set_alarm() when necessary
  rtc: max77686: Do not allow interrupt to fire before system resume
  rtc: imxdi: fix trivial typos
  rtc: cpcap: fix range

4 years agoio_uring: enable lookup of links holding inflight files
Jens Axboe [Wed, 12 Aug 2020 23:33:30 +0000 (17:33 -0600)]
io_uring: enable lookup of links holding inflight files

When a process exits, we cancel whatever requests it has pending that
are referencing the file table. However, if a link is holding a
reference, then we cannot find it by simply looking at the inflight
list.

Enable checking of the poll and timeout list to find the link, and
cancel it appropriately.

Cc: [email protected]
Reported-by: Josef <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoRevert "ipv4: tunnel: fix compilation on ARCH=um"
David S. Miller [Wed, 12 Aug 2020 20:26:37 +0000 (13:26 -0700)]
Revert "ipv4: tunnel: fix compilation on ARCH=um"

This reverts commit 06a7a37be55e29961c9ba2abec4d07c8e0e21861.

The bug was already fixed, this added a dup include.

Reported-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
Eric Dumazet [Wed, 12 Aug 2020 01:34:40 +0000 (18:34 -0700)]
net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus

We must accept an empty mask in store_rps_map(), or we are not able
to disable RPS on a queue.

Fixes: 07bbecb34106 ("net: Restrict receive packets queuing to housekeeping CPUs")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Maciej Å»enczykowski <[email protected]>
Cc: Alex Belits <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Maciej Å»enczykowski <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Nitesh Narayan Lal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'net-stmmac-Fix-multicast-filter-on-IPQ806x'
David S. Miller [Wed, 12 Aug 2020 20:12:52 +0000 (13:12 -0700)]
Merge branch 'net-stmmac-Fix-multicast-filter-on-IPQ806x'

Jonathan McDowell says:

====================
net: stmmac: Fix multicast filter on IPQ806x

This pair of patches are the result of discovering a failure to
correctly receive IPv6 multicast packets on such a device (in particular
DHCPv6 requests and RA solicitations). Putting the device into
promiscuous mode, or allmulti, both resulted in such packets correctly
being received. Examination of the vendor driver (nss-gmac from the
qsdk) shows that it does not enable the multicast filter and instead
falls back to allmulti.

Extend the base dwmac1000 driver to fall back when there's no suitable
hardware filter, and update the ipq806x platform to request this.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agonet: ethernet: stmmac: Disable hardware multicast filter
Jonathan McDowell [Wed, 12 Aug 2020 19:37:23 +0000 (20:37 +0100)]
net: ethernet: stmmac: Disable hardware multicast filter

The IPQ806x does not appear to have a functional multicast ethernet
address filter. This was observed as a failure to correctly receive IPv6
packets on a LAN to the all stations address. Checking the vendor driver
shows that it does not attempt to enable the multicast filter and
instead falls back to receiving all multicast packets, internally
setting ALLMULTI.

Use the new fallback support in the dwmac1000 driver to correctly
achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6
functionality on an RB3011 router.

Cc: [email protected]
Signed-off-by: Jonathan McDowell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: stmmac: dwmac1000: provide multicast filter fallback
Jonathan McDowell [Wed, 12 Aug 2020 19:37:01 +0000 (20:37 +0100)]
net: stmmac: dwmac1000: provide multicast filter fallback

If we don't have a hardware multicast filter available then instead of
silently failing to listen for the requested ethernet broadcast
addresses fall back to receiving all multicast packets, in a similar
fashion to other drivers with no multicast filter.

Cc: [email protected]
Signed-off-by: Jonathan McDowell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoi2c: iproc: fix race between client unreg and isr
Dhananjay Phadke [Tue, 11 Aug 2020 00:42:40 +0000 (17:42 -0700)]
i2c: iproc: fix race between client unreg and isr

When i2c client unregisters, synchronize irq before setting
iproc_i2c->slave to NULL.

(1) disable_irq()
(2) Mask event enable bits in control reg
(3) Erase slave address (avoid further writes to rx fifo)
(4) Flush tx and rx FIFOs
(5) Clear pending event (interrupt) bits in status reg
(6) enable_irq()
(7) Set client pointer to NULL

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000318

[  371.020421] pc : bcm_iproc_i2c_isr+0x530/0x11f0
[  371.025098] lr : __handle_irq_event_percpu+0x6c/0x170
[  371.030309] sp : ffff800010003e40
[  371.033727] x29: ffff800010003e40 x28: 0000000000000060
[  371.039206] x27: ffff800010ca9de0 x26: ffff800010f895df
[  371.044686] x25: ffff800010f18888 x24: ffff0008f7ff3600
[  371.050165] x23: 0000000000000003 x22: 0000000001600000
[  371.055645] x21: ffff800010f18888 x20: 0000000001600000
[  371.061124] x19: ffff0008f726f080 x18: 0000000000000000
[  371.066603] x17: 0000000000000000 x16: 0000000000000000
[  371.072082] x15: 0000000000000000 x14: 0000000000000000
[  371.077561] x13: 0000000000000000 x12: 0000000000000001
[  371.083040] x11: 0000000000000000 x10: 0000000000000040
[  371.088519] x9 : ffff800010f317c8 x8 : ffff800010f317c0
[  371.093999] x7 : ffff0008f805b3b0 x6 : 0000000000000000
[  371.099478] x5 : ffff0008f7ff36a4 x4 : ffff8008ee43d000
[  371.104957] x3 : 0000000000000000 x2 : ffff8000107d64c0
[  371.110436] x1 : 00000000c00000af x0 : 0000000000000000

[  371.115916] Call trace:
[  371.118439]  bcm_iproc_i2c_isr+0x530/0x11f0
[  371.122754]  __handle_irq_event_percpu+0x6c/0x170
[  371.127606]  handle_irq_event_percpu+0x34/0x88
[  371.132189]  handle_irq_event+0x40/0x120
[  371.136234]  handle_fasteoi_irq+0xcc/0x1a0
[  371.140459]  generic_handle_irq+0x24/0x38
[  371.144594]  __handle_domain_irq+0x60/0xb8
[  371.148820]  gic_handle_irq+0xc0/0x158
[  371.152687]  el1_irq+0xb8/0x140
[  371.155927]  arch_cpu_idle+0x10/0x18
[  371.159615]  do_idle+0x204/0x290
[  371.162943]  cpu_startup_entry+0x24/0x60
[  371.166990]  rest_init+0xb0/0xbc
[  371.170322]  arch_call_rest_init+0xc/0x14
[  371.174458]  start_kernel+0x404/0x430

Fixes: c245d94ed106 ("i2c: iproc: Add multi byte read-write support for slave mode")
Signed-off-by: Dhananjay Phadke <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Acked-by: Ray Jui <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
4 years agoipv4: tunnel: fix compilation on ARCH=um
Johannes Berg [Wed, 12 Aug 2020 19:08:53 +0000 (21:08 +0200)]
ipv4: tunnel: fix compilation on ARCH=um

With certain configurations, a 64-bit ARCH=um errors
out here with an unknown csum_ipv6_magic() function.
Include the right header file to always have it.

Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agovsock: fix potential null pointer dereference in vsock_poll()
Stefano Garzarella [Wed, 12 Aug 2020 12:56:02 +0000 (14:56 +0200)]
vsock: fix potential null pointer dereference in vsock_poll()

syzbot reported this issue where in the vsock_poll() we find the
socket state at TCP_ESTABLISHED, but 'transport' is null:
  general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
  CPU: 0 PID: 8227 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:vsock_poll+0x75a/0x8e0 net/vmw_vsock/af_vsock.c:1038
  Call Trace:
   sock_poll+0x159/0x460 net/socket.c:1266
   vfs_poll include/linux/poll.h:90 [inline]
   do_pollfd fs/select.c:869 [inline]
   do_poll fs/select.c:917 [inline]
   do_sys_poll+0x607/0xd40 fs/select.c:1011
   __do_sys_poll fs/select.c:1069 [inline]
   __se_sys_poll fs/select.c:1057 [inline]
   __x64_sys_poll+0x18c/0x440 fs/select.c:1057
   do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue can happen if the TCP_ESTABLISHED state is set after we read
the vsk->transport in the vsock_poll().

We could put barriers to synchronize, but this can only happen during
connection setup, so we can simply check that 'transport' is valid.

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reported-and-tested-by: [email protected]
Signed-off-by: Stefano Garzarella <[email protected]>
Reviewed-by: Jorgen Hansen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agosfc: fix ef100 design-param checking
Edward Cree [Wed, 12 Aug 2020 09:32:49 +0000 (10:32 +0100)]
sfc: fix ef100 design-param checking

The handling of the RXQ/TXQ size granularity design-params had two
 problems: it had a 64-bit divide that didn't build on 32-bit platforms,
 and it could divide by zero if the NIC supplied 0 as the value of the
 design-param.  Fix both by checking for 0 and for a granularity bigger
 than our min-size; if the granularity <= EFX_MIN_DMAQ_SIZE then it fits
 in 32 bits, so we can cast it to u32 for the divide.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client
Linus Torvalds [Wed, 12 Aug 2020 19:51:31 +0000 (12:51 -0700)]
Merge tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Xiubo has completed his work on filesystem client metrics, they are
  sent to all available MDSes once per second now.

  Other than that, we have a lot of fixes and cleanups all around the
  filesystem, including a tweak to cut down on MDS request resends in
  multi-MDS setups from Yanhu and fixups for SELinux symlink labeling
  and MClientSession message decoding from Jeff"

* tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client: (22 commits)
  ceph: handle zero-length feature mask in session messages
  ceph: use frag's MDS in either mode
  ceph: move sb->wb_pagevec_pool to be a global mempool
  ceph: set sec_context xattr on symlink creation
  ceph: remove redundant initialization of variable mds
  ceph: fix use-after-free for fsc->mdsc
  ceph: remove unused variables in ceph_mdsmap_decode()
  ceph: delete repeated words in fs/ceph/
  ceph: send client provided metric flags in client metadata
  ceph: periodically send perf metrics to MDSes
  ceph: check the sesion state and return false in case it is closed
  libceph: replace HTTP links with HTTPS ones
  ceph: remove unnecessary cast in kfree()
  libceph: just have osd_req_op_init() return a pointer
  ceph: do not access the kiocb after aio requests
  ceph: clean up and optimize ceph_check_delayed_caps()
  ceph: fix potential mdsc use-after-free crash
  ceph: switch to WARN_ON_ONCE in encode_supported_features()
  ceph: add global total_caps to count the mdsc's total caps number
  ceph: add check_session_state() helper and make it global
  ...

4 years agoMerge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Wed, 12 Aug 2020 19:41:15 +0000 (12:41 -0700)]
Merge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull more parisc updates from Helge Deller:

 - Oscar Carter contributed a patch which fixes parisc's usage of
   dereference_function_descriptor() and thus will allow using the
   -Wcast-function-type compiler option in the top-level Makefile

 - Sven Schnelle fixed a bug in the SBA code to prevent crashes during
   kexec

 - John David Anglin provided implementations for __smp_store_release()
   and __smp_load_acquire barriers() which avoids using the sync
   assembler instruction and thus speeds up barrier paths

 - Some whitespace cleanups in parisc's atomic.h header file

* 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Implement __smp_store_release and __smp_load_acquire barriers
  parisc: mask out enable and reserved bits from sba imask
  parisc: Whitespace cleanups in atomic.h
  parisc/kernel/ftrace: Remove function callback casts
  sections.h: dereference_function_descriptor() returns void pointer

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Wed, 12 Aug 2020 19:25:06 +0000 (12:25 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "PPC:
   - Improvements and bugfixes for secure VM support, giving reduced
     startup time and memory hotplug support.

   - Locking fixes in nested KVM code

   - Increase number of guests supported by HV KVM to 4094

   - Preliminary POWER10 support

  ARM:
   - Split the VHE and nVHE hypervisor code bases, build the EL2 code
     separately, allowing for the VHE code to now be built with
     instrumentation

   - Level-based TLB invalidation support

   - Restructure of the vcpu register storage to accomodate the NV code

   - Pointer Authentication available for guests on nVHE hosts

   - Simplification of the system register table parsing

   - MMU cleanups and fixes

   - A number of post-32bit cleanups and other fixes

  MIPS:
   - compilation fixes

  x86:
   - bugfixes

   - support for the SERIALIZE instruction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
  KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup
  x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled
  MIPS: KVM: Convert a fallthrough comment to fallthrough
  MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64
  x86: Expose SERIALIZE for supported cpuid
  KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled
  KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
  KVM: arm64: Don't skip cache maintenance for read-only memslots
  KVM: arm64: Handle data and instruction external aborts the same way
  KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
  KVM: arm: Add trace name for ARM_NISV
  KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text
  KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
  KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE
  KVM: PPC: Book3S HV: Rework secure mem slot dropping
  KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up
  KVM: PPC: Book3S HV: Migrate hot plugged memory
  KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs
  KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START
  ...

4 years agoMerge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Linus Torvalds [Wed, 12 Aug 2020 19:19:49 +0000 (12:19 -0700)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull more clk updates from Stephen Boyd:
 "Here's some more updates that missed the last pull request because I
  happened to tag the tree at an earlier point in the history of
  clk-next. I must have fat fingered it and checked out an older version
  of clk-next on this second computer I'm using.

  This time it actually includes more code for Qualcomm SoCs, the AT91
  major updates, and some Rockchip SoC clk driver updates as well. I've
  corrected this flow so this shouldn't happen again"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (83 commits)
  clk: bcm2835: Do not use prediv with bcm2711's PLLs
  clk: drop unused function __clk_get_flags
  clk: hsdk: Fix bad dependency on IOMEM
  dt-bindings: clock: Fix YAML schemas for LPASS clocks on SC7180
  clk: mmp: avoid missing prototype warning
  clk: sparx5: Add Sparx5 SoC DPLL clock driver
  dt-bindings: clock: sparx5: Add bindings include file
  clk: qoriq: add LS1021A core pll mux options
  clk: clk-atlas6: fix return value check in atlas6_clk_init()
  clk: tegra: pll: Improve PLLM enable-state detection
  clk: X1000: Add support for calculat REFCLK of USB PHY.
  clk: JZ4780: Reformat the code to align it.
  clk: JZ4780: Add functions for enable and disable USB PHY.
  clk: Ingenic: Add RTC related clocks for Ingenic SoCs.
  dt-bindings: clock: Add tabs to align code.
  dt-bindings: clock: Add RTC related clocks for Ingenic SoCs.
  clk: davinci: Use fallthrough pseudo-keyword
  clk: imx: Use fallthrough pseudo-keyword
  clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk
  clk: qcom: gcc-sdm660: Add missing modem reset
  ...

4 years agoMerge tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Wed, 12 Aug 2020 19:13:44 +0000 (12:13 -0700)]
Merge tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - f71808e_wdt imporvements

 - dw_wdt improvements

 - mlx-wdt: support new watchdog type with longer timeout period

 - fallthrough pseudo-keyword replacements

 - overall small fixes and improvements

* tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits)
  watchdog: rti-wdt: balance pm runtime enable calls
  watchdog: rti-wdt: attach to running watchdog during probe
  watchdog: add support for adjusting last known HW keepalive time
  watchdog: use __watchdog_ping in startup
  watchdog: softdog: Add options 'soft_reboot_cmd' and 'soft_active_on_boot'
  watchdog: pcwd_usb: remove needless check before usb_free_coherent()
  watchdog: Replace HTTP links with HTTPS ones
  dt-bindings: watchdog: renesas,wdt: Document r8a774e1 support
  watchdog: initialize device before misc_register
  watchdog: booke_wdt: Add common nowayout parameter driver
  watchdog: scx200_wdt: Use fallthrough pseudo-keyword
  watchdog: Use fallthrough pseudo-keyword
  watchdog: f71808e_wdt: do stricter parameter validation
  watchdog: f71808e_wdt: clear watchdog timeout occurred flag
  watchdog: f71808e_wdt: remove use of wrong watchdog_info option
  watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
  docs: watchdog: codify ident.options as superset of possible status flags
  dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150
  dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAML
  watchdog: dw_wdt: Add DebugFS files
  ...

4 years agoMerge tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio
Linus Torvalds [Wed, 12 Aug 2020 19:09:36 +0000 (12:09 -0700)]
Merge tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Inclusive naming updates (Alex Williamson)

 - Intel X550 INTx quirk (Alex Williamson)

 - Error path resched between unmaps (Xiang Zheng)

 - SPAPR IOMMU pin_user_pages() conversion (John Hubbard)

 - Trivial mutex simplification (Alex Williamson)

 - QAT device denylist (Giovanni Cabiddu)

 - type1 IOMMU ioctl refactor (Liu Yi L)

* tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Refactor vfio_iommu_type1_ioctl()
  vfio/pci: Add QAT devices to denylist
  vfio/pci: Add device denylist
  PCI: Add Intel QuickAssist device IDs
  vfio/pci: Hold igate across releasing eventfd contexts
  vfio/spapr_tce: convert get_user_pages() --> pin_user_pages()
  vfio/type1: Add conditional rescheduling after iommu map failed
  vfio/pci: Add Intel X550 to hidden INTx devices
  vfio: Cleanup allowed driver naming

4 years agoMerge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Wed, 12 Aug 2020 18:53:01 +0000 (11:53 -0700)]
Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "This has a few vmwgfx regression fixes we hit from the merge window
  (one in TTM), it also has a bunch of amdgpu fixes along with a
  scattering everywhere else.

  core:
   - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
   - Remove null check for kfree in drm_dev_release.
   - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
   - re-added docs for drm_gem_flink_ioctl()
   - add orientation quirk for ASUS T103HAF

  ttm:
   - ttm: fix page-offset calculation within TTM
   - revert patch causing vmwgfx regressions

  fbcon:
   - Fix a fbcon OOB read in fbdev, found by syzbot.

  vga:
   - Mark vga_tryget static as it's not used elsewhere.

  amdgpu:
   - Re-add spelling typo fix
   - Sienna Cichlid fixes
   - Navy Flounder fixes
   - DC fixes
   - SMU i2c fix
   - Power fixes

  vmwgfx:
   - regression fixes for modesetting crashes
   - misc fixes

  xlnx:
   - Small fixes to xlnx.

  omap:
   - Fix mode initialization in omap_connector_mode_valid().
   - force runtime PM suspend on system suspend

  tidss:
   - fix modeset init for DPI panels"

* tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits)
  drm/ttm: revert "drm/ttm: make TT creation purely optional v3"
  drm/vmwgfx: fix spelling mistake "Cant" -> "Can't"
  drm/vmwgfx: fix spelling mistake "Cound" -> "Could"
  drm/vmwgfx/ldu: Use drm_mode_config_reset
  drm/vmwgfx/sou: Use drm_mode_config_reset
  drm/vmwgfx/stdu: Use drm_mode_config_reset
  drm/vmwgfx: Fix two list_for_each loop exit tests
  drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
  drm/vmwgfx: Use struct_size() helper
  drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
  drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)
  drm/amd/powerplay: update swSMU VCN/JPEG PG logics
  drm/amdgpu: use mode1 reset by default for sienna_cichlid
  drm/amdgpu/smu: rework i2c adpater registration
  drm/amd/display: Display goes blank after inst
  drm/amd/display: Change null plane state swizzle mode to 4kb_s
  drm/amd/display: Use helper function to check for HDMI signal
  drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink
  drm/amd/display: Fix logger context
  drm/amd/display: populate new dml variable
  ...

4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Wed, 12 Aug 2020 18:24:12 +0000 (11:24 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <[email protected]>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...

4 years agomm/gup: remove task_struct pointer for all gup code
Peter Xu [Wed, 12 Aug 2020 01:39:01 +0000 (18:39 -0700)]
mm/gup: remove task_struct pointer for all gup code

After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: clean up the last pieces of page fault accountings
Peter Xu [Wed, 12 Aug 2020 01:38:57 +0000 (18:38 -0700)]
mm: clean up the last pieces of page fault accountings

Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():

arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
arch/sparc/mm/fault_32.c:        force_user_fault
arch/um/kernel/trap.c:           handle_page_fault
mm/gup.c:                        faultin_page
                                 fixup_user_fault
mm/hmm.c:                        hmm_vma_fault
mm/ksm.c:                        break_ksm

Some of them has the issue of duplicated accounting for page fault
retries.  Some of them didn't do the accounting at all.

This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings).  With that, we can safely remove all the outliers now.

There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code.  More information of this can be found at [1].

After this patch, below things should never be touched again outside
handle_mm_fault():

  - task_struct.[maj|min]_flt
  - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]

[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/xtensa: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:53 +0000 (18:38 -0700)]
mm/xtensa: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
now also done in handle_mm_fault().

Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
the fault, then it'll match with the rest of the archs.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Max Filippov <[email protected]>
Cc: Chris Zankel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/x86: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:49 +0000 (18:38 -0700)]
mm/x86: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/sparc64: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:46 +0000 (18:38 -0700)]
mm/sparc64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: David S. Miller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/sparc32: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:43 +0000 (18:38 -0700)]
mm/sparc32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: David S. Miller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/sh: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:40 +0000 (18:38 -0700)]
mm/sh: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/s390: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:37 +0000 (18:38 -0700)]
mm/s390: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Gerald Schaefer <[email protected]>
Acked-by: Gerald Schaefer <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/riscv: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:34 +0000 (18:38 -0700)]
mm/riscv: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Pekka Enberg <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Albert Ou <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/powerpc: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:31 +0000 (18:38 -0700)]
mm/powerpc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/parisc: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:28 +0000 (18:38 -0700)]
mm/parisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: Helge Deller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/openrisc: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:25 +0000 (18:38 -0700)]
mm/openrisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Stafford Horne <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/nios2: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:22 +0000 (18:38 -0700)]
mm/nios2: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/nds32: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:19 +0000 (18:38 -0700)]
mm/nds32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Greentime Hu <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Vincent Chen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/mips: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:15 +0000 (18:38 -0700)]
mm/mips: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Thomas Bogendoerfer <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/microblaze: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:12 +0000 (18:38 -0700)]
mm/microblaze: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Michal Simek <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/m68k: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:09 +0000 (18:38 -0700)]
mm/m68k: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/ia64: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:06 +0000 (18:38 -0700)]
mm/ia64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/hexagon: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:03 +0000 (18:38 -0700)]
mm/hexagon: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Brian Cain <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/csky: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:38:00 +0000 (18:38 -0700)]
mm/csky: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Guo Ren <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/arm64: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:37:57 +0000 (18:37 -0700)]
mm/arm64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.  To do this, we pass pt_regs
pointer into __do_page_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/arm: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:37:54 +0000 (18:37 -0700)]
mm/arm: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.  To do this, we need to pass
the pt_regs pointer into __do_page_fault().

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/arc: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:37:52 +0000 (18:37 -0700)]
mm/arc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vineet Gupta <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/alpha: use general page fault accounting
Peter Xu [Wed, 12 Aug 2020 01:37:49 +0000 (18:37 -0700)]
mm/alpha: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: do page fault accounting in handle_mm_fault
Peter Xu [Wed, 12 Aug 2020 01:37:44 +0000 (18:37 -0700)]
mm: do page fault accounting in handle_mm_fault

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup: use a standard migration target allocation callback
Joonsoo Kim [Wed, 12 Aug 2020 01:37:41 +0000 (18:37 -0700)]
mm/gup: use a standard migration target allocation callback

There is a well-defined migration target allocation callback. Use it.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/hugetlb: make hugetlb migration callback CMA aware
Joonsoo Kim [Wed, 12 Aug 2020 01:37:38 +0000 (18:37 -0700)]
mm/hugetlb: make hugetlb migration callback CMA aware

new_non_cma_page() in gup.c requires to allocate the new page that is not
on the CMA area.  new_non_cma_page() implements it by using allocation
scope APIs.

However, there is a work-around for hugetlb.  Normal hugetlb page
allocation API for migration is alloc_huge_page_nodemask().  It consists
of two steps.  First is dequeing from the pool.  Second is, if there is no
available page on the queue, allocating by using the page allocator.

new_non_cma_page() can't use this API since first step (deque) isn't aware
of scope API to exclude CMA area.  So, new_non_cma_page() exports hugetlb
internal function for the second step, alloc_migrate_huge_page(), to
global scope and uses it directly.  This is suboptimal since hugetlb pages
on the queue cannot be utilized.

This patch tries to fix this situation by making the deque function on
hugetlb CMA aware.  In the deque function, CMA memory is skipped if
PF_MEMALLOC_NOCMA flag is found.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup: restrict CMA region by using allocation scope API
Joonsoo Kim [Wed, 12 Aug 2020 01:37:34 +0000 (18:37 -0700)]
mm/gup: restrict CMA region by using allocation scope API

We have well defined scope API to exclude CMA region.  Use it rather than
manipulating gfp_mask manually.  With this change, we can now restore
__GFP_MOVABLE for gfp_mask like as usual migration target allocation.  It
would result in that the ZONE_MOVABLE is also searched by page allocator.
For hugetlb, gfp_mask is redefined since it has a regular allocation mask
filter for migration target.  __GPF_NOWARN is added to hugetlb gfp_mask
filter since a new user for gfp_mask filter, gup, want to be silent when
allocation fails.

Note that this can be considered as a fix for the commit 9a4e9f3b2d73
("mm: update get_user_pages_longterm to migrate pages allocated from CMA
region").  However, "Fixes" tag isn't added here since it is just
suboptimal but it doesn't cause any problem.

Suggested-by: Michal Hocko <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/page_alloc: remove a wrapper for alloc_migration_target()
Joonsoo Kim [Wed, 12 Aug 2020 01:37:31 +0000 (18:37 -0700)]
mm/page_alloc: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback.  Use it
directly.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/mempolicy: use a standard migration target allocation callback
Joonsoo Kim [Wed, 12 Aug 2020 01:37:28 +0000 (18:37 -0700)]
mm/mempolicy: use a standard migration target allocation callback

There is a well-defined migration target allocation callback.  Use it.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/migrate: introduce a standard migration target allocation function
Joonsoo Kim [Wed, 12 Aug 2020 01:37:25 +0000 (18:37 -0700)]
mm/migrate: introduce a standard migration target allocation function

There are some similar functions for migration target allocation.  Since
there is no fundamental difference, it's better to keep just one rather
than keeping all variants.  This patch implements base migration target
allocation function.  In the following patches, variants will be converted
to use this function.

Changes should be mechanical, but, unfortunately, there are some
differences.  First, some callers' nodemask is assgined to NULL since NULL
nodemask will be considered as all available nodes, that is,
&node_states[N_MEMORY].  Second, for hugetlb page allocation, gfp_mask is
redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if
user provided gfp_mask has it.  This is because future caller of this
function requires to set this node constaint.  Lastly, if provided nodeid
is NUMA_NO_NODE, nodeid is set up to the node where migration source
lives.  It helps to remove simple wrappers for setting up the nodeid.

Note that PageHighmem() call in previous function is changed to open-code
"is_highmem_idx()" since it provides more readability.

[[email protected]: tweak patch title, per Vlastimil]
[[email protected]: fix typo in comment]

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regula...
Joonsoo Kim [Wed, 12 Aug 2020 01:37:20 +0000 (18:37 -0700)]
mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations

new_page_nodemask is a migration callback and it tries to use a common gfp
flags for the target page allocation whether it is a base page or a THP.
The later only adds GFP_TRANSHUGE to the given mask.  This results in the
allocation being slightly more aggressive than necessary because the
resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD.  THP
allocations usually exclude this flag to reduce over eager background
reclaim during a high THP allocation load which has been seen during large
mmaps initialization.  There is no indication that this is a problem for
migration as well but theoretically the same might happen when migrating
large mappings to a different node.  Make the migration callback
consistent with regular THP allocations.

[[email protected]: fix comment typo, per Vlastimil]

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/hugetlb: unify migration callbacks
Joonsoo Kim [Wed, 12 Aug 2020 01:37:17 +0000 (18:37 -0700)]
mm/hugetlb: unify migration callbacks

There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling.  It's redundant to have two almost similar
functions in order to handle this flag.  So, this patch tries to remove
one by introducing a new argument, gfp_mask, to
alloc_huge_page_nodemask().

After introducing gfp_mask argument, it's caller's job to provide correct
gfp_mask.  So, every callsites for alloc_huge_page_nodemask() are changed
to provide gfp_mask.

Note that it's safe to remove a node id check in alloc_huge_page_node()
since there is no caller passing NUMA_NO_NODE as a node id.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/migrate: move migration helper from .h to .c
Joonsoo Kim [Wed, 12 Aug 2020 01:37:14 +0000 (18:37 -0700)]
mm/migrate: move migration helper from .h to .c

It's not performance sensitive function.  Move it to .c.  This is a
preparation step for future change.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/page_isolation: prefer the node of the source page
Joonsoo Kim [Wed, 12 Aug 2020 01:37:11 +0000 (18:37 -0700)]
mm/page_isolation: prefer the node of the source page

Patch series "clean-up the migration target allocation functions", v5.

This patch (of 9):

For locality, it's better to migrate the page to the same node rather than
the node of the current caller's cpu.

Signed-off-by: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoipc/shm.c: remove the superfluous break
Liao Pingfang [Wed, 12 Aug 2020 01:37:08 +0000 (18:37 -0700)]
ipc/shm.c: remove the superfluous break

Remove the superfuous break, as there is a 'return' before it.

Signed-off-by: Liao Pingfang <[email protected]>
Signed-off-by: Yi Wang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoipc: uninline functions
Alexey Dobriyan [Wed, 12 Aug 2020 01:37:05 +0000 (18:37 -0700)]
ipc: uninline functions

Two functions are only called via function pointers, don't bother
inlining them.

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoscripts/gdb: fix python 3.8 SyntaxWarning
Nick Desaulniers [Wed, 12 Aug 2020 01:37:02 +0000 (18:37 -0700)]
scripts/gdb: fix python 3.8 SyntaxWarning

Fixes the observed warnings:
scripts/gdb/linux/rbtree.py:20: SyntaxWarning: "is" with a literal. Did
you mean "=="?
  if node is 0:
scripts/gdb/linux/rbtree.py:36: SyntaxWarning: "is" with a literal. Did
you mean "=="?
  if node is 0:

It looks like this is a new warning added in Python 3.8. I've only seen
this once after adding the add-auto-load-safe-path rule to my ~/.gdbinit
for a new tree.

Fixes: commit 449ca0c95ea2 ("scripts/gdb: add rb tree iterating utilities")
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Cc: Jan Kiszka <[email protected]>
Cc: Kieran Bingham <[email protected]>
Cc: Aymeric Agon-Rambosson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: https://adamj.eu/tech/2020/01/21/why-does-python-3-8-syntaxwarning-for-is-literal/
Signed-off-by: Linus Torvalds <[email protected]>
4 years agokcov: make some symbols static
Wei Yongjun [Wed, 12 Aug 2020 01:36:59 +0000 (18:36 -0700)]
kcov: make some symbols static

Fix sparse build warnings:

kernel/kcov.c:99:1: warning:
 symbol '__pcpu_scope_kcov_percpu_data' was not declared. Should it be static?
kernel/kcov.c:778:6: warning:
 symbol 'kcov_remote_softirq_start' was not declared. Should it be static?
kernel/kcov.c:795:6: warning:
 symbol 'kcov_remote_softirq_stop' was not declared. Should it be static?

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agokcov: unconditionally add -fno-stack-protector to compiler options
Marco Elver [Wed, 12 Aug 2020 01:36:56 +0000 (18:36 -0700)]
kcov: unconditionally add -fno-stack-protector to compiler options

Unconditionally add -fno-stack-protector to KCOV's compiler options, as
all supported compilers support the option.  This saves a compiler
invocation to determine if the option is supported.

Because Clang does not support -fno-conserve-stack, and
-fno-stack-protector was wrapped in the same cc-option, we were missing
-fno-stack-protector with Clang. Unconditionally adding this option
fixes this for Clang.

Suggested-by: Nick Desaulniers <[email protected]>
Signed-off-by: Marco Elver <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agopanic: make print_oops_end_marker() static
Yue Hu [Wed, 12 Aug 2020 01:36:53 +0000 (18:36 -0700)]
panic: make print_oops_end_marker() static

Since print_oops_end_marker() is not used externally, also remove it in
kernel.h at the same time.

Signed-off-by: Yue Hu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Kees Cook <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agolib/Kconfig.debug: fix typo in the help text of CONFIG_PANIC_TIMEOUT
Tiezhu Yang [Wed, 12 Aug 2020 01:36:49 +0000 (18:36 -0700)]
lib/Kconfig.debug: fix typo in the help text of CONFIG_PANIC_TIMEOUT

There exists duplicated "the" in the help text of CONFIG_PANIC_TIMEOUT,
Remove it.

Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Xuefeng Li <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agokernel/panic.c: make oops_may_print() return bool
Tiezhu Yang [Wed, 12 Aug 2020 01:36:46 +0000 (18:36 -0700)]
kernel/panic.c: make oops_may_print() return bool

The return value of oops_may_print() is true or false, so change its type
to reflect that.

Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Xuefeng Li <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agorapidio/rio_mport_cdev: use array_size() helper in copy_{from,to}_user()
Gustavo A. R. Silva [Wed, 12 Aug 2020 01:36:43 +0000 (18:36 -0700)]
rapidio/rio_mport_cdev: use array_size() helper in copy_{from,to}_user()

Use array_size() helper instead of the open-coded version in
copy_{from,to}_user().  These sorts of multiplication factors need to be
wrapped in array_size().

This issue was found with the help of Coccinelle and, audited and fixed
manually.

Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Alexandre Bounine <[email protected]>
Link: http://lkml.kernel.org/r/20200616183050.GA31840@embeddedor
Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodrivers/rapidio/rio-scan.c: use struct_size() helper
Gustavo A. R. Silva [Wed, 12 Aug 2020 01:36:40 +0000 (18:36 -0700)]
drivers/rapidio/rio-scan.c: use struct_size() helper

Make use of the struct_size() helper instead of an open-coded version in
order to avoid any potential type mistakes.

Also, while there, use the preferred form for passing a size of a struct.
The alternative form where struct name is spelled out hurts readability
and introduces an opportunity for a bug when the pointer variable type is
changed but the corresponding sizeof that is passed as argument is not.

This issue was found with the help of Coccinelle and, audited and fixed
manually.

Addresses KSPP ID: https://github.com/KSPP/linux/issues/83

Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Alexandre Bounine <[email protected]>
Link: http://lkml.kernel.org/r/20200619170445.GA22641@embeddedor
Signed-off-by: Linus Torvalds <[email protected]>
This page took 0.142579 seconds and 4 git commands to generate.