]> Git Repo - linux.git/log
linux.git
5 years agoperf annotate: Prefer cmdline option over default config
Ravi Bangoria [Thu, 13 Feb 2020 06:43:04 +0000 (12:13 +0530)]
perf annotate: Prefer cmdline option over default config

For all the perf-config options that can also be set from command line
option, the preference is given to command line version in case of any
conflict. But that's opposite in case of perf annotate. i.e. the more
preference is given to default option rather than command line option.
Fix it.

Before:

  $ ./perf config
  annotate.show_nr_samples=false

  $ ./perf annotate shash --show-nr-samples
  Percent│
         │24:   mov    -0xc(%rbp),%eax
   49.19 │      imul   $0x1003f,%eax,%ecx
         │      mov    -0x18(%rbp),%rax

After:

  Samples│
         │24:   mov    -0xc(%rbp),%eax
       1 │      imul   $0x1003f,%eax,%ecx
         │      mov    -0x18(%rbp),%rax

Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf annotate: Make perf config effective
Ravi Bangoria [Thu, 13 Feb 2020 06:43:03 +0000 (12:13 +0530)]
perf annotate: Make perf config effective

perf default config set by user in [annotate] section is totally ignored
by annotate code. Fix it.

Before:

  $ ./perf config
  annotate.hide_src_code=true
  annotate.show_nr_jumps=true
  annotate.show_nr_samples=true

  $ ./perf annotate shash
         │    unsigned h = 0;
         │      movl   $0x0,-0xc(%rbp)
         │    while (*s)
         │    ↓ jmp    44
         │    h = 65599 * h + *s++;
   11.33 │24:   mov    -0xc(%rbp),%eax
   43.50 │      imul   $0x1003f,%eax,%ecx
         │      mov    -0x18(%rbp),%rax

After:

         │        movl   $0x0,-0xc(%rbp)
         │      ↓ jmp    44
       1 │1 24:   mov    -0xc(%rbp),%eax
       4 │        imul   $0x1003f,%eax,%ecx
         │        mov    -0x18(%rbp),%rax

Note that we have removed show_nr_samples and show_total_period from
annotation_options because they are not used. Instead of them we use
symbol_conf.show_nr_samples and symbol_conf.show_total_period.

Committer testing:

Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:

  # perf config
  #
  # perf config annotate.hide_src_code=true
  # perf config
  annotate.hide_src_code=true
  #
  # perf config annotate.show_nr_jumps=true
  # perf config annotate.show_nr_samples=true
  # perf config
  annotate.hide_src_code=true
  annotate.show_nr_jumps=true
  annotate.show_nr_samples=true
  #
  #

Before:

  # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
  Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
  ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
  Percent
              00000000000609f0 <ObjectInstance::weak_pointer_was_finalized()@@Base>:
                endbr64
                cmpq    $0x0,0x20(%rdi)
              ↓ je      10
                xor     %eax,%eax
              ← retq
                xchg    %ax,%ax
  100.00  10:   push    %rbp
                cmpq    $0x0,0x18(%rdi)
                mov     %rdi,%rbp
              ↓ jne     20
          1b:   xor     %eax,%eax
                pop     %rbp
              ← retq
                nop
          20:   lea     0x18(%rdi),%rdi
              → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                cmpq    $0x0,0x18(%rbp)
              ↑ jne     1b
                mov     %rbp,%rdi
              → callq   ObjectBase::jsobj_addr() const@plt
                mov     $0x1,%eax
                pop     %rbp
              ← retq
  #

After:

  # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
  Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
  ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
  Samples       endbr64
                cmpq    $0x0,0x20(%rdi)
              ↓ je      10
                xor     %eax,%eax
              ← retq
                xchg    %ax,%ax
     1  1 10:   push    %rbp
                cmpq    $0x0,0x18(%rdi)
                mov     %rdi,%rbp
              ↓ jne     20
        1 1b:   xor     %eax,%eax
                pop     %rbp
              ← retq
                nop
        1 20:   lea     0x18(%rdi),%rdi
              → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                cmpq    $0x0,0x18(%rbp)
              ↑ jne     1b
                mov     %rbp,%rdi
              → callq   ObjectBase::jsobj_addr() const@plt
                mov     $0x1,%eax
                pop     %rbp
              ← retq
  #
  # perf config annotate.show_nr_jumps
  annotate.show_nr_jumps=true
  # perf config annotate.show_nr_jumps=false
  # perf config annotate.show_nr_jumps
  annotate.show_nr_jumps=false
  #
  # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
  Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
  ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
  Samples       endbr64
                cmpq    $0x0,0x20(%rdi)
              ↓ je      10
                xor     %eax,%eax
              ← retq
                xchg    %ax,%ax
       1  10:   push    %rbp
                cmpq    $0x0,0x18(%rdi)
                mov     %rdi,%rbp
              ↓ jne     20
          1b:   xor     %eax,%eax
                pop     %rbp
              ← retq
                nop
          20:   lea     0x18(%rdi),%rdi
              → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                cmpq    $0x0,0x18(%rbp)
              ↑ jne     1b
                mov     %rbp,%rdi
              → callq   ObjectBase::jsobj_addr() const@plt
                mov     $0x1,%eax
                pop     %rbp
              ← retq
  #

Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf config: Introduce perf_config_u8()
Ravi Bangoria [Thu, 13 Feb 2020 06:43:02 +0000 (12:13 +0530)]
perf config: Introduce perf_config_u8()

Introduce perf_config_u8() utility function to convert char * input into
u8 destination. We will utilize it in followup patch.

Signed-off-by: Ravi Bangoria <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf annotate: Fix --show-nr-samples for tui/stdio2
Ravi Bangoria [Thu, 13 Feb 2020 06:43:01 +0000 (12:13 +0530)]
perf annotate: Fix --show-nr-samples for tui/stdio2

perf annotate --show-nr-samples does not really show number of samples.

The reason is we have two separate variables for the same purpose.

One is in symbol_conf.show_nr_samples and another is
annotation_options.show_nr_samples.

We save command line option in symbol_conf.show_nr_samples but uses
annotation_option.show_nr_samples while rendering tui/stdio2 browser.

Though, we copy symbol_conf.show_nr_samples to
annotation__default_options.show_nr_samples but that is not really
effective as we don't use annotation__default_options once we copy
default options to dynamic variable annotate.opts in cmd_annotate().

Instead of all these complication, keep only one variable and use it all
over. symbol_conf.show_nr_samples is used by perf report/top as well. So
let's kill annotation_options.show_nr_samples.

On a side note, I've kept annotation_options.show_nr_samples definition
because it's still used by perf-config code. Follow up patch to fix
perf-config for annotate will remove annotation_options.show_nr_samples.

Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf annotate: Fix --show-total-period for tui/stdio2
Ravi Bangoria [Thu, 13 Feb 2020 06:43:00 +0000 (12:13 +0530)]
perf annotate: Fix --show-total-period for tui/stdio2

perf annotate --show-total-period does not really show total period.

The reason is we have two separate variables for the same purpose.

One is in symbol_conf.show_total_period and another is
annotation_options.show_total_period.

We save command line option in symbol_conf.show_total_period but uses
annotation_option.show_total_period while rendering tui/stdio2 browser.

Though, we copy symbol_conf.show_total_period to
annotation__default_options.show_total_period but that is not really
effective as we don't use annotation__default_options once we copy
default options to dynamic variable annotate.opts in cmd_annotate().

Instead of all these complication, keep only one variable and use it all
over. symbol_conf.show_total_period is used by perf report/top as well.
So let's kill annotation_options.show_total_period.

On a side note, I've kept annotation_options.show_total_period
definition because it's still used by perf-config code. Follow up patch
to fix perf-config for annotate will remove
annotation_options.show_total_period.

Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf annotate/tui: Re-render title bar after switching back from script browser
Ravi Bangoria [Thu, 13 Feb 2020 06:42:59 +0000 (12:12 +0530)]
perf annotate/tui: Re-render title bar after switching back from script browser

The 'perf annotate' TUI browser provides a 'r' hot key to switch to a
script browser. But the annotate browser title bar becomes hidden while
switching back from script browser. Fix it.

Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Taeung Song <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yisheng Xie <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agotools headers UAPI: Update tools's copy of kvm.h headers
Arnaldo Carvalho de Melo [Thu, 27 Feb 2020 12:51:30 +0000 (09:51 -0300)]
tools headers UAPI: Update tools's copy of kvm.h headers

Picking the changes from:

  5ef8acbdd687 ("KVM: nVMX: Emulate MTF when performing instruction emulation")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

No change in tooling ensues, just the x86 kvm tooling gets rebuilt as
those headers are included in its build:

  $ cp arch/x86/include/uapi/asm/kvm.h tools/arch/x86/include/uapi/asm/kvm.h
  $ make -C tools/perf
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j12' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  <SNIP>
  ...        disassembler-four-args: [ on  ]

    DESCEND  plugins
    CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
  <SNIP>
    LD       /tmp/build/perf/arch/x86/util/perf-in.o
    LD       /tmp/build/perf/arch/x86/perf-in.o
    LD       /tmp/build/perf/arch/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>
  $

As it doesn't seem to be used there:

  $ grep STATE tools/perf/arch/x86/util/kvm-stat.c
  $

And the 'perf trace' beautifier table generator isn't interested in
these things:

  $ grep regex= tools/perf/trace/beauty/kvm_ioctl.sh
  regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
  $

Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agotools arch x86: Sync the msr-index.h copy with the kernel sources
Arnaldo Carvalho de Melo [Thu, 27 Feb 2020 12:23:35 +0000 (09:23 -0300)]
tools arch x86: Sync the msr-index.h copy with the kernel sources

To pick up the changes from these csets:

  21b5ee59ef18 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF")

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ git diff
  diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
  index ebe1685e92dd..d5e517d1c3dd 100644
  --- a/tools/arch/x86/include/asm/msr-index.h
  +++ b/tools/arch/x86/include/asm/msr-index.h
  @@ -512,6 +512,8 @@
   #define MSR_K7_HWCR                    0xc0010015
   #define MSR_K7_HWCR_SMMLOCK_BIT                0
   #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
  +#define MSR_K7_HWCR_IRPERF_EN_BIT      30
  +#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
   #define MSR_K7_FID_VID_CTL             0xc0010041
   #define MSR_K7_FID_VID_STATUS          0xc0010042
  $

That don't result in any change in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

To silence this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoMerge tag 'devfreq-fixes-for-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Rafael J. Wysocki [Thu, 27 Feb 2020 10:21:23 +0000 (11:21 +0100)]
Merge tag 'devfreq-fixes-for-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull a devfreq fix for 5.6-rc4 from Chanwoo Choi:

"Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
 - This changes as devfreq(X) cause break some user space applications
 such as Android HAL from Unisoc and Hikey. In result, decide to revert it
 for preventing the HAL layer problem."

* tag 'devfreq-fixes-for-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"

5 years agosched/fair: Fix statistics for find_idlest_group()
Vincent Guittot [Tue, 18 Feb 2020 14:45:34 +0000 (15:45 +0100)]
sched/fair: Fix statistics for find_idlest_group()

sgs->group_weight is not set while gathering statistics in
update_sg_wakeup_stats(). This means that a group can be classified as
fully busy with 0 running tasks if utilization is high enough.

This path is mainly used for fork and exec.

Fixes: 57abff067a08 ("sched/fair: Rework find_idlest_group()")
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
5 years agocpufreq: Fix policy initialization for internal governor drivers
Rafael J. Wysocki [Wed, 26 Feb 2020 21:39:27 +0000 (22:39 +0100)]
cpufreq: Fix policy initialization for internal governor drivers

Before commit 1e4f63aecb53 ("cpufreq: Avoid creating excessively
large stack frames") the initial value of the policy field in struct
cpufreq_policy set by the driver's ->init() callback was implicitly
passed from cpufreq_init_policy() to cpufreq_set_policy() if the
default governor was neither "performance" nor "powersave".  After
that commit, however, cpufreq_init_policy() must take that case into
consideration explicitly and handle it as appropriate, so make that
happen.

Fixes: 1e4f63aecb53 ("cpufreq: Avoid creating excessively large stack frames")
Link: https://lore.kernel.org/linux-pm/[email protected]/
Reported-by: Artem Bityutskiy <[email protected]>
Cc: 5.4+ <[email protected]> # 5.4+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
5 years agonet/smc: check for valid ib_client_data
Karsten Graul [Wed, 26 Feb 2020 16:52:46 +0000 (17:52 +0100)]
net/smc: check for valid ib_client_data

In smc_ib_remove_dev() check if the provided ib device was actually
initialized for SMC before.

Reported-by: [email protected]
Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: stmmac: fix notifier registration
Aaro Koskinen [Wed, 26 Feb 2020 16:49:01 +0000 (18:49 +0200)]
net: stmmac: fix notifier registration

We cannot register the same netdev notifier multiple times when probing
stmmac devices. Register the notifier only once in module init, and also
make debugfs creation/deletion safe against simultaneous notifier call.

Fixes: 481a7d154cbb ("stmmac: debugfs entry name is not be changed when udev rename device name.")
Signed-off-by: Aaro Koskinen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: phy: mscc: fix firmware paths
Antoine Tenart [Wed, 26 Feb 2020 15:26:50 +0000 (16:26 +0100)]
net: phy: mscc: fix firmware paths

The firmware paths for the VSC8584 PHYs not not contain the leading
'microchip/' directory, as used in linux-firmware, resulting in an
error when probing the driver. This patch fixes it.

Fixes: a5afc1678044 ("net: phy: mscc: add support for VSC8584 PHY")
Signed-off-by: Antoine Tenart <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agomptcp: add dummy icsk_sync_mss()
Paolo Abeni [Wed, 26 Feb 2020 11:19:03 +0000 (12:19 +0100)]
mptcp: add dummy icsk_sync_mss()

syzbot noted that the master MPTCP socket lacks the icsk_sync_mss
callback, and was able to trigger a null pointer dereference:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 8e171067 P4D 8e171067 PUD 93fa2067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 8984 Comm: syz-executor066 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc900020b7b80 EFLAGS: 00010246
RAX: 1ffff110124ba600 RBX: 0000000000000000 RCX: ffff88809fefa600
RDX: ffff8880994cdb18 RSI: 0000000000000000 RDI: ffff8880925d3140
RBP: ffffc900020b7bd8 R08: ffffffff870225be R09: fffffbfff140652a
R10: fffffbfff140652a R11: 0000000000000000 R12: ffff8880925d35d0
R13: ffff8880925d3140 R14: dffffc0000000000 R15: 1ffff110124ba6ba
FS:  0000000001a0b880(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a6d6f000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 cipso_v4_sock_setattr+0x34b/0x470 net/ipv4/cipso_ipv4.c:1888
 netlbl_sock_setattr+0x2a7/0x310 net/netlabel/netlabel_kapi.c:989
 smack_netlabel security/smack/smack_lsm.c:2425 [inline]
 smack_inode_setsecurity+0x3da/0x4a0 security/smack/smack_lsm.c:2716
 security_inode_setsecurity+0xb2/0x140 security/security.c:1364
 __vfs_setxattr_noperm+0x16f/0x3e0 fs/xattr.c:197
 vfs_setxattr fs/xattr.c:224 [inline]
 setxattr+0x335/0x430 fs/xattr.c:451
 __do_sys_fsetxattr fs/xattr.c:506 [inline]
 __se_sys_fsetxattr+0x130/0x1b0 fs/xattr.c:495
 __x64_sys_fsetxattr+0xbf/0xd0 fs/xattr.c:495
 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440199
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffcadc19e48 EFLAGS: 00000246 ORIG_RAX: 00000000000000be
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440199
RDX: 0000000020000200 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000003 R09: 00000000004002c8
R10: 0000000000000009 R11: 0000000000000246 R12: 0000000000401a20
R13: 0000000000401ab0 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
CR2: 0000000000000000

Address the issue adding a dummy icsk_sync_mss callback.
To properly sync the subflows mss and options list we need some
additional infrastructure, which will land to net-next.

Reported-by: [email protected]
Fixes: 2303f994b3e1 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: phy: corrected the return value for genphy_check_and_restart_aneg and genphy_c45...
Sudheesh Mavila [Wed, 26 Feb 2020 07:10:45 +0000 (12:40 +0530)]
net: phy: corrected the return value for genphy_check_and_restart_aneg and genphy_c45_check_and_restart_aneg

When auto-negotiation is not required, return value should be zero.

Changes v1->v2:
- improved comments and code as Andrew Lunn and Heiner Kallweit suggestion
- fixed issue in genphy_c45_check_and_restart_aneg as Russell King
  suggestion.

Fixes: 2a10ab043ac5 ("net: phy: add genphy_check_and_restart_aneg()")
Fixes: 1af9f16840e9 ("net: phy: add genphy_c45_check_and_restart_aneg()")
Signed-off-by: Sudheesh Mavila <[email protected]>
Reviewed-by: Heiner Kallweit <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoslip: not call free_netdev before rtnl_unlock in slip_open
yangerkun [Wed, 26 Feb 2020 03:54:35 +0000 (11:54 +0800)]
slip: not call free_netdev before rtnl_unlock in slip_open

As the description before netdev_run_todo, we cannot call free_netdev
before rtnl_unlock, fix it by reorder the code.

Signed-off-by: yangerkun <[email protected]>
Reviewed-by: Oliver Hartkopp <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoipv6: restrict IPV6_ADDRFORM operation
Eric Dumazet [Tue, 25 Feb 2020 19:52:29 +0000 (11:52 -0800)]
ipv6: restrict IPV6_ADDRFORM operation

IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one.
While this operation sounds illogical, we have to support it.

One of the things it does for TCP socket is to switch sk->sk_prot
to tcp_prot.

We now have other layers playing with sk->sk_prot, so we should make
sure to not interfere with them.

This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.

syzbot reported :
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD a0113067 P4D a0113067 PUD a8771067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427
 __sock_release net/socket.c:605 [inline]
 sock_close+0xe1/0x260 net/socket.c:1283
 __fput+0x2e4/0x740 fs/file_table.c:280
 ____fput+0x15/0x20 fs/file_table.c:313
 task_work_run+0x176/0x1b0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_usermode_loop arch/x86/entry/common.c:164 [inline]
 prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195
 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278
 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c429
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429
RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004
RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000
R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c
Modules linked in:
CR2: 0000000000000000
---[ end trace 82567b5207e87bae ]---
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS:  00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Cc: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet/smc: fix cleanup for linkgroup setup failures
Ursula Braun [Tue, 25 Feb 2020 15:34:36 +0000 (16:34 +0100)]
net/smc: fix cleanup for linkgroup setup failures

If an SMC connection to a certain peer is setup the first time,
a new linkgroup is created. In case of setup failures, such a
linkgroup is unusable and should disappear. As a first step the
linkgroup is removed from the linkgroup list in smc_lgr_forget().

There are 2 problems:
smc_listen_decline() might be called before linkgroup creation
resulting in a crash due to calling smc_lgr_forget() with
parameter NULL.
If a setup failure occurs after linkgroup creation, the connection
is never unregistered from the linkgroup, preventing linkgroup
freeing.

This patch introduces an enhanced smc_lgr_cleanup_early() function
which
* contains a linkgroup check for early smc_listen_decline()
  invocations
* invokes smc_conn_free() to guarantee unregistering of the
  connection.
* schedules fast linkgroup removal of the unusable linkgroup

And the unused function smcd_conn_free() is removed from smc_core.h.

Fixes: 3b2dec2603d5b ("net/smc: restructure client and server code in af_smc")
Fixes: 2a0674fffb6bc ("net/smc: improve abnormal termination of link groups")
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed
Nicolas Saenz Julienne [Tue, 25 Feb 2020 13:11:59 +0000 (14:11 +0100)]
net: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed

Outdated Raspberry Pi 4 firmware might configure the external PHY as
rgmii although the kernel currently sets it as rgmii-rxid. This makes
connections unreliable as ID_MODE_DIS is left enabled. To avoid this,
explicitly clear that bit whenever we don't need it.

Fixes: da38802211cc ("net: bcmgenet: Add RGMII_RXID support")
Signed-off-by: Nicolas Saenz Julienne <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agosched: act: count in the size of action flags bitfield
Jiri Pirko [Tue, 25 Feb 2020 12:54:12 +0000 (13:54 +0100)]
sched: act: count in the size of action flags bitfield

The put of the flags was added by the commit referenced in fixes tag,
however the size of the message was not extended accordingly.

Fix this by adding size of the flags bitfield to the message size.

Fixes: e38226786022 ("net: sched: update action implementations to support flags")
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agokbuild: get rid of trailing slash from subdir- example
Masahiro Yamada [Wed, 26 Feb 2020 17:44:58 +0000 (02:44 +0900)]
kbuild: get rid of trailing slash from subdir- example

obj-* needs a trailing slash for a directory, but subdir-* does not.

Signed-off-by: Masahiro Yamada <[email protected]>
5 years agonet: core: devlink.c: Use built-in RCU list checking
Madhuparna Bhowmik [Tue, 25 Feb 2020 12:27:45 +0000 (17:57 +0530)]
net: core: devlink.c: Use built-in RCU list checking

list_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled.

The devlink->lock is held when devlink_dpipe_table_find()
is called in non RCU read side section. Therefore, pass struct devlink
to devlink_dpipe_table_find() for lockdep checking.

Signed-off-by: Madhuparna Bhowmik <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec
Florian Fainelli [Mon, 24 Feb 2020 23:56:32 +0000 (15:56 -0800)]
net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec

We are still experiencing some packet loss with the existing advanced
congestion buffering (ACB) settings with the IMP port configured for
2Gb/sec, so revert to conservative link speeds that do not produce
packet loss until this is resolved.

Fixes: 8f1880cbe8d0 ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec")
Fixes: de34d7084edd ("net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
David S. Miller [Thu, 27 Feb 2020 00:30:17 +0000 (16:30 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes:

1) Perform garbage collection from workqueue to fix rcu detected
   stall in ipset hash set types, from Jozsef Kadlecsik.

2) Fix the forceadd evaluation path, also from Jozsef.

3) Fix nft_set_pipapo selftest, from Stefano Brivio.

4) Crash when add-flush-add element in pipapo set, also from Stefano.
   Add test to cover this crash.

5) Remove sysctl entry under mutex in hashlimit, from Cong Wang.
====================

Signed-off-by: David S. Miller <[email protected]>
5 years agoMerge tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm...
Linus Torvalds [Wed, 26 Feb 2020 23:54:52 +0000 (15:54 -0800)]
Merge tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform fix from Benson Leung:
 "Fix a build warning"

* tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path

5 years agobnxt_en: add newline to netdev_*() format strings
Jonathan Lemon [Mon, 24 Feb 2020 23:29:09 +0000 (15:29 -0800)]
bnxt_en: add newline to netdev_*() format strings

Add missing newlines to netdev_* format strings so the lines
aren't buffered by the printk subsystem.

Nitpicked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jonathan Lemon <[email protected]>
Acked-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonetfilter: xt_hashlimit: unregister proc file before releasing mutex
Cong Wang [Thu, 13 Feb 2020 06:53:52 +0000 (22:53 -0800)]
netfilter: xt_hashlimit: unregister proc file before releasing mutex

Before releasing the global mutex, we only unlink the hashtable
from the hash list, its proc file is still not unregistered at
this point. So syzbot could trigger a race condition where a
parallel htable_create() could register the same file immediately
after the mutex is released.

Move htable_remove_proc_entry() back to mutex protection to
fix this. And, fold htable_destroy() into htable_put() to make
the code slightly easier to understand.

Reported-and-tested-by: [email protected]
Fixes: c4a3922d2d20 ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()")
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agoMerge tag 'gvt-fixes-2020-02-26' of https://github.com/intel/gvt-linux into drm-intel...
Jani Nikula [Wed, 26 Feb 2020 20:58:24 +0000 (22:58 +0200)]
Merge tag 'gvt-fixes-2020-02-26' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2020-02-26

- Fix virtual display reset (Tina)
- Fix one use-after-free for dmabuf (Tina)

Signed-off-by: Jani Nikula <[email protected]>
From: Zhenyu Wang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
5 years agoethtool: limit bitset size
Michal Kubecek [Mon, 24 Feb 2020 19:42:12 +0000 (20:42 +0100)]
ethtool: limit bitset size

Syzbot reported that ethnl_compact_sanity_checks() can be tricked into
reading past the end of ETHTOOL_A_BITSET_VALUE and ETHTOOL_A_BITSET_MASK
attributes and even the message by passing a value between (u32)(-31)
and (u32)(-1) as ETHTOOL_A_BITSET_SIZE.

The problem is that DIV_ROUND_UP(attr_nbits, 32) is 0 for such values so
that zero length ETHTOOL_A_BITSET_VALUE will pass the length check but
ethnl_bitmap32_not_zero() check would try to access up to 512 MB of
attribute "payload".

Prevent this overflow byt limiting the bitset size. Technically, compact
bitset format would allow bitset sizes up to almost 2^18 (so that the
nest size does not exceed U16_MAX) but bitsets used by ethtool are much
shorter. S16_MAX, the largest value which can be directly used as an
upper limit in policy, should be a reasonable compromise.

Fixes: 10b518d4e6dd ("ethtool: netlink bitset handling")
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: Fix Tx hash bound checking
Amritha Nambiar [Mon, 24 Feb 2020 18:56:00 +0000 (10:56 -0800)]
net: Fix Tx hash bound checking

Fixes the lower and upper bounds when there are multiple TCs and
traffic is on the the same TC on the same device.

The lower bound is represented by 'qoffset' and the upper limit for
hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue
mapping when there are multiple TCs, as the queue indices for upper TCs
will be offset by 'qoffset'.

v2: Fixed commit description based on comments.

Fixes: 1b837d489e06 ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash")
Fixes: eadec877ce9c ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Signed-off-by: Amritha Nambiar <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agodrm/radeon: Inline drm_get_pci_dev
Daniel Vetter [Sat, 22 Feb 2020 17:54:32 +0000 (18:54 +0100)]
drm/radeon: Inline drm_get_pci_dev

It's the last user, and more importantly, it's the last non-legacy
user of anything in drm_pci.c.

The only tricky bit is the agp initialization. But a close look shows
that radeon does not use the drm_agp midlayer (the main use of that is
drm_bufs for legacy drivers), and instead could use the agp subsystem
directly (like nouveau does already). Hence we can just pull this in
too.

A further step would be to entirely drop the use of drm_device->agp,
but feels like too much churn just for this patch.

Signed-off-by: Daniel Vetter <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: "David (ChunMing) Zhou" <[email protected]>
Cc: [email protected]
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
5 years agodrm/amdgpu: Drop DRIVER_USE_AGP
Daniel Vetter [Sat, 22 Feb 2020 17:54:31 +0000 (18:54 +0100)]
drm/amdgpu: Drop DRIVER_USE_AGP

This doesn't do anything except auto-init drm_agp support when you
call drm_get_pci_dev(). Which amdgpu stopped doing with

commit b58c11314a1706bf094c489ef5cb28f76478c704
Author: Alex Deucher <[email protected]>
Date:   Fri Jun 2 17:16:31 2017 -0400

    drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev

No idea whether this was intentional or accidental breakage, but I
guess anyone who manages to boot a this modern gpu behind an agp
bridge deserves a price. A price I never expect anyone to ever collect
:-)

Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Xiaojie Yuan <[email protected]>
Cc: Evan Quan <[email protected]>
Cc: "Tianci.Yin" <[email protected]>
Cc: "Marek Olšák" <[email protected]>
Cc: Hans de Goede <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
5 years agoMerge tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Wed, 26 Feb 2020 18:34:42 +0000 (10:34 -0800)]
Merge tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing and bootconfig updates:
 "Fixes and changes to bootconfig before it goes live in a release.

  Change in API of bootconfig (before it comes live in a release):
  - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig
    exists
  - Set CONFIG_BOOT_CONFIG to 'n' by default
  - Show error if "bootconfig" on cmdline but not compiled in
  - Prevent redefining the same value
  - Have a way to append values
  - Added a SELECT BLK_DEV_INITRD to fix a build failure

  Synthetic event fixes:
  - Switch to raw_smp_processor_id() for recording CPU value in preempt
    section. (No care for what the value actually is)
  - Fix samples always recording u64 values
  - Fix endianess
  - Check number of values matches number of fields
  - Fix a printing bug

  Fix of trace_printk() breaking postponed start up tests

  Make a function static that is only used in a single file"

* tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue
  bootconfig: Add append value operator support
  bootconfig: Prohibit re-defining value on same key
  bootconfig: Print array as multiple commands for legacy command line
  bootconfig: Reject subkey and value on same parent key
  tools/bootconfig: Remove unneeded error message silencer
  bootconfig: Add bootconfig magic word for indicating bootconfig explicitly
  bootconfig: Set CONFIG_BOOT_CONFIG=n by default
  tracing: Clear trace_state when starting trace
  bootconfig: Mark boot_config_checksum() static
  tracing: Disable trace_printk() on post poned tests
  tracing: Have synthetic event test use raw_smp_processor_id()
  tracing: Fix number printing bug in print_synth_event()
  tracing: Check that number of vals matches number of synth event fields
  tracing: Make synth_event trace functions endian-correct
  tracing: Make sure synth_event_trace() example always uses u64

5 years agoMerge tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Wed, 26 Feb 2020 18:28:59 +0000 (10:28 -0800)]
Merge tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kunit fixes from Shuah Khan:
 "This Kselftest kunit update consists of fixes to documentation and
  the run-time tool from Brendan Higgins and Heidi Fahim"

* tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: run kunit_tool from any directory
  kunit: test: Improve error messages for kunit_tool when kunitconfig is invalid
  Documentation: kunit: fixed sphinx error in code block

5 years agoMerge tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 26 Feb 2020 18:06:56 +0000 (10:06 -0800)]
Merge tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - fixes to TIMEOUT failures and out-of-tree compilation compilation
   errors from Michael Ellerman.

 - declutter git status fix from Christophe Leroy

* tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/rseq: Fix out-of-tree compilation
  selftests: Install settings files to fix TIMEOUT failures
  selftest/lkdtm: Don't pollute 'git status'

5 years agoRevert "KVM: x86: enable -Werror"
Christoph Hellwig [Wed, 26 Feb 2020 15:39:29 +0000 (07:39 -0800)]
Revert "KVM: x86: enable -Werror"

This reverts commit ead68df94d248c80fdbae220ae5425eb5af2e753.

Using the -Werror flag breaks the build for me due to mostly harmless
KASAN or similar warnings:

  arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
  arch/x86/kvm/x86.c:7209:1: error: the frame size of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

Feel free to add a CONFIG_WERROR if you care strong enough, but don't
break peoples builds for absolutely no good reason.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agosignal: avoid double atomic counter increments for user accounting
Linus Torvalds [Mon, 24 Feb 2020 20:47:14 +0000 (12:47 -0800)]
signal: avoid double atomic counter increments for user accounting

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Philip Li <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoio_uring: drop file set ref put/get on switch
Jens Axboe [Wed, 26 Feb 2020 17:23:43 +0000 (10:23 -0700)]
io_uring: drop file set ref put/get on switch

Dan reports that he triggered a warning on ring exit doing some testing:

percpu ref (io_file_data_ref_zero) <= 0 (0) after switching to atomic
WARNING: CPU: 3 PID: 0 at lib/percpu-refcount.c:160 percpu_ref_switch_to_atomic_rcu+0xe8/0xf0
Modules linked in:
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-rc3+ #5648
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0xe8/0xf0
Code: e7 ff 55 e8 eb d2 80 3d bd 02 d2 00 00 75 8b 48 8b 55 d8 48 c7 c7 e8 70 e6 81 c6 05 a9 02 d2 00 01 48 8b 75 e8 e8 3a d0 c5 ff <0f> 0b e9 69 ff ff ff 90 55 48 89 fd 53 48 89 f3 48 83 ec 28 48 83
RSP: 0018:ffffc90000110ef8 EFLAGS: 00010292
RAX: 0000000000000045 RBX: 7fffffffffffffff RCX: 0000000000000000
RDX: 0000000000000045 RSI: ffffffff825be7a5 RDI: ffffffff825bc32c
RBP: ffff8881b75eac38 R08: 000000042364b941 R09: 0000000000000045
R10: ffffffff825beb40 R11: ffffffff825be78a R12: 0000607e46005aa0
R13: ffff888107dcdd00 R14: 0000000000000000 R15: 0000000000000009
FS:  0000000000000000(0000) GS:ffff8881b9d80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f49e6a5ea20 CR3: 00000001b747c004 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 rcu_core+0x1e4/0x4d0
 __do_softirq+0xdb/0x2f1
 irq_exit+0xa0/0xb0
 smp_apic_timer_interrupt+0x60/0x140
 apic_timer_interrupt+0xf/0x20
 </IRQ>
RIP: 0010:default_idle+0x23/0x170
Code: ff eb ab cc cc cc cc 0f 1f 44 00 00 41 54 55 53 65 8b 2d 10 96 92 7e 0f 1f 44 00 00 e9 07 00 00 00 0f 00 2d 21 d0 51 00 fb f4 <65> 8b 2d f6 95 92 7e 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 e5 95

Turns out that this is due to percpu_ref_switch_to_atomic() only
grabbing a reference to the percpu refcount if it's not already in
atomic mode. io_uring drops a ref and re-gets it when switching back to
percpu mode. We attempt to protect against this with the FFD_F_ATOMIC
bit, but that isn't reliable.

We don't actually need to juggle these refcounts between atomic and
percpu switch, we can just do them when we've switched to atomic mode.
This removes the need for FFD_F_ATOMIC, which wasn't reliable.

Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Reported-by: Dan Melnic <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoblk-mq: Remove some unused function arguments
John Garry [Wed, 26 Feb 2020 12:10:15 +0000 (20:10 +0800)]
blk-mq: Remove some unused function arguments

The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(),
blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove
it.

Overall obj code size shows a minor reduction, before:
   text    data     bss     dec     hex filename
  27306    1312       0   28618    6fca block/blk-mq.o
   4303     272       0    4575    11df block/blk-mq-tag.o

after:
  27282    1312       0   28594    6fb2 block/blk-mq.o
   4311     272       0    4583    11e7 block/blk-mq-tag.o

Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: John Garry <[email protected]>
--
This minor patch had been carried as part of the blk-mq shared tags RFC,
I'd rather not carry it anymore as it required rebasing, so now or never..
Signed-off-by: Jens Axboe <[email protected]>
5 years agokbuild: add dt_binding_check to PHONY in a correct place
Masahiro Yamada [Sat, 22 Feb 2020 19:04:34 +0000 (04:04 +0900)]
kbuild: add dt_binding_check to PHONY in a correct place

The dt_binding_check is added to PHONY, but it is invisible when
$(dtstree) is empty. So, it is not specified as phony for
ARCH=x86 etc.

Add it to PHONY outside the ifneq ... endif block.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
5 years agokbuild: add dtbs_check to PHONY
Masahiro Yamada [Sat, 22 Feb 2020 19:04:33 +0000 (04:04 +0900)]
kbuild: add dtbs_check to PHONY

The dtbs_check should be a phony target, but currently it is not
specified so.

'make dtbs_check' works even if a file named 'dtbs_check' exists
because it depends on another phony target, scripts_dtc, but we
should not rely on it.

Add dtbs_check to PHONY.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
5 years agokbuild: remove unneeded semicolon at the end of cmd_dtb_check
Masahiro Yamada [Sat, 22 Feb 2020 19:04:32 +0000 (04:04 +0900)]
kbuild: remove unneeded semicolon at the end of cmd_dtb_check

This trailing semicolon is unneeded.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
5 years agokbuild: fix DT binding schema rule to detect command line changes
Masahiro Yamada [Sat, 22 Feb 2020 19:04:31 +0000 (04:04 +0900)]
kbuild: fix DT binding schema rule to detect command line changes

This if_change_rule is not working properly; it cannot detect any
command line change.

The reason is because cmd-check in scripts/Kbuild.include compares
$(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here.

For if_change_rule to work properly, the stem part of cmd_* and rule_*
must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must
be named rule_dtc.

Fixes: 4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks")
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
5 years agokbuild: remove wrong documentation about mandatory-y
Masahiro Yamada [Wed, 19 Feb 2020 01:15:19 +0000 (10:15 +0900)]
kbuild: remove wrong documentation about mandatory-y

This sentence does not make sense in the section about mandatory-y.

This seems to be a copy-paste mistake of commit fcc8487d477a ("uapi:
export all headers under uapi directories").

The correct description would be "The convention is to list one
mandatory-y per line ...".

I just removed it instead of fixing it. If such information is needed,
it could be commented in include/asm-generic/Kbuild and
include/uapi/asm-generic/Kbuild.

Signed-off-by: Masahiro Yamada <[email protected]>
5 years agokbuild: add comment for V=2 mode
Randy Dunlap [Thu, 13 Feb 2020 04:40:57 +0000 (20:40 -0800)]
kbuild: add comment for V=2 mode

Complete the comments for valid values of KBUILD_VERBOSE,
specifically for KBUILD_VERBOSE=2.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
5 years agoefi: READ_ONCE rng seed size before munmap
Jason A. Donenfeld [Fri, 21 Feb 2020 08:48:49 +0000 (09:48 +0100)]
efi: READ_ONCE rng seed size before munmap

This function is consistent with using size instead of seed->size
(except for one place that this patch fixes), but it reads seed->size
without using READ_ONCE, which means the compiler might still do
something unwanted. So, this commit simply adds the READ_ONCE
wrapper.

Fixes: 636259880a7e ("efi: Add support for seeding the RNG from a UEFI ...")
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
5 years agoefi/x86: Handle by-ref arguments covering multiple pages in mixed mode
Ard Biesheuvel [Fri, 21 Feb 2020 08:48:48 +0000 (09:48 +0100)]
efi/x86: Handle by-ref arguments covering multiple pages in mixed mode

The mixed mode runtime wrappers are fragile when it comes to how the
memory referred to by its pointer arguments are laid out in memory, due
to the fact that it translates these addresses to physical addresses that
the runtime services can dereference when running in 1:1 mode. Since
vmalloc'ed pages (including the vmap'ed stack) are not contiguous in the
physical address space, this scheme only works if the referenced memory
objects do not cross page boundaries.

Currently, the mixed mode runtime service wrappers require that all by-ref
arguments that live in the vmalloc space have a size that is a power of 2,
and are aligned to that same value. While this is a sensible way to
construct an object that is guaranteed not to cross a page boundary, it is
overly strict when it comes to checking whether a given object violates
this requirement, as we can simply take the physical address of the first
and the last byte, and verify that they point into the same physical page.

When this check fails, we emit a WARN(), but then simply proceed with the
call, which could cause data corruption if the next physical page belongs
to a mapping that is entirely unrelated.

Given that with vmap'ed stacks, this condition is much more likely to
trigger, let's relax the condition a bit, but fail the runtime service
call if it does trigger.

Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
5 years agoefi/x86: Remove support for EFI time and counter services in mixed mode
Ard Biesheuvel [Fri, 21 Feb 2020 08:48:47 +0000 (09:48 +0100)]
efi/x86: Remove support for EFI time and counter services in mixed mode

Mixed mode calls at runtime are rather tricky with vmap'ed stacks,
as we can no longer assume that data passed in by the callers of the
EFI runtime wrapper routines is contiguous in physical memory.

We need to fix this, but before we do, let's drop the implementations
of routines that we know are never used on x86, i.e., the RTC related
ones. Given that UEFI rev2.8 permits any runtime service to return
EFI_UNSUPPORTED at runtime, let's return that instead.

As get_next_high_mono_count() is never used at all, even on other
architectures, let's make that return EFI_UNSUPPORTED too.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
5 years agoefi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
Ard Biesheuvel [Fri, 21 Feb 2020 08:48:46 +0000 (09:48 +0100)]
efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper

Hans reports that his mixed mode systems running v5.6-rc1 kernels hit
the WARN_ON() in virt_to_phys_or_null_size(), caused by the fact that
efi_guid_t objects on the vmap'ed stack happen to be misaligned with
respect to their sizes. As a quick (i.e., backportable) fix, copy GUID
pointer arguments to the local stack into a buffer that is naturally
aligned to its size, so that it is guaranteed to cover only one
physical page.

Note that on x86, we cannot rely on the stack pointer being aligned
the way the compiler expects, so we need to allocate an 8-byte aligned
buffer of sufficient size, and copy the GUID into that buffer at an
offset that is aligned to 16 bytes.

Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Reported-by: Hans de Goede <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Hans de Goede <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
5 years agoMerge tag 'perf-urgent-for-mingo-5.6-20200220' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Wed, 26 Feb 2020 14:18:05 +0000 (15:18 +0100)]
Merge tag 'perf-urgent-for-mingo-5.6-20200220' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

auxtrace:

  Adrian Hunter:

  - Fix endless record after being terminated on arm-spe.

  Wei Li:

  - Fix endless record after being terminated on Intel PT and BTS and
    on ARM's cs-etm.

perf test:

  Thomas Richter

  - Fix test trace+probe_vfs_getname.sh on s390

PowerPC:

  Arnaldo Carvalho de Melo:

  - Sync powerpc syscall.tbl with the kernel sources.

BPF:

  Arnaldo Carvalho de Melo:

  - Remove extraneous bpf/ subdir from bpf.h headers used to build bpf events.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
5 years agoio_uring: import_single_range() returns 0/-ERROR
Jens Axboe [Wed, 26 Feb 2020 00:48:55 +0000 (17:48 -0700)]
io_uring: import_single_range() returns 0/-ERROR

Unlike the other core import helpers, import_single_range() returns 0 on
success, not the length imported. This means that links that depend on
the result of non-vec based IORING_OP_{READ,WRITE} that were added for
5.5 get errored when they should not be.

Fixes: 3a6820f2bb8a ("io_uring: add non-vectored read/write commands")
Signed-off-by: Jens Axboe <[email protected]>
5 years agoio_uring: pick up link work on submit reference drop
Jens Axboe [Tue, 25 Feb 2020 20:25:41 +0000 (13:25 -0700)]
io_uring: pick up link work on submit reference drop

If work completes inline, then we should pick up a dependent link item
in __io_queue_sqe() as well. If we don't do so, we're forced to go async
with that item, which is suboptimal.

This also fixes an issue with io_put_req_find_next(), which always looks
up the next work item. That should only be done if we're dropping the
last reference to the request, to prevent multiple lookups of the same
work item.

Outside of being a fix, this also enables a good cleanup series for 5.7,
where we never have to pass 'nxt' around or into the work handlers.

Reviewed-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoselftests: nft_concat_range: Add test for reported add/flush/add issue
Stefano Brivio [Fri, 21 Feb 2020 02:04:22 +0000 (03:04 +0100)]
selftests: nft_concat_range: Add test for reported add/flush/add issue

Add a specific test for the crash reported by Phil Sutter and addressed
in the previous patch. The test cases that, in my intention, should
have covered these cases, that is, the ones from the 'concurrency'
section, don't run these sequences tightly enough and spectacularly
failed to catch this.

While at it, define a convenient way to add these kind of tests, by
adding a "reported issues" test section.

It's more convenient, for this particular test, to execute the set
setup in its own function. However, future test cases like this one
might need to call setup functions, and will typically need no tools
other than nft, so allow for this in check_tools().

The original form of the reproducer used here was provided by Phil.

Reported-by: Phil Sutter <[email protected]>
Signed-off-by: Stefano Brivio <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agonft_set_pipapo: Actually fetch key data in nft_pipapo_remove()
Stefano Brivio [Fri, 21 Feb 2020 02:04:21 +0000 (03:04 +0100)]
nft_set_pipapo: Actually fetch key data in nft_pipapo_remove()

Phil reports that adding elements, flushing and re-adding them
right away:

  nft add table t '{ set s { type ipv4_addr . inet_service; flags interval; }; }'
  nft add element t s '{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }'
  nft flush set t s
  nft add element t s '{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }'

triggers, almost reliably, a crash like this one:

  [   71.319848] general protection fault, probably for non-canonical address 0x6f6b6e696c2e756e: 0000 [#1] PREEMPT SMP PTI
  [   71.321540] CPU: 3 PID: 1201 Comm: kworker/3:2 Not tainted 5.6.0-rc1-00377-g2bb07f4e1d861 #192
  [   71.322746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014
  [   71.324430] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
  [   71.325387] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
  [   71.326164] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
  [   71.328423] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
  [   71.329225] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
  [   71.330365] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
  [   71.331473] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
  [   71.332627] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
  [   71.333615] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
  [   71.334596] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
  [   71.335780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   71.336577] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
  [   71.337533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   71.338557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   71.339718] Call Trace:
  [   71.340093]  nft_pipapo_destroy+0x7a/0x170 [nf_tables_set]
  [   71.340973]  nft_set_destroy+0x20/0x50 [nf_tables]
  [   71.341879]  nf_tables_trans_destroy_work+0x246/0x260 [nf_tables]
  [   71.342916]  process_one_work+0x1d5/0x3c0
  [   71.343601]  worker_thread+0x4a/0x3c0
  [   71.344229]  kthread+0xfb/0x130
  [   71.344780]  ? process_one_work+0x3c0/0x3c0
  [   71.345477]  ? kthread_park+0x90/0x90
  [   71.346129]  ret_from_fork+0x35/0x40
  [   71.346748] Modules linked in: nf_tables_set nf_tables nfnetlink 8021q [last unloaded: nfnetlink]
  [   71.348153] ---[ end trace 2eaa8149ca759bcc ]---
  [   71.349066] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables]
  [   71.350016] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b
  [   71.350017] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282
  [   71.350019] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0
  [   71.350019] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a
  [   71.350020] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000
  [   71.350021] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2
  [   71.350022] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0
  [   71.350025] FS:  0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
  [   71.350026] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   71.350027] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0
  [   71.350028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   71.350028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   71.350030] Kernel panic - not syncing: Fatal exception
  [   71.350412] Kernel Offset: disabled
  [   71.365922] ---[ end Kernel panic - not syncing: Fatal exception ]---

which is caused by dangling elements that have been deactivated, but
never removed.

On a flush operation, nft_pipapo_walk() walks through all the elements
in the mapping table, which are then deactivated by nft_flush_set(),
one by one, and added to the commit list for removal. Element data is
then freed.

On transaction commit, nft_pipapo_remove() is called, and failed to
remove these elements, leading to the stale references in the mapping.
The first symptom of this, revealed by KASan, is a one-byte
use-after-free in subsequent calls to nft_pipapo_walk(), which is
usually not enough to trigger a panic. When stale elements are used
more heavily, though, such as double-free via nft_pipapo_destroy()
as in Phil's case, the problem becomes more noticeable.

The issue comes from that fact that, on a flush operation,
nft_pipapo_remove() won't get the actual key data via elem->key,
elements to be deleted upon commit won't be found by the lookup via
pipapo_get(), and removal will be skipped. Key data should be fetched
via nft_set_ext_key(), instead.

Reported-by: Phil Sutter <[email protected]>
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Stefano Brivio <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agoMerge branch 'master' of git://blackhole.kfki.hu/nf
Pablo Neira Ayuso [Wed, 26 Feb 2020 12:55:15 +0000 (13:55 +0100)]
Merge branch 'master' of git://blackhole.kfki.hu/nf

Jozsef Kadlecsik says:

====================
ipset patches for nf

The first one is larger than usual, but the issue could not be solved simpler.
Also, it's a resend of the patch I submitted a few days ago, with a one line
fix on top of that: the size of the comment extensions was not taken into
account at reporting the full size of the set.

- Fix "INFO: rcu detected stall in hash_xxx" reports of syzbot
  by introducing region locking and using workqueue instead of timer based
  gc of timed out entries in hash types of sets in ipset.
- Fix the forceadd evaluation path - the bug was also uncovered by the syzbot.
====================

Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agodrm/i915: Avoid recursing onto active vma from the shrinker
Chris Wilson [Fri, 21 Feb 2020 22:18:18 +0000 (22:18 +0000)]
drm/i915: Avoid recursing onto active vma from the shrinker

We mark the vma as active while binding it in order to protect outselves
from being shrunk under mempressure. This only works if we are strict in
not attempting to shrink active objects.

<6> [472.618968] Workqueue: events_unbound fence_work [i915]
<4> [472.618970] Call Trace:
<4> [472.618974]  ? __schedule+0x2e5/0x810
<4> [472.618978]  schedule+0x37/0xe0
<4> [472.618982]  schedule_preempt_disabled+0xf/0x20
<4> [472.618984]  __mutex_lock+0x281/0x9c0
<4> [472.618987]  ? mark_held_locks+0x49/0x70
<4> [472.618989]  ? _raw_spin_unlock_irqrestore+0x47/0x60
<4> [472.619038]  ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619084]  ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619122]  i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619165]  i915_gem_object_unbind+0x1dc/0x400 [i915]
<4> [472.619208]  i915_gem_shrink+0x328/0x660 [i915]
<4> [472.619250]  ? i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619282]  i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619325]  vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
<4> [472.619330]  ? rcu_read_lock_sched_held+0x4d/0x80
<4> [472.619363]  ? __alloc_pd+0xb/0x30 [i915]
<4> [472.619366]  ? module_assert_mutex_or_preempt+0xf/0x30
<4> [472.619368]  ? __module_address+0x23/0xe0
<4> [472.619371]  ? is_module_address+0x26/0x40
<4> [472.619374]  ? static_obj+0x34/0x50
<4> [472.619376]  ? lockdep_init_map+0x4d/0x1e0
<4> [472.619407]  setup_page_dma+0xd/0x90 [i915]
<4> [472.619437]  alloc_pd+0x29/0x50 [i915]
<4> [472.619470]  __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
<4> [472.619503]  gen8_ppgtt_alloc+0xd7/0x300 [i915]
<4> [472.619535]  ppgtt_bind_vma+0x2a/0xe0 [i915]
<4> [472.619577]  __vma_bind+0x26/0x40 [i915]
<4> [472.619611]  fence_work+0x1c/0x90 [i915]
<4> [472.619617]  process_one_work+0x26a/0x620

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 6f24e41022f28061368776ea1514db0a6e67a9b1)
Signed-off-by: Jani Nikula <[email protected]>
5 years agodrm/i915/pmu: Avoid using globals for PMU events
Michał Winiarski [Wed, 19 Feb 2020 16:18:22 +0000 (17:18 +0100)]
drm/i915/pmu: Avoid using globals for PMU events

Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915, will cause us to try and
double free the global state, hitting null ptr deref in free_event_attributes.

Let's move it to i915_pmu.

Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Michal Wajdeczko <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 46129dc10f47c5c2b51c93a82b7b2aca46574ae0)
Signed-off-by: Jani Nikula <[email protected]>
5 years agodrm/i915/pmu: Avoid using globals for CPU hotplug state
Michał Winiarski [Wed, 19 Feb 2020 16:18:21 +0000 (17:18 +0100)]
drm/i915/pmu: Avoid using globals for CPU hotplug state

Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915 can lead to leaks and
warnings from cpuhp:
Error: Removing state XXX which has instances left.

Let's move the state to i915_pmu.

Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Michal Wajdeczko <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit f5a179d4687d4e7bfadd7cbda7ee5d0bad76761f)
Signed-off-by: Jani Nikula <[email protected]>
5 years agodrm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt
Chris Wilson [Mon, 24 Feb 2020 10:11:20 +0000 (10:11 +0000)]
drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt

Full-ppgtt on gen7 is proving to be highly unstable and not robust.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/694
Fixes: 3cd6e8860ecd ("drm/i915/gen7: Re-enable full-ppgtt for ivb & hsw")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Dave Airlie <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 4fbe112a569526e46fa2accb5763c069f78cb431)
Signed-off-by: Jani Nikula <[email protected]>
5 years agodrm/i915: fix header test with GCOV
Jani Nikula [Fri, 21 Feb 2020 10:54:14 +0000 (12:54 +0200)]
drm/i915: fix header test with GCOV

$(CC) with $(CFLAGS_GCOV) assumes the output filename with .gcno suffix
appended is writable. This is not the case when the output filename is
/dev/null:

  HDRTEST drivers/gpu/drm/i915/display/intel_frontbuffer.h
/dev/null:1:0: error: cannot open /dev/null.gcno
  HDRTEST drivers/gpu/drm/i915/display/intel_ddi.h
/dev/null:1:0: error: cannot open /dev/null.gcno
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_ddi.hdrtest] Error 1
make[5]: *** Waiting for unfinished jobs....
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_frontbuffer.hdrtest] Error 1

Filter out $(CFLAGS_GVOC) from the header test $(c_flags) as they don't
make sense here anyway.

References: http://lore.kernel.org/r/d8112767-4089-4c58-d7d3-2ce03139858a@infradead.org
Reported-by: Randy Dunlap <[email protected]>
Fixes: c6d4a099a240 ("drm/i915: reimplement header test feature")
Cc: Masahiro Yamada <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 408c1b3253dab93da175690dc0e21dd8bccf3371)
Signed-off-by: Jani Nikula <[email protected]>
5 years agozonefs: select FS_IOMAP
Johannes Thumshirn [Tue, 25 Feb 2020 21:03:33 +0000 (22:03 +0100)]
zonefs: select FS_IOMAP

Zonefs makes use of iomap internally, so it should also select iomap in
Kconfig.

Signed-off-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
5 years agozonefs: fix IOCB_NOWAIT handling
Christoph Hellwig [Fri, 21 Feb 2020 14:37:23 +0000 (06:37 -0800)]
zonefs: fix IOCB_NOWAIT handling

IOCB_NOWAIT can't just be ignored as it breaks applications expecting
it not to block.  Just refuse the operation as applications must handle
that (e.g. by falling back to a thread pool).

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
5 years agobootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue
Masami Hiramatsu [Tue, 25 Feb 2020 14:36:41 +0000 (23:36 +0900)]
bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue

Since commit d8a953ddde5e ("bootconfig: Set CONFIG_BOOT_CONFIG=n by
default") also changed the CONFIG_BOOTTIME_TRACING to select
CONFIG_BOOT_CONFIG to show the boot-time tracing on the menu,
it introduced wrong dependencies with BLK_DEV_INITRD as below.

WARNING: unmet direct dependencies detected for BOOT_CONFIG
  Depends on [n]: BLK_DEV_INITRD [=n]
  Selected by [y]:
  - BOOTTIME_TRACING [=y] && TRACING_SUPPORT [=y] && FTRACE [=y] && TRACING [=y]

This makes the CONFIG_BOOT_CONFIG selects CONFIG_BLK_DEV_INITRD to
fix this error and make CONFIG_BOOTTIME_TRACING=n by default, so
that both boot-time tracing and boot configuration off but those
appear on the menu list.

Link: http://lkml.kernel.org/r/158264140162.23842.11237423518607465535.stgit@devnote2
Fixes: d8a953ddde5e ("bootconfig: Set CONFIG_BOOT_CONFIG=n by default")
Reported-by: Randy Dunlap <[email protected]>
Compiled-tested-by: Randy Dunlap <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
5 years agoio-wq: ensure work->task_pid is cleared on init
Jens Axboe [Tue, 25 Feb 2020 18:52:56 +0000 (11:52 -0700)]
io-wq: ensure work->task_pid is cleared on init

We use ->task_pid for exit cancellation, but we need to ensure it's
cleared to zero for io_req_work_grab_env() to do the right thing. Take
a suggestion from Bart and clear the whole thing, just setting the
function passed in. This makes it more future proof as well.

Fixes: 36282881a795 ("io-wq: add io_wq_cancel_pid() to cancel based on a specific pid")
Signed-off-by: Jens Axboe <[email protected]>
5 years agoicmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n
Jason A. Donenfeld [Tue, 25 Feb 2020 10:05:35 +0000 (18:05 +0800)]
icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n

The icmpv6_send function has long had a static inline implementation
with an empty body for CONFIG_IPV6=n, so that code calling it doesn't
need to be ifdef'd. The new icmpv6_ndo_send function, which is intended
for drivers as a drop-in replacement with an identical function
signature, should follow the same pattern. Without this patch, drivers
that used to work with CONFIG_IPV6=n now result in a linker error.

Cc: Chen Zhou <[email protected]>
Reported-by: Hulk Robot <[email protected]>
Fixes: 0b41713b6066 ("icmp: introduce helper for nat'd source address in network device context")
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoMerge tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 25 Feb 2020 18:14:39 +0000 (10:14 -0800)]
Merge tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This contains a handful of RISC-V related fixes that I've collected
  and would like to target for 5.6-rc4:

   - A fix to set up the PMPs on boot, which allows the kernel to access
     memory on systems that don't set up permissive PMPs before getting
     to Linux. This only effects machine-mode kernels, which currently
     means only NOMMU kernels.

   - A fix to avoid enabling supervisor-mode interrupts when running in
     machine-mode, also only for NOMMU kernels.

   - A pair of fixes to our KASAN support to avoid corrupting memory.

   - A gitignore fix.

  This boots on QEMU's virt board for me"

* tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: adjust the indent
  riscv: allocate a complete page size for each page table
  riscv: Fix gitignore
  RISC-V: Don't enable all interrupts in trap_init()
  riscv: set pmp configuration if kernel is running in M-mode

5 years agoMerge branch 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Linus Torvalds [Tue, 25 Feb 2020 18:09:41 +0000 (10:09 -0800)]
Merge branch 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:
 "Here are a few MIPS fixes, and a MAINTAINERS update to hand over MIPS
  maintenance to Thomas Bogendoerfer - this will be my final pull
  request as MIPS maintainer.

  Thanks for your helpful comments, useful corrections & responsiveness
  during the time I've fulfilled the role, and I'm sure I'll pop up
  elsewhere in the tree somewhere down the line"

* 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MAINTAINERS: Hand MIPS over to Thomas
  MIPS: ingenic: DTS: Fix watchdog nodes
  MIPS: X1000: Fix clock of watchdog node.
  MIPS: vdso: Wrap -mexplicit-relocs in cc-option
  MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'
  MIPS: cavium_octeon: Fix syncw generation.
  mips: vdso: add build time check that no 'jalr t9' calls left
  MIPS: Disable VDSO time functionality on microMIPS
  mips: vdso: fix 'jalr t9' crash in vdso code

5 years agonull_blk: remove unused fields in 'nullb_cmd'
Dongli Zhang [Mon, 24 Feb 2020 18:39:11 +0000 (10:39 -0800)]
null_blk: remove unused fields in 'nullb_cmd'

'list', 'll_list' and 'csd' are no longer used.

The 'list' is not used since it was introduced by commit f2298c0403b0
("null_blk: multi queue aware block test driver").

The 'll_list' is no longer used since commit 3c395a969acc ("null_blk: set a
separate timer for each command").

The 'csd' is no longer used since commit ce2c350b2cfe ("null_blk: use
blk_complete_request and blk_mq_complete_request").

Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoamdgpu/gmc_v9: save/restore sdpif regs during S3
Shirish S [Mon, 27 Jan 2020 11:05:24 +0000 (16:35 +0530)]
amdgpu/gmc_v9: save/restore sdpif regs during S3

fixes S3 issue with IOMMU + S/G  enabled @ 64M VRAM.

Suggested-by: Alex Deucher <[email protected]>
Signed-off-by: Shirish S <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
5 years agodrm/amdgpu: fix memory leak during TDR test(v2)
Monk Liu [Sat, 8 Feb 2020 11:01:21 +0000 (19:01 +0800)]
drm/amdgpu: fix memory leak during TDR test(v2)

fix system memory leak

v2:
fix coding style

Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
5 years agoio-wq: remove spin-for-work optimization
Jens Axboe [Tue, 25 Feb 2020 15:47:30 +0000 (08:47 -0700)]
io-wq: remove spin-for-work optimization

Andres reports that buffered IO seems to suck up more cycles than we
would like, and he narrowed it down to the fact that the io-wq workers
will briefly spin for more work on completion of a work item. This was
a win on the networking side, but apparently some other cases take a
hit because of it. Remove the optimization to avoid burning more CPU
than we have to for disk IO.

Reported-by: Andres Freund <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoio_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL
Xiaoguang Wang [Tue, 25 Feb 2020 14:12:08 +0000 (22:12 +0800)]
io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL

After making ext4 support iopoll method:
  let ext4_file_operations's iopoll method be iomap_dio_iopoll(),
we found fio can easily hang in fio_ioring_getevents() with below fio
job:
    rm -f testfile; sync;
    sudo fio -name=fiotest -filename=testfile -iodepth=128 -thread
-rw=write -ioengine=io_uring  -hipri=1 -sqthread_poll=1 -direct=1
-bs=4k -size=10G -numjobs=8 -runtime=2000 -group_reporting
with IORING_SETUP_SQPOLL and IORING_SETUP_IOPOLL enabled.

There are two issues that results in this hang, one reason is that
when IORING_SETUP_SQPOLL and IORING_SETUP_IOPOLL are enabled, fio
does not use io_uring_enter to get completed events, it relies on
kernel io_sq_thread to poll for completed events.

Another reason is that there is a race: when io_submit_sqes() in
io_sq_thread() submits a batch of sqes, variable 'inflight' will
record the number of submitted reqs, then io_sq_thread will poll for
reqs which have been added to poll_list. But note, if some previous
reqs have been punted to io worker, these reqs will won't be in
poll_list timely. io_sq_thread() will only poll for a part of previous
submitted reqs, and then find poll_list is empty, reset variable
'inflight' to be zero. If app just waits these deferred reqs and does
not wake up io_sq_thread again, then hang happens.

For app that entirely relies on io_sq_thread to poll completed requests,
let io_iopoll_req_issued() wake up io_sq_thread properly when adding new
element to poll_list, and when io_sq_thread prepares to sleep, check
whether poll_list is empty again, if not empty, continue to poll.

Signed-off-by: Xiaoguang Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoblktrace: Protect q->blk_trace with RCU
Jan Kara [Thu, 6 Feb 2020 14:28:12 +0000 (15:28 +0100)]
blktrace: Protect q->blk_trace with RCU

KASAN is reporting that __blk_add_trace() has a use-after-free issue
when accessing q->blk_trace. Indeed the switching of block tracing (and
thus eventual freeing of q->blk_trace) is completely unsynchronized with
the currently running tracing and thus it can happen that the blk_trace
structure is being freed just while __blk_add_trace() works on it.
Protect accesses to q->blk_trace by RCU during tracing and make sure we
wait for the end of RCU grace period when shutting down tracing. Luckily
that is rare enough event that we can afford that. Note that postponing
the freeing of blk_trace to an RCU callback should better be avoided as
it could have unexpected user visible side-effects as debugfs files
would be still existing for a short while block tracing has been shut
down.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
CC: [email protected]
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Tested-by: Ming Lei <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Reported-by: Tristan Madani <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoselftests: nft_concat_range: Move option for 'list ruleset' before command
Stefano Brivio [Fri, 21 Feb 2020 02:11:56 +0000 (03:11 +0100)]
selftests: nft_concat_range: Move option for 'list ruleset' before command

Before nftables commit fb9cea50e8b3 ("main: enforce options before
commands"), 'nft list ruleset -a' happened to work, but it's wrong
and won't work anymore. Replace it by 'nft -a list ruleset'.

Reported-by: Chen Yi <[email protected]>
Fixes: 611973c1e06f ("selftests: netfilter: Introduce tests for sets with range concatenation")
Signed-off-by: Stefano Brivio <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agodocs: Fix empty parallelism argument
Kees Cook [Sat, 22 Feb 2020 00:02:39 +0000 (16:02 -0800)]
docs: Fix empty parallelism argument

When there was no parallelism (no top-level -j arg and a pre-1.7
sphinx-build), the argument passed would be empty ("") instead of just
being missing, which would (understandably) badly confuse sphinx-build.
Fix this by removing the quotes.

Reported-by: Rafael J. Wysocki <[email protected]>
Fixes: 51e46c7a4007 ("docs, parallelism: Rearrange how jobserver reservations are made")
Cc: [email protected] # v5.5 only
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>
5 years agodocs: remove MPX from the x86 toc
Stephen Kitt [Fri, 21 Feb 2020 20:57:33 +0000 (21:57 +0100)]
docs: remove MPX from the x86 toc

MPX was removed in commit 45fc24e89b7c ("x86/mpx: remove MPX from
arch/x86"), this removes the corresponding entry in the x86 toc.

This was suggested by a Sphinx warning.

Signed-off-by: Stephen Kitt <[email protected]>
Fixes: 45fc24e89b7cc ("x86/mpx: remove MPX from arch/x86")
Acked-by: Dave Hansen <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>
5 years agodrm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime
Tina Zhang [Tue, 25 Feb 2020 05:35:27 +0000 (13:35 +0800)]
drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime

Deleting dmabuf item's list head after releasing its container can lead
to KASAN-reported issue:

  BUG: KASAN: use-after-free in __list_del_entry_valid+0x15/0xf0
  Read of size 8 at addr ffff88818a4598a8 by task kworker/u8:3/13119

So fix this issue by puting deleting dmabuf_objs ahead of releasing its
container.

Fixes: dfb6ae4e14bd6 ("drm/i915/gvt: Handle orphan dmabuf_objs")
Signed-off-by: Tina Zhang <[email protected]>
Reviewed-by: Zhenyu Wang <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
5 years agoMAINTAINERS: Hand MIPS over to Thomas
Paul Burton [Sat, 22 Feb 2020 17:04:17 +0000 (09:04 -0800)]
MAINTAINERS: Hand MIPS over to Thomas

My time with MIPS the company has reached its end, and so at best I'll
have little time spend on maintaining arch/mips/.

Ralf last authored a patch over 2 years ago, the last time he committed
one is even further back & activity was sporadic for a while before
that. The reality is that he isn't active.

Having a new maintainer with time to do things properly will be
beneficial all round. Thomas Bogendoerfer has been involved in MIPS
development for a long time & has offered to step up as maintainer, so
add Thomas and remove myself & Ralf from the MIPS entry.

Ralf already has an entry in CREDITS to honor his contributions, so this
just adds one for me.

Signed-off-by: Paul Burton <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Acked-by: Thomas Bogendoerfer <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: [email protected]
Cc: [email protected]
5 years agoblk-mq: insert passthrough request into hctx->dispatch directly
Ming Lei [Tue, 25 Feb 2020 01:04:32 +0000 (09:04 +0800)]
blk-mq: insert passthrough request into hctx->dispatch directly

For some reason, device may be in one situation which can't handle
FS request, so STS_RESOURCE is always returned and the FS request
will be added to hctx->dispatch. However passthrough request may
be required at that time for fixing the problem. If passthrough
request is added to scheduler queue, there isn't any chance for
blk-mq to dispatch it given we prioritize requests in hctx->dispatch.
Then the FS IO request may never be completed, and IO hang is caused.

So passthrough request has to be added to hctx->dispatch directly
for fixing the IO hang.

Fix this issue by inserting passthrough request into hctx->dispatch
directly together withing adding FS request to the tail of
hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request
to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert().

Then it becomes consistent with original legacy IO request
path, in which passthrough request is always added to q->queue_head.

Cc: Dongli Zhang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Ewan D. Milne <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
5 years agoMerge tag 'mac80211-for-net-2020-02-24' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Mon, 24 Feb 2020 23:43:38 +0000 (15:43 -0800)]
Merge tag 'mac80211-for-net-2020-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg

====================
A few fixes:
 * remove a double mutex-unlock
 * fix a leak in an error path
 * NULL pointer check
 * include if_vlan.h where needed
 * avoid RCU list traversal when not under RCU
====================

Signed-off-by: David S. Miller <[email protected]>
5 years agoaudit: always check the netlink payload length in audit_receive_msg()
Paul Moore [Mon, 24 Feb 2020 21:38:57 +0000 (16:38 -0500)]
audit: always check the netlink payload length in audit_receive_msg()

This patch ensures that we always check the netlink payload length
in audit_receive_msg() before we take any action on the payload
itself.

Cc: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Paul Moore <[email protected]>
5 years agoriscv: adjust the indent
Zong Li [Fri, 7 Feb 2020 09:52:45 +0000 (17:52 +0800)]
riscv: adjust the indent

Adjust the indent to match Linux coding style.

Signed-off-by: Zong Li <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
5 years agoriscv: allocate a complete page size for each page table
Zong Li [Fri, 7 Feb 2020 09:52:44 +0000 (17:52 +0800)]
riscv: allocate a complete page size for each page table

Each page table should be created by allocating a complete page size
for it. Otherwise, the content of the page table would be corrupted
somewhere through memory allocation which allocates the memory at the
middle of the page table for other use.

Signed-off-by: Zong Li <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
5 years agocifs: Use #define in cifs_dbg
Joe Perches [Fri, 21 Feb 2020 13:20:45 +0000 (05:20 -0800)]
cifs: Use #define in cifs_dbg

All other uses of cifs_dbg use defines so change this one.

Signed-off-by: Joe Perches <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
5 years agocifs: fix rename() by ensuring source handle opened with DELETE bit
Aurelien Aptel [Fri, 21 Feb 2020 10:19:06 +0000 (11:19 +0100)]
cifs: fix rename() by ensuring source handle opened with DELETE bit

To rename a file in SMB2 we open it with the DELETE access and do a
special SetInfo on it. If the handle is missing the DELETE bit the
server will fail the SetInfo with STATUS_ACCESS_DENIED.

We currently try to reuse any existing opened handle we have with
cifs_get_writable_path(). That function looks for handles with WRITE
access but doesn't check for DELETE, making rename() fail if it finds
a handle to reuse. Simple reproducer below.

To select handles with the DELETE bit, this patch adds a flag argument
to cifs_get_writable_path() and find_writable_file() and the existing
'bool fsuid_only' argument is converted to a flag.

The cifsFileInfo struct only stores the UNIX open mode but not the
original SMB access flags. Since the DELETE bit is not mapped in that
mode, this patch stores the access mask in cifs_fid on file open,
which is accessible from cifsFileInfo.

Simple reproducer:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define E(s) perror(s), exit(1)

int main(int argc, char *argv[])
{
int fd, ret;
if (argc != 3) {
fprintf(stderr, "Usage: %s A B\n"
"create&open A in write mode, "
"rename A to B, close A\n", argv[0]);
return 0;
}

fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666);
if (fd == -1) E("openat()");

ret = rename(argv[1], argv[2]);
if (ret) E("rename()");

ret = close(fd);
if (ret) E("close()");

return ret;
}

$ gcc -o bugrename bugrename.c
$ ./bugrename /mnt/a /mnt/b
rename(): Permission denied

Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name")
CC: Stable <[email protected]>
Signed-off-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
5 years agocifs: add missing mount option to /proc/mounts
Steve French [Thu, 20 Feb 2020 05:59:32 +0000 (23:59 -0600)]
cifs: add missing mount option to /proc/mounts

We were not displaying the mount option "signloosely" in /proc/mounts
for cifs mounts which some users found confusing recently

Signed-off-by: Steve French <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
5 years agocifs: fix potential mismatch of UNC paths
Paulo Alcantara (SUSE) [Thu, 20 Feb 2020 22:49:35 +0000 (19:49 -0300)]
cifs: fix potential mismatch of UNC paths

Ensure that full_path is an UNC path that contains '\\' as delimiter,
which is required by cifs_build_devname().

The build_path_from_dentry_optional_prefix() function may return a
path with '/' as delimiter when using SMB1 UNIX extensions, for
example.

Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
Acked-by: Ronnie Sahlberg <[email protected]>
5 years agocifs: don't leak -EAGAIN for stat() during reconnect
Ronnie Sahlberg [Tue, 18 Feb 2020 20:01:03 +0000 (06:01 +1000)]
cifs: don't leak -EAGAIN for stat() during reconnect

If from cifs_revalidate_dentry_attr() the SMB2/QUERY_INFO call fails with an
error, such as STATUS_SESSION_EXPIRED, causing the session to be reconnected
it is possible we will leak -EAGAIN back to the application even for
system calls such as stat() where this is not a valid error.

Fix this by re-trying the operation from within cifs_revalidate_dentry_attr()
if cifs_get_inode_info*() returns -EAGAIN.

This fixes stat() and possibly also other system calls that uses
cifs_revalidate_dentry*().

Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
CC: Stable <[email protected]>
5 years agoscsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate...
Adam Williamson [Wed, 19 Feb 2020 16:50:07 +0000 (17:50 +0100)]
scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places

Arnd Bergmann inadvertently typoed these in d320a9551e394 and 64cbfa96551a;
they seem to be the cause of
https://bugzilla.redhat.com/show_bug.cgi?id=1801353 , invalid SCSI commands
when udev tries to query a DVD drive.

[arnd] Found another instance of the same bug, also introduced in my
compat_ioctl series.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1801353
Link: https://lore.kernel.org/r/[email protected]
Fixes: c103d6ee69f9 ("compat_ioctl: ide: floppy: add handler")
Fixes: 64cbfa96551a ("compat_ioctl: move cdrom commands into cdrom.c")
Fixes: d320a9551e39 ("compat_ioctl: scsi: move ioctl handling into drivers")
Bisected-by: Chris Murphy <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Adam Williamson <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
5 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Mon, 24 Feb 2020 19:48:17 +0000 (11:48 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Bugfixes, including the fix for CVE-2020-2732 and a few issues found
  by 'make W=1'"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: rstify new ioctls in api.rst
  KVM: nVMX: Check IO instruction VM-exit conditions
  KVM: nVMX: Refactor IO bitmap checks into helper function
  KVM: nVMX: Don't emulate instructions in guest mode
  KVM: nVMX: Emulate MTF when performing instruction emulation
  KVM: fix error handling in svm_hardware_setup
  KVM: SVM: Fix potential memory leak in svm_cpu_init()
  KVM: apic: avoid calculating pending eoi from an uninitialized val
  KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled
  KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
  kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled
  KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE
  KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow
  KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
  kvm/emulate: fix a -Werror=cast-function-type
  KVM: x86: fix incorrect comparison in trace event
  KVM: nVMX: Fix some obsolete comments and grammar error
  KVM: x86: fix missing prototypes
  KVM: x86: enable -Werror

5 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Mon, 24 Feb 2020 19:40:23 +0000 (11:40 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes a Kconfig-related build error and an integer overflow in
  chacha20poly1305"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: chacha20poly1305 - prevent integer overflow on large input
  tee: amdtee: amdtee depends on CRYPTO_DEV_CCP_DD

5 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Mon, 24 Feb 2020 19:32:15 +0000 (11:32 -0800)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull tmpfs fix from Al Viro:
 "Regression from fs_parse series this cycle..."

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  tmpfs: deny and force are not huge mount options

5 years agofloppy: check FDC index for errors before assigning it
Linus Torvalds [Fri, 21 Feb 2020 20:43:35 +0000 (12:43 -0800)]
floppy: check FDC index for errors before assigning it

Jordy Zomer reported a KASAN out-of-bounds read in the floppy driver in
wait_til_ready().

Which on the face of it can't happen, since as Willy Tarreau points out,
the function does no particular memory access.  Except through the FDCS
macro, which just indexes a static allocation through teh current fdc,
which is always checked against N_FDC.

Except the checking happens after we've already assigned the value.

The floppy driver is a disgrace (a lot of it going back to my original
horrd "design"), and has no real maintainer.  Nobody has the hardware,
and nobody really cares.  But it still gets used in virtual environment
because it's one of those things that everybody supports.

The whole thing should be re-written, or at least parts of it should be
seriously cleaned up.  The 'current fdc' index, which is used by the
FDCS macro, and which is often shadowed by a local 'fdc' variable, is a
prime example of how not to write code.

But because nobody has the hardware or the motivation, let's just fix up
the immediate problem with a nasty band-aid: test the fdc index before
actually assigning it to the static 'fdc' variable.

Reported-by: Jordy Zomer <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: Dan Carpenter <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agonet: bridge: fix stale eth hdr pointer in br_dev_xmit
Nikolay Aleksandrov [Mon, 24 Feb 2020 16:46:22 +0000 (18:46 +0200)]
net: bridge: fix stale eth hdr pointer in br_dev_xmit

In br_dev_xmit() we perform vlan filtering in br_allowed_ingress() but
if the packet has the vlan header inside (e.g. bridge with disabled
tx-vlan-offload) then the vlan filtering code will use skb_vlan_untag()
to extract the vid before filtering which in turn calls pskb_may_pull()
and we may end up with a stale eth pointer. Moreover the cached eth header
pointer will generally be wrong after that operation. Remove the eth header
caching and just use eth_hdr() directly, the compiler does the right thing
and calculates it only once so we don't lose anything.

Fixes: 057658cb33fb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoMerge branch 'net-ll_temac-Bugfixes'
David S. Miller [Mon, 24 Feb 2020 18:58:57 +0000 (10:58 -0800)]
Merge branch 'net-ll_temac-Bugfixes'

Esben Haabendal says:

====================
net: ll_temac: Bugfixes

Fix a number of bugs which have been present since the first commit.

The bugs fixed in patch 1,2 and 4 have all been observed in real systems, and
was relatively easy to reproduce given an appropriate stress setup.

Changes since v1:

- Changed error handling of of dma_map_single() in temac_start_xmit() to drop
  packet instead of returning NETDEV_TX_BUSY.
====================

Signed-off-by: David S. Miller <[email protected]>
5 years agonet: ll_temac: Handle DMA halt condition caused by buffer underrun
Esben Haabendal [Fri, 21 Feb 2020 06:47:58 +0000 (07:47 +0100)]
net: ll_temac: Handle DMA halt condition caused by buffer underrun

The SDMA engine used by TEMAC halts operation when it has finished
processing of the last buffer descriptor in the buffer ring.
Unfortunately, no interrupt event is generated when this happens,
so we need to setup another mechanism to make sure DMA operation is
restarted when enough buffers have been added to the ring.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: ll_temac: Fix RX buffer descriptor handling on GFP_ATOMIC pressure
Esben Haabendal [Fri, 21 Feb 2020 06:47:45 +0000 (07:47 +0100)]
net: ll_temac: Fix RX buffer descriptor handling on GFP_ATOMIC pressure

Failures caused by GFP_ATOMIC memory pressure have been observed, and
due to the missing error handling, results in kernel crash such as

[1876998.350133] kernel BUG at mm/slub.c:3952!
[1876998.350141] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[1876998.350147] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.3.0-scnxt #1
[1876998.350150] Hardware name: N/A N/A/COMe-bIP2, BIOS CCR2R920 03/01/2017
[1876998.350160] RIP: 0010:kfree+0x1ca/0x220
[1876998.350164] Code: 85 db 74 49 48 8b 95 68 01 00 00 48 31 c2 48 89 10 e9 d7 fe ff ff 49 8b 04 24 a9 00 00 01 00 75 0b 49 8b 44 24 08 a8 01 75 02 <0f> 0b 49 8b 04 24 31 f6 a9 00 00 01 00 74 06 41 0f b6 74 24
 5b
[1876998.350172] RSP: 0018:ffffc900000f0df0 EFLAGS: 00010246
[1876998.350177] RAX: ffffea00027f0708 RBX: ffff888008d78000 RCX: 0000000000391372
[1876998.350181] RDX: 0000000000000000 RSI: ffffe8ffffd01400 RDI: ffff888008d78000
[1876998.350185] RBP: ffff8881185a5d00 R08: ffffc90000087dd8 R09: 000000000000280a
[1876998.350189] R10: 0000000000000002 R11: 0000000000000000 R12: ffffea0000235e00
[1876998.350193] R13: ffff8881185438a0 R14: 0000000000000000 R15: ffff888118543870
[1876998.350198] FS:  0000000000000000(0000) GS:ffff88811f300000(0000) knlGS:0000000000000000
[1876998.350203] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
s#1 Part1
[1876998.350206] CR2: 00007f8dac7b09f0 CR3: 000000011e20a006 CR4: 00000000001606e0
[1876998.350210] Call Trace:
[1876998.350215]  <IRQ>
[1876998.350224]  ? __netif_receive_skb_core+0x70a/0x920
[1876998.350229]  kfree_skb+0x32/0xb0
[1876998.350234]  __netif_receive_skb_core+0x70a/0x920
[1876998.350240]  __netif_receive_skb_one_core+0x36/0x80
[1876998.350245]  process_backlog+0x8b/0x150
[1876998.350250]  net_rx_action+0xf7/0x340
[1876998.350255]  __do_softirq+0x10f/0x353
[1876998.350262]  irq_exit+0xb2/0xc0
[1876998.350265]  do_IRQ+0x77/0xd0
[1876998.350271]  common_interrupt+0xf/0xf
[1876998.350274]  </IRQ>

In order to handle such failures more graceful, this change splits the
receive loop into one for consuming the received buffers, and one for
allocating new buffers.

When GFP_ATOMIC allocations fail, the receive will continue with the
buffers that is still there, and with the expectation that the allocations
will succeed in a later call to receive.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: ll_temac: Add more error handling of dma_map_single() calls
Esben Haabendal [Fri, 21 Feb 2020 06:47:33 +0000 (07:47 +0100)]
net: ll_temac: Add more error handling of dma_map_single() calls

This adds error handling to the remaining dma_map_single() calls, so that
behavior is well defined if/when we run out of DMA memory.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: ll_temac: Fix race condition causing TX hang
Esben Haabendal [Fri, 21 Feb 2020 06:47:21 +0000 (07:47 +0100)]
net: ll_temac: Fix race condition causing TX hang

It is possible that the interrupt handler fires and frees up space in
the TX ring in between checking for sufficient TX ring space and
stopping the TX queue in temac_start_xmit. If this happens, the
queue wake from the interrupt handler will occur before the queue is
stopped, causing a lost wakeup and the adapter's transmit hanging.

To avoid this, after stopping the queue, check again whether there is
sufficient space in the TX ring. If so, wake up the queue again.

This is a port of the similar fix in axienet driver,
commit 7de44285c1f6 ("net: axienet: Fix race condition causing TX hang").

Fixes: 23ecc4bde21f ("net: ll_temac: fix checksum offload logic")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This page took 0.159554 seconds and 4 git commands to generate.