Git Repo - linux.git/log

qlcnic: Fix corruption while copying

Use proper typecasting while performing byte-by-byte copy

Signed-off-by: Shahed Shaikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

act_bpf: fix memory leaks when replacing bpf programs

We currently trigger multiple memory leaks when replacing bpf
actions, besides others:

  comm "tc", pid 1909, jiffies 4294851310 (age 1602.796s)
  hex dump (first 32 bytes):
    01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00  ................
    18 b0 98 6d 00 88 ff ff 00 00 00 00 00 00 00 00  ...m............
  backtrace:
    [<ffffffff817e623e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8120a22d>] __vmalloc_node_range+0x1bd/0x2c0
    [<ffffffff8120a37a>] __vmalloc+0x4a/0x50
    [<ffffffff811a8d0a>] bpf_prog_alloc+0x3a/0xa0
    [<ffffffff816c0684>] bpf_prog_create+0x44/0xa0
    [<ffffffffa09ba4eb>] tcf_bpf_init+0x28b/0x3c0 [act_bpf]
    [<ffffffff816d7001>] tcf_action_init_1+0x191/0x1b0
    [<ffffffff816d70a2>] tcf_action_init+0x82/0xf0
    [<ffffffff816d4d12>] tcf_exts_validate+0xb2/0xc0
    [<ffffffffa09b5838>] cls_bpf_modify_existing+0x98/0x340 [cls_bpf]
    [<ffffffffa09b5cd6>] cls_bpf_change+0x1a6/0x274 [cls_bpf]
    [<ffffffff816d56e5>] tc_ctl_tfilter+0x335/0x910
    [<ffffffff816b9145>] rtnetlink_rcv_msg+0x95/0x240
    [<ffffffff816df34f>] netlink_rcv_skb+0xaf/0xc0
    [<ffffffff816b909e>] rtnetlink_rcv+0x2e/0x40
    [<ffffffff816deaaf>] netlink_unicast+0xef/0x1b0

Issue is that the old content from tcf_bpf is allocated and needs
to be released when we replace it. We seem to do that since the
beginning of act_bpf on the filter and insns, later on the name as
well.

Example test case, after patch:

  # FOO="1,6 0 0 4294967295,"
  # BAR="1,6 0 0 4294967294,"
  # tc actions add action bpf bytecode "$FOO" index 2
  # tc actions show action bpf
   action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
   index 2 ref 1 bind 0
  # tc actions replace action bpf bytecode "$BAR" index 2
  # tc actions show action bpf
   action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
   index 2 ref 1 bind 0
  # tc actions replace action bpf bytecode "$FOO" index 2
  # tc actions show action bpf
   action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
   index 2 ref 1 bind 0
  # tc actions del action bpf index 2
  [...]
  # echo "scan" > /sys/kernel/debug/kmemleak
  # cat /sys/kernel/debug/kmemleak | grep "comm \"tc\"" | wc -l
  0

Fixes: d23b8ad8ab23 ("tc: add BPF based action")
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'thunderx-fixes'

Aleksey Makarov says:

====================
net: thunderx: Misc fixes

Miscellaneous fixes for the ThunderX VNIC driver

All the patches can be applied individually.
It's ok to drop some if the maintainer feels uncomfortable
with applying for 4.2.
====================

Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix for crash while BGX teardown

Cortina phy does not have kernel driver and we don't attach
device with phy layer for intefaces like XFI, XLAUI etc,
Hence check for interface type before calling disconnect.

Signed-off-by: Thanneeru Srinivasulu <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Add PCI driver shutdown routine

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix crash when changing rss with mutliple traffic flows

This fixes a crash when changing rss with multiple traffic flows.

While interface teardown, disable tx queues after all NAPI threads
are done. If done otherwise tx queues might be woken up inside NAPI
if any CQE_TX are processed.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Set watchdog timeout value

If a txq (SQ) remains in stopped state after this timeout its
considered as stuck and interface is reinited.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Wakeup TXQ only if CQE_TX are processed

Previously TXQ is wakedup whenever napi is executed
and irrespective of if any CQE_TX are processed or not.
Added 'txq_stop' and 'txq_wake' counters to aid in debugging
if there are any future issues.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Suppress alloc_pages() failure warnings

Suppressing standard alloc_pages() warnings. Some kernel configs limit
alloc size and the network driver may fail. Do not drop a kernel
warning in this case, instead just drop a oneliner that the network
driver could not be loaded since the buffer could not be allocated.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix TSO packet statistic

Fixing TSO packages not being counted.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix memory leak when changing queue count

Fix for memory leak when changing queue/channel count via ethtool

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix RQ_DROP miscalculation

With earlier configured value sufficient number of CQEs are not
being reserved for transmitted packets. Hence under heavy incoming
traffic load, receive notifications will take away most of the CQ
thus transmit notifications will be lost resulting in tx skbs not
being freed.

Finally SQ will be full and it will be stopped, watchdog timer
will kick in. After this fix receive notifications will not take
morethan half of CQ reserving the rest for transmit notifications.

Also changed CQ & SQ sizes from 16k to 4k.
This is also due to the receive notifications taking first half of
CQ under heavy load and time taken by NAPI to clear transmit notifications
will increase with higher queue sizes. Again results in SQ being stopped.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix memory leak while tearing down interface

Fixed 'tso_hdrs' memory not being freed properly.
Also fixed SQ skbuff maintenance issues.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix data integrity issues with LDWB

Switching back to LDD transactions from LDWB.

While transmitting packets out with LDWB transactions
data integrity issues are seen very frequently.
hence switching back to LDD.

Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Robert Richter <[email protected]>
Signed-off-by: Aleksey Makarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: flush nd cache on IFF_NOARP change

This patch is the IPv6 equivalent of commit
6c8b4e3ff81b ("arp: flush arp cache on IFF_NOARP change")

Without it, we keep buggy neighbours in the cache, with destination
MAC address equal to our own MAC address.

Tested:
tcpdump -i eth0 -s 0 ip6 -n -e &
ip link set dev eth0 arp off
ping6 remote // sends buggy frames
ip link set dev eth0 arp on
ping6 remote // should work once kernel is patched

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Mario Fanelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hwmon: (nct7802) Fix integer overflow seen when writing voltage limits

Writing a large value into a voltage limit attribute can result
in an overflow due to an auto-conversion from unsigned long to
unsigned int.

Cc: Constantine Shulyupin <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Cc: [email protected] # v4.1+
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (nct7904) Rename pwm attributes to match hwmon ABI

pwm attributes have well defined names, which should be used.

Cc: Vadim V. Vlasov <[email protected]>
Cc: [email protected] #v4.1+
Signed-off-by: Guenter Roeck <[email protected]>

Merge branch 'msm-fixes-4.2' of git://people.freedesktop.org/~robclark/linux into drm-fixes

Fix for nasty crash on mdp4 in disable path, fix for dma-buf export,
smb leak on mdp5 which could result in intermittent modeset fails, and
don't let interrupted system call disturb atomic commit once we are
past the point of no return.

* 'msm-fixes-4.2' of git://people.freedesktop.org/~robclark/linux:
  drm/msm/mdp5: release SMB (shared memory blocks) in various cases
  drm/msm: change to uninterruptible wait in atomic commit
  drm/msm: mdp4: Fix drm_framebuffer dereference crash
  drm/msm: fix msm_gem_prime_get_sg_table()

Merge branch 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Radeon and amdgpu fixes for 4.2.  The audio fix ended up being more
invasive than I would have liked, but this should finally fix up the
last of the regressions since DP audio support was added.

* 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: add new parameter to seperate map and unmap
  drm/amdgpu: hdp_flush is not needed for inside IB
  drm/amdgpu: different emit_ib for gfx and compute
  drm/amdgpu: information leak in amdgpu_info_ioctl()
  drm/amdgpu: clean up init sequence for failures
  drm/radeon/combios: add some validation of lvds values
  drm/radeon: rework audio modeset to handle non-audio hdmi features
  drm/radeon: rework audio detect (v4)
  drm/amdgpu: Drop drm/ prefix for including drm.h in amdgpu_drm.h
  drm/radeon: Drop drm/ prefix for including drm.h in radeon_drm.h

Merge branch 'netcp-fixes'

Murali Karicheri says:

====================
net: netcp: bug fixes for dynamic module support

This series fixes few bugs to allow keystone netcp modules to be
dynamically loaded and removed. Currently it allows following
sequence multiple times

insmod cpsw_ale.ko
insmod davinci_mdio.ko
insmod keystone_netcp.ko
insmod keystone_netcp_ethss.ko
ifup eth0
ifup eth1
ping <hosts on eth0>
ping <hosts on eth1>
ifdown eth1
ifdown eth0
rmmod keystone_netcp_ethss.ko
rmmod keystone_netcp.ko
rmmod davinci_mdio.ko
rmmod cpsw_ale.ko
====================

Signed-off-by: David S. Miller <[email protected]>

net: netcp: ethss: cleanup gbe_probe() and gbe_remove() functions

This patch clean up error handle code to use goto label properly. In some
cases, the code unnecessarily use goto instead of just returning the error
code. Code also make explicit calls to devm_* APIs on error which is
not necessary. In the gbe_remove() also it makes similar calls which is
also unnecessary.

Also fix few checkpatch warnings

Signed-off-by: Murali Karicheri <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: netcp: ethss: fix up incorrect use of list api

The code seems to assume a null is returned when the list is empty
from first_sec_slave() to break the loop which is incorrect. Fix the
code by using list_empty().

Signed-off-by: Murali Karicheri <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: netcp: fix cleanup interface list in netcp_remove()

Currently if user do rmmod keystone_netcp.ko following warning is
seen :-

[ 59.035891] ------------[ cut here ]------------
[ 59.040535] WARNING: CPU: 2 PID: 1619 at drivers/net/ethernet/ti/
netcp_core.c:2127 netcp_remove)

This is because the interface list is not cleaned up in netcp_remove.
This patch fixes this. Also fix some checkpatch related warnings.

Signed-off-by: Murali Karicheri <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'pm+acpi-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"These fix three regressions, two recent ones (cpufreq core and ACPI
  device power management) and one introduced during the 4.1 cycle
  (intel_pstate).

  Specifics:

   - Fix a recently introduced issue in the cpufreq core causing it to
     attempt to create duplicate symbolic links to the policy directory
     in sysfs for CPUs that are offline when the cpufreq driver is being
     registered (Rafael J Wysocki)

   - Fix a recently introduced problem in the ACPI device power
     management core code causing it to store an incorrect value in the
     device object's power.state field in some cases which in turn leads
     to attempts to turn power resources off while they should still be
     on going forward (Mika Westerberg)

   - Fix an intel_pstate driver issue introduced during the 4.1 cycle
     which leads to kernel panics on boot on Knights Landing chips due
     to incomplete support for them in that driver (Lukasz Anaczkowski)"

* tag 'pm+acpi-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Avoid attempts to create duplicate symbolic links
  ACPI / PM: Use target_state to set the device power state
  intel_pstate: Add get_scaling cpu_defaults param to Knights Landing

Merge tag 'dm-4.2-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

- fix DM thinp to consistently return -ENOSPC when out of data space

- fix a logic bug in the DM cache smq policy's creation error path

- revert a DM cache 4.2-rc3 change that reduced writeback efficiency

- fix a hang on DM cache device destruction due to improper
   prealloc_used accounting introduced in 4.2-rc3

- update URL for dm-crypt wiki page

* tag 'dm-4.2-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix device destroy hang due to improper prealloc_used accounting
  Revert "dm cache: do not wake_worker() in free_migration()"
  dm crypt: update wiki page URL
  dm cache policy smq: fix alloc_bitset check that always evaluates as false
  dm thin: return -ENOSPC when erroring retry list due to out of data space

ebpf, x86: fix general protection fault when tail call is invoked

With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger
the following general protection fault out of an eBPF program with a simple
tail call, f.e. tracex5 (or a stripped down version of it):

  [  927.097918] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
  [...]
  [  927.100870] task: ffff8801f228b780 ti: ffff880016a64000 task.ti: ffff880016a64000
  [  927.102096] RIP: 0010:[<ffffffffa002440d>]  [<ffffffffa002440d>] 0xffffffffa002440d
  [  927.103390] RSP: 0018:ffff880016a67a68  EFLAGS: 00010006
  [  927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000001
  [  927.105921] RDX: 0000000000000000 RSI: ffff88014e438000 RDI: ffff880016a67e00
  [  927.107137] RBP: ffff880016a67c90 R08: 0000000000000000 R09: 0000000000000001
  [  927.108351] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880016a67e00
  [  927.109567] R13: 0000000000000000 R14: ffff88026500e460 R15: ffff880220a81520
  [  927.110787] FS:  00007fe7d5c1f740(0000) GS:ffff880265000000(0000) knlGS:0000000000000000
  [  927.112021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  927.113255] CR2: 0000003e7bbb91a0 CR3: 000000006e04b000 CR4: 00000000001407e0
  [  927.114500] Stack:
  [  927.115737]  ffffc90008cdb000 ffff880016a67e00 ffff88026500e460 ffff880220a81520
  [  927.117005]  0000000100000000 000000000000001b ffff880016a67aa8 ffffffff8106c548
  [  927.118276]  00007ffcdaf22e58 0000000000000000 0000000000000000 ffff880016a67ff0
  [  927.119543] Call Trace:
  [  927.120797]  [<ffffffff8106c548>] ? lookup_address+0x28/0x30
  [  927.122058]  [<ffffffff8113d176>] ? __module_text_address+0x16/0x70
  [  927.123314]  [<ffffffff8117bf0e>] ? is_ftrace_trampoline+0x3e/0x70
  [  927.124562]  [<ffffffff810c1a0f>] ? __kernel_text_address+0x5f/0x80
  [  927.125806]  [<ffffffff8102086f>] ? print_context_stack+0x7f/0xf0
  [  927.127033]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
  [  927.128254]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
  [  927.129461]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
  [  927.130654]  [<ffffffff8119ee4a>] trace_call_bpf+0x8a/0x140
  [  927.131837]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
  [  927.133015]  [<ffffffff8119f008>] kprobe_perf_func+0x28/0x220
  [  927.134195]  [<ffffffff811a1668>] kprobe_dispatcher+0x38/0x60
  [  927.135367]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
  [  927.136523]  [<ffffffff81061400>] kprobe_ftrace_handler+0xf0/0x150
  [  927.137666]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
  [  927.138802]  [<ffffffff8117950c>] ftrace_ops_recurs_func+0x5c/0xb0
  [  927.139934]  [<ffffffffa022b0d5>] 0xffffffffa022b0d5
  [  927.141066]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
  [  927.142199]  [<ffffffff81174b95>] seccomp_phase1+0x5/0x230
  [  927.143323]  [<ffffffff8102c0a4>] syscall_trace_enter_phase1+0xc4/0x150
  [  927.144450]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
  [  927.145572]  [<ffffffff8102c0a4>] ? syscall_trace_enter_phase1+0xc4/0x150
  [  927.146666]  [<ffffffff817f9a9f>] tracesys+0xd/0x44
  [  927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83
                       c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74
                       0a <48> 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff
                       ff 4c
  [  927.150046] RIP  [<ffffffffa002440d>] 0xffffffffa002440d

The code section with the instructions that traps points into the eBPF JIT
image of the root program (the one invoking the tail call instruction).

Using bpf_jit_disasm -o on the eBPF root program image:

  [...]
  4e:   mov    -0x204(%rbp),%eax
        8b 85 fc fd ff ff
  54:   cmp    $0x20,%eax               <--- if (tail_call_cnt > MAX_TAIL_CALL_CNT)
        83 f8 20
  57:   ja     0x000000000000007a
        77 21
  59:   add    $0x1,%eax                <--- tail_call_cnt++
        83 c0 01
  5c:   mov    %eax,-0x204(%rbp)
        89 85 fc fd ff ff
  62:   lea    -0x80(%rsi,%rdx,8),%rax  <--- prog = array->prog[index]
        48 8d 44 d6 80
  67:   mov    (%rax),%rax
        48 8b 00
  6a:   cmp    $0x0,%rax                <--- check for NULL
        48 83 f8 00
  6e:   je     0x000000000000007a
        74 0a
  70:   mov    0x20(%rax),%rax          <--- GPF triggered here! fetch of bpf_func
        48 8b 40 20                              [ matches <48> 8b 40 20 ... from above ]
  74:   add    $0x33,%rax               <--- prologue skip of new prog
        48 83 c0 33
  78:   jmpq   *%rax                    <--- jump to new prog insns
        ff e0
  [...]

The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call
jump to map slot 0 is pointing to a poisoned page. The issue is the following:

lea instruction has a wrong offset, i.e. it should be ...

  lea    0x80(%rsi,%rdx,8),%rax

... but it actually seems to be ...

  lea   -0x80(%rsi,%rdx,8),%rax

... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs
to be positive instead of negative. Disassembling the interpreter, we btw
similarly do:

  [...]
  c88:  lea     0x80(%rax,%rdx,8),%rax  <--- prog = array->prog[index]
        48 8d 84 d0 80 00 00 00
  c90:  add     $0x1,%r13d
        41 83 c5 01
  c94:  mov     (%rax),%rax
        48 8b 00
  [...]

Now the other interesting fact is that this panic triggers only when things
like CONFIG_LOCKDEP are being used. In that case offsetof(struct bpf_array,
prog) starts at offset 0x80 and in non-CONFIG_LOCKDEP case at offset 0x50.
Reason is that the work_struct inside struct bpf_map grows by 48 bytes in my
case due to the lockdep_map member (which also has CONFIG_LOCK_STAT enabled
members).

Changing the emitter to always use the 4 byte displacement in the lea
instruction fixes the panic on my side. It increases the tail call instruction
emission by 3 more byte, but it should cover us from various combinations
(and perhaps other future increases on related structures).

After patch, disassembly:

  [...]
  9e:   lea    0x80(%rsi,%rdx,8),%rax   <--- CONFIG_LOCKDEP/CONFIG_LOCK_STAT
        48 8d 84 d6 80 00 00 00
  a6:   mov    (%rax),%rax
        48 8b 00
  [...]

  [...]
  9e:   lea    0x50(%rsi,%rdx,8),%rax   <--- No CONFIG_LOCKDEP
        48 8d 84 d6 50 00 00 00
  a6:   mov    (%rax),%rax
        48 8b 00
  [...]

Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: mdb: fix delmdb state in the notification

Since mdb states were introduced when deleting an entry the state was
left as it was set in the delete request from the user which leads to
the following output when doing a monitor (for example):
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 temp
^^^
Note the "temp" state in the delete notification which is wrong since
the entry was permanent, the state in a delete is always reported as
"temp" regardless of the real state of the entry.

After this patch:
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent

There's one important note to make here that the state is actually not
matched when doing a delete, so one can delete a permanent entry by
stating "temp" in the end of the command, I've chosen this fix in order
not to break user-space tools which rely on this (incorrect) behaviour.

So to give an example after this patch and using the wrong state:
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent

Note the state of the entry that got deleted is correct in the
notification.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires")
Signed-off-by: David S. Miller <[email protected]>

bridge: mcast: give fast leave precedence over multicast router and querier

When fast leave is configured on a bridge port and an IGMP leave is
received for a group, the group is not deleted immediately if there is
a router detected or if multicast querier is configured.
Ideally the group should be deleted immediately when fast leave is
configured.

Signed-off-by: Satish Ashok <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/msm/mdp5: release SMB (shared memory blocks) in various cases

Release all blocks after the pipe is disabled, even when vsync
didn't happen in some error cases. Allow requesting SMB multiple
times before configuring to hardware, by releasing blocks not
programmed to hardware yet for shrinking case.

This fixes a potential leak of shared memory pool blocks.

Signed-off-by: Wentao Xu <[email protected]>
Tested-by: Archit Taneja <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: change to uninterruptible wait in atomic commit

The atomic commit cannot easily undo and return an error once the
state is swapped. Change to uninterruptible wait, and ignore the
timeout error.

Signed-off-by: Wentao Xu <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: mdp4: Fix drm_framebuffer dereference crash

mdp4_get_frame_format() can dereference a drm_framebuffer when it's NULL.
Call it in mdp4_plane_mode_set only when we know fb is non-NULL.

Signed-off-by: Archit Taneja <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: fix msm_gem_prime_get_sg_table()

We need to return a new sgt, since the caller takes ownership of it.

Reported-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/amdgpu: add new parameter to seperate map and unmap

Signed-off-by: monk.liu <[email protected]>
Reviewed-by: Christian König <[email protected]>

drm/amdgpu: hdp_flush is not needed for inside IB

hdp flush is not needed for IBs that dispatched from kernel inside
because there is no video memory host access

Signed-off-by: monk.liu <[email protected]>
Reviewed-by: Christian König <[email protected]>

drm/amdgpu: different emit_ib for gfx and compute

compute ring didn't use const engine byfar, so ignore CE things in
compute routine

Signed-off-by: monk.liu <[email protected]>
Reviewed-by: Christian König <[email protected]>

drm/amdgpu: information leak in amdgpu_info_ioctl()

We recently changed the drm_amdgpu_info_device struct so now there is
a 4 byte hole at the end. We need to initialize it so we don't disclose
secret information from the stack.

Fixes: fa92754e9c47 ('drm/amdgpu: add VCE harvesting instance query')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: clean up init sequence for failures

If we fail during device init, record what state each
block is in so that we can tear down clearly.

Fixes various problems on device init failure.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/combios: add some validation of lvds values

Fixes a broken hsync start value uncovered by:
abc0b1447d4974963548777a5ba4a4457c82c426
(drm: Perform basic sanity checks on probed modes)

The driver handled the bad hsync start elsewhere, but
the above commit prevented it from getting added.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=91401

Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/radeon: rework audio modeset to handle non-audio hdmi features

Need to setup the deep color and avi packets regardless of
audio setup.

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: rework audio detect (v4)

1. Always assign audio function pointers even if the display does
not support audio.  We need to properly disable the audio stream
when when using a non-audio capable monitor.  Fixes purple line
on some hdmi monitors.

2. Check if a pin is in use by another encoder before disabling
it.

v2: make sure we've fetched the edid before checking audio and
    look up the encoder before calling audio_detect since
    connector->encoder may not be assigned yet.  Separate
    pin and afmt.  They are allocated at different times and
    have no dependency on eachother.
v3: fix connector fetching in encoder functions
v4: fix missed dig->pin check in dce6_afmt_write_latency_fields

bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=93701
https://bugzilla.redhat.com/show_bug.cgi?id=1236337
https://bugs.freedesktop.org/show_bug.cgi?id=91041

Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amdgpu: Drop drm/ prefix for including drm.h in amdgpu_drm.h

This allows amdgpu_drm.h to be reused verbatim in libdrm.

Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Michel Dänzer <[email protected]>

drm/radeon: Drop drm/ prefix for including drm.h in radeon_drm.h

This allows radeon_drm.h to be reused verbatim in libdrm.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Michel Dänzer <[email protected]>

bridge: Fix network header pointer for vlan tagged packets

There are several devices that can receive vlan tagged packets with
CHECKSUM_PARTIAL like tap, possibly veth and xennet.
When (multiple) vlan tagged packets with CHECKSUM_PARTIAL are forwarded
by bridge to a device with the IP_CSUM feature, they end up with checksum
error because before entering bridge, the network header is set to
ETH_HLEN (not including vlan header length) in __netif_receive_skb_core(),
get_rps_cpu(), or drivers' rx functions, and nobody fixes the pointer later.

Since the network header is exepected to be ETH_HLEN in flow-dissection
and hash-calculation in RPS in rx path, and since the header pointer fix
is needed only in tx path, set the appropriate network header on forwarding
packets.

Signed-off-by: Toshiaki Makita <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dm cache: fix device destroy hang due to improper prealloc_used accounting

Commit 665022d72f9 ("dm cache: avoid calls to prealloc_free_structs() if
possible") introduced a regression that caused the removal of a DM cache
device to hang in cache_postsuspend()'s call to wait_for_migrations()
with the following stack trace:

  [<ffffffff81651457>] schedule+0x37/0x80
  [<ffffffffa041e21b>] cache_postsuspend+0xbb/0x470 [dm_cache]
  [<ffffffff810ba970>] ? prepare_to_wait_event+0xf0/0xf0
  [<ffffffffa0006f77>] dm_table_postsuspend_targets+0x47/0x60 [dm_mod]
  [<ffffffffa0001eb5>] __dm_destroy+0x215/0x250 [dm_mod]
  [<ffffffffa0004113>] dm_destroy+0x13/0x20 [dm_mod]
  [<ffffffffa00098cd>] dev_remove+0x10d/0x170 [dm_mod]
  [<ffffffffa00097c0>] ? dev_suspend+0x240/0x240 [dm_mod]
  [<ffffffffa0009f85>] ctl_ioctl+0x255/0x4d0 [dm_mod]
  [<ffffffff8127ac00>] ? SYSC_semtimedop+0x280/0xe10
  [<ffffffffa000a213>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
  [<ffffffff811fd432>] do_vfs_ioctl+0x2d2/0x4b0
  [<ffffffff81117d5f>] ? __audit_syscall_entry+0xaf/0x100
  [<ffffffff81022636>] ? do_audit_syscall_entry+0x66/0x70
  [<ffffffff811fd689>] SyS_ioctl+0x79/0x90
  [<ffffffff81023e58>] ? syscall_trace_leave+0xb8/0x110
  [<ffffffff81654f6e>] entry_SYSCALL_64_fastpath+0x12/0x71

Fix this by accounting for the call to prealloc_data_structs()
immediately _before_ the call as opposed to after.  This is needed
because it is possible to break out of the control loop after the call
to prealloc_data_structs() but before prealloc_used was set to true.

Signed-off-by: Mike Snitzer <[email protected]>

Revert "dm cache: do not wake_worker() in free_migration()"

This reverts commit 386cb7cdeeef97e0bf082a8d6bbfc07a2ccce07b.

Taking the wake_worker() out of free_migration() will slow writeback
dramatically, and hence adaptability.

Say we have 10k blocks that need writing back, but are only able to
issue 5 concurrently due to the migration bandwidth: it's imperative
that we wake_worker() immediately after migration completion; waiting
for the next 1 second wake up (via do_waker) means it'll take a long
time to write that all back.

Reported-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

ALSA: hda - Fix race between PM ops and HDA init/probe

PM ops could be triggered before HDA is done initializing
and cause PM to set HDA controller to D3Hot.  This can result
in "CORB reset timeout#2, CORBRP = 65535" and "no codecs
initialized".  Additionally, PM ops can be triggered before
azx_probe_continue finishes (async probe).  This can result
in a NULL deref kernel crash.

To fix this, avoid PM ops if !chip->running.

Signed-off-by: U. Artie Eoff <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
"Two bug fixes:

   - fix a crash on pre-z10 hardware due to cache-info

   - fix an issue with classic BPF programs in the eBPF JIT"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/cachinfo: add missing facility check to init_cache_level()
  s390/bpf: clear correct BPF accumulator register

Merge tag 'vfio-v4.2-rc5' of git://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:
"Fix a lockdep reported deadlock in device open error path"

* tag 'vfio-v4.2-rc5' of git://github.com/awilliam/linux-vfio:
vfio: Fix lockdep issue

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"This series is larger than what I'd normally be conformable with
  sending for a -rc5 PULL request..

  However, the bulk of the series is localized to qla2xxx target
  specific fixes that address a number of real-world correctness issues,
  that have been outstanding on the list for ~6 weeks now.  They where
  submitted + verified + acked by the HW LLD vendor, contributed by a
  major production customer of the code, and are marked for v3.18.y
  stable code.

  That said, I don't see a good reason to wait another month to get
  these fixes into mainline.

  Beyond the qla2xx specific fixes, this series also includes:

   - bugfix for a long standing use-after-free in iscsi-target during
     TPG shutdown + demo-mode sessions.

   - bugfix for a >= v4.0 regression OOPs in iscsi-target during a
     iscsi_start_kthreads() failure.

   - bugfix for a >= v4.0 regression hang in iscsi-target for iser
     explicit session/connection logout.

   - bugfix for a iser-target bug where a early CMA REJECTED status
     during login triggers a NULL pointer dereference OOPs.

   - bugfixes for a handful of v4.2-rc1 specific regressions related to
     the larger set of recent backend configfs attribute changes.

  A big thanks to QLogic + Pure Storage for the qla2xxx target bugfixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
  Documentation/target: Fix tcm_mod_builder.py build breakage
  iser-target: Fix REJECT CM event use-after-free OOPs
  iscsi-target: Fix iser explicit logout TX kthread leak
  iscsi-target: Fix iscsit_start_kthreads failure OOPs
  iscsi-target: Fix use-after-free during TPG session shutdown
  qla2xxx: terminate exchange when command is aborted by LIO
  qla2xxx: drop cmds/tmrs arrived while session is being deleted
  qla2xxx: disable scsi_transport_fc registration in target mode
  qla2xxx: added sess generations to detect RSCN update races
  qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives
  qla2xxx: delay plogi/prli ack until existing sessions are deleted
  qla2xxx: cleanup cmd in qla workqueue before processing TMR
  qla2xxx: kill sessions/log out initiator on RSCN and port down events
  qla2xxx: fix command initialization in target mode.
  qla2xxx: Remove msleep in qlt_send_term_exchange
  qla2xxx: adjust debug flags
  qla2xxx: release request queue reservation.
  qla2xxx: Add flush after updating ATIOQ consumer index.
  qla2xxx: Enable target mode for ISP27XX
  qla2xxx: Fix hardware lock/unlock issue causing kernel panic.
  ...

btrfs: add missing discards when unpinning extents with -o discard

When we clear the dirty bits in btrfs_delete_unused_bgs for extents
in the empty block group, it results in btrfs_finish_extent_commit being
unable to discard the freed extents.

The block group removal patch added an alternate path to forget extents
other than btrfs_finish_extent_commit. As a result, any extents that
would be freed when the block group is removed aren't discarded. In my
test run, with a large copy of mixed sized files followed by removal, it
left nearly 2/3 of extents undiscarded.

To clean up the block groups, we add the removed block group onto a list
that will be discarded after transaction commit.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

btrfs: explictly delete unused block groups in close_ctree and ro-remount

The cleaner thread may already be sleeping by the time we enter
close_ctree. If that's the case, we'll skip removing any unused
block groups queued for removal, even during a normal umount.
They'll be cleaned up automatically at next mount, but users
expect a umount to be a clean synchronization point, especially
when used on thin-provisioned storage with -odiscard. We also
explicitly remove unused block groups in the ro-remount path
for the same reason.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

btrfs: iterate over unused chunk space in FITRIM

Since we now clean up block groups automatically as they become
empty, iterating over block groups is no longer sufficient to discard
unused space.

This patch iterates over the unused chunk space and discards any regions
that are unallocated, regardless of whether they were ever used.  This is
a change for btrfs but is consistent with other file systems.

We do this in a transactionless manner since the discard process can take
a substantial amount of time and a transaction would need to be started
before the acquisition of the device list lock.  That would mean a
transaction would be held open across /all/ of the discards collectively.
In order to prevent other threads from allocating or freeing chunks, we
hold the chunks lock across the search and discard calls.  We release it
between searches to allow the file system to perform more-or-less
normally.  Since the running transaction can commit and disappear while
we're using the transaction pointer, we take a reference to it and
release it after the search.  This is safe since it would happen normally
at the end of the transaction commit after any locks are released anyway.
We also take the commit_root_sem to protect against a transaction starting
and committing while we're running.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

btrfs: skip superblocks during discard

Btrfs doesn't track superblocks with extent records so there is nothing
persistent on-disk to indicate that those blocks are in use.  We track
the superblocks in memory to ensure they don't get used by removing them
from the free space cache when we load a block group from disk.  Prior
to 47ab2a6c6a (Btrfs: remove empty block groups automatically), that
was fine since the block group would never be reclaimed so the superblock
was always safe.  Once we started removing the empty block groups, we
were protected by the fact that discards weren't being properly issued
for unused space either via FITRIM or -odiscard.  The block groups were
still being released, but the blocks remained on disk.

In order to properly discard unused block groups, we need to filter out
the superblocks from the discard range.  Superblocks are located at fixed
locations on each device, so it makes sense to filter them out in
btrfs_issue_discard, which is used by both -odiscard and FITRIM.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

btrfs: btrfs_issue_discard ensure offset/length are aligned to sector boundaries

It's possible, though unexpected, to pass unaligned offsets and lengths
to btrfs_issue_discard.  We then shift the offset/length values to sector
units.  If an unaligned offset has been passed, it will result in the
entire sector being discarded, possibly losing data.  An unaligned
length is safe but we'll end up returning an inaccurate number of
discarded bytes.

This patch aligns the offset to the 512B boundary, adjusts the length,
and warns, since we shouldn't be discarding on an offset that isn't
aligned with our sector size.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

btrfs: make btrfs_issue_discard return bytes discarded

Initially this will just be the length argument passed to it,
but the following patches will adjust that to reflect re-alignment
and skipped blocks.

Signed-off-by: Jeff Mahoney <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Tested-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

Merge branches 'pm-cpufreq' and 'acpi-pm'

* pm-cpufreq:
  cpufreq: Avoid attempts to create duplicate symbolic links
  intel_pstate: Add get_scaling cpu_defaults param to Knights Landing

* acpi-pm:
  ACPI / PM: Use target_state to set the device power state

drm/i915: Replace WARN inside I915_READ64_2x32 with retry loop

Since we may conceivably encounter situations where the upper part of the
64bit register changes between reads, for example when a timestamp
counter overflows, change the WARN into a retry loop.

Signed-off-by: Chris Wilson <[email protected]>
Cc: Michał Winiarski <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>

ALSA: usb-audio: add dB range mapping for some devices

Add the correct dB ranges of Bose Companion 5 and Drangonfly DAC 1.2.

Signed-off-by: Yao-Wen Mao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

Merge branch 'linux-4.2' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Two more nouveau fixes.

* 'linux-4.2' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/nouveau/ttm: fix tiled system memory with Maxwell
drm/nouveau/kms/nv50-: guard against enabling cursor on disabled heads

packet: tpacket_snd(): fix signed/unsigned comparison

tpacket_fill_skb() can return a negative value (-errno) which
is stored in tp_len variable. In that case the following
condition will be (but shouldn't be) true:

tp_len > dev->mtu + dev->hard_header_len

as dev->mtu and dev->hard_header_len are both unsigned.

That may lead to just returning an incorrect EMSGSIZE errno
to the user.

Fixes: 52f1454f629fa ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
Signed-off-by: Alexander Drozdov <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ALSA: hda - Apply a fixup to Dell Vostro 5480

Dell Vostro 5480 (1028:069a) needs the very same quirk used for Vostro
5470 model to make bass speakers properly working.

Reported-and-tested-by: Paulo Roberto de Oliveira Castro <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

arp: filter NOARP neighbours for SIOCGARP

When arp is off on a device, and ioctl(SIOCGARP) is queried,
a buggy answer is given with MAC address of the device, instead
of the mac address of the destination/gateway.

We filter out NUD_NOARP neighbours for /proc/net/arp,
we must do the same for SIOCGARP ioctl.

Tested:

lpaa23:~# ./arp 10.246.7.190
MAC=00:01:e8:22:cb:1d      // correct answer

lpaa23:~# ip link set dev eth0 arp off
lpaa23:~# cat /proc/net/arp   # check arp table is now 'empty'
IP address       HW type     Flags       HW address    Mask     Device
lpaa23:~# ./arp 10.246.7.190
MAC=00:1a:11:c3:0d:7f   // buggy answer before patch (this is eth0 mac)

After patch :

lpaa23:~# ip link set dev eth0 arp off
lpaa23:~# ./arp 10.246.7.190
ioctl(SIOCGARP) failed: No such device or address

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Vytautas Valancius <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv4: suppress NETDEV_UP notification on address lifetime update

This notification causes the FIB to be updated, which is not needed
because the address already exists, and more importantly it may undo
intentional changes that were made to the FIB after the address was
originally added. (As a point of comparison, when an address becomes
deprecated because its preferred lifetime expired, a notification on
this chain is not generated.)

The motivation for this commit is fixing an incompatibility between
DHCP clients which set and update the address lifetime according to
the lease, and a commercial VPN client which replaces kernel routes
in a way that outbound traffic is sent only through the tunnel (and
disconnects if any further route changes are detected via netlink).

Signed-off-by: David Ward <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: stp: when using userspace stp stop kernel hello and hold timers

These should be handled only by the respective STP which is in control.
They become problematic for devices with limited resources with many
ports because the hold_timer is per port and fires each second and the
hello timer fires each 2 seconds even though it's global. While in
user-space STP mode these timers are completely unnecessary so it's better
to keep them off.
Also ensure that when the bridge is up these timers are started only when
running with kernel STP.

Signed-off-by: Satish Ashok <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xfs: remote attributes need to be considered data

We don't log remote attribute contents, and instead write them
synchronously before we commit the block allocation and attribute
tree update transaction. As a result we are writing to the allocated
space before the allcoation has been made permanent.

As a result, we cannot consider this allocation to be a metadata
allocation. Metadata allocation can take blocks from the free list
and so reuse them before the transaction that freed the block is
committed to disk. This behaviour is perfectly fine for journalled
metadata changes as log recovery will ensure the free operation is
replayed before the overwrite, but for remote attribute writes this
is not the case.

Hence we have to consider the remote attribute blocks to contain
data and allocate accordingly. We do this by dropping the
XFS_BMAPI_METADATA flag from the block allocation. This means the
allocation will not use blocks that are on the busy list without
first ensuring that the freeing transaction has been committed to
disk and the blocks removed from the busy list. This ensures we will
never overwrite a freed block without first ensuring that it is
really free.

cc: <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: remote attribute headers contain an invalid LSN

In recent testing, a system that crashed failed log recovery on
restart with a bad symlink buffer magic number:

XFS (vda): Starting recovery (logdev: internal)
XFS (vda): Bad symlink block magic!
XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060

On examination of the log via xfs_logprint, none of the symlink
buffers in the log had a bad magic number, nor were any other types
of buffer log format headers mis-identified as symlink buffers.
Tracing was used to find the buffer the kernel was tripping over,
and xfs_db identified it's contents as:

000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e
020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d
040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
.....

This is a remote attribute buffer, which are notable in that they
are not logged but are instead written synchronously by the remote
attribute code so that they exist on disk before the attribute
transactions are committed to the journal.

The above remote attribute block has an invalid LSN in it - cycle
0xd002000, block 0 - which means when log recovery comes along to
determine if the transaction that writes to the underlying block
should be replayed, it sees a block that has a future LSN and so
does not replay the buffer data in the transaction. Instead, it
validates the buffer magic number and attaches the buffer verifier
to it. It is this buffer magic number check that is failing in the
above assert, indicating that we skipped replay due to the LSN of
the underlying buffer.

The problem here is that the remote attribute buffers cannot have a
valid LSN placed into them, because the transaction that contains
the attribute tree pointer changes and the block allocation that the
attribute data is being written to hasn't yet been committed. Hence
the LSN field in the attribute block is completely unwritten,
thereby leaving the underlying contents of the block in the LSN
field. It could have any value, and hence a future overwrite of the
block by log recovery may or may not work correctly.

Fix this by always writing an invalid LSN to the remote attribute
block, as any buffer in log recovery that needs to write over the
remote attribute should occur. We are protected from having old data
written over the attribute by the fact that freeing the block before
the remote attribute is written will result in the buffer being
marked stale in the log and so all changes prior to the buffer stale
transaction will be cancelled by log recovery.

Hence it is safe to ignore the LSN in the case or synchronously
written, unlogged metadata such as remote attribute blocks, and to
ensure we do that correctly, we need to write an invalid LSN to all
remote attribute blocks to trigger immediate recovery of metadata
that is written over the top.

As a further protection for filesystems that may already have remote
attribute blocks with bad LSNs on disk, change the log recovery code
to always trigger immediate recovery of metadata over remote
attribute blocks.

cc: <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: call dax_fault on read page faults for DAX

When modifying the patch series to handle the XFS MMAP_LOCK nesting
of page faults, I botched the conversion of the read page fault
path, and so it is only every calling through the page cache. Re-add
the necessary __dax_fault() call for such files.

Because the get_blocks callback on read faults may not set up the
mapping buffer correctly to allow unwritten extent completion to be
run, we need to allow callers of __dax_fault() to pass a null
complete_unwritten() callback. The DAX code always zeros the
unwritten page when it is read faulted so there are no stale data
exposure issues with not doing the conversion. The only downside
will be the potential for increased CPU overhead on repeated read
faults of the same page. If this proves to be a problem, then the
filesystem needs to fix it's get_block callback and provide a
convert_unwritten() callback to the read fault path.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:

- two minor bug fixes

- relicense ocrdma driver to dual license, GPL or BSD

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/ocrdma: update ocrdma module license string
  RDMA/ocrdma: update ocrdma license to dual-license
  IB/ipoib: Fix CONFIG_INFINIBAND_IPOIB_CM
  RDMA/cxgb3: fail get_dma_mr on 64 bit arches

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull key fix from James Morris.

Fix memory leak.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: ensure we free the assoc array edit if edit is valid

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
"Fix buffer overflow when UTF-16 UEFI vendor string is copied from the
system table into a char array with a size of 100 bytes"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/efi: map the entire UEFI vendor string before reading it

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32

Pull AVR32 fix from Hans-Christian Egtvedt.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: handle NULL as a valid clock object

Revert "Input: zforce - don't overwrite the stack"

This reverts commit 7d01cd261c76f95913c81554a751968a1d282d3a because
with given FRAME_MAXSIZE of 257 the check will never trigger and it
causes warnings from GCC (with -Wtype-limits). Also the check was
incorrect as it was not accounting for the already read 2 bytes of data
stored in the buffer.

Merge tag 'devicetree-fixes-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:
"A handful of DT related fixes for 4.2-rc"

* tag 'devicetree-fixes-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: Drop owner assignment from platform and i2c driver
  DEVICETREE: Misc fix for the AR7100 SPI controller binding
  of: constify drv arg of of_driver_match_device stub
  of: add HAS_IOMEM depends to OF_ADDRESS

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull vhost fixes from Michael Tsirkin:
"Two bugfixes only here"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: fix error handling for memory region alloc
vhost: actually track log eventfd file

Merge tag 'linux-kselftest-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan.

* tag 'linux-kselftest-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/futex: Fix futex_cmp_requeue_pi() error handling

Merge tag 'nfs-for-4.2-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:

  Stable patches:
   - Fix a situation where the client uses the wrong (zero) stateid.
   - Fix a memory leak in nfs_do_recoalesce

  Bugfixes:
   - Plug a memory leak when ->prepare_layoutcommit fails
   - Fix an Oops in the NFSv4 open code
   - Fix a backchannel deadlock
   - Fix a livelock in sunrpc when sendmsg fails due to low memory
     availability
   - Don't revalidate the mapping if both size and change attr are up to
     date
   - Ensure we don't miss a file extension when doing pNFS
   - Several fixes to handle NFSv4.1 sequence operation status bits
     correctly
   - Several pNFS layout return bugfixes"

* tag 'nfs-for-4.2-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits)
  nfs: Fix an oops caused by using other thread's stack space in ASYNC mode
  nfs: plug memory leak when ->prepare_layoutcommit fails
  SUNRPC: Report TCP errors to the caller
  sunrpc: translate -EAGAIN to -ENOBUFS when socket is writable.
  NFSv4.2: handle NFS-specific llseek errors
  NFS: Don't clear desc->pg_moreio in nfs_do_recoalesce()
  NFS: Fix a memory leak in nfs_do_recoalesce
  NFS: nfs_mark_for_revalidate should always set NFS_INO_REVAL_PAGECACHE
  NFS: Remove the "NFS_CAP_CHANGE_ATTR" capability
  NFS: Set NFS_INO_REVAL_PAGECACHE if the change attribute is uninitialised
  NFS: Don't revalidate the mapping if both size and change attr are up to date
  NFSv4/pnfs: Ensure we don't miss a file extension
  NFSv4: We must set NFS_OPEN_STATE flag in nfs_resync_open_stateid_locked
  SUNRPC: xprt_complete_bc_request must also decrement the free slot count
  SUNRPC: Fix a backchannel deadlock
  pNFS: Don't throw out valid layout segments
  pNFS: pnfs_roc_drain() fix a race with open
  pNFS: Fix races between return-on-close and layoutreturn.
  pNFS: pnfs_roc_drain should return 'true' when sleeping
  pNFS: Layoutreturn must invalidate all existing layout segments.
  ...

Merge tag 'for-f2fs-v4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim.

* tag 'for-f2fs-v4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: call set_page_dirty to attach i_wb for cgroup
f2fs: handle error cases in move_encrypted_block

cpufreq: Avoid attempts to create duplicate symbolic links

After commit 87549141d516 (cpufreq: Stop migrating sysfs files on
hotplug) there is a problem with CPUs that share cpufreq policy
objects with other CPUs and are initially offline.

Say CPU1 shares a policy with CPU0 which is online and is registered
first.  As part of the registration process, cpufreq_add_dev() is
called for it.  It creates the policy object and a symbolic link
to it from the CPU1's sysfs directory.  If CPU1 is registered
subsequently and it is offline at that time, cpufreq_add_dev() will
attempt to create a symbolic link to the policy object for it, but
that link is present already, so a warning about that will be
triggered.

To avoid that warning, make cpufreq use an additional CPU mask
containing related CPUs that are actually present for each policy
object.  That mask is initialized when the policy object is populated
after its creation (for the first online CPU using it) and it includes
CPUs from the "policy CPUs" mask returned by the cpufreq driver's
->init() callback that are physically present at that time.  Symbolic
links to the policy are created only for the CPUs in that mask.

If cpufreq_add_dev() is invoked for an offline CPU, it checks the
new mask and only creates the symlink if the CPU was not in it (the
CPU is added to the mask at the same time).

In turn, cpufreq_remove_dev() drops the given CPU from the new mask,
removes its symlink to the policy object and returns, unless it is
the CPU owning the policy object.  In that case, the policy object
is moved to a new CPU's sysfs directory or deleted if the CPU being
removed was the last user of the policy.

While at it, notice that cpufreq_remove_dev() can't fail, because
its return value is ignored, so make it ignore return values from
__cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish()
and prevent these functions from aborting on errors returned by
__cpufreq_governor().  Also drop the now unused sif argument from
them.

Fixes: 87549141d516 (cpufreq: Stop migrating sysfs files on hotplug)
Signed-off-by: Rafael J. Wysocki <[email protected]>
Reported-and-tested-by: Russell King <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

ACPI / PM: Use target_state to set the device power state

Commit 20dacb71ad28 ("ACPI / PM: Rework device power management to follow
ACPI 6") changed the device power management to use D3hot if the device
in question does not have _PR3 method even if D3cold was requested by the
caller.

However, if the device has _PR3 device->power.state is also set to D3hot
instead of D3Cold after power resources have been turned off because
device->power.state will be assigned from "state" instead of
"target_state".

Next time the device is transitioned to D0, acpi_power_transition() will
find that the current power state of the device is D3hot instead of D3cold
which causes it to power down all resources required for the current
(wrong) state D3hot.

Below is a simplified ASL example of a real touch panel device which
triggers the problem:

  Scope (TPL1)
  {
      Name (_PR0, Package (1) { \_SB.PCI0.I2C1.PXTC })
      Name (_PR3, Package (1) { \_SB.PCI0.I2C1.PXTC })
      ...
  }

In both D0 and D3hot the same power resource is required. However, when
acpi_power_transition() turns off power resources required for D3hot (as
the device is transitioned to D0) it powers down PXTC which then makes the
device to lose its power.

Fix this by assigning "target_state" to the device power state instead of
"state" that is always D3hot even for devices with valid _PR3.

Fixes: 20dacb71ad28 (ACPI / PM: Rework device power management to follow ACPI 6)
Signed-off-by: Mika Westerberg <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

nfs: Fix an oops caused by using other thread's stack space in ASYNC mode

An oops caused by using other thread's stack space in sunrpc ASYNC sending thread.

[ 9839.007187] ------------[ cut here ]------------
[ 9839.007923] kernel BUG at fs/nfs/nfs4xdr.c:910!
[ 9839.008069] invalid opcode: 0000 [#1] SMP
[ 9839.008069] Modules linked in: blocklayoutdriver rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm joydev iosf_mbi crct10dif_pclmul snd_timer crc32_pclmul crc32c_intel ghash_clmulni_intel snd soundcore ppdev pvpanic parport_pc i2c_piix4 serio_raw virtio_balloon parport acpi_cpufreq nfsd nfs_acl lockd grace auth_rpcgss sunrpc qxl drm_kms_helper virtio_net virtio_console virtio_blk ttm drm virtio_pci virtio_ring virtio ata_generic pata_acpi
[ 9839.008069] CPU: 0 PID: 308 Comm: kworker/0:1H Not tainted 4.0.0-0.rc4.git1.3.fc23.x86_64 #1
[ 9839.008069] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 9839.008069] Workqueue: rpciod rpc_async_schedule [sunrpc]
[ 9839.008069] task: ffff8800d8b4d8e0 ti: ffff880036678000 task.ti: ffff880036678000
[ 9839.008069] RIP: 0010:[<ffffffffa0339cc9>]  [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4]
[ 9839.008069] RSP: 0018:ffff88003667ba58  EFLAGS: 00010246
[ 9839.008069] RAX: 0000000000000000 RBX: 000000001fc15e18 RCX: ffff8800c0193800
[ 9839.008069] RDX: ffff8800e4ae3f24 RSI: 000000001fc15e2c RDI: ffff88003667bcd0
[ 9839.008069] RBP: ffff88003667ba58 R08: ffff8800d9173008 R09: 0000000000000003
[ 9839.008069] R10: ffff88003667bcd0 R11: 000000000000000c R12: 0000000000010000
[ 9839.008069] R13: ffff8800d9173350 R14: 0000000000000000 R15: ffff8800c0067b98
[ 9839.008069] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 9839.008069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9839.008069] CR2: 00007f988c9c8bb0 CR3: 00000000d99b6000 CR4: 00000000000407f0
[ 9839.008069] Stack:
[ 9839.008069]  ffff88003667bbc8 ffffffffa03412c5 00000000c6c55680 ffff880000000003
[ 9839.008069]  0000000000000088 00000010c6c55680 0001000000000002 ffffffff816e87e9
[ 9839.008069]  0000000000000000 00000000477290e2 ffff88003667bab8 ffffffff81327ba3
[ 9839.008069] Call Trace:
[ 9839.008069]  [<ffffffffa03412c5>] encode_attrs+0x435/0x530 [nfsv4]
[ 9839.008069]  [<ffffffff816e87e9>] ? inet_sendmsg+0x69/0xb0
[ 9839.008069]  [<ffffffff81327ba3>] ? selinux_socket_sendmsg+0x23/0x30
[ 9839.008069]  [<ffffffff8164c1df>] ? do_sock_sendmsg+0x9f/0xc0
[ 9839.008069]  [<ffffffff8164c278>] ? kernel_sendmsg+0x58/0x70
[ 9839.008069]  [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc]
[ 9839.008069]  [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc]
[ 9839.008069]  [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4]
[ 9839.008069]  [<ffffffffa03419a5>] encode_open+0x2d5/0x340 [nfsv4]
[ 9839.008069]  [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4]
[ 9839.008069]  [<ffffffffa011ab89>] ? xdr_encode_opaque+0x19/0x20 [sunrpc]
[ 9839.008069]  [<ffffffffa0339cfb>] ? encode_string+0x2b/0x40 [nfsv4]
[ 9839.008069]  [<ffffffffa0341bf3>] nfs4_xdr_enc_open+0xb3/0x140 [nfsv4]
[ 9839.008069]  [<ffffffffa0110a4c>] rpcauth_wrap_req+0xac/0xf0 [sunrpc]
[ 9839.008069]  [<ffffffffa01017db>] call_transmit+0x18b/0x2d0 [sunrpc]
[ 9839.008069]  [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc]
[ 9839.008069]  [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc]
[ 9839.008069]  [<ffffffffa010caa0>] __rpc_execute+0x90/0x460 [sunrpc]
[ 9839.008069]  [<ffffffffa010ce85>] rpc_async_schedule+0x15/0x20 [sunrpc]
[ 9839.008069]  [<ffffffff810b452b>] process_one_work+0x1bb/0x410
[ 9839.008069]  [<ffffffff810b47d3>] worker_thread+0x53/0x470
[ 9839.008069]  [<ffffffff810b4780>] ? process_one_work+0x410/0x410
[ 9839.008069]  [<ffffffff810b4780>] ? process_one_work+0x410/0x410
[ 9839.008069]  [<ffffffff810ba7b8>] kthread+0xd8/0xf0
[ 9839.008069]  [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180
[ 9839.008069]  [<ffffffff81786418>] ret_from_fork+0x58/0x90
[ 9839.008069]  [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180
[ 9839.008069] Code: 00 00 48 c7 c7 21 fa 37 a0 e8 94 1c d6 e0 c6 05 d2 17 05 00 01 8b 03 eb d7 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 89 f3
[ 9839.008069] RIP  [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4]
[ 9839.008069]  RSP <ffff88003667ba58>
[ 9839.071114] ---[ end trace cc14c03adb522e94 ]---

Signed-off-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

nfs: plug memory leak when ->prepare_layoutcommit fails

"data" is currently leaked when the prepare_layoutcommit operation
returns an error. Put the cred before taking the spinlock in that
case, take the lock and then goto out_unlock which will drop the
lock and then free "data".

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

Merge tag 'imx-fixes-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.2, 2nd round:
- Add the required second clock for i.MX35 FlexCAN in device tree,
so that the device can be probed by kernel successfully.

* tag 'imx-fixes-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: i.MX35: Fix can support.

Signed-off-by: Olof Johansson <[email protected]>

drm/nouveau/nouveau/ttm: fix tiled system memory with Maxwell

Add Maxwell to the switch statement that sets node->memtype, otherwise
all tiling information is ignored for buffers in system memory.

While we are at it, make that switch statement explicitly complain the
next time we meet a non-handled card family.

Signed-off-by: Alexandre Courbot <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>

drm/nouveau/kms/nv50-: guard against enabling cursor on disabled heads

Userspace has started doing this, which upsets the display class hw
error checking in various unpleasant ways.

Signed-off-by: Ben Skeggs <[email protected]>

s390/cachinfo: add missing facility check to init_cache_level()

Stephen Powell reported the following crash on a z890 machine:

Kernel BUG at 00000000001219d0 [verbose debug info unavailable]
illegal operation: 0001 ilc:3 [#1] SMP
Krnl PSW : 0704e00180000000 00000000001219d0 (init_cache_level+0x38/0xe0)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
Krnl Code: 00000000001219c2: a7840056 brc 8,121a6e
   00000000001219c6: a7190000 lghi %r1,0
  #00000000001219ca: eb101000004c ecag %r1,%r0,0(%r1)
  >00000000001219d0: a7390000 lghi %r3,0
   00000000001219d4: e310f0a00024 stg %r1,160(%r15)
   00000000001219da: a7080000 lhi %r0,0
   00000000001219de: a7b9f000 lghi %r11,-4096
   00000000001219e2: c0a0002899d9 larl %r10,634d94
Call Trace:
[<0000000000478ee2>] detect_cache_attributes+0x2a/0x2b8
[<000000000097c9b0>] cacheinfo_sysfs_init+0x60/0xc8
[<00000000001001c0>] do_one_initcall+0x98/0x1c8
[<000000000094fdc2>] kernel_init_freeable+0x212/0x2d8
[<000000000062352e>] kernel_init+0x26/0x118
[<000000000062fd2e>] kernel_thread_starter+0x6/0xc

The illegal operation was executed because of a missing facility check,
which should have made sure that the ECAG execution would only be executed
on machines which have the general-instructions-extension facility
installed.

Reported-and-tested-by: Stephen Powell <[email protected]>
Cc: [email protected] # v4.0+
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

KEYS: ensure we free the assoc array edit if edit is valid

__key_link_end is not freeing the associated array edit structure
and this leads to a 512 byte memory leak each time an identical
existing key is added with add_key().

The reason the add_key() system call returns okay is that
key_create_or_update() calls __key_link_begin() before checking to see
whether it can update a key directly rather than adding/replacing - which
it turns out it can. Thus __key_link() is not called through
__key_instantiate_and_link() and __key_link_end() must cancel the edit.

CVE-2015-1333

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: James Morris <[email protected]>

Merge branch 'linux-4.2' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Various minor fixes all over the place, nothing too scary.

* 'linux-4.2' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau/fbcon/g80: reduce PUSH_SPACE alloc, fire ring on accel init
  drm/nouveau/fbcon/gf100-: reduce RING_SPACE allocation
  drm/nouveau/fbcon/nv11-: correctly account for ring space usage
  drm/nouveau/bios: add proper support for opcode 0x59
  drm/nouveau/bios: add 0x59 and 0x5a opcodes
  drm/nouveau/disp: Use NULL for pointers
  drm/nouveau/pm: fix a potential race condition when creating an engine context
  drm/nouveau/pm: prevent freeing the wrong engine context
  drm/nouveau/gr/gf100: wait for GR idle after GO_IDLE bundle
  drm/nouveau/gr/gf100: wait on bottom half of FE's pipeline
  drm/nouveau/fifo/gk104: kick channels when deactivating them
  drm/nouveau/ibus/gk20a: increase SM wait timeout
  drm/nouveau/platform: fix compile error if !CONFIG_IOMMU
  drm/nouveau: Do not leak client objects
  drm/nouveau/clk/gt215: u32->s32 for difference in req. and set clock
  drm/nouveau/drm/nv04-nv40/instmem: protect access to priv->heap by mutex
  drm/nouveau: hold mutex when calling nouveau_abi16_fini()

Input: bcm5974 - add support for the 2015 Macbook Pro

Add support for the MacBookPro12,1 model. This patch needs to be
applied together with the accompanied HID patch, as usual.

Tested-by: John Horan <[email protected]>
Tested-by: Jochen Radmacher <[email protected]>
Tested-by: Yang Hongyang <[email protected]>
Tested-by: Yen-Chin, Lee <[email protected]>
Tested-by: George Hilios <[email protected]>
Tested-by: Janez Urevc <[email protected]>
Signed-off-by: Henrik Rydberg <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

HID: apple: Add support for the 2015 Macbook Pro

This patch adds keyboard support for MacbookPro12,1 as WELLSPRING9
(0x0272, 0x0273, 0x0274). The touchpad is handled in a separate
bcm5974 patch, as usual.

Tested-by: John Horan <[email protected]>
Tested-by: Jochen Radmacher <[email protected]>
Tested-by: Yang Hongyang <[email protected]>
Tested-by: Yen-Chin, Lee <[email protected]>
Tested-by: George Hilios <[email protected]>
Tested-by: Janez Urevc <[email protected]>
Signed-off-by: Henrik Rydberg <[email protected]>
Acked-by: Jiri Kosina <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

Input: bcm5974 - prepare for a new trackpad generation

With the advent of the Macbook Pro 12, we see a new generation of
trackpads, capable of force sensoring and haptic feedback.

This patch prepares for the new device by adding configuration data
for the code paths that would otherwise look different.

Tested-by: John Horan <[email protected]>
Tested-by: Jochen Radmacher <[email protected]>
Tested-by: Yang Hongyang <[email protected]>
Tested-by: Yen-Chin, Lee <[email protected]>
Tested-by: George Hilios <[email protected]>
Tested-by: Janez Urevc <[email protected]>
Signed-off-by: Henrik Rydberg <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

Input: synaptics - dump ext10 capabilities as well

Make extended capabilities obtained through $10 query also available in
touchpad identification.

Signed-off-by: Jiri Kosina <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

packet: missing dev_put() in packet_do_bind()

When binding a PF_PACKET socket, the use count of the bound interface is
always increased with dev_hold in dev_get_by_{index,name}. However,
when rebound with the same protocol and device as in the previous bind
the use count of the interface was not decreased. Ultimately, this
caused the deletion of the interface to fail with the following message:

unregister_netdevice: waiting for dummy0 to become free. Usage count = 1

This patch moves the dev_put out of the conditional part that was only
executed when either the protocol or device changed on a bind.

Fixes: 902fefb82ef7 ('packet: improve socket create/bind latency in some cases')
Signed-off-by: Lars Westerhoff <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

SUNRPC: Report TCP errors to the caller

Signed-off-by: Trond Myklebust <[email protected]>

macvtap: fix network header pointer for VLAN tagged pkts

Network header is set with offset ETH_HLEN but it is not true for VLAN
(multiple-)tagged and results in checksum issues in lower devices.

v2: leave skb->protocol untouched (thx Vlad), comment added
v3: moved after skb_probe_transport_header() call (thx Toshiaki)

Signed-off-by: Ivan Vecera <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

fib_trie: Drop unnecessary calls to leaf_pull_suffix

It was reported that update_suffix was taking a long time on systems where
a large number of leaves were attached to a single node. As it turns out
fib_table_flush was calling update_suffix for each leaf that didn't have all
of the aliases stripped from it. As a result, on this large node removing
one leaf would result in us calling update_suffix for every other leaf on
the node.

The fix is to just remove the calls to leaf_pull_suffix since they are
redundant as we already have a call in resize that will go through and
update the suffix length for the node before we exit out of
fib_table_flush or fib_table_flush_external.

Reported-by: David Ahern <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macb: Fix build with macro'ized readl/writel.

If an architecture defines readl/writel using CPP macros, we
get the following kinds of build failure:

> > > drivers/net/ethernet/cadence/macb.c:164:1: error: macro "writel"
> > > passed 3 arguments, but takes just 2
> macb_or_gem_writel(bp, SA1B, bottom);
> ^

Rename the methods so that this doesn't happen.

Signed-off-by: David S. Miller <[email protected]>

arm64/efi: map the entire UEFI vendor string before reading it

At boot, the UTF-16 UEFI vendor string is copied from the system
table into a char array with a size of 100 bytes. However, this
size of 100 bytes is also used for memremapping() the source,
which may not be sufficient if the vendor string exceeds 50
UTF-16 characters, and the placement of the vendor string inside
a 4 KB page happens to leave the end unmapped.

So use the correct '100 * sizeof(efi_char16_t)' for the size of
the mapping.

Signed-off-by: Ard Biesheuvel <[email protected]>
Fixes: f84d02755f5a ("arm64: add EFI runtime services")
Cc: <[email protected]> # 3.16+
Signed-off-by: Catalin Marinas <[email protected]>

sunrpc: translate -EAGAIN to -ENOBUFS when socket is writable.

The networking layer does not reliably report the distinction between
a non-block write failing because:
1/ the queue is too full already and
2/ a memory allocation attempt failed.

The distinction is important because in the first case it is
appropriate to retry as soon as the socket reports that it is
writable, and in the second case a small delay is required as the
socket will most likely report as writable but kmalloc could still
fail.

sk_stream_wait_memory() exhibits this distinction nicely, setting
'vm_wait' if a small wait is needed.  However in the non-blocking case
it always returns -EAGAIN no matter the cause of the failure.  This
-EAGAIN call get all the way to sunrpc.

The sunrpc layer expects EAGAIN to indicate the first cause, and
ENOBUFS to indicate the second.  Various documentation suggests that
this is not unreasonable, but does not guarantee the desired error
codes.

The result of getting -EAGAIN when -ENOBUFS is expected is that the
send is tried again in a tight loop and soft lockups are reported.

so: add tests after calls to xs_sendpages() to translate -EAGAIN into
-ENOBUFS if the socket is writable.  This cannot happen inside
xs_sendpages() as the test for "is socket writable" is different
between TCP and UDP.

With this change, the tight loop retrying xs_sendpages() becomes a
loop which only retries every 250ms, and so will not trigger a
soft-lockup warning.

It is possible that the write did fail because the queue was too full
and by the time xs_sendpages() completed, the queue was writable
again.  In this case an extra 250ms delay is inserted that isn't
really needed.  This circumstance suggests a degree of congestion so a
delay is not necessarily a bad thing, and it can only cause a single
250ms delay, not a series of them.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4.2: handle NFS-specific llseek errors

Handle NFS-specific llseek errors instead of letting them leak out to
userspace.

Reported-by: Benjamin Coddington <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

vhost: fix error handling for memory region alloc

callers of vhost_kvzalloc() expect the same behaviour on
allocation error as from kmalloc/vmalloc i.e. NULL return
value. So just return vzmalloc() returned value instead of
returning ERR_PTR(-ENOMEM)

Fixes: 4de7255f7d2be5 ("vhost: extend memory regions allocation to vmalloc")
Spotted-by: Dan Carpenter <[email protected]>
Suggested-by: Julia Lawall <[email protected]>
Signed-off-by: Igor Mammedov <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>