Git Repo - linux.git/log

pNFS: The client must not do I/O to the DS if it's lease has expired

Ensure that the client conforms to the normative behaviour described in
RFC5661 Section 12.7.2: "If a client believes its lease has expired,
it MUST NOT send I/O to the storage device until it has validated its
lease."

So ensure that we wait for the lease to be validated before using
the layout.

Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v3.20+

vhost/scsi: fix reuse of &vq->iov[out] in response

The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi
command's response iovec throughout the lifetime of the command. Rather, it
is more likely to contain an iovec from an immediately following command
after looping back around to vhost_get_vq_desc(). Pass along the iovec
entirely instead.

Fixes: 79c14141a487 ("vhost/scsi: Convert completion path to use copy_to_iter")
Cc: [email protected]
Signed-off-by: Benjamin Coddington <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ASoC: omap-mcpdm: Fix irq resource handling

Fixes: ddd17531ad908 ("ASoC: omap-mcpdm: Clean up with devm_* function")
Managed irq request will not doing any good in ASoC probe level as it is
not going to free up the irq when the driver is unbound from the sound
card.

Signed-off-by: Peter Ujfalusi <[email protected]>
Reported-by: Russell King <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

net sched: fix encoding to use real length

Encoding of the metadata was using the padded length as opposed to
the real length of the data which is a bug per specification.
This has not been an issue todate because all metadatum specified
so far has been 32 bit where aligned and data length are the same width.
This also includes a bug fix for validating the length of a u16 field.
But since there is no metadata of size u16 yes we are fine to include it
here.

While at it get rid of magic numbers.

Fixes: ef6980b6becb ("net sched: introduce IFE action")
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

usercopy: fix overlap check for kernel text

When running with a local patch which moves the '_stext' symbol to the
very beginning of the kernel text area, I got the following panic with
CONFIG_HARDENED_USERCOPY:

  usercopy: kernel memory exposure attempt detected from ffff88103dfff000 (<linear kernel text>) (4096 bytes)
  ------------[ cut here ]------------
  kernel BUG at mm/usercopy.c:79!
  invalid opcode: 0000 [#1] SMP
  ...
  CPU: 0 PID: 4800 Comm: cp Not tainted 4.8.0-rc3.after+ #1
  Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.5.4 01/22/2016
  task: ffff880817444140 task.stack: ffff880816274000
  RIP: 0010:[<ffffffff8121c796>] __check_object_size+0x76/0x413
  RSP: 0018:ffff880816277c40 EFLAGS: 00010246
  RAX: 000000000000006b RBX: ffff88103dfff000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88081f80dfa8 RDI: ffff88081f80dfa8
  RBP: ffff880816277c90 R08: 000000000000054c R09: 0000000000000000
  R10: 0000000000000005 R11: 0000000000000006 R12: 0000000000001000
  R13: ffff88103e000000 R14: ffff88103dffffff R15: 0000000000000001
  FS:  00007fb9d1750800(0000) GS:ffff88081f800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000021d2000 CR3: 000000081a08f000 CR4: 00000000001406f0
  Stack:
   ffff880816277cc8 0000000000010000 000000043de07000 0000000000000000
   0000000000001000 ffff880816277e60 0000000000001000 ffff880816277e28
   000000000000c000 0000000000001000 ffff880816277ce8 ffffffff8136c3a6
  Call Trace:
   [<ffffffff8136c3a6>] copy_page_to_iter_iovec+0xa6/0x1c0
   [<ffffffff8136e766>] copy_page_to_iter+0x16/0x90
   [<ffffffff811970e3>] generic_file_read_iter+0x3e3/0x7c0
   [<ffffffffa06a738d>] ? xfs_file_buffered_aio_write+0xad/0x260 [xfs]
   [<ffffffff816e6262>] ? down_read+0x12/0x40
   [<ffffffffa06a61b1>] xfs_file_buffered_aio_read+0x51/0xc0 [xfs]
   [<ffffffffa06a6692>] xfs_file_read_iter+0x62/0xb0 [xfs]
   [<ffffffff812224cf>] __vfs_read+0xdf/0x130
   [<ffffffff81222c9e>] vfs_read+0x8e/0x140
   [<ffffffff81224195>] SyS_read+0x55/0xc0
   [<ffffffff81003a47>] do_syscall_64+0x67/0x160
   [<ffffffff816e8421>] entry_SYSCALL64_slow_path+0x25/0x25
  RIP: 0033:[<00007fb9d0c33c00>] 0x7fb9d0c33c00
  RSP: 002b:00007ffc9c262f28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  RAX: ffffffffffffffda RBX: fffffffffff8ffff RCX: 00007fb9d0c33c00
  RDX: 0000000000010000 RSI: 00000000021c3000 RDI: 0000000000000004
  RBP: 00000000021c3000 R08: 0000000000000000 R09: 00007ffc9c264d6c
  R10: 00007ffc9c262c50 R11: 0000000000000246 R12: 0000000000010000
  R13: 00007ffc9c2630b0 R14: 0000000000000004 R15: 0000000000010000
  Code: 81 48 0f 44 d0 48 c7 c6 90 4d a3 81 48 c7 c0 bb b3 a2 81 48 0f 44 f0 4d 89 e1 48 89 d9 48 c7 c7 68 16 a3 81 31 c0 e8 f4 57 f7 ff <0f> 0b 48 8d 90 00 40 00 00 48 39 d3 0f 83 22 01 00 00 48 39 c3
  RIP  [<ffffffff8121c796>] __check_object_size+0x76/0x413
   RSP <ffff880816277c40>

The checked object's range [ffff88103dfff000, ffff88103e000000) is
valid, so there shouldn't have been a BUG.  The hardened usercopy code
got confused because the range's ending address is the same as the
kernel's text starting address at 0xffff88103e000000.  The overlap check
is slightly off.

Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Kees Cook <[email protected]>

usercopy: avoid potentially undefined behavior in pointer math

check_bogus_address() checked for pointer overflow using this expression,
where 'ptr' has type 'const void *':

ptr + n < ptr

Since pointer wraparound is undefined behavior, gcc at -O2 by default
treats it like the following, which would not behave as intended:

(long)n < 0

Fortunately, this doesn't currently happen for kernel code because kernel
code is compiled with -fno-strict-overflow. But the expression should be
fixed anyway to use well-defined integer arithmetic, since it could be
treated differently by different compilers in the future or could be
reported by tools checking for undefined behavior.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Kees Cook <[email protected]>

qed: FLR of active VFs might lead to FW assert

Driver never bothered marking the VF's vport with the VF's sw_fid.
As a result, FLR flows are not going to clean those vports.

If the vport was active when FLRed, re-activating it would lead
to a FW assertion.

Fixes: dacd88d6f6851 ("qed: IOV l2 functionality")
Signed-off-by: Yuval Mintz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ip_finish_output_gso: Allow fragmenting segments of tunneled skbs if their DF is unset

In b8247f095e,

"net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs"

gso skbs arriving from an ingress interface that go through UDP
tunneling, are allowed to be fragmented if the resulting encapulated
segments exceed the dst mtu of the egress interface.

This aligned the behavior of gso skbs to non-gso skbs going through udp
encapsulation path.

However the non-gso vs gso anomaly is present also in the following
cases of a GRE tunnel:
- ip_gre in collect_md mode, where TUNNEL_DONT_FRAGMENT is not set
(e.g. OvS vport-gre with df_default=false)
- ip_gre in nopmtudisc mode, where IFLA_GRE_IGNORE_DF is set

In both of the above cases, the non-gso skbs get fragmented, whereas the
gso skbs (having skb_gso_network_seglen that exceeds dst mtu) get dropped,
as they don't go through the segment+fragment code path.

Fix: Setting IPSKB_FRAG_SEGS if the tunnel specified IP_DF bit is NOT set.

Tunnels that do set IP_DF, will not go to fragmentation of segments.
This preserves behavior of ip_gre in (the default) pmtudisc mode.

Fixes: b8247f095e ("net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs")
Reported-by: wenxu <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Signed-off-by: Shmulik Ladkani <[email protected]>
Tested-by: wenxu <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv6: Remove addresses for failures with strict DAD

If DAD fails with accept_dad set to 2, global addresses and host routes
are incorrectly left in place. Even though disable_ipv6 is set,
contrary to documentation, the addresses are not dynamically deleted
from the interface. It is only on a subsequent link down/up that these
are removed. The fix is not only to set the disable_ipv6 flag, but
also to call addrconf_ifdown(), which is the action to carry out when
disabling IPv6. This results in the addresses and routes being deleted
immediately. The DAD failure for the LL addr is determined as before
via netlink, or by the absence of the LL addr (which also previously
would have had to be checked for in case of an intervening link down
and up). As the call to addrconf_ifdown() requires an rtnl lock, the
logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().

Previous behavior:

root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2000::10/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
root@vm1:/# ip -6 route show dev eth3
2000::/64  proto kernel  metric 256
fe80::/64  proto kernel  metric 256
root@vm1:/# ip link set down eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#

New behavior:

root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
net.ipv6.conf.eth3.accept_dad = 2
root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
root@vm1:/# ip link set up eth3
root@vm1:/# ip -6 addr show dev eth3
root@vm1:/# ip -6 route show dev eth3
root@vm1:/#

Signed-off-by: Mike Manning <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/ipx.h: fix conflicting defitions with glibc netipx/ipx.h

Fixes these compiler warnings via libc-compat.h when glibc netipx/ipx.h is
included before linux/ipx.h:

./linux/ipx.h:9:8: error: redefinition of ‘struct sockaddr_ipx’
./linux/ipx.h:26:8: error: redefinition of ‘struct ipx_route_definition’
./linux/ipx.h:32:8: error: redefinition of ‘struct ipx_interface_definition’
./linux/ipx.h:49:8: error: redefinition of ‘struct ipx_config_data’
./linux/ipx.h:58:8: error: redefinition of ‘struct ipx_route_def’

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/openvswitch.h: use __u32 from linux/types.h

Kernel uapi header are supposed to use them. Fixes userspace compile error:

linux/openvswitch.h:583:2: error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/atm_zatm.h: include linux/time.h

Fixes userspace compile error:

error: field ‘real’ has incomplete type
struct timeval real; /* real (wall-clock) time */

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/openvswitch.h: use __u32 from linux/types.h

Fixes userspace compiler error:

error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/if_pppox.h: include linux/in.h and linux/in6.h

Fixes userspace compilation errors:

error: field ‘addr’ has incomplete type
struct sockaddr_in addr; /* IP address and port to send to */

error: field ‘addr’ has incomplete type
struct sockaddr_in6 addr; /* IP address and port to send to */

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/if_pppol2tp.h: include linux/in.h and linux/in6.h

Fixes userspace compilation errors like:

error: field ‘addr’ has incomplete type
struct sockaddr_in addr; /* IP address and port to send to */
^
error: field ‘addr’ has incomplete type
struct sockaddr_in6 addr; /* IP address and port to send to */

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/if_tunnel.h: include linux/if.h, linux/ip.h and linux/in6.h

Fixes userspace compilation errors like:

error: field ‘iph’ has incomplete type
error: field ‘prefix’ has incomplete type

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

include/uapi/linux/if_pppox.h: include linux/if.h

Fixes userspace compilation error:

error: ‘IFNAMSIZ’ undeclared here (not in a function)

Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'arc-4.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- support for Syscall ABI v4 with upstream gcc 6.x

- lockdep fix (Daniel Mentz)

- gdb register clobber (Liav Rehana)

- couple of missing exports for modules

- other fixes here and there

* tag 'arc-4.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: export __udivdi3 for modules
  ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
  ARC: export kmap
  ARC: Support syscall ABI v4
  ARC: use correct offset in pt_regs for saving/restoring user mode r25
  ARC: Elide redundant setup of DMA callbacks
  ARC: Call trace_hardirqs_on() before enabling irqs

Merge tag 'gpio-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"Here are a few GPIO fixes for v4.8.

  I was expecting some fallout from the new chardev rework but nothing
  like that turned up att all.  Instead a Kconfig confusion that I think
  I have finally nailed, then some ordinary driver noise and trivia.

  This fixes a Kconfig issue with UM: when I made GPIOLIB available to
  all archs, that included UM, but the OF part of GPIOLIB requires
  HAS_IOMEM, so we add HAS_IOMEM as a dependency to OF_GPIO.

  This in turn exposed the fact that a few GPIO drivers were implicitly
  assuming OF_GPIO as their dependency but instead depended on OF alone
  (the typical problem being a pointer inside gpio_chip not existing
  unless OF_GPIO is selected) and then UM would fail to compile with
  these drivers instead.  Then I lost patience and made any GPIO driver
  depending on just OF depend on OF_GPIO instead, that is certainly what
  they meant and the only thing that makes sense anyway.  GPIO with just
  OF but !OF_GPIO does not make sense.

  Also a fix for the max730x driver data pointer, and a minor comment
  fix for the GPIO tools"

* tag 'gpio-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: make any OF dependent driver depend on OF_GPIO
  gpio: Fix OF build problem on UM
  gpio: max730x: set gpiochip data pointer before using it
  tools/gpio: fix gpio-event-mon header comment

Input: ads7846 - remove redundant regulator_disable call

ADS7846 regulator is disabled twice in a row in ads7846_remove(). Valid
one is in ads7846_disable().

Removing the ads7846 module causes warning about unbalanced disables.

  ...
  WARNING: CPU: 0 PID: 29269 at drivers/regulator/core.c:2251 _regulator_disable+0xf8/0x130
  unbalanced disables for vads7846
  CPU: 0 PID: 29269 Comm: rmmod Tainted: G      D W       4.7.0+ #3
  Hardware name: HTC Magician
  ...
    show_stack+0x10/0x14
    __warn+0xd8/0x100
    warn_slowpath_fmt+0x38/0x48
    _regulator_disable+0xf8/0x130
    regulator_disable+0x34/0x60
    ads7846_remove+0x58/0xd4 [ads7846]
    spi_drv_remove+0x1c/0x34
    __device_release_driver+0x84/0x114
    driver_detach+0x8c/0x90
    bus_remove_driver+0x5c/0xc8
    SyS_delete_module+0x1a0/0x238
    ret_fast_syscall+0x0/0x38

Signed-off-by: Petr Cvek <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

Input: synaptics-rmi4 - fix register descriptor subpacket map construction

The map_offset variable is specific to the register and needs to be reset
in the loop. Otherwise, subsequent register's subpacket maps will have
their bits set at the wrong index.

Signed-off-by: Andrew Duggan <[email protected]>
Tested-by: Nitin Chaudhary <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>

Input: tegra-kbc - fix inverted reset logic

Commit fe6b0dfaba68 ("Input: tegra-kbc - use reset framework")
accidentally converted _deassert to _assert, so there is no code
to wake up this hardware.

Fixes: fe6b0dfaba68 ("Input: tegra-kbc - use reset framework")
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Thierry Reding <[email protected]>
Acked-by: Laxman Dewangan <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>

Input: silead - use devm_gpiod_get

The silead code is using devm_foo for everything (and does not free
any resources). Except that it is using gpiod_get instead of
devm_gpiod_get (but is not freeing the gpio_desc), change this
to use devm_gpiod_get so that the gpio will be properly released.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>

iw_cxgb4: use the MPA initiator's IRD if < our ORD

The i40iw initiator sends an MPA-request with ird=16 and ord=16. The cxgb4
responder sends an MPA-reply with ord = 32 causing i40iw to terminate
due to insufficient resources.

The logic to reduce the ORD to <= peer's IRD was wrong.

Reported-by: Shiraz Saleem <[email protected]>
Signed-off-by: Steve Wise <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

iw_cxgb4: limit IRD/ORD advertised to ULP by device max.

The i40iw initiator sends an MPA-request with ird = 63, ord = 63. The
cxgb4 responder sends a RST. Since the inbound ord=63 and it exceeds
the max_ird/c4iw_max_read_depth (=32 default), chelsio decides to abort.

Instead, cxgb4 should adjust the ord/ird down before presenting it to
the ULP.

Reported-by: Shiraz Saleem <[email protected]>
Signed-off-by: Steve Wise <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Fix mm_struct use after free

Testing with CONFIG_SLUB_DEBUG_ON=y resulted in the kernel panic below.

This is the result of the mm_struct sometimes being free'd prior to
hfi1_file_close being called.

This was due to the combination of 2 reasons:

1) hfi1_file_close is deferred in process exit and it therefore may not
   be called synchronously with process exit.
2) exit_mm is called prior to exit_files in do_exit.  Normally this is ok
   however, our kernel bypass code requires us to have access to the
   mm_struct for house keeping both at "normal" close time as well as at
   process exit.

Therefore, the fix is to simply keep a reference to the mm_struct until
we are done with it.

[ 3006.340150] general protection fault: 0000 [#1] SMP
[ 3006.346469] Modules linked in: hfi1 rdmavt rpcrdma ib_isert iscsi_target_mod
ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod snd_hda_code
c_realtek iTCO_wdt snd_hda_codec_generic iTCO_vendor_support sb_edac edac_core
x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass c
rct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw snd_hda_intel
gf128mul snd_hda_codec glue_helper snd_hda_core ablk_helper sn
d_hwdep cryptd snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pcspkr
shpchp mei_me sg lpc_ich mei i2c_i801 mfd_core ioatdma ipmi_devi
ntf wmi ipmi_si ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd
grace sunrpc ip_tables ext4 jbd2 mbcache mlx4_en ib_core sr_mod s
d_mod cdrom crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect igb
sysimgblt fb_sys_fops ptp mlx4_core ttm isci pps_core ahci drm li
bsas libahci dca firewire_ohci i2c_algo_bit scsi_transport_sas firewire_core
crc_itu_t i2c_core libata [last unloaded: mlx4_ib]
[ 3006.461759] CPU: 16 PID: 11624 Comm: mpi_stress Not tainted 4.7.0-rc5+ #1
[ 3006.469915] Hardware name: Intel Corporation W2600CR ........../W2600CR, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[ 3006.483027] task: ffff8804102f0040 ti: ffff8804102f8000 task.ti: ffff8804102f8000
[ 3006.491971] RIP: 0010:[<ffffffff810f0383>]  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
[ 3006.501905] RSP: 0018:ffff8804102fb908  EFLAGS: 00010002
[ 3006.508447] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
[ 3006.517012] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880410b56a40
[ 3006.525569] RBP: ffff8804102fb9b0 R08: 0000000000000001 R09: 0000000000000000
[ 3006.534119] R10: ffff8804102f0040 R11: 0000000000000000 R12: 0000000000000000
[ 3006.542664] R13: ffff880410b56a40 R14: 0000000000000000 R15: 0000000000000000
[ 3006.551203] FS:  00007ff478c08700(0000) GS:ffff88042e200000(0000) knlGS:0000000000000000
[ 3006.560814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3006.567806] CR2: 00007f667f5109e0 CR3: 0000000001c06000 CR4: 00000000000406e0
[ 3006.576352] Stack:
[ 3006.579157]  ffffffff8124b819 ffffffffffffffff 0000000000000000 ffff8804102fb940
[ 3006.588072]  0000000000000002 0000000000000000 ffff8804102f0040 0000000000000007
[ 3006.596971]  0000000000000006 ffff8803cad6f000 0000000000000000 ffff8804102f0040
[ 3006.605878] Call Trace:
[ 3006.609220]  [<ffffffff8124b819>] ? uncharge_batch+0x109/0x250
[ 3006.616382]  [<ffffffff810f2313>] lock_acquire+0xd3/0x220
[ 3006.623056]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.631593]  [<ffffffff81775579>] down_write+0x49/0x80
[ 3006.638022]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.646569]  [<ffffffffa0a30bfc>] hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.654898]  [<ffffffffa0a2efb6>] cacheless_tid_rb_remove+0x106/0x330 [hfi1]
[ 3006.663417]  [<ffffffff810efd36>] ? mark_held_locks+0x66/0x90
[ 3006.670498]  [<ffffffff817771f6>] ? _raw_spin_unlock_irqrestore+0x36/0x60
[ 3006.678741]  [<ffffffffa0a2f1ee>] tid_rb_remove+0xe/0x10 [hfi1]
[ 3006.686010]  [<ffffffffa0a0c5d5>] hfi1_mmu_rb_unregister+0xc5/0x100 [hfi1]
[ 3006.694387]  [<ffffffffa0a2fcb9>] hfi1_user_exp_rcv_free+0x39/0x120 [hfi1]
[ 3006.702732]  [<ffffffffa09fc6ea>] hfi1_file_close+0x17a/0x330 [hfi1]
[ 3006.710489]  [<ffffffff81263e9a>] __fput+0xfa/0x230
[ 3006.716595]  [<ffffffff8126400e>] ____fput+0xe/0x10
[ 3006.722696]  [<ffffffff810b95c6>] task_work_run+0x86/0xc0
[ 3006.729379]  [<ffffffff81099933>] do_exit+0x323/0xc40
[ 3006.735672]  [<ffffffff8109a2dc>] do_group_exit+0x4c/0xc0
[ 3006.742371]  [<ffffffff810a7f55>] get_signal+0x345/0x940
[ 3006.748958]  [<ffffffff810340c7>] do_signal+0x37/0x700
[ 3006.755328]  [<ffffffff8127872a>] ? poll_select_set_timeout+0x5a/0x90
[ 3006.763146]  [<ffffffff811609cb>] ? __audit_syscall_exit+0x1db/0x260
[ 3006.770853]  [<ffffffff8110f3e3>] ? rcu_read_lock_sched_held+0x93/0xa0
[ 3006.778765]  [<ffffffff812347a4>] ? kfree+0x1e4/0x2a0
[ 3006.784986]  [<ffffffff8108e75a>] ? exit_to_usermode_loop+0x33/0xac
[ 3006.792551]  [<ffffffff8108e785>] exit_to_usermode_loop+0x5e/0xac
[ 3006.799907]  [<ffffffff81003dca>] do_syscall_64+0x12a/0x190
[ 3006.806664]  [<ffffffff81777a7f>] entry_SYSCALL64_slow_path+0x25/0x25
[ 3006.814396] Code: 24 08 44 89 44 24 10 89 4c 24 18 e8 a8 d8 ff ff 48 85 c0
8b 4c 24 18 44 8b 44 24 10 44 8b 4c 24 08 4c 8b 14 24 0f 84 30
08 00 00 <f0> ff 80 98 01 00 00 8b 3d 48 ad be 01 45 8b a2 90 0b 00 00 85
[ 3006.837158] RIP  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
[ 3006.844401]  RSP <ffff8804102fb908>
[ 3006.851170] ---[ end trace b7b9f21cf06c27df ]---
[ 3006.927420] Kernel panic - not syncing: Fatal exception
[ 3006.933954] Kernel Offset: disabled
[ 3006.940961] ---[ end Kernel panic - not syncing: Fatal exception
[ 3006.948249] ------------[ cut here ]------------

Fixes: 3faa3d9a308e ("IB/hfi1: Make use of mm consistent")
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/rdmvat: Fix double vfree() in rvt_create_qp() error path

The unwind logic for creating a user QP has a double vfree
of the non-shared receive queue when handling a "too many qps"
failure.

The code unwinds the mmmap info by decrementing a reference
count which will call rvt_release_mmap_info() which in turn
does the vfree() of the r_rq.wq. The unwind code then does
the same free.

Fix by guarding the vfree() with the same test that is done
in close and only do the vfree() if qp->ip is NULL.

Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Improve J_KEY generation

Previously, J_KEY generation was based on the lower 16 bits
of the user's UID. While this works, it was not good enough
as a non-root user could collide with a root user given a
sufficiently large UID.

This patch attempt to improve the J_KEY generation by using
the following algorithm:

The 16 bit J_KEY space is partitioned into 3 separate spaces
reserved for different user classes:
   * all users with administtor privileges (including 'root')
     will use J_KEYs in the range of 0 to 31,
   * all kernel protocols, which use KDETH packets will use
     J_KEYs in the range of 32 to 63, and
   * all other users will use J_KEYs in the range of 64 to
     65535.

The above separation is aimed at preventing different user levels
from sending packets to each other and, additionally, separate
kernel protocols from all other types of users. The later is meant
to prevent the potential corruption of kernel memory by any other
type of user.

Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Return invalid field for non-QSFP CableInfo queries

The driver does not check if the CableInfo query is supported for the
port type. Return early if CableInfo is not supported for the port type,
making compliance with the specification explicit and preventing lower
level code from potentially doing the wrong thing if the query is not
supported for the hardware implementation.

Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Easwar Hariharan <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

MAINTAINERS: Fix Soft RoCE location

The Soft RoCE (rxe) is located in drivers/inifiniband/sw
and not in drivers/infiniband/hw/.

This patch fixes it.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/usnic: Fix error return code

If 'pci_register_driver' fails, we return 'err' which is known to be 0.
Return the error instead.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Add missing error code assignment before test

It is likely that checking the result of 'setup_ctxt' is expected here.

Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Using kfree_rcu() to simplify the code

The callback function of call_rcu() just calls a kfree(), so we
can use kfree_rcu() instead of call_rcu() + callback function.

Signed-off-by: Wei Yongjun <[email protected]>
Tested-by: Mike Marciniszyn <[email protected]>
Acked-by: Mike Marciniszyn <[email protected]>
Tested-by: Mike Marciniszyn <[email protected]>
Acked-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Validate header in set_armed_active

Validate the etype to insure that the header is correct.

Reviewed-by: Don Hiatt <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Pass packet ptr to set_armed_active

The "packet" parameter was being passed on the stack,
change it to a pointer.

Reviewed-by: Don Hiatt <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Fetch monitor values on-demand for CableInfo query

The monitor values from bytes 22 through 81 of the QSFP memory space
(SFF 8636) are dynamic and serving them out of the QSFP memory cache
maintained by the driver provides stale data to the CableInfo SMA query.
This patch refreshes the dynamic values from the QSFP memory on request
and overwrites the stale data from the cache for the overlap between the
requested range and the monitor range.

Reviewed-by: Jubin John <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Easwar Hariharan <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1,IB/qib: Fix qp_stats sleep with rcu read lock held

The qp init function does a kzalloc() while holding the RCU
lock that encounters the following warning with a debug kernel
when a cat of the qp_stats is done:

[  231.723948] rcu_scheduler_active = 1, debug_locks = 0
[  231.731939] 3 locks held by cat/11355:
[  231.736492]  #0:  (debugfs_srcu){......}, at: [<ffffffff813001a5>] debugfs_use_file_start+0x5/0x90
[  231.746955]  #1:  (&p->lock){+.+.+.}, at: [<ffffffff81289a6c>] seq_read+0x4c/0x3c0
[  231.755873]  #2:  (rcu_read_lock){......}, at: [<ffffffffa0a0c535>] _qp_stats_seq_start+0x5/0xd0 [hfi1]
[  231.766862]

The init functions do an implicit next which requires the rcu read lock
before the kzalloc().

Fix for both drivers is to change the scope of the init function to only
do the allocation and the initialization of the just allocated iter.

The implict next is moved back into the respective start functions to fix
the issue.

Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
CC: <[email protected]> # 4.6.x-
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Remove duplicated include from affinity.c

Remove duplicated include.

Signed-off-by: Wei Yongjun <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/isert: fix error return code in isert_alloc_login_buf()

Fix to return error code -ENOMEM from the alloc error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Sagi Grimberg <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/core: Fix possible memory leak in cma_resolve_iboe_route()

'work' and 'route->path_rec' are malloced in cma_resolve_iboe_route()
and should be freed before leaving from the error handling cases,
otherwise it will cause memory leak.

Fixes: 200298326b27 ('IB/core: Validate route when we init ah')
Signed-off-by: Wei Yongjun <[email protected]>
Reviewed-by: Haggai Eran <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/hfi1: Allocate cpu mask on the heap to silence warning

If CONFIG_FRAME_WARN is small (1K) and CONFIG_NR_CPUS big
then a frame size warning is triggered during build.
Allocate the cpu mask dynamically to silence the warning.

Reviewed-by: Sebastian Sanchez <[email protected]>
Signed-off-by: Tadeusz Struk <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/mlx4: Return EAGAIN for any error in mlx4_ib_poll_one

Error code EAGAIN should be used when errors are temporary and next call
might succeeds.
When error code other than EAGAIN is returned, the caller (mlx4_ib_poll)
will assume all CQE in the same bunch are error too and will drop them all.

Signed-off-by: Yuval Shaia <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

IB/mlx4: Make function use_tunnel_data return void

No need to return int if function always returns 0

Signed-off-by: Yuval Shaia <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>

irqchip/mips-gic: Implement activate op for device domain

If an IRQ is setup using __setup_irq(), which is used by the
request_irq() family of functions, and we are using an SMP kernel then
the affinity of the IRQ will be set via setup_affinity() immediately
after the IRQ is enabled. This call to gic_set_affinity() will lead to
the interrupt being mapped to a VPE. However there are other ways to use
IRQs which don't cause affinity to be set, for example if it is used to
chain to another IRQ controller with irq_set_chained_handler_and_data().
The irq_set_chained_handler_and_data() code path will enable the IRQ,
but will not trigger a call to gic_set_affinity() and in this case
nothing will map the interrupt to a VPE, meaning that the interrupt is
never received.

Fix this by implementing the activate operation for the GIC device IRQ
domain, using gic_shared_irq_domain_map() to map the interrupt to the
correct pin of cpu 0.

Fixes: c98c1822ee13 ("irqchip/mips-gic: Add device hierarchy domain")
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/mips-gic: Cleanup chip and handler setup

gic_shared_irq_domain_map() is called from gic_irq_domain_alloc() where
the wrong chip has been set, and is then overwritten. Tidy this up by
setting the correct chip the first time, and setting the
handle_level_irq handler from gic_irq_domain_alloc() too.

gic_shared_irq_domain_map() is also called from gic_irq_domain_map(),
which now calls irq_set_chip_and_handler() to retain its previous
behaviour.

This patch prepares for a follow-on which will call
gic_shared_irq_domain_map() from a callback where the lock on the struct
irq_desc is held, which without this change would cause the call to
irq_set_chip_and_handler() to lead to a deadlock.

Fixes: c98c1822ee13 ("irqchip/mips-gic: Add device hierarchy domain")
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

ASoC: max98371: Add terminate entry for i2c_device_id tables

Make sure i2c_device_id tables are NULL terminated.

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

bdev: fix NULL pointer dereference

I got this:

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    Dumping ftrace buffer:
       (ftrace buffer empty)
    CPU: 0 PID: 5505 Comm: syz-executor Not tainted 4.8.0-rc2+ #161
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff880113415940 task.stack: ffff880118350000
    RIP: 0010:[<ffffffff8172cb32>]  [<ffffffff8172cb32>] bd_mount+0x52/0xa0
    RSP: 0018:ffff880118357ca0  EFLAGS: 00010207
    RAX: dffffc0000000000 RBX: ffffffffffffffff RCX: ffffc90000bb6000
    RDX: 0000000000000018 RSI: ffffffff846d6b20 RDI: 00000000000000c7
    RBP: ffff880118357cb0 R08: ffff880115967c68 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801188211e8
    R13: ffffffff847baa20 R14: ffff8801139cb000 R15: 0000000000000080
    FS:  00007fa3ff6c0700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fc1d8cc7e78 CR3: 0000000109f20000 CR4: 00000000000006f0
    DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
    Stack:
     ffff880112cfd6c0 ffff8801188211e8 ffff880118357cf0 ffffffff8167f207
     ffffffff816d7a1e ffff880112a413c0 ffffffff847baa20 ffff8801188211e8
     0000000000000080 ffff880112cfd6c0 ffff880118357d38 ffffffff816dce0a
    Call Trace:
     [<ffffffff8167f207>] mount_fs+0x97/0x2e0
     [<ffffffff816d7a1e>] ? alloc_vfsmnt+0x55e/0x760
     [<ffffffff816dce0a>] vfs_kern_mount+0x7a/0x300
     [<ffffffff83c3247c>] ? _raw_read_unlock+0x2c/0x50
     [<ffffffff816dfc87>] do_mount+0x3d7/0x2730
     [<ffffffff81235fd4>] ? trace_do_page_fault+0x1f4/0x3a0
     [<ffffffff816df8b0>] ? copy_mount_string+0x40/0x40
     [<ffffffff8161ea81>] ? memset+0x31/0x40
     [<ffffffff816df73e>] ? copy_mount_options+0x1ee/0x320
     [<ffffffff816e2a02>] SyS_mount+0xb2/0x120
     [<ffffffff816e2950>] ? copy_mnt_ns+0x970/0x970
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c3282a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 83 e8 63 1b fc ff 48 85 c0 48 89 c3 74 4c e8 56 35 d1 ff 48 8d bb c8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 36 4c 8b a3 c8 00 00 00 48 b8 00 00 00 00 00 fc
    RIP  [<ffffffff8172cb32>] bd_mount+0x52/0xa0
     RSP <ffff880118357ca0>
    ---[ end trace 13690ad962168b98 ]---

mount_pseudo() returns ERR_PTR(), not NULL, on error.

Fixes: 3684aa7099e0 ("block-dev: enable writeback cgroup support")
Cc: Shaohua Li <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Signed-off-by: Vegard Nossum <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

drm/i915: Fix botched merge that downgrades CSR versions.

Merge commit 5e580523d9128a4d8 reverts the version bumping parts of
commit 4aa7fb9c3c4fa0. Bump the versions again and request the specific
firmware version.

The currently recommended versions are: SKL 1.26, KBL 1.01 and BXT 1.07.

Cc: Patrik Jakobsson <[email protected]>
Cc: Imre Deak <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97242
Cc: [email protected]
Fixes: 5e580523d912 ("Backmerge tag 'v4.7' into drm-next")
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/1471266567-22443-1-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Imre Deak <[email protected]>
(cherry picked from commit 536ab3ca19ef856e84389a155c5832c68559a28a)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/skl: Ensure pipes with changed wms get added to the state

If we're enabling a pipe, we'll need to modify the watermarks on all
active planes. Since those planes won't be added to the state on
their own, we need to add them ourselves.

Signed-off-by: Lyude <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Cc: [email protected]
Cc: Ville Syrjälä <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Radhakrishna Sripada <[email protected]>
Cc: Hans de Goede <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/gen9: Only copy WM results for changed pipes to skl_hw

When we write watermark values to the hardware, those values are stored
in dev_priv->wm.skl_hw.  However with recent watermark changes, the
results structure we're copying from only contains valid watermark and
DDB values for the pipes that are actually changing; the values for
other pipes remain 0.  Thus a blind copy of the entire skl_wm_values
structure will clobber the values for unchanged pipes...we need to be
more selective and only copy over the values for the changing pipes.

This mistake was hidden until recently due to another bug that caused us
to erroneously re-calculate watermarks for all active pipes rather than
changing pipes.  Only when that bug was fixed was the impact of this bug
discovered (e.g., modesets failing with "Requested display configuration
exceeds system watermark limitations" messages and leaving watermarks
non-functional, even ones initiated by intel_fbdev_restore_mode).

Changes since v1:
- Add a function for copying a pipe's wm values
   (skl_copy_wm_for_pipe()) so we can reuse this later

Fixes: 734fa01f3a17 ("drm/i915/gen9: Calculate watermarks during atomic 'check' (v2)")
Fixes: 9b6130227495 ("drm/i915/gen9: Re-allocate DDB only for changed pipes")
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Lyude <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Cc: [email protected]
Cc: Maarten Lankhorst <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Radhakrishna Sripada <[email protected]>
Cc: Hans de Goede <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 2722efb90b3420dee54b4cb3cdc7917efacc2dce)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/skl: Add support for the SAGV, fix underrun hangs

Since the watermark calculations for Skylake are still broken, we're apt
to hitting underruns very easily under multi-monitor configurations.
While it would be lovely if this was fixed, it's not. Another problem
that's been coming from this however, is the mysterious issue of
underruns causing full system hangs. An easy way to reproduce this with
a skylake system:

- Get a laptop with a skylake GPU, and hook up two external monitors to
  it
- Move the cursor from the built-in LCD to one of the external displays
  as quickly as you can
- You'll get a few pipe underruns, and eventually the entire system will
  just freeze.

After doing a lot of investigation and reading through the bspec, I
found the existence of the SAGV, which is responsible for adjusting the
system agent voltage and clock frequencies depending on how much power
we need. According to the bspec:

"The display engine access to system memory is blocked during the
adjustment time. SAGV defaults to enabled. Software must use the
GT-driver pcode mailbox to disable SAGV when the display engine is not
able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave
the SAGV enabled, and disable it when we use interlaced pipes/have more
then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe
underruns on Skylake have completely vanished on my T460s. Additionally,
the bspec mentions turning off the SAGV with more then one pipe enabled
as a workaround for display underruns. While this patch doesn't entirely
fix that, it looks like it does improve the situation a little bit so
it's likely this is going to be required to make watermarks on Skylake
fully functional.

This will still need additional work in the future: we shouldn't be
enabling the SAGV if any of the currently enabled planes can't enable WM
levels that introduce latencies >= 30 µs.

Changes since v11:
- Add skl_can_enable_sagv()
- Make sure we don't enable SAGV when not all planes can enable
   watermarks >= the SAGV engine block time. I was originally going to
   save this for later, but I recently managed to run into a machine
   that was having problems with a single pipe configuration + SAGV.
- Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit
- Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE
- Move printks outside of mutexes
- Don't print error messages twice
Changes since v10:
- Apparently sandybridge_pcode_read actually writes values and reads
   them back, despite it's misleading function name. This means we've
   been doing this mostly wrong and have been writing garbage to the
   SAGV control. Because of this, we no longer attempt to read the SAGV
   status during initialization (since there are no helpers for this).
- mlankhorst noticed that this patch was breaking on some very early
   pre-release Skylake machines, which apparently don't allow you to
   disable the SAGV. To prevent machines from failing tests due to SAGV
   errors, if the first time we try to control the SAGV results in the
   mailbox indicating an invalid command, we just disable future attempts
   to control the SAGV state by setting dev_priv->skl_sagv_status to
   I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg.
- Move mutex_unlock() a little higher in skl_enable_sagv(). This
   doesn't actually fix anything, but lets us release the lock a little
   sooner since we're finished with it.
Changes since v9:
- Only enable/disable sagv on Skylake
Changes since v8:
- Add intel_state->modeset guard to the conditional for
   skl_enable_sagv()
Changes since v7:
- Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's
   all we use it for anyway)
- Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification
- Fix a styling error that snuck past me
Changes since v6:
- Protect skl_enable_sagv() with intel_state->modeset conditional in
   intel_atomic_commit_tail()
Changes since v5:
- Don't use is_power_of_2. Makes things confusing
- Don't use the old state to figure out whether or not to
   enable/disable the sagv, use the new one
- Split the loop in skl_disable_sagv into it's own function
- Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail()
Changes since v4:
- Use is_power_of_2 against active_crtcs to check whether we have > 1
   pipe enabled
- Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0
   enabled
- Call skl_sagv_enable/disable() from pre/post-plane updates
Changes since v3:
- Use time_before() to compare timeout to jiffies
Changes since v2:
- Really apply minor style nitpicks to patch this time
Changes since v1:
- Added comments about this probably being one of the requirements to
   fixing Skylake's watermark issues
- Minor style nitpicks from Matt Roper
- Disable these functions on Broxton, since it doesn't have an SAGV

Signed-off-by: Lyude <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: [email protected]
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
[mlankhorst: ENOSYS -> ENXIO, whitespace fixes]

(cherry picked from commit 656d1b89e5ffb83036ab0e2a24be7558f34365c7)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/gen6+: Interpret mailbox error flags

In order to add proper support for the SAGV, we need to be able to know
what the cause of a failure to change the SAGV through the pcode mailbox
was. The reasoning for this is that some very early pre-release Skylake
machines don't actually allow you to control the SAGV on them, and
indicate an invalid mailbox command was sent.

This also might come in handy in the future for debugging.

Changes since v1:
- Add functions for interpreting gen6 mailbox error codes along with
gen7+ error codes, and actually interpret those codes properly
- Renamed patch to reflect new behavior

Signed-off-by: Lyude <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: [email protected]
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
[mlankhorst: -ENOSYS -> -ENXIO for checkpatch]

(cherry picked from commit 87660502f1a4d51fb043e89a45d30c9917787c22)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Reattach comment, complete type specification

In the recent patch
bc3d674 drm/i915: Allow userspace to request no-error-capture upon ...
the final version moved the flags and the associated #defines around
so they were adjacent; unfortunately, they ended up between a comment
and the thing (hw_id) to which the comment applies :(

So this patch reshuffles the comment and subject back together.

Also, as we're touching 'hw_id', let's change it from just 'unsigned'
to a fully-specified 'unsigned int', because some code checking tools
(including checkpatch) object to plain 'unsigned'.

Fixes: bc3d674462e5 ("drm/i915: Allow userspace to request no-error-capture...")
Signed-off-by: Dave Gordon <[email protected]>
Cc: Chris Wilson <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
(cherry picked from commit 0be81156b3fb4d4e8e2c94177e5222dc21c3ff10)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Unconditionally flush any chipset buffers before execbuf

If userspace is asynchronously streaming into the batch or other
execobjects, we may not flush those writes along with a change in cache
domain (as there is no change). Therefore those writes may end up in
internal chipset buffers and not visible to the GPU upon execution. We
must issue a flush command or otherwise we encounter incoherency in the
batchbuffers and the GPU executing invalid commands (i.e. hanging) quite
regularly.

v2: Throw a paranoid wmb() into the general flush so that we remain
consistent with before.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90841
Fixes: 1816f9236303 ("drm/i915: Support creation of unbound wc user...")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Akash Goel <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Tested-by: Matti Hämäläinen <[email protected]>
Cc: [email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 600f436801deae65e48404847b61c89b4944e355)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/gen9: Drop invalid WARN() during data rate calculation

It's possible to have a non-zero plane mask and still wind up with a
total data rate of zero.  There are two cases where this can happen:

* planes are active (from the KMS point of view), but are
   all fully clipped (positioned offscreen)
* the only active plane on a CRTC is the cursor (which is handled
   independently and not counted into the general data rate computations

These are both valid display setups (although unusual), so we need to
drop the WARN().

Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Maarten Lankhorst <[email protected]>
Testcase: kms_universal_planes.cursor-only-pipe-*
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: [email protected] #v4.7+
(cherry picked from commit 43aa7e87507f519b0b2497b6fac1e894554eaef2)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2)

intel_state->active_crtcs is usually only initialized when doing a
modeset.  During our first atomic commit after boot, we're effectively
faking a modeset to sanitize the DDB/wm setup, so ensure that this field
gets initialized before use.

v2:
- Don't clobber active_crtcs if our first commit really is a modeset
   (Maarten)
- Grab connection_mutex when faking a modeset during sanitization
   (Maarten)

Reported-by: Tvrtko Ursulin <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Tested-by: Tvrtko Ursulin <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: [email protected] #v4.7+
(cherry picked from commit 1b54a880b250acc226b13cea221b90aa1b3e37dd)
Signed-off-by: Jani Nikula <[email protected]>

Merge branch 'for-joerg/arm-smmu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into iommu/fixes

ALSA: line6: Fix POD sysfs attributes segfault

The commit 02fc76f6a changed base of the sysfs attributes from device to card.
The "show" callbacks dereferenced wrong objects because of this.

Fixes: 02fc76f6a7db ('ALSA: line6: Create sysfs via snd_card_add_dev_attr()')
Cc: <[email protected]> # v4.0+
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Andrej Krutak <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: line6: Give up on the lock while URBs are released.

Done, because line6_stream_stop() locks and calls line6_unlink_audio_urbs(),
which in turn invokes audio_out_callback(), which tries to lock 2nd time.

Fixes:

=============================================
[ INFO: possible recursive locking detected ]
4.4.15+ #15 Not tainted
---------------------------------------------
mplayer/3591 is trying to acquire lock:
(&(&line6pcm->out.lock)->rlock){-.-...}, at: [<bfa27655>] audio_out_callback+0x70/0x110 [snd_usb_line6]

but task is already holding lock:
(&(&line6pcm->out.lock)->rlock){-.-...}, at: [<bfa26aad>] line6_stream_stop+0x24/0x5c [snd_usb_line6]

other info that might help us debug this:
Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&line6pcm->out.lock)->rlock);
  lock(&(&line6pcm->out.lock)->rlock);

*** DEADLOCK ***

May be due to missing lock nesting notation

3 locks held by mplayer/3591:
#0:  (snd_pcm_link_rwlock){.-.-..}, at: [<bf8d49a7>] snd_pcm_stream_lock+0x1e/0x40 [snd_pcm]
#1:  (&(&substream->self_group.lock)->rlock){-.-...}, at: [<bf8d49af>] snd_pcm_stream_lock+0x26/0x40 [snd_pcm]
#2:  (&(&line6pcm->out.lock)->rlock){-.-...}, at: [<bfa26aad>] line6_stream_stop+0x24/0x5c [snd_usb_line6]

stack backtrace:
CPU: 0 PID: 3591 Comm: mplayer Not tainted 4.4.15+ #15
Hardware name: Generic AM33XX (Flattened Device Tree)
[<c0015d85>] (unwind_backtrace) from [<c001253d>] (show_stack+0x11/0x14)
[<c001253d>] (show_stack) from [<c02f1bdf>] (dump_stack+0x8b/0xac)
[<c02f1bdf>] (dump_stack) from [<c0076f43>] (__lock_acquire+0xc8b/0x1780)
[<c0076f43>] (__lock_acquire) from [<c007810d>] (lock_acquire+0x99/0x1c0)
[<c007810d>] (lock_acquire) from [<c06171e7>] (_raw_spin_lock_irqsave+0x3f/0x4c)
[<c06171e7>] (_raw_spin_lock_irqsave) from [<bfa27655>] (audio_out_callback+0x70/0x110 [snd_usb_line6])
[<bfa27655>] (audio_out_callback [snd_usb_line6]) from [<c04294db>] (__usb_hcd_giveback_urb+0x53/0xd0)
[<c04294db>] (__usb_hcd_giveback_urb) from [<c046388d>] (musb_giveback+0x3d/0x98)
[<c046388d>] (musb_giveback) from [<c04647f5>] (musb_urb_dequeue+0x6d/0x114)
[<c04647f5>] (musb_urb_dequeue) from [<c042ac11>] (usb_hcd_unlink_urb+0x39/0x98)
[<c042ac11>] (usb_hcd_unlink_urb) from [<bfa26a87>] (line6_unlink_audio_urbs+0x6a/0x6c [snd_usb_line6])
[<bfa26a87>] (line6_unlink_audio_urbs [snd_usb_line6]) from [<bfa26acb>] (line6_stream_stop+0x42/0x5c [snd_usb_line6])
[<bfa26acb>] (line6_stream_stop [snd_usb_line6]) from [<bfa26fe7>] (snd_line6_trigger+0xb6/0xf4 [snd_usb_line6])
[<bfa26fe7>] (snd_line6_trigger [snd_usb_line6]) from [<bf8d47b7>] (snd_pcm_do_stop+0x36/0x38 [snd_pcm])
[<bf8d47b7>] (snd_pcm_do_stop [snd_pcm]) from [<bf8d462f>] (snd_pcm_action_single+0x22/0x40 [snd_pcm])
[<bf8d462f>] (snd_pcm_action_single [snd_pcm]) from [<bf8d46f9>] (snd_pcm_action+0xac/0xb0 [snd_pcm])
[<bf8d46f9>] (snd_pcm_action [snd_pcm]) from [<bf8d4b61>] (snd_pcm_drop+0x38/0x64 [snd_pcm])
[<bf8d4b61>] (snd_pcm_drop [snd_pcm]) from [<bf8d6233>] (snd_pcm_common_ioctl1+0x7fe/0xbe8 [snd_pcm])
[<bf8d6233>] (snd_pcm_common_ioctl1 [snd_pcm]) from [<bf8d6779>] (snd_pcm_playback_ioctl1+0x15c/0x51c [snd_pcm])
[<bf8d6779>] (snd_pcm_playback_ioctl1 [snd_pcm]) from [<bf8d6b59>] (snd_pcm_playback_ioctl+0x20/0x28 [snd_pcm])
[<bf8d6b59>] (snd_pcm_playback_ioctl [snd_pcm]) from [<c016714b>] (do_vfs_ioctl+0x3af/0x5c8)

Fixes: 63e20df1e5b2 ('ALSA: line6: Reorganize PCM stream handling')
Cc: <[email protected]> # v4.0+
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Andrej Krutak <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: line6: Remove double line6_pcm_release() after failed acquire.

If there's an error, pcm is released in line6_pcm_acquire already.

Fixes: 247d95ee6dd2 ('ALSA: line6: Handle error from line6_pcm_acquire()')
Cc: <[email protected]> # v4.0+
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Andrej Krutak <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

genirq/affinity: Use get/put_online_cpus around cpumask operations

Without locking out CPU mask operations we might end up with an inconsistent
view of the cpumask in the function.

Fixes: 5e385a6ef31f: "genirq: Add a helper to spread an affinity mask for MSI/MSI-X vectors"
Signed-off-by: Christoph Hellwig <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Fix potential memleak when failing to get irq pm

Obviously we should free action here if irq_chip_pm_get failed.

Fixes: be45beb2df69: "genirq: Add runtime power management support for IRQ chips"
Signed-off-by: Shawn Lin <[email protected]>
Cc: Jon Hunter <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

Merge tag 'irqchip-for-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull genirq/irqchip fixes for 4.8-rc4 from Marc Zygnier

- A critical fix for chained irqchip where we failed to configure
the cascade interrupt trigger
- A GIC fix for self-IPI in SMP-on-UP configurations
- A PM fix for GICv3
- A initialization fix the the GICv3 ITS, triggered by kexec

drm: Reject page_flip for !DRIVER_MODESET

Somehow this one slipped through, which means drivers without modeset
support can be oopsed (since those also don't call
drm_mode_config_init, which means the crtc lookup will chase an
uninitalized idr).

Reported-by: Alexander Potapenko <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

powerpc: move hmi.c to arch/powerpc/kvm/

hmi.c functions are unused unless sibling_subcore_state is nonzero, and
that in turn happens only if KVM is in use. So move the code to
arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
rather than CONFIG_PPC_BOOK3S_64. The sibling_subcore_state is also
included in struct paca_struct only if KVM is supported by the kernel.

Cc: Daniel Axtens <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: sysdev: cpm: fix gpio save_regs functions

of_mm_gpiochip_add_data() calls mm_gc->save_regs() before
setting the data. Therefore ->save_regs() cannot use
gpiochip_get_data()

[    0.275940] Unable to handle kernel paging request for data at address 0x00000130
[    0.283120] Faulting instruction address: 0xc01b44cc
[    0.288175] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.293343] PREEMPT CMPC885
[    0.296141] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-g65124df-dirty #68
[    0.304131] task: c6074000 ti: c6080000 task.ti: c6080000
[    0.309459] NIP: c01b44cc LR: c0011720 CTR: c0011708
[    0.314372] REGS: c6081d90 TRAP: 0300   Not tainted  (4.7.0-g65124df-dirty)
[    0.322267] MSR: 00009032 <EE,ME,IR,DR,RI>  CR: 24000028  XER: 20000000
[    0.328813] DAR: 00000130 DSISR: c0000000
GPR00: c01b6d0c c6081e40 c6074000 c6017000 c9028000 c601d028 c6081dd8 00000000
GPR08: c601d028 00000000 ffffffff 00000001 24000044 00000000 c0002790 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c05643b0 00000083
GPR24: c04a1a6c c0560000 c04a8308 c04c6480 c0012498 c6017000 c7ffcc78 c6017000
[    0.360806] NIP [c01b44cc] gpiochip_get_data+0x4/0xc
[    0.365684] LR [c0011720] cpm1_gpio16_save_regs+0x18/0x44
[    0.370972] Call Trace:
[    0.373451] [c6081e50] [c01b6d0c] of_mm_gpiochip_add_data+0x70/0xdc
[    0.379624] [c6081e70] [c00124c0] cpm_init_par_io+0x28/0x118
[    0.385238] [c6081e80] [c04a8ac0] do_one_initcall+0xb0/0x17c
[    0.390819] [c6081ef0] [c04a8cbc] kernel_init_freeable+0x130/0x1dc
[    0.396924] [c6081f30] [c00027a4] kernel_init+0x14/0x110
[    0.402177] [c6081f40] [c000b424] ret_from_kernel_thread+0x5c/0x64
[    0.408233] Instruction dump:
[    0.411168] 4182fafc 3f80c040 48234c6d 3bc0fff0 3b9c5ed0 4bfffaf4 81290020 712a0004
[    0.418825] 4182fb34 48234c51 4bfffb2c 81230004 <80690130> 4e800020 7c0802a6 9421ffe0
[    0.426763] ---[ end trace fe4113ee21d72ffa ]---

fixes: e65078f1f3490 ("powerpc: sysdev: cpm1: use gpiochip data pointer")
fixes: a14a2d484b386 ("powerpc: cpm_common: use gpiochip data pointer")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/pseries: PACA save area fix for MCE vs MCE

MCE must not enable MSR_RI until PACA_EXMC is no longer being used.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/pseries: PACA save area fix for general exception vs MCE

MCE must not use PACA_EXGEN. When a general exception enables MSR_RI,
that means SPRN_SRR[01] and SPRN_SPRG are no longer used. However the
PACA save area is still in use.
Acked-by: Mahesh Salgaonkar <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support

When booting from an OpenFirmware which supports it, we use the
"ibm,client-architecture-support" firmware call to communicate
our capabilities to firmware.

The format of the structure we pass to firmware is specified in
PAPR (Power Architecture Platform Requirements), or the public version
LoPAPR (Linux on Power Architecture Platform Reference).

Referring to table 244 in LoPAPR v1.1, option vector 5 contains a 4 byte
field at bytes 17-20 for the "Platform Facilities Enable". This is
followed by a 1 byte field at byte 21 for "Sub-Processor Represenation
Level".

Comparing to the code, there we have the Platform Facilities
options (OV5_PFO_*) at byte 17, but we fail to pad that field out to its
full width of 4 bytes. This means the OV5_SUB_PROCESSORS option is
incorrectly placed at byte 18.

Fix it by adding zero bytes for bytes 18, 19, 20, and comment the bytes
to hopefully make it clearer in future.

As far as I'm aware nothing actually consumes this value at this time,
so the effect of this bug is nil in practice.

It does mean we've been incorrectly setting bit 15 of the "Platform
Facilities Enable" option for the past ~3 1/2 years, so we should avoid
allocating that bit to anything else in future.

Fixes: df77c7992029 ("powerpc/pseries: Update ibm,architecture.vec for PAPR 2.7/POWER8")
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc, hotplug: Avoid to touch non-existent cpumasks.

We observed a kernel oops when running a PPC guest with config NR_CPUS=4
and qemu option "-smp cores=1,threads=8":

[   30.634781] Unable to handle kernel paging request for data at
address 0xc00000014192eb17
[   30.636173] Faulting instruction address: 0xc00000000003e5cc
[   30.637069] Oops: Kernel access of bad area, sig: 11 [#1]
[   30.637877] SMP NR_CPUS=4 NUMA pSeries
[   30.638471] Modules linked in:
[   30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
4.7.0-07963-g9714b26 #1
[   30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
[   30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
0000000000000000
[   30.642001] REGS: c00000001e2ab8e0 TRAP: 0300   Not tainted
(4.7.0-07963-g9714b26)
[   30.643139] MSR: 8000000102803033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22004084  XER: 00000000
[   30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
[   30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
[   30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
[   30.655528] Call Trace:
[   30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
[   30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
[   30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
[   30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
[   30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
[   30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
[   30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
[   30.664017] Instruction dump:
[   30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
[   30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 <7d4048a8> 7d4a3878 7d4049ad 40c2fff4
[   30.666854] ---[ end trace 32643b7195717741 ]---

The reason of this is that in __cpu_disable(), when we try to set the
cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
one, we don't check whether the current configuration employs those
sibling CPUs(hw threads). And if a CPU is not employed by a
configuration, the percpu structures cpu_{sibling,core}_mask are not
allocated, therefore accessing those cpumasks will result in problems as
above.

This patch fixes this problem by adding an addition check on whether the
id is no less than nr_cpu_ids in the sibling CPU iteration code.

Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: migrate exception table users off module.h and onto extable.h

These files were only including module.h for exception table
related functions. We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Signed-off-by: Paul Gortmaker <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/powernv/pci: fix iterator signedness

Unsigned type is always non-negative, so the loop could not end in case
condition is never true.

The problem has been detected using semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)

This patch leverages 'struct pci_host_bridge' from the PCI subsystem
in order to free the pci_controller only after the last reference to
its devices is dropped (avoiding an oops in pcibios_release_device()
if the last reference is dropped after pcibios_free_controller()).

The patch relies on pci_host_bridge.release_fn() (and .release_data),
which is called automatically by the PCI subsystem when the root bus
is released (i.e., the last reference is dropped).  Those fields are
set via pci_set_host_bridge_release() (e.g. in the platform-specific
implementation of pcibios_root_bridge_prepare()).

It introduces the 'pcibios_free_controller_deferred()' .release_fn()
and it expects .release_data to hold a pointer to the pci_controller.

The function implictly calls 'pcibios_free_controller()', so an user
must *NOT* explicitly call it if using the new _deferred() callback.

The functionality is enabled for pseries (although it isn't platform
specific, and may be used by cxl).

Details on not-so-elegant design choices:

- Use 'pci_host_bridge.release_data' field as pointer to associated
   'struct pci_controller' so *not* to 'pci_bus_to_host(bridge->bus)'
   in pcibios_free_controller_deferred().

   That's because pci_remove_root_bus() sets 'host_bridge->bus = NULL'
   (so, if the last reference is released after pci_remove_root_bus()
   runs, which eventually reaches pcibios_free_controller_deferred(),
   that would hit a null pointer dereference).

   The cxl/vphb.c code calls pci_remove_root_bus(), and the cxl folks
   are interested in this fix.

Test-case #1 (hold references)

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
  <...> /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/<...>

  # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
  <...> /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/<...>

  # cat >/dev/sdaa & pid1=$!
  # cat >/dev/sdab & pid2=$!

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  594.306719] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  594.306738] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  598.236381] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  611.972077] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  611.972140] rpadlpar_io: slot PHB 33 removed

  # kill -9 $pid1
  # kill -9 $pid2
  [  632.918088] pcibios_free_controller_deferred: domain 33, dynamic 1

Test-case #2 (don't hold references)

  # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
  Validating PHB DLPAR capability...yes.
  [  916.357363] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
  [  916.357386] pci_hp_remove_devices:    Removing 0021:01:00.0...
  ...
  [  920.566527] pci_hp_remove_devices:    Removing 0021:01:00.1...
  ...
  [  933.955873] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  933.955977] pcibios_free_controller_deferred: domain 33, dynamic 1
  [  933.955999] rpadlpar_io: slot PHB 33 removed

Suggested-By: Gavin Shan <[email protected]>
Signed-off-by: Mauricio Faria de Oliveira <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Andrew Donnellan <[email protected]>
Tested-by: Andrew Donnellan <[email protected]> # cxl
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

cxl: use pcibios_free_controller_deferred() when removing vPHBs

When cxl removes a vPHB, it's possible that the pci_controller may be freed
before all references to the devices on the vPHB have been released. This
in turn causes an invalid memory access when the devices are eventually
released, as pcibios_release_device() attempts to call the phb's
release_device hook.

In cxl_pci_vphb_remove(), remove the existing call to
pcibios_free_controller(). Instead, use
pcibios_free_controller_deferred() to free the pci_controller after all
devices have been released. Export pci_set_host_bridge_release() so we can
do this.

Cc: [email protected]
Signed-off-by: Andrew Donnellan <[email protected]>
Reviewed-by: Matthew R. Ochs <[email protected]>
Acked-by: Ian Munsie <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"

The field "owner" is set by the core.
Thus delete an unneeded initialisation.

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/512x: Delete unnecessary assignment for the field "owner"

The field "owner" is set by the core.
Thus delete an unneeded initialisation.

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

drivers/macintosh: Delete owner assignment

The field "owner" is set by core. Thus delete an extra initialisation.

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: cputhreads: Add missing include file

Powerpc builds may fail with the following build error.

Error log:
In file included from ./arch/powerpc/include/asm/mmu_context.h:11:0,
from ./include/linux/mmu_context.h:4,
from mm/mmu_context.c:8:
./arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr':
./arch/powerpc/include/asm/cputhreads.h:101:2: error:
implicit declaration of function 'cpu_has_feature'

The problem can be triggered by configuring ppc64e_defconfig and selecting
CONFIG_TICK_CPU_ACCOUNTING instead of CONFIG_VIRT_CPU_ACCOUNTING_NATIVE.

Fixes: b92a226e5284 ("powerpc: Move cpu_has_feature() to a separate file")
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

Linux 4.8-rc3

net: tehuti: fix typo: "eneble" -> "enable"

trivial typo fix in pr_err message

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull two parisc fixes from Helge Deller:
"The first patch ensures that the high-res cr16 clocksource (which was
  added in kernel 4.7) gets choosen as default clocksource for parisc.

  The second patch moves the #define of EREFUSED down inside errno.h and
  thus unbreaks building the gccgo compiler"

* 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix order of EREFUSED define in errno.h
  parisc: Fix automatic selection of cr16 clocksource

EDAC, skx_edac: Add EDAC driver for Skylake

This is an entirely new driver instead of yet another set of patches
to sb_edac.c because:

1) Mapping from PCI devices to socket/memory controller is significantly
   different. Skylake scatters devices on a socket across a number of
   PCI buses.
2) There is an extra level of interleaving via the "mcroute" register
   that would be a little messy to squeeze into the old driver.
3) Validation is getting too expensive. Changes to sb_edac need to
   be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and
   Knights Landing.

Acked-by: Aristeu Rozanski <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

net: xilinx: emaclite: Fallback to random MAC address.

If the address configured in the device tree is invalid, the
driver will fallback to using a random address from the locally
administered range.

Signed-off-by: Daniel Romell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

parisc: Fix order of EREFUSED define in errno.h

When building gccgo in userspace, errno.h gets parsed and the go include file
sysinfo.go is generated.

Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
is defined later on in errno.h, this leads to go complaining that EREFUSED
isn't defined yet.

Fix this trivial problem by moving the define of EREFUSED down after
ECONNREFUSED in errno.h (and clean up the indenting while touching this line).

Signed-off-by: Helge Deller <[email protected]>
Cc: [email protected]

parisc: Fix automatic selection of cr16 clocksource

Commit 54b66800907 (parisc: Add native high-resolution sched_clock()
implementation) added support to use the CPU-internal cr16 counters as reliable
clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK.

Sadly the commit missed to remove the hack which prevented cr16 to become the
default clocksource even on SMP systems.

Signed-off-by: Helge Deller <[email protected]>
Cc: [email protected] # 4.7+

vmxnet3: fix tx data ring copy for variable size

'Commit 3c8b3efc061a ("vmxnet3: allow variable length transmit data ring
buffer")' changed the size of the buffers in the tx data ring from a
fixed size of 128 bytes to a variable size.

However, while copying data to the data ring, vmxnet3_copy_hdr continues
to carry the old code that assumes fixed buffer size of 128. This patch
fixes it by adding correct offset based on the actual data ring buffer
size.

Signed-off-by: Guolin Yang <[email protected]>
Signed-off-by: Shrikrishna Khare <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ixgbe: Do not clear RAR entry when clearing VMDq for SAN MAC

The RAR entry for the SAN MAC address was being cleared when we were
clearing the VMDq pool bits. In order to prevent this we need to add
an extra check to protect the SAN MAC from being cleared.

Fixes: 6e982aeae ("ixgbe: Clear stale pool mappings")
Signed-off-by: Alexander Duyck <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum_buffers: Fix pool value handling in mlxsw_sp_sb_tc_pool_bind_set

Pool index has to be converted by get_pool helper to work correctly for
egress pool. In mlxsw the egress pool index starts from 0.

Fixes: 0f433fa0ecc ("mlxsw: spectrum_buffers: Implement shared buffer configuration")
Signed-off-by: Jiri Pirko <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

l2tp: Fix the connect status check in pppol2tp_getname

The sk->sk_state is bits flag, so need use bit operation check
instead of value check.

Signed-off-by: Gao Feng <[email protected]>
Tested-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sctp: linearize early if it's not GSO

Because otherwise when crc computation is still needed it's way more
expensive than on a linear buffer to the point that it affects
performance.

It's so expensive that netperf test gives a perf output as below:

Overhead  Command         Shared Object       Symbol
  18,62%  netserver       [kernel.vmlinux]    [k] crc32_generic_shift
   2,57%  netserver       [kernel.vmlinux]    [k] __pskb_pull_tail
   1,94%  netserver       [kernel.vmlinux]    [k] fib_table_lookup
   1,90%  netserver       [kernel.vmlinux]    [k] copy_user_enhanced_fast_string
   1,66%  swapper         [kernel.vmlinux]    [k] intel_idle
   1,63%  netserver       [kernel.vmlinux]    [k] _raw_spin_lock
   1,59%  netserver       [sctp]              [k] sctp_packet_transmit
   1,55%  netserver       [kernel.vmlinux]    [k] memcpy_erms
   1,42%  netserver       [sctp]              [k] sctp_rcv

# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

212992 212992  12000    10.00      3016.42   2.88     3.78     1.874   2.462

After patch:
Overhead  Command         Shared Object      Symbol
   2,75%  netserver       [kernel.vmlinux]   [k] memcpy_erms
   2,63%  netserver       [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
   2,39%  netserver       [kernel.vmlinux]   [k] fib_table_lookup
   2,04%  netserver       [kernel.vmlinux]   [k] __pskb_pull_tail
   1,91%  netserver       [kernel.vmlinux]   [k] _raw_spin_lock
   1,91%  netserver       [sctp]             [k] sctp_packet_transmit
   1,72%  netserver       [mlx4_en]          [k] mlx4_en_process_rx_cq
   1,68%  netserver       [sctp]             [k] sctp_rcv

# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

212992 212992  12000    10.00      3681.77   3.83     3.46     2.045   1.849

Fixes: 3acb50c18d8d ("sctp: delay as much as possible skb_linearize")
Signed-off-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlx5-fixes'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-08-16

This series includes some bug fixes for mlx5e driver.

From Saeed and Tariq, Optimize MTU change to not reset when it is not required.

From Paul, Command interface message length check to speedup firmware
command preparation.

From Mohamad, Save pci state when pci error is detected.

From Amir, Flow counters "lastuse" update fix.

From Hadar, Use correct flow dissector key on flower offloading.
Plus a small optimization for switchdev hardware id query.

From Or, three patches to address some E-Switch offloads issues.

For -stable of 4.6.y and 4.7.y:
    net/mlx5e: Use correct flow dissector key on flower offloading
    net/mlx5: Fix pci error recovery flow
    net/mlx5: Added missing check of msg length in verifying its signature
====================

Signed-off-by: David S. Miller <[email protected]>

net/mlx5: E-Switch, Avoid ACLs in the offloads mode

When we are in the switchdev/offloads mode, HW matching is done as
dictated by the offloaded rules and hence we don't need to enable
the ACLs mechanism used by the legacy mode.

Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: E-Switch, Set the send-to-vport rules in the correct table

While adding actual offloading support to the new switchdev mode, we didn't
change the setup of the send-to-vport rules to put them in the slow path
table, fix that.

Fixes: 1033665e63b6 ('net/mlx5: E-Switch, Use two priorities for SRIOV offloads mode')
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: E-Switch, Return the correct devlink e-switch mode

Since mlx5 has also the NONE e-switch mode, we must translate from mlx5
mode to devlink mode on the devlink eswitch mode get call, do that.

While here, remove the mlx5_ prefix from the static function helpers
that deal with the mode to comply with the rest of the code.

Fixes: c930a3ad7453 ('net/mlx5e: Add devlink based SRIOV mode change')
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: Retrieve the switchdev id from the firmware only once

Avoid firmware command execution each time the switchdev HW ID attr get
call is made. We do that by reading the ID (PF NIC MAC) only once at
load time and store it on the representor structure.

Signed-off-by: Hadar Hen Zion <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: Use correct flow dissector key on flower offloading

The wrong key is used when extracting the address type field set by
the flower offload code. We have to use the control key and not the
basic key, fix that.

Fixes: e3a2b7ed018e ('net/mlx5e: Support offload cls_flower with drop action')
Signed-off-by: Hadar Hen Zion <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Update last-use statistics for flow rules

Set lastuse statistic, when number of packets is changed compared to
last query. This was wrongly dropped when bulk counter reading was added.

Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters')
Signed-off-by: Amir Vadai <[email protected]>
Reported-by: Paul Blakey <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Added missing check of msg length in verifying its signature

Set and verify signature calculates the signature for each of the
mailbox nodes, even for those that are unused (from cache). Added
a missing length check to set and verify only those which are used.

While here, also moved the setting of msg's nodes token to where we
already go over them. This saves a pass because checksum is disabled,
and the only useful thing remaining that set signature does is setting
the token.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB
adapters')
Signed-off-by: Paul Blakey <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Fix pci error recovery flow

When PCI error is detected we should save the state of the pci prior to
disabling it.

Also when receiving pci slot reset call we need to verify that the
device is responsive.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core
driver')
Signed-off-by: Mohamad Haj Yahia <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: Optimization for MTU change

Avoid unnecessary interface down/up operations upon an MTU change
when it does not affect the rings configuration.

Fixes: 461017cb006a ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>