]> Git Repo - linux.git/log
linux.git
8 years agoMerge branch 'vsock-pkt-cancel'
David S. Miller [Tue, 21 Mar 2017 21:41:47 +0000 (14:41 -0700)]
Merge branch 'vsock-pkt-cancel'

Peng Tao says:

====================
vsock: cancel connect packets when failing to connect

Currently, if a connect call fails on a signal or timeout (e.g., guest is still
in the process of starting up), we'll just return to caller and leave the connect
packet queued and they are sent even though the connection is considered a failure,
which can confuse applications with unwanted false connect attempt.

The patchset enables vsock (both host and guest) to cancel queued packets when
a connect attempt is considered to fail.

v5 changelog:
  - change virtio_vsock_pkt->cancel_token back to virtio_vsock_pkt->vsk
v4 changelog:
  - drop two unnecessary void * cast
  - update new callback comment
v3 changelog:
  - define cancel_pkt callback in struct vsock_transport rather than struct virtio_transport
  - rename virtio_vsock_pkt->vsk to virtio_vsock_pkt->cancel_token
v2 changelog:
  - fix queued_replies counting and resume tx/rx when necessary
====================

Signed-off-by: David S. Miller <[email protected]>
8 years agovsock: cancel packets when failing to connect
Peng Tao [Wed, 15 Mar 2017 01:32:17 +0000 (09:32 +0800)]
vsock: cancel packets when failing to connect

Otherwise we'll leave the packets queued until releasing vsock device.
E.g., if guest is slow to start up, resulting ETIMEDOUT on connect, guest
will get the connect requests from failed host sockets.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Jorgen Hansen <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agovsock: add pkt cancel capability
Peng Tao [Wed, 15 Mar 2017 01:32:16 +0000 (09:32 +0800)]
vsock: add pkt cancel capability

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agovhost-vsock: add pkt cancel capability
Peng Tao [Wed, 15 Mar 2017 01:32:15 +0000 (09:32 +0800)]
vhost-vsock: add pkt cancel capability

To allow canceling all packets of a connection.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Jorgen Hansen <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agovsock: track pkt owner vsock
Peng Tao [Wed, 15 Mar 2017 01:32:14 +0000 (09:32 +0800)]
vsock: track pkt owner vsock

So that we can cancel a queued pkt later if necessary.

Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agocrypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex
Herbert Xu [Tue, 14 Mar 2017 10:25:57 +0000 (18:25 +0800)]
crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote:
>
> Yes, please.
> Disregarding some reports is not a good way long term.

Please try this patch.

---8<---
Subject: netlink: Annotate nlk cb_mutex by protocol

Currently all occurences of nlk->cb_mutex are annotated by lockdep
as a single class.  This causes a false lcokdep cycle involving
genl and crypto_user.

This patch fixes it by dividing cb_mutex into individual classes
based on the netlink protocol.  As genl and crypto_user do not
use the same netlink protocol this breaks the false dependency
loop.

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agor8152: fix the list rx_done may be used without initialization
hayeswang [Tue, 14 Mar 2017 06:15:20 +0000 (14:15 +0800)]
r8152: fix the list rx_done may be used without initialization

The list rx_done would be initialized when the linking on occurs.
Therefore, if a napi is scheduled without any linking on before,
the following kernel panic would happen.

BUG: unable to handle kernel NULL pointer dereference at 000000000000008
IP: [<ffffffffc085efde>] r8152_poll+0xe1e/0x1210 [r8152]
PGD 0
Oops: 0002 [#1] SMP

Signed-off-by: Hayes Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
8 years agocpuidle: Validate cpu_dev in cpuidle_add_sysfs()
Vaidyanathan Srinivasan [Sat, 18 Mar 2017 19:21:59 +0000 (00:51 +0530)]
cpuidle: Validate cpu_dev in cpuidle_add_sysfs()

If a given cpu is not in cpu_present and cpu hotplug
is disabled, arch can skip setting up the cpu_dev.

Arch cpuidle driver should pass correct cpu mask
for registration, but failing to do so by the driver
causes error to propagate and crash like this:

[   30.076045] Unable to handle kernel paging request for data at address 0x00000048
[   30.076100] Faulting instruction address: 0xc0000000007b2f30
cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670]
    pc: c0000000007b2f30: kobject_get+0x20/0x70
    lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0
    sp: c000003feb18b8f0
   msr: 9000000000009033
   dar: 48
 dsisr: 40000000
  current = 0xc000003fd2ed8300
  paca    = 0xc00000000fbab500   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/0
Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0
20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017
enter ? for help
[c000003feb18b960c0000000007b3c94 kobject_add_internal+0x54/0x3f0
[c000003feb18b9f0c0000000007b43a4 kobject_init_and_add+0x64/0xa0
[c000003feb18ba70c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130
[c000003feb18baf0c000000000e26038 cpuidle_register_device+0x118/0x1c0
[c000003feb18bb30c000000000e26c48 cpuidle_register+0x78/0x120
[c000003feb18bbc0c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4
[c000003feb18bc40c00000000000cff8 do_one_initcall+0x68/0x1d0
[c000003feb18bd00c0000000016242f4 kernel_init_freeable+0x280/0x360
[c000003feb18bdc0c00000000000d864 kernel_init+0x24/0x160
[c000003feb18be30c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74

Validating cpu_dev fixes the crash and reports correct error message like:

[   30.163506] Failed to register cpuidle device for cpu136
[   30.173329] Registration of powernv driver failed.

Signed-off-by: Vaidyanathan Srinivasan <[email protected]>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
8 years agocpufreq: intel_pstate: Fix policy data management in passive mode
Rafael J. Wysocki [Tue, 21 Mar 2017 21:19:07 +0000 (22:19 +0100)]
cpufreq: intel_pstate: Fix policy data management in passive mode

The policy->cpuinfo.max_freq and policy->max updates in
intel_cpufreq_turbo_update() are excessive as they are done for no
good reason and may lead to problems in principle, so they should be
dropped.  However, after dropping them intel_cpufreq_turbo_update()
becomes almost entirely pointless, because the check made by it is
made again down the road in intel_pstate_prepare_request().  The
only thing in it that still needs to be done is the call to
update_turbo_state(), so drop intel_cpufreq_turbo_update() altogether
and make its callers invoke update_turbo_state() directly instead of
it.

In addition to that, fix intel_cpufreq_verify_policy() so that it
checks global.no_turbo in addition to global.turbo_disabled when
updating policy->cpuinfo.max_freq to make it consistent with
intel_pstate_verify_policy().

Fixes: 001c76f05b01 (cpufreq: intel_pstate: Generic governors support)
Signed-off-by: Rafael J. Wysocki <[email protected]>
8 years agomm, swap: Remove WARN_ON_ONCE() in free_swap_slot()
Huang Ying [Mon, 20 Mar 2017 06:26:42 +0000 (14:26 +0800)]
mm, swap: Remove WARN_ON_ONCE() in free_swap_slot()

Before commit 452b94b8c8c7 ("mm/swap: don't BUG_ON() due to
uninitialized swap slot cache"), the following bug is reported,

  ------------[ cut here ]------------
  kernel BUG at mm/swap_slots.c:270!
  invalid opcode: 0000 [#1] SMP
  CPU: 5 PID: 1745 Comm: (sd-pam) Not tainted 4.11.0-rc1-00243-g24c534bb161b #1
  Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016
  RIP: 0010:free_swap_slot+0xba/0xd0
  Call Trace:
   swap_free+0x36/0x40
   do_swap_page+0x360/0x6d0
   __handle_mm_fault+0x880/0x1080
   handle_mm_fault+0xd0/0x240
   __do_page_fault+0x232/0x4d0
   do_page_fault+0x20/0x70
   page_fault+0x22/0x30
  ---[ end trace aefc9ede53e0ab21 ]---

This is raised by the BUG_ON(!swap_slot_cache_initialized) in
free_swap_slot().  This is incorrect, because even if the swap slots
cache fails to be initialized, the swap should operate properly without
the swap slots cache.  And the use_swap_slot_cache check later in the
function will protect the uninitialized swap slots cache case.

In commit 452b94b8c8c7, the BUG_ON() is replaced by WARN_ON_ONCE().  In
the patch, the WARN_ON_ONCE() is removed too.

Reported-by: Linus Torvalds <[email protected]>
Acked-by: Tim Chen <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: "Huang, Ying" <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Tue, 21 Mar 2017 20:10:17 +0000 (13:10 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine small fixes: the biggest is probably finally sorting out Kconfig
  issues with lpfc nvme.  There are some performance fixes for megaraid
  and hpsa and a static checker fix"

[ Johannes Thumshirn points out that there still seems to be more lpfc
  vs nvme config issues.  Oh well.   - Linus ]

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: Finalize Kconfig options for nvme
  scsi: ufs: don't check unsigned type for a negative value
  scsi: hpsa: do not timeout reset operations
  scsi: hpsa: limit outstanding rescans
  scsi: hpsa: update check for logical volume status
  scsi: megaraid_sas: Driver version upgrade
  scsi: megaraid_sas: raid6 also require cpuSel check same as raid5
  scsi: megaraid_sas: add correct return type check for ldio hint logic for raid1
  scsi: megaraid_sas: enable intx only if msix request fails

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Tue, 21 Mar 2017 20:07:18 +0000 (13:07 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:

 - regression fixes for Wacom devices, from Aaron Armstrong Skomra and
   Ping Cheng

 - memory leak in hid-sony driver from Roderick Colenbrander

 - new device IDs support from Oscar Campos and Daniel Drake

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: wacom: generic: Wacom mouse is only provided for opaque tablets
  HID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair
  HID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB
  HID: wacom: don't manually release resources for the EKR
  HID: wacom: Correct Intuos Pro 2 resolution
  HID: sony: Fix input device leak when connecting a DS4 twice using USB/BT
  HID: chicony: Add support for another ASUS Zen AiO keyboard

8 years agoMerge tag 'gpio-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Tue, 21 Mar 2017 20:01:53 +0000 (13:01 -0700)]
Merge tag 'gpio-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here is the first set of GPIO fixes for 4.11. It was delayed a bit
  beacuse I was chicken when linux-next was not rotating last week.

  This hits the ST serial driver in drivers/tty/serial and that has an
  ACK from Greg, he suggested to keep the old GPIO fwnode API around to
  smoothen things in the merge Windod and those have now served their
  purpose so we take them out and convert the last driver to the new
  API.

  Apart from that it's fixes as usual.

  Summary:

   - set the parent on the Altera A10SR driver, also fix high level
     IRQs.

   - fix error path on the mockup driver.

   - compilation noise about unused functions fixed.

   - fix missed interrupts on the MCP23S08 expander, this is also tagged
     for stable.

   - retire the interrim helpers devm_get_gpiod_from_child() used to
     smoothen merging in the merge window"

* tag 'gpio-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio:mcp23s08 Fixed missing interrupts
  serial: st-asc: Use new GPIOD API to obtain RTS pin
  gpio: altera: Use handle_level_irq when configured as a level_high
  gpio: xgene: mark PM functions as __maybe_unused
  gpio: mockup: return -EFAULT if copy_from_user() fails
  gpio: altera-a10sr: Set gpio_chip parent property

8 years agoMerge tag 'rproc-v4.11-fixes' of git://github.com/andersson/remoteproc
Linus Torvalds [Tue, 21 Mar 2017 19:54:12 +0000 (12:54 -0700)]
Merge tag 'rproc-v4.11-fixes' of git://github.com/andersson/remoteproc

Pull remoteproc fix from Bjorn Andersson:
 "This fixes a Kbuild dependency issue related to the Qualcomm
  remoteproc drivers"

* tag 'rproc-v4.11-fixes' of git://github.com/andersson/remoteproc:
  remoteproc: qcom: fix QCOM_SMD dependencies

8 years agoMerge tag 'for-f2fs-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeu...
Linus Torvalds [Tue, 21 Mar 2017 19:27:06 +0000 (12:27 -0700)]
Merge tag 'for-f2fs-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:

 - fix performance regression reported by lkp-rebot

 - fix potential data lost after power-cut due to SSR reallocation

* tag 'for-f2fs-4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: combine nat_bits and free_nid_bitmap cache
  f2fs: skip scanning free nid bitmap of full NAT blocks
  f2fs: use __set{__clear}_bit_le
  f2fs: declare static functions
  f2fs: don't overwrite node block by SSR

8 years agovfio: Rework group release notifier warning
Alex Williamson [Tue, 21 Mar 2017 19:19:09 +0000 (13:19 -0600)]
vfio: Rework group release notifier warning

The intent of the original warning is make sure that the mdev vendor
driver has removed any group notifiers at the point where the group
is closed by the user.  Theoretically this would be through an
orderly shutdown where any devices are release prior to the group
release.  We can't always count on an orderly shutdown, the user can
close the group before the notifier can be removed or the user task
might be killed.  We'd like to add this sanity test when the group is
idle and the only references are from the devices within the group
themselves, but we don't have a good way to do that.  Instead check
both when the group itself is removed and when the group is opened.
A bit later than we'd prefer, but better than the current over
aggressive approach.

Fixes: ccd46dbae77d ("vfio: support notifier chain in vfio_group")
Signed-off-by: Alex Williamson <[email protected]>
Cc: <[email protected]> # v4.10
Cc: Jike Song <[email protected]>
8 years agoarm64: compat: Update compat syscalls
Will Deacon [Tue, 21 Mar 2017 18:04:26 +0000 (18:04 +0000)]
arm64: compat: Update compat syscalls

Hook up three pkey syscalls (which we don't implement) and the new statx
syscall, as has been done for arch/arm/.

Signed-off-by: Will Deacon <[email protected]>
8 years agoreset: fix optional reset_control_get stubs to return NULL
Philipp Zabel [Mon, 20 Mar 2017 10:25:16 +0000 (11:25 +0100)]
reset: fix optional reset_control_get stubs to return NULL

When RESET_CONTROLLER is not enabled, the optional reset_control_get
stubs should now also return NULL.

Since it is now valid for reset_control_assert/deassert/reset/status/put
to be called unconditionally, with NULL as an argument for optional
resets, the stubs are not allowed to warn anymore.

Fixes: bb475230b8e5 ("reset: make optional functions really optional")
Reported-by: Andrzej Hajda <[email protected]>
Tested-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Cc: Ramiro Oliveira <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
8 years agoMerge branch 'nvme-4.11-rc' of git://git.infradead.org/nvme into for-linus
Jens Axboe [Tue, 21 Mar 2017 17:03:53 +0000 (11:03 -0600)]
Merge branch 'nvme-4.11-rc' of git://git.infradead.org/nvme into for-linus

Sagi writes:

This consists of some fixes for issues reported lately:
- loop and rdma host driver cpu hotplug fixes
- fix loop use-after-free
- nvmet percpu_ref confirmation fix to fail ongoing requests
- nvmet-rdma fix a non-initialized commands deref

8 years agonvme-loop: handle cpu unplug when re-establishing the controller
Sagi Grimberg [Mon, 13 Mar 2017 11:27:51 +0000 (13:27 +0200)]
nvme-loop: handle cpu unplug when re-establishing the controller

If a cpu unplug event has occured, we need to take the minimum
of the provided nr_io_queues and the number of online cpus,
otherwise we won't be able to connect them as blk-mq mapping
won't dispatch to those queues.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
8 years agonvme-rdma: handle cpu unplug when re-establishing the controller
Sagi Grimberg [Thu, 9 Mar 2017 11:26:07 +0000 (13:26 +0200)]
nvme-rdma: handle cpu unplug when re-establishing the controller

If a cpu unplug event has occured, we need to take the minimum
of the provided nr_io_queues and the number of online cpus,
otherwise we won't be able to connect them as blk-mq mapping
won't dispatch to those queues.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
8 years agodrm/i915: make context status notifier head be per engine
Changbin Du [Tue, 21 Mar 2017 14:47:20 +0000 (14:47 +0000)]
drm/i915: make context status notifier head be per engine

GVTg has introduced the context status notifier to schedule the GVTg
workload. At that time, the notifier is bound to GVTg context only,
so GVTg is not aware of host workloads.

Now we are going to improve GVTg's guest workload scheduler policy,
and add Guc emulation support for new Gen graphics. Both these two
features require acknowledgment for all contexts running on hardware.
(But will not alter host workload.) So here try to make some change.

The change is simple:
  1. Move the context status notifier head from i915_gem_context to
     intel_engine_cs. Which means there is a notifier head per engine
     instead of per context. Execlist driver still call notifier for
     each context sched-in/out events of current engine.
  2. At GVTg side, it binds a notifier_block for each physical engine
     at GVTg initialization period. Then GVTg can hear all context
     status events.

In this patch, GVTg do nothing for host context event, but later
will add a function there. But in any case, the notifier callback is
a noop if this is no active vGPU.

Since intel_gvt_init() is called at early initialization stage and
require the status notifier head has been initiated, I initiate it in
intel_engine_setup().

v2: remove a redundant newline. (chris)

Fixes: 3c7ba6359d70 ("drm/i915: Introduce execlist context status change notification")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232
Signed-off-by: Changbin Du <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Zhi Wang <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
(cherry picked from commit 3fc03069bc6e6c316f19bb526e3c8ce784677477)
Signed-off-by: Jani Nikula <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
8 years agodrm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)
Chris Wilson [Tue, 21 Mar 2017 14:45:31 +0000 (14:45 +0000)]
drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)

The rcu_barrier() takes the cpu_hotplug mutex which itself is not
reclaim-safe, and so rcu_barrier() is illegal from inside the shrinker.

[  309.661373] =========================================================
[  309.661376] [ INFO: possible irq lock inversion dependency detected ]
[  309.661380] 4.11.0-rc1-CI-CI_DRM_2333+ #1 Tainted: G        W
[  309.661383] ---------------------------------------------------------
[  309.661386] gem_exec_gttfil/6435 just changed the state of lock:
[  309.661389]  (rcu_preempt_state.barrier_mutex){+.+.-.}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160
[  309.661399] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[  309.661402]  (cpu_hotplug.lock){+.+.+.}
[  309.661404]

               and interrupts could create inverse lock ordering between them.

[  309.661410]
               other info that might help us debug this:
[  309.661414]  Possible interrupt unsafe locking scenario:

[  309.661417]        CPU0                    CPU1
[  309.661419]        ----                    ----
[  309.661421]   lock(cpu_hotplug.lock);
[  309.661425]                                local_irq_disable();
[  309.661432]                                lock(rcu_preempt_state.barrier_mutex);
[  309.661441]                                lock(cpu_hotplug.lock);
[  309.661446]   <Interrupt>
[  309.661448]     lock(rcu_preempt_state.barrier_mutex);
[  309.661453]
                *** DEADLOCK ***

[  309.661460] 4 locks held by gem_exec_gttfil/6435:
[  309.661464]  #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff8120d83d>] vfs_write+0x17d/0x1f0
[  309.661475]  #1:  (debugfs_srcu){......}, at: [<ffffffff81320491>] debugfs_use_file_start+0x41/0xa0
[  309.661486]  #2:  (&attr->mutex){+.+.+.}, at: [<ffffffff8123a3e7>] simple_attr_write+0x37/0xe0
[  309.661495]  #3:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0091b4a>] i915_drop_caches_set+0x3a/0x150 [i915]
[  309.661540]
               the shortest dependencies between 2nd lock and 1st lock:
[  309.661547]  -> (cpu_hotplug.lock){+.+.+.} ops: 829 {
[  309.661553]     HARDIRQ-ON-W at:
[  309.661560]                       __lock_acquire+0x5e5/0x1b50
[  309.661565]                       lock_acquire+0xc9/0x220
[  309.661572]                       __mutex_lock+0x6e/0x990
[  309.661576]                       mutex_lock_nested+0x16/0x20
[  309.661583]                       get_online_cpus+0x61/0x80
[  309.661590]                       kmem_cache_create+0x25/0x1d0
[  309.661596]                       debug_objects_mem_init+0x30/0x249
[  309.661602]                       start_kernel+0x341/0x3fe
[  309.661607]                       x86_64_start_reservations+0x2a/0x2c
[  309.661612]                       x86_64_start_kernel+0x173/0x186
[  309.661619]                       verify_cpu+0x0/0xfc
[  309.661622]     SOFTIRQ-ON-W at:
[  309.661627]                       __lock_acquire+0x611/0x1b50
[  309.661632]                       lock_acquire+0xc9/0x220
[  309.661636]                       __mutex_lock+0x6e/0x990
[  309.661641]                       mutex_lock_nested+0x16/0x20
[  309.661646]                       get_online_cpus+0x61/0x80
[  309.661650]                       kmem_cache_create+0x25/0x1d0
[  309.661655]                       debug_objects_mem_init+0x30/0x249
[  309.661660]                       start_kernel+0x341/0x3fe
[  309.661664]                       x86_64_start_reservations+0x2a/0x2c
[  309.661669]                       x86_64_start_kernel+0x173/0x186
[  309.661674]                       verify_cpu+0x0/0xfc
[  309.661677]     RECLAIM_FS-ON-W at:
[  309.661682]                          mark_held_locks+0x6f/0xa0
[  309.661687]                          lockdep_trace_alloc+0xb3/0x100
[  309.661693]                          kmem_cache_alloc_trace+0x31/0x2e0
[  309.661699]                          __smpboot_create_thread.part.1+0x27/0xe0
[  309.661704]                          smpboot_create_threads+0x61/0x90
[  309.661709]                          cpuhp_invoke_callback+0x9c/0x8a0
[  309.661713]                          cpuhp_up_callbacks+0x31/0xb0
[  309.661718]                          _cpu_up+0x7a/0xc0
[  309.661723]                          do_cpu_up+0x5f/0x80
[  309.661727]                          cpu_up+0xe/0x10
[  309.661734]                          smp_init+0x71/0xb3
[  309.661738]                          kernel_init_freeable+0x94/0x19e
[  309.661743]                          kernel_init+0x9/0xf0
[  309.661748]                          ret_from_fork+0x2e/0x40
[  309.661752]     INITIAL USE at:
[  309.661757]                      __lock_acquire+0x234/0x1b50
[  309.661761]                      lock_acquire+0xc9/0x220
[  309.661766]                      __mutex_lock+0x6e/0x990
[  309.661771]                      mutex_lock_nested+0x16/0x20
[  309.661775]                      get_online_cpus+0x61/0x80
[  309.661780]                      __cpuhp_setup_state+0x44/0x170
[  309.661785]                      page_alloc_init+0x23/0x3a
[  309.661790]                      start_kernel+0x124/0x3fe
[  309.661794]                      x86_64_start_reservations+0x2a/0x2c
[  309.661799]                      x86_64_start_kernel+0x173/0x186
[  309.661804]                      verify_cpu+0x0/0xfc
[  309.661807]   }
[  309.661813]   ... key      at: [<ffffffff81e37690>] cpu_hotplug+0xb0/0x100
[  309.661817]   ... acquired at:
[  309.661821]    lock_acquire+0xc9/0x220
[  309.661825]    __mutex_lock+0x6e/0x990
[  309.661829]    mutex_lock_nested+0x16/0x20
[  309.661833]    get_online_cpus+0x61/0x80
[  309.661837]    _rcu_barrier+0x9f/0x160
[  309.661841]    rcu_barrier+0x10/0x20
[  309.661847]    netdev_run_todo+0x5f/0x310
[  309.661852]    rtnl_unlock+0x9/0x10
[  309.661856]    default_device_exit_batch+0x133/0x150
[  309.661862]    ops_exit_list.isra.0+0x4d/0x60
[  309.661866]    cleanup_net+0x1d8/0x2c0
[  309.661872]    process_one_work+0x1f4/0x6d0
[  309.661876]    worker_thread+0x49/0x4a0
[  309.661881]    kthread+0x107/0x140
[  309.661884]    ret_from_fork+0x2e/0x40

[  309.661890] -> (rcu_preempt_state.barrier_mutex){+.+.-.} ops: 179 {
[  309.661896]    HARDIRQ-ON-W at:
[  309.661901]                     __lock_acquire+0x5e5/0x1b50
[  309.661905]                     lock_acquire+0xc9/0x220
[  309.661910]                     __mutex_lock+0x6e/0x990
[  309.661914]                     mutex_lock_nested+0x16/0x20
[  309.661919]                     _rcu_barrier+0x31/0x160
[  309.661923]                     rcu_barrier+0x10/0x20
[  309.661928]                     netdev_run_todo+0x5f/0x310
[  309.661932]                     rtnl_unlock+0x9/0x10
[  309.661936]                     default_device_exit_batch+0x133/0x150
[  309.661941]                     ops_exit_list.isra.0+0x4d/0x60
[  309.661946]                     cleanup_net+0x1d8/0x2c0
[  309.661951]                     process_one_work+0x1f4/0x6d0
[  309.661955]                     worker_thread+0x49/0x4a0
[  309.661960]                     kthread+0x107/0x140
[  309.661964]                     ret_from_fork+0x2e/0x40
[  309.661968]    SOFTIRQ-ON-W at:
[  309.661972]                     __lock_acquire+0x611/0x1b50
[  309.661977]                     lock_acquire+0xc9/0x220
[  309.661981]                     __mutex_lock+0x6e/0x990
[  309.661986]                     mutex_lock_nested+0x16/0x20
[  309.661990]                     _rcu_barrier+0x31/0x160
[  309.661995]                     rcu_barrier+0x10/0x20
[  309.661999]                     netdev_run_todo+0x5f/0x310
[  309.662003]                     rtnl_unlock+0x9/0x10
[  309.662008]                     default_device_exit_batch+0x133/0x150
[  309.662013]                     ops_exit_list.isra.0+0x4d/0x60
[  309.662017]                     cleanup_net+0x1d8/0x2c0
[  309.662022]                     process_one_work+0x1f4/0x6d0
[  309.662027]                     worker_thread+0x49/0x4a0
[  309.662031]                     kthread+0x107/0x140
[  309.662035]                     ret_from_fork+0x2e/0x40
[  309.662039]    IN-RECLAIM_FS-W at:
[  309.662043]                        __lock_acquire+0x638/0x1b50
[  309.662048]                        lock_acquire+0xc9/0x220
[  309.662053]                        __mutex_lock+0x6e/0x990
[  309.662058]                        mutex_lock_nested+0x16/0x20
[  309.662062]                        _rcu_barrier+0x31/0x160
[  309.662067]                        rcu_barrier+0x10/0x20
[  309.662089]                        i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662109]                        i915_drop_caches_set+0x141/0x150 [i915]
[  309.662114]                        simple_attr_write+0xc7/0xe0
[  309.662119]                        full_proxy_write+0x4f/0x70
[  309.662124]                        __vfs_write+0x23/0x120
[  309.662128]                        vfs_write+0xc6/0x1f0
[  309.662133]                        SyS_write+0x44/0xb0
[  309.662138]                        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  309.662142]    INITIAL USE at:
[  309.662147]                    __lock_acquire+0x234/0x1b50
[  309.662151]                    lock_acquire+0xc9/0x220
[  309.662156]                    __mutex_lock+0x6e/0x990
[  309.662160]                    mutex_lock_nested+0x16/0x20
[  309.662165]                    _rcu_barrier+0x31/0x160
[  309.662169]                    rcu_barrier+0x10/0x20
[  309.662174]                    netdev_run_todo+0x5f/0x310
[  309.662178]                    rtnl_unlock+0x9/0x10
[  309.662183]                    default_device_exit_batch+0x133/0x150
[  309.662188]                    ops_exit_list.isra.0+0x4d/0x60
[  309.662192]                    cleanup_net+0x1d8/0x2c0
[  309.662197]                    process_one_work+0x1f4/0x6d0
[  309.662202]                    worker_thread+0x49/0x4a0
[  309.662206]                    kthread+0x107/0x140
[  309.662210]                    ret_from_fork+0x2e/0x40
[  309.662214]  }
[  309.662220]  ... key      at: [<ffffffff81e4e1c8>] rcu_preempt_state+0x508/0x780
[  309.662225]  ... acquired at:
[  309.662229]    check_usage_forwards+0x12b/0x130
[  309.662233]    mark_lock+0x360/0x6f0
[  309.662237]    __lock_acquire+0x638/0x1b50
[  309.662241]    lock_acquire+0xc9/0x220
[  309.662245]    __mutex_lock+0x6e/0x990
[  309.662249]    mutex_lock_nested+0x16/0x20
[  309.662253]    _rcu_barrier+0x31/0x160
[  309.662257]    rcu_barrier+0x10/0x20
[  309.662279]    i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662298]    i915_drop_caches_set+0x141/0x150 [i915]
[  309.662303]    simple_attr_write+0xc7/0xe0
[  309.662307]    full_proxy_write+0x4f/0x70
[  309.662311]    __vfs_write+0x23/0x120
[  309.662315]    vfs_write+0xc6/0x1f0
[  309.662319]    SyS_write+0x44/0xb0
[  309.662323]    entry_SYSCALL_64_fastpath+0x1c/0xb1

[  309.662329]
               stack backtrace:
[  309.662335] CPU: 1 PID: 6435 Comm: gem_exec_gttfil Tainted: G        W       4.11.0-rc1-CI-CI_DRM_2333+ #1
[  309.662342] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011
[  309.662348] Call Trace:
[  309.662354]  dump_stack+0x67/0x92
[  309.662359]  print_irq_inversion_bug.part.19+0x1a4/0x1b0
[  309.662365]  check_usage_forwards+0x12b/0x130
[  309.662369]  mark_lock+0x360/0x6f0
[  309.662374]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
[  309.662379]  __lock_acquire+0x638/0x1b50
[  309.662383]  ? __mutex_unlock_slowpath+0x3e/0x2e0
[  309.662388]  ? trace_hardirqs_on+0xd/0x10
[  309.662392]  ? _rcu_barrier+0x31/0x160
[  309.662396]  lock_acquire+0xc9/0x220
[  309.662400]  ? _rcu_barrier+0x31/0x160
[  309.662404]  ? _rcu_barrier+0x31/0x160
[  309.662409]  __mutex_lock+0x6e/0x990
[  309.662412]  ? _rcu_barrier+0x31/0x160
[  309.662416]  ? _rcu_barrier+0x31/0x160
[  309.662421]  ? synchronize_rcu_expedited+0x35/0xb0
[  309.662426]  ? _raw_spin_unlock_irqrestore+0x52/0x60
[  309.662434]  mutex_lock_nested+0x16/0x20
[  309.662438]  _rcu_barrier+0x31/0x160
[  309.662442]  rcu_barrier+0x10/0x20
[  309.662464]  i915_gem_shrink_all+0x33/0x40 [i915]
[  309.662484]  i915_drop_caches_set+0x141/0x150 [i915]
[  309.662489]  simple_attr_write+0xc7/0xe0
[  309.662494]  full_proxy_write+0x4f/0x70
[  309.662498]  __vfs_write+0x23/0x120
[  309.662503]  ? rcu_read_lock_sched_held+0x75/0x80
[  309.662507]  ? rcu_sync_lockdep_assert+0x2a/0x50
[  309.662512]  ? __sb_start_write+0x102/0x210
[  309.662516]  ? vfs_write+0x17d/0x1f0
[  309.662520]  vfs_write+0xc6/0x1f0
[  309.662524]  ? trace_hardirqs_on_caller+0xe7/0x200
[  309.662529]  SyS_write+0x44/0xb0
[  309.662533]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  309.662537] RIP: 0033:0x7f507eac24a0
[  309.662541] RSP: 002b:00007fffda8720e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  309.662548] RAX: ffffffffffffffda RBX: ffffffff81482bd3 RCX: 00007f507eac24a0
[  309.662552] RDX: 0000000000000005 RSI: 00007fffda8720f0 RDI: 0000000000000005
[  309.662557] RBP: ffffc9000048bf88 R08: 0000000000000000 R09: 000000000000002c
[  309.662561] R10: 0000000000000014 R11: 0000000000000246 R12: 00007fffda872230
[  309.662566] R13: 00007fffda872228 R14: 0000000000000201 R15: 00007fffda8720f0
[  309.662572]  ? __this_cpu_preempt_check+0x13/0x20

Fixes: 0eafec6d3244 ("drm/i915: Enable lockless lookup of request tracking via RCU")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100192
Signed-off-by: Chris Wilson <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: <[email protected]> # v4.9+
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Daniel Vetter <[email protected]>
(cherry picked from commit bd784b7cc41af7a19cfb705fa6d800e511c4ab02)
Signed-off-by: Jani Nikula <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
8 years agoHID: wacom: generic: Wacom mouse is only provided for opaque tablets
Ping Cheng [Wed, 15 Mar 2017 00:08:16 +0000 (17:08 -0700)]
HID: wacom: generic: Wacom mouse is only provided for opaque tablets

Commit f85c9dc ("Support tool ID and additional tool types") introduced mouse
and lens cursor tools to generic codepath, which covers both display (direct)
and opaque tablets (indirect devices). However, mouse and lens cursor tools are
only provided for opaque tablets. This patch ignores mouse and lens cursor tools
if the device is a display tablet.

Signed-off-by: Ping Cheng <[email protected]>
Reviewed-by: Jason Gerecke <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
8 years agoHID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair
Oscar Campos [Mon, 6 Mar 2017 21:02:39 +0000 (21:02 +0000)]
HID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair

This mouse sold by Corsair as Scimitar PRO RGB defines two consecutive
Logical Minimum items in its Application (Consumer.0001) report making
it non parseable. This patch fixes the report descriptor overriding
byte 77 in rdesc from 0x16 (Logical Minimum with 16 bits value) to 0x26
(Logical Maximum with 16 bits value).

Signed-off-by: Oscar Campos <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
8 years agoHID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB
Oscar Campos [Fri, 10 Feb 2017 18:23:00 +0000 (18:23 +0000)]
HID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB

Add quirks for several corsair gaming devices to avoid long delays on
report initialization

Supported devices:

 - Corsair K65RGB Rapidfire Gaming Keyboard
 - Corsair K70RGB Rapidfire Gaming Keyboard
 - Corsair Scimitar Pro RGB Gaming Mouse

Signed-off-by: Oscar Campos <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
8 years agoHID: wacom: don't manually release resources for the EKR
Aaron Armstrong Skomra [Mon, 6 Mar 2017 18:54:58 +0000 (10:54 -0800)]
HID: wacom: don't manually release resources for the EKR

Commit 5b779fc introduces the manual release of resources in wacom_remove() as
an addition to the driver's use of devm.  The EKR resources can only be
released through wacom_remote_destroy_one() so we skip the manual release for
it.

Fixes: 5b779fc ("HID: wacom: release the resources before leaving despite devm")
Signed-off-by: Aaron Armstrong Skomra <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
8 years agoHID: wacom: Correct Intuos Pro 2 resolution
Aaron Armstrong Skomra [Mon, 6 Mar 2017 18:54:57 +0000 (10:54 -0800)]
HID: wacom: Correct Intuos Pro 2 resolution

The features struct for the second gen Intuos Pro uses the wrong constant for
the resolution. This fix is for commit 4922cd2.

Fixes: 4922cd2 ("HID: wacom: Support 2nd-gen Intuos Pro's Bluetooth classic interface")
Signed-off-by: Aaron Armstrong Skomra <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
8 years agoALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
Takashi Iwai [Tue, 21 Mar 2017 12:56:04 +0000 (13:56 +0100)]
ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()

When snd_seq_pool_done() is called, it marks the closing flag to
refuse the further cell insertions.  But snd_seq_pool_done() itself
doesn't clear the cells but just waits until all cells are cleared by
the caller side.  That is, it's racy, and this leads to the endless
stall as syzkaller spotted.

This patch addresses the racy by splitting the setup of pool->closing
flag out of snd_seq_pool_done(), and calling it properly before
snd_seq_pool_done().

BugLink: http://lkml.kernel.org/r/CACT4Y+aqqy8bZA1fFieifNxR2fAfFQQABcBHj801+u5ePV0URw@mail.gmail.com
Reported-and-tested-by: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
8 years agoALSA: x86: Make CONFIG_SND_X86 bool
Takashi Iwai [Tue, 21 Mar 2017 12:26:02 +0000 (13:26 +0100)]
ALSA: x86: Make CONFIG_SND_X86 bool

CONFIG_SND_X86 is a menu config to filter only for x86-specific
drivers in its sub-menu, and this doesn't have to be tristate but
rather it should be a bool.  Also, like other sub-menu configs, it's
more user-friendly to be default=y; it's merely a menu config and the
actual drivers are configured in the sub-menu, after all.

Fixes: 287599cf2d77 ("ALSA: add Intel HDMI LPE audio driver for BYT/CHT-T")
Signed-off-by: Takashi Iwai <[email protected]>
8 years agodrm/exynos/dsi: make te-gpios optional
Andrzej Hajda [Wed, 15 Mar 2017 11:20:42 +0000 (12:20 +0100)]
drm/exynos/dsi: make te-gpios optional

DSI forwards te-gpios interrupts to display controller, but if display
controller works in HW-TRIGGER mode this interrupt is not necessary.
Making te-gpios property optional allows to avoid generating spare
interrupts.
And also if panel device node of command mode panel device doesn't provide
te-gpios property then the panel driver failed to probe. This was a critial
issue.

With this patch we can not only get rid of 60 interrupt callbacks per second
but also fix the critial issues.

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos: Print kernel pointers in a restricted form
Krzysztof Kozlowski [Tue, 14 Mar 2017 18:38:04 +0000 (20:38 +0200)]
drm/exynos: Print kernel pointers in a restricted form

Printing raw kernel pointers might reveal information which sometimes we
try to hide (e.g. with Kernel Address Space Layout Randomization).  Use
the "%pK" format so these pointers will be hidden for unprivileged
users.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos/decon5433: fix software trigger mask
Andrzej Hajda [Tue, 14 Mar 2017 08:28:00 +0000 (09:28 +0100)]
drm/exynos/decon5433: fix software trigger mask

The patch fixes copy/paste bug introduced during code refactoring.

Reported-by: Dan Carpenter <[email protected]>
Fixes: b93c2e8b5d9d ("drm/exynos/decon5433: configure sysreg in case of hardware trigger")Fixes:
Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos/fimd: signal frame done interrupt at front porch
Andrzej Hajda [Tue, 14 Mar 2017 08:27:59 +0000 (09:27 +0100)]
drm/exynos/fimd: signal frame done interrupt at front porch

VBLANK interrupt should be signalled as soon as scanout ends, front porch
is the best moment.

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos/decon5433: signal frame done interrupt at front porch
Andrzej Hajda [Tue, 14 Mar 2017 08:27:58 +0000 (09:27 +0100)]
drm/exynos/decon5433: signal frame done interrupt at front porch

DECON in case of video mode generates interrupt by default at start
of vertical back porch. As this interrupt is used to generate VBLANK
events more optimal point is start of vertical front porch.

Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos/decon5433: fix vblank event handling
Andrzej Hajda [Tue, 14 Mar 2017 08:27:57 +0000 (09:27 +0100)]
drm/exynos/decon5433: fix vblank event handling

Current implementation of event handling assumes that vblank interrupt is
always called at the right time. It is not true, it can be delayed due to
various reasons. As a result different races can happen. The patch fixes
the issue by using hardware frame counter present in DECON to serialize
vblank and commit completion events.

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos: move crtc event handling to drivers callbacks
Andrzej Hajda [Tue, 14 Mar 2017 08:27:56 +0000 (09:27 +0100)]
drm/exynos: move crtc event handling to drivers callbacks

CRTC event is currently send with next vblank, or instantly in case crtc
is being disabled. This approach usually works, but in corner cases it can
result in premature event generation. Only device driver is able to verify
if the event can be sent. This patch is a first step in that direction - it
moves event handling to the drivers.

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos: Remove support for Exynos4415 (SoC not supported anymore)
Krzysztof Kozlowski [Sat, 11 Mar 2017 18:04:16 +0000 (20:04 +0200)]
drm/exynos: Remove support for Exynos4415 (SoC not supported anymore)

Support for Exynos4415 is going away because there are no internal nor
external users.

Since commit 46dcf0ff0de3 ("ARM: dts: exynos: Remove exynos4415.dtsi"),
the platform cannot be instantiated so remove also the drivers.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Kukjin Kim <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agodrm/exynos/decon5433: & vs | typo
Dan Carpenter [Tue, 14 Feb 2017 07:46:20 +0000 (10:46 +0300)]
drm/exynos/decon5433: & vs | typo

"&" was obviously intended instead of "|".  The original condition is
always true.

Fixes: b93c2e8b5d9d ("drm/exynos/decon5433: configure sysreg in case of hardware trigger")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
8 years agocpufreq: schedutil: Fix per-CPU structure initialization in sugov_start()
Rafael J. Wysocki [Sun, 19 Mar 2017 13:30:02 +0000 (14:30 +0100)]
cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start()

sugov_start() only initializes struct sugov_cpu per-CPU structures
for shared policies, but it should do that for single-CPU policies too.

That in particular makes the IO-wait boost mechanism work in the
cases when cpufreq policies correspond to individual CPUs.

Fixes: 21ca6d2c52f8 (cpufreq: schedutil: Add iowait boosting)
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: 4.9+ <[email protected]> # 4.9+
8 years agoremoteproc: qcom: fix QCOM_SMD dependencies
Arnd Bergmann [Mon, 13 Mar 2017 16:36:25 +0000 (17:36 +0100)]
remoteproc: qcom: fix QCOM_SMD dependencies

qcom_smd_register_edge() is provided by either QCOM_SMD or RPMSG_QCOM_SMD,
and if both of them are disabled, it does nothing.

The check for the PIL drivers however only checks for QCOM_SMD, so it breaks
with QCOM_SMD=n && RPMSG_QCOM_SMD=m:

drivers/remoteproc/built-in.o: In function `smd_subdev_remove':
qcom_wcnss_iris.c:(.text+0x231c): undefined reference to `qcom_smd_unregister_edge'
drivers/remoteproc/built-in.o: In function `smd_subdev_probe':
qcom_wcnss_iris.c:(.text+0x2344): undefined reference to `qcom_smd_register_edge'
drivers/remoteproc/built-in.o: In function `smd_subdev_probe':
qcom_q6v5_pil.c:(.text+0x3538): undefined reference to `qcom_smd_register_edge'
qcom_q6v5_pil.c:(.text+0x3538): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `qcom_smd_register_edge'

This clarifies the Kconfig dependency.

Fixes: 4b48921a8f74 ("remoteproc: qcom: Use common SMD edge handler")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
8 years agoath10k: fix incorrect wlan_mac_base in qca6174_regs
Ryan Hsu [Mon, 13 Mar 2017 22:49:03 +0000 (15:49 -0700)]
ath10k: fix incorrect wlan_mac_base in qca6174_regs

In the 'commit ebee76f7fa46 ("ath10k: allow setting coverage class")',
it inherits the design and the address offset from ath9k, but the address
is not applicable to QCA6174, which leads to a random crash while doing the
resume() operation, since the set_coverage_class.ops will be called from
ieee80211_reconfig() when resume() (if the wow is not configured).

Fix the incorrect address offset here to avoid the random crash.

Verified on QCA6174/hw3.0 with firmware WLAN.RM.4.4-00022-QCARMSWPZ-2.

kvalo: this also seems to fix a regression with firmware restart.

Fixes: ebee76f7fa46 ("ath10k: allow setting coverage class")
Cc: <[email protected]> # v4.10
Signed-off-by: Ryan Hsu <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
8 years agof2fs: combine nat_bits and free_nid_bitmap cache
Chao Yu [Wed, 8 Mar 2017 12:07:49 +0000 (20:07 +0800)]
f2fs: combine nat_bits and free_nid_bitmap cache

Both nat_bits cache and free_nid_bitmap cache provide same functionality
as a intermediate cache between free nid cache and disk, but with
different granularity of indicating free nid range, and different
persistence policy. nat_bits cache provides better persistence ability,
and free_nid_bitmap provides better granularity.

In this patch we combine advantage of both caches, so finally policy of
the intermediate cache would be:
- init: load free nid status from nat_bits into free_nid_bitmap
- lookup: scan free_nid_bitmap before load NAT blocks
- update: update free_nid_bitmap in real-time
- persistence: udpate and persist nat_bits in checkpoint

This patch also resolves performance regression reported by lkp-robot.

commit:
  4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
  d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
  1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")

4ac912427c4214d8 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
---------------- -------------------------- --------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
     77863 Â±  0%      +2.1%      79485 Â±  1%     +50.8%     117404 Â±  0%  aim7.jobs-per-min
    231.63 Â±  0%      -2.0%     227.01 Â±  1%     -33.6%     153.80 Â±  0%  aim7.time.elapsed_time
    231.63 Â±  0%      -2.0%     227.01 Â±  1%     -33.6%     153.80 Â±  0%  aim7.time.elapsed_time.max
    896604 Â±  0%      -0.8%     889221 Â±  3%     -20.2%     715260 Â±  1%  aim7.time.involuntary_context_switches
      2394 Â±  1%      +4.6%       2503 Â±  1%      +3.7%       2481 Â±  2%  aim7.time.maximum_resident_set_size
      6240 Â±  0%      -1.5%       6145 Â±  1%     -14.1%       5360 Â±  1%  aim7.time.system_time
   1111357 Â±  3%      +1.9%    1132509 Â±  2%      -6.2%    1041932 Â±  2%  aim7.time.voluntary_context_switches
...

Signed-off-by: Chao Yu <[email protected]>
Tested-by: Xiaolong Ye <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
8 years agof2fs: skip scanning free nid bitmap of full NAT blocks
Chao Yu [Wed, 1 Mar 2017 09:09:07 +0000 (17:09 +0800)]
f2fs: skip scanning free nid bitmap of full NAT blocks

This patch adds to account free nids for each NAT blocks, and while
scanning all free nid bitmap, do check count and skip lookuping in
full NAT block.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
8 years agof2fs: use __set{__clear}_bit_le
Jaegeuk Kim [Tue, 7 Mar 2017 22:11:06 +0000 (14:11 -0800)]
f2fs: use __set{__clear}_bit_le

This patch uses __set{__clear}_bit_le for highter speed.

Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
8 years agof2fs: declare static functions
Jaegeuk Kim [Fri, 10 Mar 2017 17:39:57 +0000 (09:39 -0800)]
f2fs: declare static functions

This is to avoid build warning reported by kbuild test robot.

Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
8 years agof2fs: don't overwrite node block by SSR
Jaegeuk Kim [Mon, 6 Mar 2017 19:59:56 +0000 (11:59 -0800)]
f2fs: don't overwrite node block by SSR

This patch fixes that SSR can overwrite previous warm node block consisting of
a node chain since the last checkpoint.

Fixes: 5b6c6be2d878 ("f2fs: use SSR for warm node as well")
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
8 years agogeneric syscalls: Wire up statx syscall
Stafford Horne [Mon, 13 Mar 2017 14:45:21 +0000 (23:45 +0900)]
generic syscalls: Wire up statx syscall

The new syscall statx is implemented as generic code, so enable it
for architectures like openrisc which use the generic syscall table.

Fixes: a528d35e8bfcc ("statx: Add a system call to make enhanced file info available")
Cc: Thomas Gleixner <[email protected]>
Cc: Al Viro <[email protected]>
Cc: David Howells <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Signed-off-by: Stafford Horne <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
8 years agoALSA: hda - add support for docking station for HP 840 G3
Jaroslav Kysela [Thu, 9 Mar 2017 12:30:09 +0000 (13:30 +0100)]
ALSA: hda - add support for docking station for HP 840 G3

This tested patch adds missing initialization for Line-In/Out PINs for
the docking station for HP 840 G3.

Signed-off-by: Jaroslav Kysela <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
8 years agoALSA: hda - add support for docking station for HP 820 G2
Jaroslav Kysela [Thu, 9 Mar 2017 12:29:13 +0000 (13:29 +0100)]
ALSA: hda - add support for docking station for HP 820 G2

This tested patch adds missing initialization for Line-In/Out PINs for
the docking station for HP 820 G2.

Signed-off-by: Jaroslav Kysela <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
8 years agoMerge tag 'gvt-fixes-2017-03-17' of https://github.com/01org/gvt-linux into drm-intel...
Jani Nikula [Mon, 20 Mar 2017 10:10:26 +0000 (12:10 +0200)]
Merge tag 'gvt-fixes-2017-03-17' of https://github.com/01org/gvt-linux into drm-intel-fixes

gvt-fixes-2017-03-17

- force_nonpriv reg handling in cmd parser (Yan)
- gvt error message cleanup (Tina)
- i915_wait_request fix from Chris
- KVM srcu warning fix (Changbin)
- ensure shadow ctx pinned (Chuanxiao)
- critical gvt scheduler interval time fix (Zhenyu)
- etc.

Signed-off-by: Jani Nikula <[email protected]>
8 years agoALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
Takashi Iwai [Mon, 20 Mar 2017 09:08:19 +0000 (10:08 +0100)]
ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call

In the commit [15c75b09f8d1: ALSA: ctxfi: Fallback DMA mask to 32bit],
I forgot to put "!" at dam_set_mask() call check in cthw20k1.c (while
cthw20k2.c is OK).  This patch fixes that obvious bug.

(As a side note: although the original commit was completely wrong,
 it's still working for most of machines, as it sets to 32bit DMA mask
 in the end.  So the bug severity is low.)

Fixes: 15c75b09f8d1 ("ALSA: ctxfi: Fallback DMA mask to 32bit")
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
8 years agoARM: sun8i: a23/a33: drop bl_en_pin GPIO pinmux in reference design DTSI
Icenowy Zheng [Fri, 17 Mar 2017 21:23:15 +0000 (05:23 +0800)]
ARM: sun8i: a23/a33: drop bl_en_pin GPIO pinmux in reference design DTSI

The bl_en_pin GPIO pinmux is configured as "gpio_in", which makes it
conflicts with the real GPIO usage (out), and makes the backlight not
usable.

Drop the GPIO pinmux for it, thus this GPIO can be correctly used.

Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
8 years agoARM: dts: sun7i: lamobo-r1: Fix CPU port RGMII settings
Florian Fainelli [Sun, 19 Mar 2017 04:53:20 +0000 (21:53 -0700)]
ARM: dts: sun7i: lamobo-r1: Fix CPU port RGMII settings

The CPU port of the BCM53125 is configured with RGMII (no delays) but
this should actually be RGMII with transmit delay (rgmii-txid) because
STMMAC takes care of inserting the transmitter delay. This fixes
occasional packet loss encountered.

Fixes: d7b9eaff5f0c ("ARM: dts: sun7i: Add BCM53125 switch nodes to the lamobo-r1 board")
Reported-by: Hartmut Knaack <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
8 years agoLinux 4.11-rc3 v4.11-rc3
Linus Torvalds [Mon, 20 Mar 2017 02:09:39 +0000 (19:09 -0700)]
Linux 4.11-rc3

8 years agomm/swap: don't BUG_ON() due to uninitialized swap slot cache
Linus Torvalds [Mon, 20 Mar 2017 02:00:47 +0000 (19:00 -0700)]
mm/swap: don't BUG_ON() due to uninitialized swap slot cache

This BUG_ON() triggered for me once at shutdown, and I don't see a
reason for the check.  The code correctly checks whether the swap slot
cache is usable or not, so an uninitialized swap slot cache is not
actually problematic afaik.

I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
I'm not sure why that seemingly pointless check was there.  I suspect
the real fix is to just remove it entirely, but for now we'll warn about
it but not bring the machine down.

Cc: "Huang, Ying" <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agoMerge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Mon, 20 Mar 2017 01:49:28 +0000 (18:49 -0700)]
Merge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc fixes from Michael Ellerman:
 "A couple of minor powerpc fixes for 4.11:

   - wire up statx() syscall

   - don't print a warning on memory hotplug when HPT resizing isn't
     available

  Thanks to: David Gibson, Chandan Rajendra"

* tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Don't give a warning when HPT resizing isn't available
  powerpc: Wire up statx() syscall

8 years agoMerge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Mon, 20 Mar 2017 01:11:13 +0000 (18:11 -0700)]
Merge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:

 - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
   modules with CONFIG_MODVERSIONS.

 - Dave Anglin optimized the cache flushing for vmap ranges.

 - Arvind Yadav provided a fix for a potential NULL pointer dereference
   in the parisc perf code (and some code cleanups).

 - I wired up the new statx system call, fixed some compiler warnings
   with the access_ok() macro and fixed shutdown code to really halt a
   system at shutdown instead of crashing & rebooting.

* 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix system shutdown halt
  parisc: perf: Fix potential NULL pointer dereference
  parisc: Avoid compiler warnings with access_ok()
  parisc: Wire up statx system call
  parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
  parisc: support R_PARISC_SECREL32 relocation in modules

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Linus Torvalds [Mon, 20 Mar 2017 01:06:31 +0000 (18:06 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the changes are in qla2xxx target driver code to address
  various issues found during Cavium/QLogic's internal testing (stable
  CC's included), along with a few other stability and smaller
  miscellaneous improvements.

  There are also a couple of different patch sets from Mike Christie,
  which have been a result of his work to use target-core ALUA logic
  together with tcm-user backend driver.

  Finally, a patch to address some long standing issues with
  pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
  which will make folks using physical (or virtual) magnetic tape happy"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
  qla2xxx: Update driver version to 9.00.00.00-k
  qla2xxx: Fix delayed response to command for loop mode/direct connect.
  qla2xxx: Change scsi host lookup method.
  qla2xxx: Add DebugFS node to display Port Database
  qla2xxx: Use IOCB interface to submit non-critical MBX.
  qla2xxx: Add async new target notification
  qla2xxx: Export DIF stats via debugfs
  qla2xxx: Improve T10-DIF/PI handling in driver.
  qla2xxx: Allow relogin to proceed if remote login did not finish
  qla2xxx: Fix sess_lock & hardware_lock lock order problem.
  qla2xxx: Fix inadequate lock protection for ABTS.
  qla2xxx: Fix request queue corruption.
  qla2xxx: Fix memory leak for abts processing
  qla2xxx: Allow vref count to timeout on vport delete.
  tcmu: Convert cmd_time_out into backend device attribute
  tcmu: make cmd timeout configurable
  tcmu: add helper to check if dev was configured
  target: fix race during implicit transition work flushes
  target: allow userspace to set state to transitioning
  target: fix ALUA transition timeout handling
  ...

8 years agoMerge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Sun, 19 Mar 2017 22:45:02 +0000 (15:45 -0700)]
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull device-dax fixes from Dan Williams:
 "The device-dax driver was not being careful to handle falling back to
  smaller fault-granularity sizes.

  The driver already fails fault attempts that are smaller than the
  device's alignment, but it also needs to handle the cases where a
  larger page mapping could be established. For simplicity of the
  immediate fix the implementation just signals VM_FAULT_FALLBACK until
  fault-size == device-alignment.

  One fix is for -stable to address pmd-to-pte fallback from the
  original implementation, another fix is for the new (introduced in
  4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
  ride.

  These have received a build success notification from the kbuild
  robot"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: fix debug output typo
  device-dax: fix pud fault fallback handling
  device-dax: fix pmd/pte fault fallback handling

8 years agoqla2xxx: Update driver version to 9.00.00.00-k
Himanshu Madhani [Wed, 15 Mar 2017 16:48:56 +0000 (09:48 -0700)]
qla2xxx: Update driver version to 9.00.00.00-k

Signed-off-by: Himanshu Madhani <[email protected]>
signed-off-by: Giridhar Malavali <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Fix delayed response to command for loop mode/direct connect.
Quinn Tran [Wed, 15 Mar 2017 16:48:55 +0000 (09:48 -0700)]
qla2xxx: Fix delayed response to command for loop mode/direct connect.

Current driver wait for FW to be in the ready state before
processing in-coming commands. For Arbitrated Loop or
Point-to- Point (not switch), FW Ready state can take a while.
FW will transition to ready state after all Nports have been
logged in. In the mean time, certain initiators have completed
the login and starts IO. Driver needs to start processing all
queues if FW is already started.

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Change scsi host lookup method.
Quinn Tran [Wed, 15 Mar 2017 16:48:54 +0000 (09:48 -0700)]
qla2xxx: Change scsi host lookup method.

For target mode, when new scsi command arrive, driver first performs
a look up of the SCSI Host. The current look up method is based on
the ALPA portion of the NPort ID. For Cisco switch, the ALPA can
not be used as the index. Instead, the new search method is based
on the full value of the Nport_ID via btree lib.

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Add DebugFS node to display Port Database
Himanshu Madhani [Wed, 15 Mar 2017 16:48:53 +0000 (09:48 -0700)]
qla2xxx: Add DebugFS node to display Port Database

Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Giridhar Malavali <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Use IOCB interface to submit non-critical MBX.
Quinn Tran [Wed, 15 Mar 2017 16:48:52 +0000 (09:48 -0700)]
qla2xxx: Use IOCB interface to submit non-critical MBX.

The Mailbox interface is currently over subscribed. We like
to reserve the Mailbox interface for the chip managment and
link initialization. Any non essential Mailbox command will
be routed through the IOCB interface. The IOCB interface is
able to absorb more commands.

Following commands are being routed through IOCB interface

- Get ID List (007Ch)
- Get Port DB (0064h)
- Get Link Priv Stats (006Dh)

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Add async new target notification
Quinn Tran [Wed, 15 Mar 2017 16:48:51 +0000 (09:48 -0700)]
qla2xxx: Add async new target notification

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Export DIF stats via debugfs
Anil Gurumurthy [Wed, 15 Mar 2017 16:48:50 +0000 (09:48 -0700)]
qla2xxx: Export DIF stats via debugfs

Signed-off-by: Anil Gurumurthy <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Improve T10-DIF/PI handling in driver.
Quinn Tran [Wed, 15 Mar 2017 16:48:49 +0000 (09:48 -0700)]
qla2xxx: Improve T10-DIF/PI handling in driver.

Add routines to support T10 DIF tag.

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Anil Gurumurthy <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Allow relogin to proceed if remote login did not finish
Quinn Tran [Wed, 15 Mar 2017 16:48:48 +0000 (09:48 -0700)]
qla2xxx: Allow relogin to proceed if remote login did not finish

If the remote port have started the login process, then the
PLOGI and PRLI should be back to back. Driver will allow
the remote port to complete the process. For the case where
the remote port decide to back off from sending PRLI, this
local port sets an expiration timer for the PRLI. Once the
expiration time passes, the relogin retry logic is allowed
to go through and perform login with the remote port.

Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Fix sess_lock & hardware_lock lock order problem.
Quinn Tran [Wed, 15 Mar 2017 16:48:47 +0000 (09:48 -0700)]
qla2xxx: Fix sess_lock & hardware_lock lock order problem.

The main lock that needs to be held for CMD or TMR submission
to upper layer is the sess_lock. The sess_lock is used to
serialize cmd submission and session deletion. The addition
of hardware_lock being held is not necessary. This patch removes
hardware_lock dependency from CMD/TMR submission.

Use hardware_lock only for error response in this case.

Path1
       CPU0                    CPU1
       ----                    ----
  lock(&(&ha->tgt.sess_lock)->rlock);
                               lock(&(&ha->hardware_lock)->rlock);
                               lock(&(&ha->tgt.sess_lock)->rlock);
  lock(&(&ha->hardware_lock)->rlock);

Path2/deadlock
*** DEADLOCK ***
Call Trace:
dump_stack+0x85/0xc2
print_circular_bug+0x1e3/0x250
__lock_acquire+0x1425/0x1620
lock_acquire+0xbf/0x210
_raw_spin_lock_irqsave+0x53/0x70
qlt_sess_work_fn+0x21d/0x480 [qla2xxx]
process_one_work+0x1f4/0x6e0

Cc: <[email protected]>
Cc: Bart Van Assche <[email protected]>
Reported-by: Bart Van Assche <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Fix inadequate lock protection for ABTS.
Quinn Tran [Wed, 15 Mar 2017 16:48:46 +0000 (09:48 -0700)]
qla2xxx: Fix inadequate lock protection for ABTS.

Normally, ABTS is sent to Target Core as Task MGMT command.
In the case of error, qla2xxx needs to send response, hardware_lock
is required to prevent request queue corruption.

Cc: <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Fix request queue corruption.
Quinn Tran [Wed, 15 Mar 2017 16:48:45 +0000 (09:48 -0700)]
qla2xxx: Fix request queue corruption.

When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.

Cc: <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Fix memory leak for abts processing
Quinn Tran [Wed, 15 Mar 2017 16:48:44 +0000 (09:48 -0700)]
qla2xxx: Fix memory leak for abts processing

Cc: <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoqla2xxx: Allow vref count to timeout on vport delete.
Joe Carnuccio [Wed, 15 Mar 2017 16:48:43 +0000 (09:48 -0700)]
qla2xxx: Allow vref count to timeout on vport delete.

Cc: <[email protected]>
Signed-off-by: Joe Carnuccio <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotcmu: Convert cmd_time_out into backend device attribute
Nicholas Bellinger [Sat, 18 Mar 2017 22:04:13 +0000 (15:04 -0700)]
tcmu: Convert cmd_time_out into backend device attribute

Instead of putting cmd_time_out under ../target/core/user_0/foo/control,
which has historically been used by parameters needed for initial
backend device configuration, go ahead and move cmd_time_out into
a backend device attribute.

In order to do this, tcmu_module_init() has been updated to create
a local struct configfs_attribute **tcmu_attrs, that is based upon
the existing passthrough_attrib_attrs along with the new cmd_time_out
attribute.  Once **tcm_attrs has been setup, go ahead and point
it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core.

Also following MNC's previous change, ->cmd_time_out is stored in
milliseconds but exposed via configfs in seconds.  Also, note this
patch restricts the modification of ->cmd_time_out to before +
after the TCMU device has been configured, but not while it has
active fabric exports.

Cc: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotcmu: make cmd timeout configurable
Mike Christie [Thu, 9 Mar 2017 08:42:09 +0000 (02:42 -0600)]
tcmu: make cmd timeout configurable

A single daemon could implement multiple types of devices
using multuple types of real devices that may not support
restarting from crashes and/or handling tcmu timeouts. This
makes the cmd timeout configurable, so handlers that do not
support it can turn if off for now.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotcmu: add helper to check if dev was configured
Mike Christie [Thu, 9 Mar 2017 08:42:08 +0000 (02:42 -0600)]
tcmu: add helper to check if dev was configured

This adds a helper to check if the dev was configured. It
will be used in the next patch to prevent updates to some
config settings after the device has been setup.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoMerge tag 'openrisc-for-linus' of git://github.com/openrisc/linux
Linus Torvalds [Sat, 18 Mar 2017 22:50:39 +0000 (15:50 -0700)]
Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixes from Stafford Horne:
 "OpenRISC fixes for build issues that were exposed by kbuild robots
  after 4.11 merge. All from allmodconfig builds. This includes:

   - bug in the handling of 8-byte get_user() calls

   - module build failure due to multile missing symbol exports"

* tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
  openrisc: Export symbols needed by modules
  openrisc: fix issue handling 8 byte get_user calls
  openrisc: xchg: fix `computed is not used` warning

8 years agotarget: fix race during implicit transition work flushes
Mike Christie [Thu, 2 Mar 2017 10:59:50 +0000 (04:59 -0600)]
target: fix race during implicit transition work flushes

This fixes the following races:

1. core_alua_do_transition_tg_pt could have read
tg_pt_gp_alua_access_state and gone into this if chunk:

if (!explicit &&
        atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) ==
           ALUA_ACCESS_STATE_TRANSITION) {

and then core_alua_do_transition_tg_pt_work could update the
state. core_alua_do_transition_tg_pt would then only set
tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would
not get updated with the second calls state.

2. core_alua_do_transition_tg_pt could be setting
tg_pt_gp_transition_complete while the tg_pt_gp_transition_work
is already completing. core_alua_do_transition_tg_pt then waits on the
completion that will never be called.

To handle these issues, we just call flush_work which will return when
core_alua_do_transition_tg_pt_work has completed so there is no need
to do the complete/wait. And, if core_alua_do_transition_tg_pt_work
was running, instead of trying to sneak in the state change, we just
schedule up another core_alua_do_transition_tg_pt_work call.

Note that this does not handle a possible race where there are multiple
threads call core_alua_do_transition_tg_pt at the same time. I think
we need a mutex in target_tg_pt_gp_alua_access_state_store.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: allow userspace to set state to transitioning
Mike Christie [Thu, 2 Mar 2017 10:59:49 +0000 (04:59 -0600)]
target: allow userspace to set state to transitioning

Userspace target_core_user handlers like tcmu-runner may want to set the
ALUA state to transitioning while it does implicit transitions. This
patch allows that state when set from configfs.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: fix ALUA transition timeout handling
Mike Christie [Thu, 2 Mar 2017 10:59:48 +0000 (04:59 -0600)]
target: fix ALUA transition timeout handling

The implicit transition time tells initiators the min time
to wait before timing out a transition. We currently schedule
the transition to occur in tg_pt_gp_implicit_trans_secs
seconds so there is no room for delays. If
core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata
needs to write out info to a remote file, then the initiator can
easily time out the operation.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: Use system workqueue for ALUA transitions
Mike Christie [Thu, 2 Mar 2017 05:13:26 +0000 (23:13 -0600)]
target: Use system workqueue for ALUA transitions

If tcmu-runner is processing a STPG and needs to change the kernel's
ALUA state then we cannot use the same work queue for task management
requests and ALUA transitions, because we could deadlock. The problem
occurs when a STPG times out before tcmu-runner is able to
call into target_tg_pt_gp_alua_access_state_store->
core_alua_do_port_transition -> core_alua_do_transition_tg_pt ->
queue_work. In this case, the tmr is on the work queue waiting for
the STPG to complete, but the STPG transition is now queued behind
the waiting tmr.

Note:
This bug will also be fixed by this patch:
http://www.spinics.net/lists/target-devel/msg14560.html
which switches the tmr code to use the system workqueues.

For both, I am not sure if we need a dedicated workqueue since
it is not a performance path and I do not think we need WQ_MEM_RECLAIM
to make forward progress to free up memory like the block layer does.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: fail ALUA transitions for pscsi
Mike Christie [Thu, 2 Mar 2017 05:13:25 +0000 (23:13 -0600)]
target: fail ALUA transitions for pscsi

We do not setup the LU group for pscsi devices, so if you write
a state to alua_access_state that will cause a transition you will
get a NULL pointer dereference.

This patch will fail attempts to try and transition the path
for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA
flag.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: allow ALUA setup for some passthrough backends
Mike Christie [Thu, 2 Mar 2017 05:13:24 +0000 (23:13 -0600)]
target: allow ALUA setup for some passthrough backends

This patch allows passthrough backends to use the core/base LIO
ALUA setup and state checks, but still handle the execution of
commands.

This will allow the target_core_user module to execute STPG and RTPG
in userspace, and not have to duplicate the ALUA state checks, path
information (needed so we can check if command is executable on
specific paths) and setup (rtslib sets/updates the configfs ALUA
interface like it does for iblock or file).

For STPG, the target_core_user userspace daemon, tcmu-runner will
still execute the STPG, and to update the core/base LIO state it
will use the existing configfs interface. For RTPG, tcmu-runner
will loop over configfs and/or cache the state.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotcmu: return on first Opt parse failure
Mike Christie [Thu, 2 Mar 2017 05:14:40 +0000 (23:14 -0600)]
tcmu: return on first Opt parse failure

We only were returing failure if the last opt to be parsed failed.
This has a return failure when we first detect a failure.

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotcmu: allow hw_max_sectors greater than 128
Mike Christie [Thu, 2 Mar 2017 05:14:39 +0000 (23:14 -0600)]
tcmu: allow hw_max_sectors greater than 128

tcmu hard codes the hw_max_sectors to 128 which is a litle small.
Userspace uses the max_sectors to report the optimal IO size and
some initiators perform better with larger IOs (open-iscsi seems
to do better with 256 to 512 depending on the test).

(Fix do not display hw max sectors twice - MNC)

Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agotarget: Drop pointless tfo->check_stop_free check
Nicholas Bellinger [Wed, 8 Mar 2017 08:09:59 +0000 (00:09 -0800)]
target: Drop pointless tfo->check_stop_free check

All in-tree fabric drivers provide a tfo->check_stop_free(),
so there is no need to do the extra check within existing
transport_cmd_check_stop_to_fabric() code.

Just to be sure, add a check in target_fabric_tf_ops_check()
to notify any out-of-tree drivers that might be missing it.

Signed-off-by: Nicholas Bellinger <[email protected]>
8 years agoparisc: Fix system shutdown halt
Helge Deller [Sat, 18 Mar 2017 16:13:27 +0000 (17:13 +0100)]
parisc: Fix system shutdown halt

On those parisc machines which don't provide a software power off
function, the system currently kills the init process at the end of a
shutdown and unexpectedly restarts insteads of halting.
Fix it by adding a loop which will not return.

Signed-off-by: Helge Deller <[email protected]>
Cc: [email protected] # 4.9+
8 years agoparisc: perf: Fix potential NULL pointer dereference
Arvind Yadav [Tue, 14 Mar 2017 09:54:51 +0000 (15:24 +0530)]
parisc: perf: Fix potential NULL pointer dereference

Fix potential NULL pointer dereference and clean up
coding style errors (code indent, trailing whitespaces).

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
8 years agoMerge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 18 Mar 2017 15:33:44 +0000 (08:33 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CPU hotplug fix from Thomas Gleixner:
 "A single fix preventing the concurrent execution of the CPU hotplug
  callback install/invocation machinery. Long standing bug caused by a
  massive brain slip of that Gleixner dude, which went unnoticed for
  almost a year"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Serialize callback invocations proper

8 years agoMerge tag 'pm-4.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Sat, 18 Mar 2017 00:25:14 +0000 (17:25 -0700)]
Merge tag 'pm-4.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a few more intel_pstate issues and one small issue in the
  cpufreq core.

  Specifics:

   - Fix breakage in the intel_pstate's debugfs interface for PID
     controller tuning (Rafael Wysocki)

   - Fix computations related to P-state limits in intel_pstate to avoid
     excessive rounding errors leading to visible inaccuracies (Srinivas
     Pandruvada, Rafael Wysocki)

   - Add a missing newline to a message printed by one function in the
     cpufreq core and clean up that function (Rafael Wysocki)"

* tag 'pm-4.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Fix and clean up show_cpuinfo_cur_freq()
  cpufreq: intel_pstate: Avoid percentages in limits-related computations
  cpufreq: intel_pstate: Correct frequency setting in the HWP mode
  cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()

8 years agocpufreq: intel_pstate: One set of global limits in active mode
Rafael J. Wysocki [Fri, 17 Mar 2017 23:57:39 +0000 (00:57 +0100)]
cpufreq: intel_pstate: One set of global limits in active mode

In the active mode intel_pstate currently uses two sets of global
limits, each associated with one of the possible scaling_governor
settings in that mode: "powersave" or "performance".

The driver switches over from one of those sets to the other
depending on the scaling_governor setting for the last CPU whose
per-policy cpufreq interface in sysfs was last used to change
parameters exposed in there.  That obviously leads to no end of
issues when the scaling_governor settings differ between CPUs.

The most recent issue was introduced by commit a240c4aa5d0f (cpufreq:
intel_pstate: Do not reinit performance limits in ->setpolicy)
that eliminated the reinitialization of "performance" limits in
intel_pstate_set_policy() preventing the max limit from being set
to anything below 100, among other things.

Namely, an undesirable side effect of commit a240c4aa5d0f is that
now, after setting scaling_governor to "performance" in the active
mode, the per-policy limits for the CPU in question go to the highest
level and stay there even when it is switched back to "powersave"
later.

As it turns out, some distributions set scaling_governor to
"performance" temporarily for all CPUs to speed-up system
initialization, so that change causes them to misbehave later.

To fix that, get rid of the performance/powersave global limits
split and use just one set of global limits for everything.

From the user's persepctive, after this modification, when
scaling_governor is switched from "performance" to "powersave"
or the other way around on one CPU, the limits settings (ie. the
global max/min_perf_pct and per-policy scaling_max/min_freq for
any CPUs) will not change.  Still, switching from "performance"
to "powersave" or the other way around changes the way in which
P-states are selected and in particular "performance" causes the
driver to always request the highest P-state it is allowed to ask
for for the given CPU.

Fixes: a240c4aa5d0f (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy)
Signed-off-by: Rafael J. Wysocki <[email protected]>
8 years agoMerge branches 'pm-cpufreq-fixes' and 'intel_pstate-fixes'
Rafael J. Wysocki [Fri, 17 Mar 2017 23:45:09 +0000 (00:45 +0100)]
Merge branches 'pm-cpufreq-fixes' and 'intel_pstate-fixes'

* pm-cpufreq-fixes:
  cpufreq: Fix and clean up show_cpuinfo_cur_freq()

* intel_pstate-fixes:
  cpufreq: intel_pstate: Avoid percentages in limits-related computations
  cpufreq: intel_pstate: Correct frequency setting in the HWP mode
  cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()

8 years agoInput: ALPS - fix trackstick button handling on V8 devices
Masaki Ota [Fri, 17 Mar 2017 21:19:40 +0000 (14:19 -0700)]
Input: ALPS - fix trackstick button handling on V8 devices

Alps stick devices always have physical buttons, so we should not check
ALPS_BUTTONPAD flag to decide whether we should report them.

Fixes: 4777ac220c43 ("Input: ALPS - add touchstick support for SS5 hardware")
Signed-off-by: Masaki Ota <[email protected]>
Acked-by: Pali Rohar <[email protected]>
Tested-by: Paul Donohue <[email protected]>
Tested-by: Nick Fletcher <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
8 years agoInput: ALPS - fix V8+ protocol handling (73 03 28)
Masaki Ota [Fri, 17 Mar 2017 21:10:57 +0000 (14:10 -0700)]
Input: ALPS - fix V8+ protocol handling (73 03 28)

Devices identified as E7="73 03 28" use slightly modified version of V8
protocol, with lower count per electrode, different offsets, and different
feature bits in OTP data.

Fixes: aeaa881f9b17 ("Input: ALPS - set DualPoint flag for 74 03 28 devices")
Signed-off-by: Masaki Ota <[email protected]>
Acked-by: Pali Rohar <[email protected]>
Tested-by: Paul Donohue <[email protected]>
Tested-by: Nick Fletcher <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
8 years agoMerge tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Fri, 17 Mar 2017 21:16:22 +0000 (14:16 -0700)]
Merge tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "We have a handful of stable fixes to fix kernel warnings and other
  bugs that have been around for a while. We've also found a few other
  reference counting bugs and memory leaks since the initial 4.11 pull.

  Stable Bugfixes:
   - Fix decrementing nrequests in NFS v4.2 COPY to fix kernel warnings
   - Prevent a double free in async nfs4_exchange_id()
   - Squelch a kbuild sparse complaint for xprtrdma

  Other Bugfixes:
   - Fix a typo (NFS_ATTR_FATTR_GROUP_NAME) that causes a memory leak
   - Fix a reference leak that causes kernel warnings
   - Make nfs4_cb_sv_ops static to fix a sparse warning
   - Respect a server's max size in CREATE_SESSION
   - Handle errors from nfs4_pnfs_ds_connect
   - Flexfiles layout shouldn't mark devices as unavailable"

* tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  pNFS/flexfiles: never nfs4_mark_deviceid_unavailable
  pNFS: return status from nfs4_pnfs_ds_connect
  NFSv4.1 respect server's max size in CREATE_SESSION
  NFS prevent double free in async nfs4_exchange_id
  nfs: make nfs4_cb_sv_ops static
  xprtrdma: Squelch kbuild sparse complaint
  NFS: fix the fault nrequests decreasing for nfs_inode COPY
  NFSv4: fix a reference leak caused WARNING messages
  nfs4: fix a typo of NFS_ATTR_FATTR_GROUP_NAME

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 17 Mar 2017 21:05:03 +0000 (14:05 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "An assorted pile of fixes along with some hardware enablement:

   - a fix for a KASAN / branch profiling related boot failure

   - some more fallout of the PUD rework

   - a fix for the Always Running Timer which is not initialized when
     the TSC frequency is known at boot time (via MSR/CPUID)

   - a resource leak fix for the RDT filesystem

   - another unwinder corner case fixup

   - removal of the warning for duplicate NMI handlers because there are
     legitimate cases where more than one handler can be registered at
     the last level

   - make a function static - found by sparse

   - a set of updates for the Intel MID platform which got delayed due
     to merge ordering constraints. It's hardware enablement for a non
     mainstream platform, so there is no risk"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mpx: Make unnecessarily global function static
  x86/intel_rdt: Put group node in rdtgroup_kn_unlock
  x86/unwind: Fix last frame check for aligned function stacks
  mm, x86: Fix native_pud_clear build error
  x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y
  x86/platform/intel-mid: Add power button support for Merrifield
  x86/platform/intel-mid: Use common power off sequence
  x86/platform: Remove warning message for duplicate NMI handlers
  x86/tsc: Fix ART for TSC_KNOWN_FREQ
  x86/platform/intel-mid: Correct MSI IRQ line for watchdog device

8 years agoMerge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 17 Mar 2017 21:01:40 +0000 (14:01 -0700)]
Merge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 acpi fixes from Thomas Gleixner:
 "This update deals with the fallout of the recent work to make
  cpuid/node mappings persistent.

  It turned out that the boot time ACPI based mapping tripped over ACPI
  inconsistencies and caused regressions. It's partially reverted and
  the fragile part replaced by an implementation which makes the mapping
  persistent when a CPU goes online for the first time"

* 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  acpi/processor: Check for duplicate processor ids at hotplug time
  acpi/processor: Implement DEVICE operator for processor enumeration
  x86/acpi: Restore the order of CPU IDs
  Revert"x86/acpi: Enable MADT APIs to return disabled apicids"
  Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"

8 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 17 Mar 2017 20:59:52 +0000 (13:59 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A set of perf related fixes:

   - fix a CR4.PCE propagation issue caused by usage of mm instead of
     active_mm and therefore propagated the wrong value.

   - perf core fixes, which plug a use-after-free issue and make the
     event inheritance on fork more robust.

   - a tooling fix for symbol handling"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf symbols: Fix symbols__fixup_end heuristic for corner cases
  x86/perf: Clarify why x86_pmu_event_mapped() isn't racy
  x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm
  perf/core: Better explain the inherit magic
  perf/core: Simplify perf_event_free_task()
  perf/core: Fix event inheritance on fork()
  perf/core: Fix use-after-free in perf_release()

8 years agobtrfs: add missing memset while reading compressed inline extents
Zygo Blaxell [Fri, 10 Mar 2017 21:45:44 +0000 (16:45 -0500)]
btrfs: add missing memset while reading compressed inline extents

This is a story about 4 distinct (and very old) btrfs bugs.

Commit c8b978188c ("Btrfs: Add zlib compression support") added
three data corruption bugs for inline extents (bugs #1-3).

Commit 93c82d5750 ("Btrfs: zero page past end of inline file items")
fixed bug #1:  uncompressed inline extents followed by a hole and more
extents could get non-zero data in the hole as they were read.  The fix
was to add a memset in btrfs_get_extent to zero out the hole.

Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption")
fixed bug #2:  compressed inline extents which contained non-zero bytes
might be replaced with zero bytes in some cases.  This patch removed an
unhelpful memset from uncompress_inline, but the case where memset is
required was missed.

There is also a memset in the decompression code, but this only covers
decompressed data that is shorter than the ram_bytes from the extent
ref record.  This memset doesn't cover the region between the end of the
decompressed data and the end of the page.  It has also moved around a
few times over the years, so there's no single patch to refer to.

This patch fixes bug #3:  compressed inline extents followed by a hole
and more extents could get non-zero data in the hole as they were read
(i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/).
The fix is the same:  zero out the hole in the compressed case too,
by putting a memset back in uncompress_inline, but this time with
correct parameters.

The last and oldest bug, bug #0, is the cause of the offending inline
extent/hole/extent pattern.  Bug #0 is a subtle and mostly-harmless quirk
of behavior somewhere in the btrfs write code.  In a few special cases,
an inline extent and hole are allowed to persist where they normally
would be combined with later extents in the file.

A fast reproducer for bug #0 is presented below.  A few offending extents
are also created in the wild during large rsync transfers with the -S
flag.  A Linux kernel build (git checkout; make allyesconfig; make -j8)
will produce a handful of offending files as well.  Once an offending
file is created, it can present different content to userspace each
time it is read.

Bug #0 is at least 4 and possibly 8 years old.  I verified every vX.Y
kernel back to v3.5 has this behavior.  There are fossil records of this
bug's effects in commits all the way back to v2.6.32.  I have no reason
to believe bug #0 wasn't present at the beginning of btrfs compression
support in v2.6.29, but I can't easily test kernels that old to be sure.

It is not clear whether bug #0 is worth fixing.  A fix would likely
require injecting extra reads into currently write-only paths, and most
of the exceptional cases caused by bug #0 are already handled now.

Whether we like them or not, bug #0's inline extents followed by holes
are part of the btrfs de-facto disk format now, and we need to be able
to read them without data corruption or an infoleak.  So enough about
bug #0, let's get back to bug #3 (this patch).

An example of on-disk structure leading to data corruption found in
the wild:

        item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160
                inode generation 50 transid 50 size 47424 nbytes 49141
                block group 0 mode 100644 links 1 uid 0 gid 0
                rdev 0 flags 0x0(none)
        item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20
                inode ref index 3 namelen 10 name: DB_File.so
        item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362
                inline extent data size 1341 ram 4085 compress(zlib)
        item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53
                extent data disk byte 5367308288 nr 20480
                extent data offset 0 nr 45056 ram 45056
                extent compression(zlib)

Different data appears in userspace during each read of the 11 bytes
between 4085 and 4096.  The extent in item 63 is not long enough to
fill the first page of the file, so a memset is required to fill the
space between item 63 (ending at 4085) and item 64 (beginning at 4096)
with zero.

Here is a reproducer from Liu Bo, which demonstrates another method
of creating the same inline extent and hole pattern:

Using 'page_poison=on' kernel command line (or enable
CONFIG_PAGE_POISONING) run the following:

# touch foo
# chattr +c foo
# xfs_io -f -c "pwrite -W 0 1000" foo
# xfs_io -f -c "falloc 4 8188" foo
# od -x foo
# echo 3 >/proc/sys/vm/drop_caches
# od -x foo

This produce the following on my box:

Correct output:  file contains 1000 data bytes followed
by zeros:

0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
*
0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000
0001760 0000 0000 0000 0000 0000 0000 0000 0000
*
0020000

Actual output:  the data after the first 1000 bytes
will be different each run:

0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
*
0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d
0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400
0002000 435f 0056 5f74 6164 7400 645f 0062 5f74
(...)

Signed-off-by: Zygo Blaxell <[email protected]>
Reviewed-by: Liu Bo <[email protected]>
Reviewed-by: Chris Mason <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
This page took 0.133669 seconds and 4 git commands to generate.