Git Repo - linux.git/log

mm: numa: quickly fail allocations for NUMA balancing on full nodes

Commit 4167e9b2cf10 ("mm: remove GFP_THISNODE") removed the GFP_THISNODE
flag combination due to confusing semantics.  It noted that
alloc_misplaced_dst_page() was one such user after changes made by
commit e97ca8e5b864 ("mm: fix GFP_THISNODE callers and clarify").

Unfortunately when GFP_THISNODE was removed, users of
alloc_misplaced_dst_page() started waking kswapd and entering direct
reclaim because the wrong GFP flags are cleared.  The consequence is
that workloads that used to fit into memory now get reclaimed which is
addressed by this patch.

The problem can be demonstrated with "mutilate" that exercises memcached
which is software dedicated to memory object caching.  The configuration
uses 80% of memory and is run 3 times for varying numbers of clients.
The results on a 4-socket NUMA box are

mutilate
                            4.4.0                 4.4.0
                          vanilla           numaswap-v1
Hmean    1      8394.71 (  0.00%)     8395.32 (  0.01%)
Hmean    4     30024.62 (  0.00%)    34513.54 ( 14.95%)
Hmean    7     32821.08 (  0.00%)    70542.96 (114.93%)
Hmean    12    55229.67 (  0.00%)    93866.34 ( 69.96%)
Hmean    21    39438.96 (  0.00%)    85749.21 (117.42%)
Hmean    30    37796.10 (  0.00%)    50231.49 ( 32.90%)
Hmean    47    18070.91 (  0.00%)    38530.13 (113.22%)

The metric is queries/second with the more the better.  The results are
way outside of the noise and the reason for the improvement is obvious
from some of the vmstats

                                 4.4.0       4.4.0
                               vanillanumaswap-v1r1
Minor Faults                1929399272  2146148218
Major Faults                  19746529        3567
Swap Ins                      57307366        9913
Swap Outs                     50623229       17094
Allocation stalls                35909         443
DMA allocs                           0           0
DMA32 allocs                  72976349   170567396
Normal allocs               5306640898  5310651252
Movable allocs                       0           0
Direct pages scanned         404130893      799577
Kswapd pages scanned         160230174           0
Kswapd pages reclaimed        55928786           0
Direct pages reclaimed         1843936       41921
Page writes file                  2391           0
Page writes anon              50623229       17094

The vanilla kernel is swapping like crazy with large amounts of direct
reclaim and kswapd activity.  The figures are aggregate but it's known
that the bad activity is throughout the entire test.

Note that simple streaming anon/file memory consumers also see this
problem but it's not as obvious.  In those cases, kswapd is awake when
it should not be.

As there are at least two reclaim-related bugs out there, it's worth
spelling out the user-visible impact.  This patch only addresses bugs
related to excessive reclaim on NUMA hardware when the working set is
larger than a NUMA node.  There is a bug related to high kswapd CPU
usage but the reports are against laptops and other UMA hardware and is
not addressed by this patch.

Signed-off-by: Mel Gorman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]> [4.1+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED

pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were
introduced to locklessy (but atomically) detect when a pmd is a regular
(stable) pmd or when the pmd is unstable and can infinitely transition
from pmd_none() and pmd_trans_huge() from under us, while only holding
the mmap_sem for reading (for writing not).

While holding the mmap_sem only for reading, MADV_DONTNEED can run from
under us and so before we can assume the pmd to be a regular stable pmd
we need to compare it against pmd_none() and pmd_trans_huge() in an
atomic way, with pmd_trans_unstable(). The old pmd_trans_huge() left a
tiny window for a race.

Useful applications are unlikely to notice the difference as doing
MADV_DONTNEED concurrently with a page fault would lead to undefined
behavior.

[[email protected]: tidy up comment grammar/layout]
Signed-off-by: Andrea Arcangeli <[email protected]>
Reported-by: Kirill A. Shutemov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

PCI: mvebu: Restrict build to 32-bit ARM

This driver uses PCI glue that is only available on 32-bit ARM. This used
to work fine as long as ARCH_MVEBU and ARCH_DOVE were exclusively 32-bit,
but there's a patch in the pipe to make ARCH_MVEBU also available on 64-bit
ARM.

[bhelgaas: changelog; patch is coming but not merged yet]
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Thomas Petazzoni <[email protected]>

Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"

991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") appeared in v4.3 and helps support IOAPIC hotplug.

Олег reported that the Elcus-1553 TA1-PCI driver worked in v4.2 but not
v4.3 and bisected it to 991de2e59090.  Sunjin reported that the RocketRAID
272x driver worked in v4.2 but not v4.3.  In both cases booting with
"pci=routirq" is a workaround.

I think the problem is that after 991de2e59090, we no longer call
pcibios_enable_irq() for upstream bridges.  Prior to 991de2e59090, when a
driver called pci_enable_device(), we recursively called
pcibios_enable_irq() for upstream bridges via pci_enable_bridge().

After 991de2e59090, we call pcibios_enable_irq() from pci_device_probe()
instead of the pci_enable_device() path, which does *not* call
pcibios_enable_irq() for upstream bridges.

Revert 991de2e59090 to fix these driver regressions.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
Fixes: 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
Reported-and-tested-by: Олег Мороз <[email protected]>
Reported-by: Sunjin Yang <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
CC: Jiang Liu <[email protected]>

ipr: Fix regression when loading firmware

Commit d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite") removed
the end of line handling when storing the update_fw sysfs attribute.
This changed the userpace API because it started refusing writes
terminated by a line feed, which broke the update tools we already have.

This patch re-adds that handling, so both a write terminated by a line
feed or not can make it through with the update.

Fixes: d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite")
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Cc: Insu Yun <[email protected]>
Acked-by: Brian King <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

SCSI: Free resources when we return BLKPREP_INVALID

When called scsi_prep_fn return BLKPREP_INVALID, we should use the same
code with BLKPREP_KILL in scsi_prep_return.

Signed-off-by: Yiwen Jiang <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

x86/mpx: Fix off-by-one comparison with nr_registers

In the unlikely event that regno == nr_registers then we get an array
overrun on regoff because the invalid register check is currently
off-by-one. Fix this with a check that regno is >= nr_registers instead.

Detected with static analysis using CoverityScan.

Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information"
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

arm64: vmemmap: use virtual projection of linear region

Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual mapping.

However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.

vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)

We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can even reduce the size by half.

Cc: <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

Pull Ceph fixes from Sage Weil:
"There are two small messenger bug fixes and a log spam regression fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: don't spam dmesg with stray reply warnings
  libceph: use the right footer size when skipping a message
  libceph: don't bail early from try_read() when skipping a message

Merge tag 'sound-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Things got calmed down for rc6, as it seems, and we have only a few
  HD-audio fixes at this time: a fix for Skylake codec probe errors, a
  fix for missing interrupt handling, and a few Dell and HP quirks"

* tag 'sound-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Loop interrupt handling until really cleared
  ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2
  ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225
  ALSA: hda - Fixing background noise on Dell Inspiron 3162
  ALSA: hda - Apply clock gate workaround to Skylake, too

Merge tag 'pm+acpi-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"These are two reverts of recent PCI-related ACPI core changes (one of
  which caused some systems to crash on boot and the other was a cleanup
  on top of it) and a devfreq fix for Tegra.

  Specifics:

   - Revert an ACPI core change related to IRQ management in PCI that
     introduced code relying on the use of kmalloc() which turned out to
     also run during early init when that's not available yet and caused
     some systems to crash on boot for this reason along with a cleanup
     on top of it (Rafael Wysocki).

   - Prevent devfreq from flooding the kernel log with useless messages
     on Tegra (which started to happen after some recent changes in the
     devfreq core) by fixing the driver to follow the documentation and
     the core's expectations in its ->target callback (Tomeu Vizoso)"

* tag 'pm+acpi-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI, PCI, irq: remove interrupt count restriction"
  Revert "ACPI / PCI: Simplify acpi_penalize_isa_irq()"
  PM / devfreq: tegra: Set freq in rate callback

Merge branches 'pm-devfreq' and 'acpi-pci'

* pm-devfreq:
  PM / devfreq: tegra: Set freq in rate callback

* acpi-pci:
  Revert "ACPI, PCI, irq: remove interrupt count restriction"
  Revert "ACPI / PCI: Simplify acpi_penalize_isa_irq()"

KVM: x86: fix root cause for missed hardware breakpoints

Commit 172b2386ed16 ("KVM: x86: fix missed hardware breakpoints",
2016-02-10) worked around a case where the debug registers are not loaded
correctly on preemption and on the first entry to KVM_RUN.

However, Xiao Guangrong pointed out that the root cause must be that
KVM_DEBUGREG_BP_ENABLED is not being set correctly. This can indeed
happen due to the lazy debug exit mechanism, which does not call
kvm_update_dr7. Fix it by replacing the existing loop (more or less
equivalent to kvm_update_dr0123) with calls to all the kvm_update_dr*
functions.

Cc: [email protected] # 4.1+
Fixes: 172b2386ed16a9143d9a456aae5ec87275c61489
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

fbcon: set a default value to blink interval

Since commit 27a4c827c34ac4256a190cc9d24607f953c1c459
fbcon: use the cursor blink interval provided by vt

two attempts have been made at fixing a possible hang caused by
cursor_timer_handler. That function registers a timer to be triggered at
"jiffies + fbcon_ops.cur_blink_jiffies".

A new case had been encountered during initialisation of clcd-pl11x:

    fbcon_fb_registered
    do_fbcon_takeover

    ->  do_register_con_driver
        fbcon_startup
    (A) add_cursor_timer (with cur_blink_jiffies = 0)

    ->  do_bind_con_driver
        visual_init
        fbcon_init
    (B) cur_blink_jiffies = msecs_to_jiffies(vc->vc_cur_blink_ms);

If we take an softirq anywhere between A and B (and we do),
cursor_timer_handler executes indefinitely.

Instead of patching all possible paths that lead to this case one at a
time, fix the issue at the source and initialise cur_blink_jiffies to
200ms when allocating fbcon_ops. This was its default value before
aforesaid commit. fbcon_cursor or fbcon_init will refine this value
downstream.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Cc: <[email protected]> # v4.2
Tested-by: Scot Doyle <[email protected]>
Signed-off-by: Tomi Valkeinen <[email protected]>

Merge branch 'stable-4.5' of git://git.infradead.org/users/pcmoore/selinux into for-linus

ALSA: hda - Loop interrupt handling until really cleared

Currently the interrupt handler of HD-audio driver assumes that no irq
update is needed while processing the irq.  But in reality, it has
been confirmed that the HW irq is issued even during the irq
handling.  Since we clear the irq status at the beginning, process the
interrupt, then exits from the handler, the lately issued interrupt is
left untouched without being properly processed.

This patch changes the interrupt handler code to loop over the
check-and-process.  The handler tries repeatedly as long as the IRQ
status are turned on, and either stream or CORB/RIRB is handled.

For checking the stream handling, snd_hdac_bus_handle_stream_irq()
returns a value indicating the stream indices bits.  Other than that,
the change is only in the irq handler itself.

Reported-by: Libin Yang <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"Another small bug reported to me by Chunyu Hu.

  When perf added a "reg" function to the function tracing event (not a
  tracepoint), it caused that event to be displayed as a tracepoint and
  could cause errors in tracepoint handling.  That was solved by adding
  a flag to ignore ftrace non-tracepoint events.  But that flag was
  missed when displaying events in available_events, which should only
  contain tracepoint events.

  This broke a documented way to enable all events with:

      cat available_events > set_event

  As the function non-tracepoint event would cause that to error out.
  The commit here fixes that by having the available_events file not
  list events that have the ignore flag set"

* tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix showing function event in available_events

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"KVM/ARM fixes:
   - Fix per-vcpu vgic bitmap allocation
   - Do not give copy random memory on MMIO read
   - Fix GICv3 APR register restore order

  KVM/x86 fixes:
   - Fix ubsan warning
   - Fix hardware breakpoints in a guest vs. preempt notifiers
   - Fix Hurd

  Generic:
   - use __GFP_NOWARN together with GFP_NOWAIT"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: MMU: fix ubsan index-out-of-range warning
  arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2
  KVM: async_pf: do not warn on page allocation failures
  KVM: x86: fix conversion of addresses to linear in 32-bit protected mode
  KVM: x86: fix missed hardware breakpoints
  arm/arm64: KVM: Feed initialized memory to MMIO accesses
  KVM: arm/arm64: vgic: Ensure bitmaps are long enough

Merge tag 'renesas-sh-drivers-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas

Pull SuperH driver fix from Simon Horman:
"Restore legacy clock domain on SuperH platforms"

* tag 'renesas-sh-drivers-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
drivers: sh: Restore legacy clock domain on SuperH platforms

Merge tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
- eeh: Fix partial hotplug criterion from Gavin Shan
- mm: Clear the invalid slot information correctly from Aneesh Kumar K.V

* tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash: Clear the invalid slot information correctly
powerpc/eeh: Fix partial hotplug criterion

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 bugfixes from Martin Schwidefsky:
"Two critical bug fixes for the signal handling"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/fpu: signals vs. floating point control register
s390/compat: correct restore of high gprs on signal return

Merge tag 'nfsd-4.5-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from Bruce Fields:
"One fix for a bug that could cause a NULL write past the end of a
  buffer in case of unusually long writes to some system interfaces used
  by mountd and other nfs support utilities"

* tag 'nfsd-4.5-1' of git://linux-nfs.org/~bfields/linux:
  sunrpc/cache: fix off-by-one in qword_get()

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"This is a bit larger than Id like, but I asked the Intel guys to pull
  in some Skylake fixes in the possibly vain hope that Skylake might be
  more functional now that I'm seeing production hardware shipping.

  For i915, it's mostly the same patch in a few places, making sure the
  hw doesn't turn off when we are programming it.

  Apart from that are two nouveau fixes, one for a module defer bug, and
  one for using nouveau on new Lenovo P50 models.

  Then there are a bunch of AMDGPU fixes, one is a fix for v4.4 vblank
  regressions, and some PM fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (26 commits)
  drm/nouveau/disp/dp: ensure sink is powered up before attempting link training
  drm/nouveau: platform: Fix deferred probe
  drm/amdgpu: disable direct VM updates when vm_debug is set
  amdgpu: fix NULL pointer dereference at tonga_check_states_equal
  drm/i915/gen9: Verify and enforce dc6 state writes
  drm/i915/gen9: Check for DC state mismatch
  drm/radeon/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: add some checks for PX
  drm/amdgpu: fix locking in force performance level
  drm/amdgpu/gfx8: fix priv reg interrupt enable
  drm/i915/skl: Ensure HW is powered during DDB HW state readout
  drm/i915/lvds: Ensure the HW is powered during HW state readout
  drm/i915/hdmi: Ensure the HW is powered during HW state readout
  drm/i915/dsi: Ensure the HW is powered during HW state readout
  drm/i915/dp: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered when accessing the CRC HW block
  drm/i915/ddi: Ensure the HW is powered during HW state readout
  drm/i915/crt: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered during HW access in assert_pipe
  ...

Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:

- Two fixes for compatibility with the ACPI 6.1 specification.

   Without these fixes multi-interface DIMMs will fail to be probed, and
   address range scrub commands to find memory errors will give results
   that the kernel will mis-interpret.  For multi-interface DIMMs Linux
   will accept either the original 6.0 implementation or 6.1.

   For address range scrub we'll only support 6.1 since ACPI formalized
   this DSM differently than the original example [1] implemented in
   v4.2.  The expectation is that production systems will only ever ship
   the ACPI 6.1 address range scrub command definition.

- The wider async address range scrub work targeting 4.6 discovered
   that the original synchronous implementation in 4.5 is not sizing its
   return buffer correctly.

- Arnd caught that my recent fix to the size of the pfn_t flags missed
   updating the flags variable used in the pmem driver.

- Toshi found that we mishandle the memremap() return value in
   devm_memremap().

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm: use 'u64' for pfn flags
  devm_memremap: Fix error value when memremap failed
  nfit: update address range scrub commands to the acpi 6.1 format
  libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
  nfit: fix multi-interface dimm handling, acpi6.1 compatibility

ASoC: cs4271: add regulator consumer support

The cs4271 has three power domains: vd, vl and va.
Enable them all, as long as the codec is in use.

While at it, factored out the reset code into its own function.

Signed-off-by: Pascal Huerst <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

Merge tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:
"Add a regression fix for changed sysfs path of bq27xxx_battery and
  update MAINTAINERS file"

* tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: bq27xxx_battery: Restore device name
  MAINTAINERS: update bq27xxx driver

regulator: max77620: Remove duplicate module alias

The same alias is already in .id_table.

Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

regulator: max77620: Eliminate duplicate code

Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

regulator: max77620: Remove unused fields

These fields are never used and not required at all, remove them.

Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

libata: Align ata_device's id on a cacheline

The id buffer in ata_device is a DMA target, but it isn't explicitly
cacheline aligned. Due to this, adjacent fields can be overwritten with
stale data from memory on non coherent architectures. As a result, the
kernel is sometimes unable to communicate with an ATA device.

Fix this by ensuring that the id buffer is cacheline aligned.

This issue is similar to that fixed by Commit 84bda12af31f
("libata: align ap->sector_buf").

Signed-off-by: Harvey Hunt <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # 2.6.18
Signed-off-by: Tejun Heo <[email protected]>

MAINTAINERS: add maintainer entry for FREESCALE GPMI NAND driver

Add a maintainer entry for FREESCALE GPMI NAND driver and add myself as a
maintainer.

Signed-off-by: Han Xu <[email protected]>
Acked-by: Huang Shijie <[email protected]>
Signed-off-by: Brian Norris <[email protected]>

x86/mm: Fix slow_virt_to_phys() for X86_PAE again

"d1cd12108346: x86, pageattr: Prevent overflow in slow_virt_to_phys() for
X86_PAE" was unintentionally removed by the recent "34437e67a672: x86/mm: Fix
slow_virt_to_phys() to handle large PAT bit".

And, the variable 'phys_addr' was defined as "unsigned long" by mistake -- it should
be "phys_addr_t".

As a result, Hyper-V network driver in 32-PAE Linux guest can't work again.

Fixes: commit 34437e67a672: "x86/mm: Fix slow_virt_to_phys() to handle large PAT bit"
Signed-off-by: Dexuan Cui <[email protected]>
Reviewed-by: Toshi Kani <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

iommu/amd: Apply workaround for ATS write permission check

The AMD Family 15h Models 30h-3Fh (Kaveri) BIOS and Kernel Developer's
Guide omitted part of the BIOS IOMMU L2 register setup specification.
Without this setup the IOMMU L2 does not fully respect write permissions
when handling an ATS translation request.

The IOMMU L2 will set PTE dirty bit when handling an ATS translation with
write permission request, even when PTE RW bit is clear. This may occur by
direct translation (which would cause a PPR) or by prefetch request from
the ATC.

This is observed in practice when the IOMMU L2 modifies a PTE which maps a
pagecache page. The ext4 filesystem driver BUGs when asked to writeback
these (non-modified) pages.

Enable ATS write permission check in the Kaveri IOMMU L2 if BIOS has not.

Signed-off-by: Jay Cornwall <[email protected]>
Cc: <[email protected]> # v3.19+
Signed-off-by: Joerg Roedel <[email protected]>

iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered

The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.

Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.

Cc: [email protected]
Signed-off-by: Suravee Suthikulpanit <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>

gpio: rcar: Add Runtime PM handling for interrupts

The R-Car GPIO driver handles Runtime PM for requested GPIOs only.

When using a GPIO purely as an interrupt source, no Runtime PM handling
is done, and the GPIO module's clock may not be enabled.

To fix this:
  - Add .irq_request_resources() and .irq_release_resources() callbacks
    to handle Runtime PM when an interrupt is requested,
  - Add irq_bus_lock() and sync_unlock() callbacks to handle Runtime PM
    when e.g. disabling/enabling an interrupt, or configuring the
    interrupt type.

Fixes: d5c3d84657db57bd "net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS"
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2

HP EliteBook 755 G2 with ALC3228 (ALC280) codec [103c:221c] requires
the known fixup (ALC269_FIXUP_HEADSET_MIC) for making the headset mic
working. Also, it suffers from the loopback noise problem, so we
should disable aamix path as well.

Reported-by: Derick Eddington <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

Fix directory hardlinks from deleted directories

When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.

This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.

To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.

Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.

Reported-by: Liu Song <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Cc: [email protected]

jffs2: Fix page lock / f->sem deadlock

With this fix, all code paths should now be obtaining the page lock before
f->sem.

Reported-by: Szabó Tamás <[email protected]>
Tested-by: Thomas Betker <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Cc: [email protected]

Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"

This reverts commit 5ffd3412ae55
("jffs2: Fix lock acquisition order bug in jffs2_write_begin").

The commit modified jffs2_write_begin() to remove a deadlock with
jffs2_garbage_collect_live(), but this introduced new deadlocks found
by multiple users. page_lock() actually has to be called before
mutex_lock(&c->alloc_sem) or mutex_lock(&f->sem) because
jffs2_write_end() and jffs2_readpage() are called with the page locked,
and they acquire c->alloc_sem and f->sem, resp.

In other words, the lock order in jffs2_write_begin() was correct, and
it is the jffs2_garbage_collect_live() path that has to be changed.

Revert the commit to get rid of the new deadlocks, and to clear the way
for a better fix of the original deadlock.

Reported-by: Deng Chao <[email protected]>
Reported-by: Ming Liu <[email protected]>
Reported-by: wangzaiwei <[email protected]>
Signed-off-by: Thomas Betker <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Cc: [email protected]

ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225

On one of the machines we enable, we found that the actual speaker volume
did not always correspond to the volume set in alsamixer. This patch
fixes that problem.

This patch was orginally written by Kailang @ Realtek, I've rebased it
to fit sound git master.

Cc: [email protected]
BugLink: https://bugs.launchpad.net/bugs/1549660
Co-Authored-By: Kailang <[email protected]>
Signed-off-by: David Henningsson <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'kvm-arm-for-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for 4.5-rc6

- Fix per-vcpu vgic bitmap allocation
- Do not give copy random memory on MMIO read
- Fix GICv3 APR register restore order

KVM: x86: MMU: fix ubsan index-out-of-range warning

Ubsan reports the following warning due to a typo in
update_accessed_dirty_bits template, the patch fixes
the typo:

[  168.791851] ================================================================================
[  168.791862] UBSAN: Undefined behaviour in arch/x86/kvm/paging_tmpl.h:252:15
[  168.791866] index 4 is out of range for type 'u64 [4]'
[  168.791871] CPU: 0 PID: 2950 Comm: qemu-system-x86 Tainted: G           O L  4.5.0-rc5-next-20160222 #7
[  168.791873] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013
[  168.791876]  0000000000000000 ffff8801cfcaf208 ffffffff81c9f780 0000000041b58ab3
[  168.791882]  ffffffff82eb2cc1 ffffffff81c9f6b4 ffff8801cfcaf230 ffff8801cfcaf1e0
[  168.791886]  0000000000000004 0000000000000001 0000000000000000 ffffffffa1981600
[  168.791891] Call Trace:
[  168.791899]  [<ffffffff81c9f780>] dump_stack+0xcc/0x12c
[  168.791904]  [<ffffffff81c9f6b4>] ? _atomic_dec_and_lock+0xc4/0xc4
[  168.791910]  [<ffffffff81da9e81>] ubsan_epilogue+0xd/0x8a
[  168.791914]  [<ffffffff81daafa2>] __ubsan_handle_out_of_bounds+0x15c/0x1a3
[  168.791918]  [<ffffffff81daae46>] ? __ubsan_handle_shift_out_of_bounds+0x2bd/0x2bd
[  168.791922]  [<ffffffff811287ef>] ? get_user_pages_fast+0x2bf/0x360
[  168.791954]  [<ffffffffa1794050>] ? kvm_largepages_enabled+0x30/0x30 [kvm]
[  168.791958]  [<ffffffff81128530>] ? __get_user_pages_fast+0x360/0x360
[  168.791987]  [<ffffffffa181b818>] paging64_walk_addr_generic+0x1b28/0x2600 [kvm]
[  168.792014]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
[  168.792019]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
[  168.792044]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
[  168.792076]  [<ffffffffa181c36d>] paging64_gva_to_gpa+0x7d/0x110 [kvm]
[  168.792121]  [<ffffffffa181c2f0>] ? paging64_walk_addr_generic+0x2600/0x2600 [kvm]
[  168.792130]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792178]  [<ffffffffa17d9a4a>] emulator_read_write_onepage+0x27a/0x1150 [kvm]
[  168.792208]  [<ffffffffa1794d44>] ? __kvm_read_guest_page+0x54/0x70 [kvm]
[  168.792234]  [<ffffffffa17d97d0>] ? kvm_task_switch+0x160/0x160 [kvm]
[  168.792238]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792263]  [<ffffffffa17daa07>] emulator_read_write+0xe7/0x6d0 [kvm]
[  168.792290]  [<ffffffffa183b620>] ? em_cr_write+0x230/0x230 [kvm]
[  168.792314]  [<ffffffffa17db005>] emulator_write_emulated+0x15/0x20 [kvm]
[  168.792340]  [<ffffffffa18465f8>] segmented_write+0xf8/0x130 [kvm]
[  168.792367]  [<ffffffffa1846500>] ? em_lgdt+0x20/0x20 [kvm]
[  168.792374]  [<ffffffffa14db512>] ? vmx_read_guest_seg_ar+0x42/0x1e0 [kvm_intel]
[  168.792400]  [<ffffffffa1846d82>] writeback+0x3f2/0x700 [kvm]
[  168.792424]  [<ffffffffa1846990>] ? em_sidt+0xa0/0xa0 [kvm]
[  168.792449]  [<ffffffffa185554d>] ? x86_decode_insn+0x1b3d/0x4f70 [kvm]
[  168.792474]  [<ffffffffa1859032>] x86_emulate_insn+0x572/0x3010 [kvm]
[  168.792499]  [<ffffffffa17e71dd>] x86_emulate_instruction+0x3bd/0x2110 [kvm]
[  168.792524]  [<ffffffffa17e6e20>] ? reexecute_instruction.part.110+0x2e0/0x2e0 [kvm]
[  168.792532]  [<ffffffffa14e9a81>] handle_ept_misconfig+0x61/0x460 [kvm_intel]
[  168.792539]  [<ffffffffa14e9a20>] ? handle_pause+0x450/0x450 [kvm_intel]
[  168.792546]  [<ffffffffa15130ea>] vmx_handle_exit+0xd6a/0x1ad0 [kvm_intel]
[  168.792572]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
[  168.792597]  [<ffffffffa17f6bcd>] kvm_arch_vcpu_ioctl_run+0xd3d/0x6090 [kvm]
[  168.792621]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
[  168.792627]  [<ffffffff8293b530>] ? __ww_mutex_lock_interruptible+0x1630/0x1630
[  168.792651]  [<ffffffffa17f5e90>] ? kvm_arch_vcpu_runnable+0x4f0/0x4f0 [kvm]
[  168.792656]  [<ffffffff811eeb30>] ? preempt_notifier_unregister+0x190/0x190
[  168.792681]  [<ffffffffa17e0447>] ? kvm_arch_vcpu_load+0x127/0x650 [kvm]
[  168.792704]  [<ffffffffa178e9a3>] kvm_vcpu_ioctl+0x553/0xda0 [kvm]
[  168.792727]  [<ffffffffa178e450>] ? vcpu_put+0x40/0x40 [kvm]
[  168.792732]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
[  168.792735]  [<ffffffff82946087>] ? _raw_spin_unlock+0x27/0x40
[  168.792740]  [<ffffffff8163a943>] ? handle_mm_fault+0x1673/0x2e40
[  168.792744]  [<ffffffff8129daa8>] ? trace_hardirqs_on_caller+0x478/0x6c0
[  168.792747]  [<ffffffff8129dcfd>] ? trace_hardirqs_on+0xd/0x10
[  168.792751]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792756]  [<ffffffff81725a80>] do_vfs_ioctl+0x1b0/0x12b0
[  168.792759]  [<ffffffff817258d0>] ? ioctl_preallocate+0x210/0x210
[  168.792763]  [<ffffffff8174aef3>] ? __fget+0x273/0x4a0
[  168.792766]  [<ffffffff8174acd0>] ? __fget+0x50/0x4a0
[  168.792770]  [<ffffffff8174b1f6>] ? __fget_light+0x96/0x2b0
[  168.792773]  [<ffffffff81726bf9>] SyS_ioctl+0x79/0x90
[  168.792777]  [<ffffffff82946880>] entry_SYSCALL_64_fastpath+0x23/0xc1
[  168.792780] ================================================================================

Signed-off-by: Mike Krinkin <[email protected]>
Reviewed-by: Xiao Guangrong <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

ALSA: hda - Fixing background noise on Dell Inspiron 3162

After login to the desktop on Dell Inspiron 3162,
there's a very loud background noise comes from the builtin speaker.
The noise does not go away even if the speaker is muted.

The noise disappears after using the aamix fixup.

Codec: Realtek ALC3234
Address: 0
AFG Function Id: 0x1 (unsol 1)
    Vendor Id: 0x10ec0255
    Subsystem Id: 0x10280725
    Revision Id: 0x100002
    No Modem Function Group found

BugLink: http://bugs.launchpad.net/bugs/1549620
Signed-off-by: Kai-Heng Feng <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

perf: Robustify task_function_call()

Since there is no serialization between task_function_call() doing
task_curr() and the other CPU doing context switches, we could end
up not sending an IPI even if we had to.

And I'm not sure I still buy my own argument we're OK.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix scaling vs. perf_install_in_context()

Completely reworks perf_install_in_context() (again!) in order to
ensure that there will be no ctx time hole between add_event_to_ctx()
and any potential ctx_sched_in().

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix scaling vs. perf_event_enable()

Similar to the perf_enable_on_exec(), ensure that event timings are
consistent across perf_event_enable().

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix scaling vs. perf_event_enable_on_exec()

The recent commit 3e349507d12d ("perf: Fix perf_enable_on_exec() event
scheduling") caused this by moving task_ctx_sched_out() from before
__perf_event_mask_enable() to after it.

The overlooked consequence of that change is that task_ctx_sched_out()
would update the ctx time fields, and now __perf_event_mask_enable()
uses stale time.

In order to fix this, explicitly stop our context's time before
enabling the event(s).

Reported-by: Oleg Nesterov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix ctx time tracking by introducing EVENT_TIME

Currently any ctx_sched_in() call will re-start the ctx time tracking,
this means that calls like:

ctx_sched_in(.event_type = EVENT_PINNED);
ctx_sched_in(.event_type = EVENT_FLEXIBLE);

will have a hole in their ctx time tracking. This is likely harmless
but can confuse things a little. By adding EVENT_TIME, we can have the
first ctx_sched_in() (is_active: 0 -> !0) start the time and any
further ctx_sched_in() will leave the timestamps alone.

Secondly, this allows for an early disable like:

ctx_sched_out(.event_type = EVENT_TIME);

which would update the ctx time (if the ctx is active) and any further
calls to ctx_sched_out() would not further modify the ctx time.

For ctx_sched_in() any 0 -> !0 transition will automatically include
EVENT_TIME.

For ctx_sched_out(), any transition that clears EVENT_ALL will
automatically clear EVENT_TIME.

These two rules ensure that under normal circumstances we need not
bother with EVENT_TIME and get natural ctx time behaviour.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Cure event->pending_disable race

Because event_sched_out() checks event->pending_disable _before_
actually disabling the event, it can happen that the event fires after
it checks but before it gets disabled.

This would leave event->pending_disable set and the queued irq_work
will try and process it.

However, if the event trigger was during schedule(), the event might
have been de-scheduled by the time the irq_work runs, and
perf_event_disable_local() will fail.

Fix this by checking event->pending_disable _after_ we call
event->pmu->del(). This depends on the latter being a compiler
barrier, such that the compiler does not lift the load and re-creates
the problem.

Tested-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix race between event install and jump_labels

perf_install_in_context() relies upon the context switch hooks to have
scheduled in events when the IPI misses its target -- after all, if
the task has moved from the CPU (or wasn't running at all), it will
have to context switch to run elsewhere.

This however doesn't appear to be happening.

It is possible for the IPI to not happen (task wasn't running) only to
later observe the task running with an inactive context.

The only possible explanation is that the context switch hooks are not
called. Therefore put in a sync_sched() after toggling the jump_label
to guarantee all CPUs will have them enabled before we install an
event.

A simple if (0->1) sync_sched() will not in fact work, because any
further increment can race and complete before the sync_sched().
Therefore we must jump through some hoops.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Fix cloning

Alexander reported that when the 'original' context gets destroyed, no
new clones happen.

This can happen irrespective of the ctx switch optimization, any task
can die, even the parent, and we want to continue monitoring the task
hierarchy until we either close the event or no tasks are left in the
hierarchy.

perf_event_init_context() will attempt to pin the 'parent' context
during clone(). At that point current is the parent, and since current
cannot have exited while executing clone(), its context cannot have
passed through perf_event_exit_task_context(). Therefore
perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.

However, since inherit_event() does:

if (parent_event->parent)
parent_event = parent_event->parent;

it looks at the 'original' event when it does: is_orphaned_event().
This can return true if the context that contains the this event has
passed through perf_event_exit_task_context(). And thus we'll fail to
clone the perf context.

Fix this by adding a new state: STATE_DEAD, which is set by
perf_release() to indicate that the filedesc (or kernel reference) is
dead and there are no observers for our data left.

Only for STATE_DEAD will is_orphaned_event() be true and inhibit
cloning.

STATE_EXIT is otherwise preserved such that is_event_hup() remains
functional and will report when the observed task hierarchy becomes
empty.

Reported-by: Alexander Shishkin <[email protected]>
Tested-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Only update context time when active

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Allow perf_release() with !event->ctx

In the err_file: fput(event_file) case, the event will not yet have
been attached to a context. However perf_release() does assume it has
been. Cure this.

Tested-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Do not double free

In case of: err_file: fput(event_file), we'll end up calling
perf_release() which in turn will free the event.

Do not then free the event _again_.

Tested-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf: Close install vs. exit race

Consider the following scenario:

  CPU0 CPU1

  ctx = find_get_ctx();
perf_event_exit_task_context()
  mutex_lock(&ctx->mutex);
  perf_install_in_context(ctx, ...);
    /* NO-OP */
  mutex_unlock(&ctx->mutex);

  ...

  perf_release()
    WARN_ON_ONCE(event->state != STATE_EXIT);

Since the event doesn't pass through perf_remove_from_context()
because perf_install_in_context() NO-OPs because the ctx is dead, and
perf_event_exit_task_context() will not observe the event because its
not attached yet, the event->state will not be set.

Solve this by revalidating ctx->task after we acquire ctx->mutex and
failing the event creation as a whole.

Tested-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

x86/entry/compat: Add missing CLAC to entry_INT80_32

This doesn't seem to fix a regression -- I don't think the CLAC was
ever there.

I double-checked in a debugger: entries through the int80 gate do
not automatically clear AC.

Stable maintainers: I can provide a backport to 4.3 and earlier if
needed. This needs to be backported all the way to 3.10.

Reported-by: Brian Gerst <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: <[email protected]> # v3.10 and later
Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access")
Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>

Merge branch 'linux-4.5' of git://github.com/skeggsb/linux into drm-fixes

single for for eDP panel issues on Lenovo P50
* 'linux-4.5' of git://github.com/skeggsb/linux:
drm/nouveau/disp/dp: ensure sink is powered up before attempting link training

drm/nouveau/disp/dp: ensure sink is powered up before attempting link training

This can happen under some annoying circumstances, and is a quick fix
until more substantial changes can be made.

Fixed eDP mode changes on (at least) the Lenovo P50.

Signed-off-by: Ben Skeggs <[email protected]>
Cc: [email protected]

drm/nouveau: platform: Fix deferred probe

The error cleanup paths aren't quite correct and will crash upon
deferred probe.

Cc: [email protected] # v4.3+
Reviewed-by: Ben Skeggs <[email protected]>
Reviewed-by: Alexandre Courbot <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

regulator: core: fix crash in error path of regulator_register

This problem was introduced by:
commit daad134d6649 ("regulator: core: Request GPIO before creating
sysfs entries")

The error path was not updated correctly after moving GPIO registration
code and in case regulator_ena_gpio_free failed, device_unregister() was
called even though device_register() was not yet called.

This problem breaks the boot at least on all Tegra 32-bit devices. It
will also crash each device that specifices GPIO that is unavaiable at
regulator_register call. Here's error log I've got when forced GPIO to
be invalid:

[    1.116612] usb-otg-vbus-reg: Failed to request enable GPIO10: -22
[    1.122794] Unable to handle kernel NULL pointer dereference at
virtual address 00000044
[    1.130894] pgd = c0004000
[    1.133598] [00000044] *pgd=00000000
[    1.137205] Internal error: Oops: 5 [#1] SMP ARM

and here's backtrace from KDB:

Exception stack(0xef11fbd0 to 0xef11fc18)
fbc0:                                     00000000 c0738a14 00000000 00000000
fbe0: c0b2a0b0 00000000 00000000 c0738a14 c0b5fdf8 00000001 ef7f6074 ef11fc4c
fc00: ef11fc50 ef11fc20 c02a8344 c02a7f1c 60000013 ffffffff
[<c010cee0>] (__dabt_svc) from [<c02a7f1c>] (kernfs_find_ns+0x18/0xf8)
[<c02a7f1c>] (kernfs_find_ns) from [<c02a8344>] (kernfs_find_and_get_ns+0x40/0x58)
[<c02a8344>] (kernfs_find_and_get_ns) from [<c02ac4a4>] (sysfs_unmerge_group+0x28/0x68)
[<c02ac4a4>] (sysfs_unmerge_group) from [<c044389c>] (dpm_sysfs_remove+0x30/0x5c)
[<c044389c>] (dpm_sysfs_remove) from [<c0436ba8>] (device_del+0x48/0x1f4)
[<c0436ba8>] (device_del) from [<c0436d84>] (device_unregister+0x30/0x6c)
[<c0436d84>] (device_unregister) from [<c0403910>] (regulator_register+0x6d0/0xdac)
[<c0403910>] (regulator_register) from [<c04052d4>] (devm_regulator_register+0x50/0x84)
[<c04052d4>] (devm_regulator_register) from [<c0406298>] (reg_fixed_voltage_probe+0x25c/0x3c0)
[<c0406298>] (reg_fixed_voltage_probe) from [<c043d21c>] (platform_drv_probe+0x60/0xb0)
[<c043d21c>] (platform_drv_probe) from [<c043b078>] (driver_probe_device+0x24c/0x440)
[<c043b078>] (driver_probe_device) from [<c043b5e8>] (__device_attach_driver+0xc0/0x120)
[<c043b5e8>] (__device_attach_driver) from [<c043901c>] (bus_for_each_drv+0x6c/0x98)
[<c043901c>] (bus_for_each_drv) from [<c043ad20>] (__device_attach+0xac/0x138)
[<c043ad20>] (__device_attach) from [<c043b664>] (device_initial_probe+0x1c/0x20)
[<c043b664>] (device_initial_probe) from [<c043a074>] (bus_probe_device+0x94/0x9c)
[<c043a074>] (bus_probe_device) from [<c043a610>] (deferred_probe_work_func+0x80/0xcc)
[<c043a610>] (deferred_probe_work_func) from [<c01381d0>] (process_one_work+0x158/0x454)
[<c01381d0>] (process_one_work) from [<c013854c>] (worker_thread+0x38/0x510)
[<c013854c>] (worker_thread) from [<c013e154>] (kthread+0xe8/0x104)
[<c013e154>] (kthread) from [<c0108638>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Krzysztof Adamski <[email protected]>
Reported-by: Jon Hunter <[email protected]>
Acked-by: Jon Hunter <[email protected]>
Tested-by: Jon Hunter <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

Merge branch 'fix/core' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator into regulator-core

usb: chipidea: otg: change workqueue ci_otg as freezable

If we use USB ID pin as wakeup source, and there is a USB block
device on this USB OTG (ID) cable, the system will be deadlock
after system resume.

The root cause for this problem is: the workqueue ci_otg may try
to remove hcd before the driver resume has finished, and hcd will
disconnect the device on it, then, it will call device_release_driver,
and holds the device lock "dev->mutex", but it is never unlocked since
it waits workqueue writeback to run to flush the block information, but
the workqueue writeback is freezable, it is not thawed before driver
resume has finished.

When the driver (device: sd 0:0:0:0:) resume goes to dpm_complete, it
tries to get its device lock "dev->mutex", but it can't get it forever,
then the deadlock occurs. Below call stacks show the situation.

So, in order to fix this problem, we need to change workqueue ci_otg
as freezable, then the work item in this workqueue will be run after
driver's resume, this workqueue will not be blocked forever like above
case since the workqueue writeback has been thawed too.

Tested at: i.mx6qdl-sabresd and i.mx6sx-sdb.

[  555.178869] kworker/u2:13   D c07de74c     0   826      2 0x00000000
[  555.185310] Workqueue: ci_otg ci_otg_work
[  555.189353] Backtrace:
[  555.191849] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  555.198912]  r10:ee471ba0 r9:00000000 r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4
[  555.206867]  r4:ee470000
[  555.209453] [<c07dec24>] (schedule) from [<c07e2fc4>] (schedule_timeout+0x15c/0x1e0)
[  555.217212]  r4:7fffffff r3:edc2b000
[  555.220862] [<c07e2e68>] (schedule_timeout) from [<c07df6c8>] (wait_for_common+0x94/0x144)
[  555.229140]  r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4 r4:7fffffff
[  555.235980] [<c07df634>] (wait_for_common) from [<c07df790>] (wait_for_completion+0x18/0x1c)
[  555.244430]  r10:00000001 r9:c0b5563c r8:c0042e48 r7:ef086000 r6:eea4372c r5:ef131b00
[  555.252383]  r4:00000000
[  555.254970] [<c07df778>] (wait_for_completion) from [<c0043cb8>] (flush_work+0x19c/0x234)
[  555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)
[  555.271106]  r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
[  555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
[  555.286236]  r4:eea43520 r3:20000153
[  555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
[  555.298250]  r5:eea43808 r4:eea43400
[  555.301909] [<c02c1fd4>] (blk_cleanup_queue) from [<c0417914>] (__scsi_remove_device+0x48/0xb8)
[  555.310623]  r7:00000000 r6:20000153 r5:ededa950 r4:ededa800
[  555.316403] [<c04178cc>] (__scsi_remove_device) from [<c0415e90>] (scsi_forget_host+0x64/0x68)
[  555.325028]  r5:ededa800 r4:ed5b5000
[  555.328689] [<c0415e2c>] (scsi_forget_host) from [<c0409828>] (scsi_remove_host+0x78/0x104)
[  555.337054]  r5:ed5b5068 r4:ed5b5000
[  555.340709] [<c04097b0>] (scsi_remove_host) from [<c04cdfcc>] (usb_stor_disconnect+0x50/0xb4)
[  555.349247]  r6:ed5b56e4 r5:ed5b5818 r4:ed5b5690 r3:00000008
[  555.355025] [<c04cdf7c>] (usb_stor_disconnect) from [<c04b3bc8>] (usb_unbind_interface+0x78/0x25c)
[  555.363997]  r8:c13919b4 r7:edd3c000 r6:edd3c020 r5:ee551c68 r4:ee551c00 r3:c04cdf7c
[  555.371892] [<c04b3b50>] (usb_unbind_interface) from [<c03dc248>] (__device_release_driver+0x8c/0x118)
[  555.381213]  r10:00000001 r9:edd90c00 r8:c13919b4 r7:ee551c68 r6:c0b546e0 r5:c0b5563c
[  555.389167]  r4:edd3c020
[  555.391752] [<c03dc1bc>] (__device_release_driver) from [<c03dc2fc>] (device_release_driver+0x28/0x34)
[  555.401071]  r5:edd3c020 r4:edd3c054
[  555.404721] [<c03dc2d4>] (device_release_driver) from [<c03db304>] (bus_remove_device+0xe0/0x110)
[  555.413607]  r5:edd3c020 r4:ef17f04c
[  555.417253] [<c03db224>] (bus_remove_device) from [<c03d8128>] (device_del+0x114/0x21c)
[  555.425270]  r6:edd3c028 r5:edd3c020 r4:ee551c00 r3:00000000
[  555.431045] [<c03d8014>] (device_del) from [<c04b1560>] (usb_disable_device+0xa4/0x1e8)
[  555.439061]  r8:edd3c000 r7:eded8000 r6:00000000 r5:00000001 r4:ee551c00
[  555.445906] [<c04b14bc>] (usb_disable_device) from [<c04a8e54>] (usb_disconnect+0x74/0x224)
[  555.454271]  r9:edd90c00 r8:ee551000 r7:ee551c68 r6:ee551c9c r5:ee551c00 r4:00000001
[  555.462156] [<c04a8de0>] (usb_disconnect) from [<c04a8fb8>] (usb_disconnect+0x1d8/0x224)
[  555.470259]  r10:00000001 r9:edd90000 r8:ee471e2c r7:ee551468 r6:ee55149c r5:ee551400
[  555.478213]  r4:00000001
[  555.480797] [<c04a8de0>] (usb_disconnect) from [<c04ae5ec>] (usb_remove_hcd+0xa0/0x1ac)
[  555.488813]  r10:00000001 r9:ee471eb0 r8:00000000 r7:ef3d9500 r6:eded810c r5:eded80b0
[  555.496765]  r4:eded8000
[  555.499351] [<c04ae54c>] (usb_remove_hcd) from [<c04d4158>] (host_stop+0x28/0x64)
[  555.506847]  r6:eeb50010 r5:eded8000 r4:eeb51010
[  555.511563] [<c04d4130>] (host_stop) from [<c04d09b8>] (ci_otg_work+0xc4/0x124)
[  555.518885]  r6:00000001 r5:eeb50010 r4:eeb502a0 r3:c04d4130
[  555.524665] [<c04d08f4>] (ci_otg_work) from [<c00454f0>] (process_one_work+0x194/0x420)
[  555.532682]  r6:ef086000 r5:eeb502a0 r4:edc44480
[  555.537393] [<c004535c>] (process_one_work) from [<c00457b0>] (worker_thread+0x34/0x514)
[  555.545496]  r10:edc44480 r9:ef086000 r8:c0b1a100 r7:ef086034 r6:00000088 r5:edc44498
[  555.553450]  r4:ef086000
[  555.556032] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[  555.563268]  r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44480 r5:eddc15c0
[  555.571221]  r4:00000000
[  555.573804] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[  555.581040]  r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0

[  553.429383] sh              D c07de74c     0   694    691 0x00000000
[  553.435801] Backtrace:
[  553.438295] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  553.445358]  r10:edd3c054 r9:edd3c078 r8:edddbd50 r7:edcbbc00 r6:c1377c34 r5:60000153
[  553.453313]  r4:eddda000
[  553.455896] [<c07dec24>] (schedule) from [<c07deff8>] (schedule_preempt_disabled+0x10/0x14)
[  553.464261]  r4:edd3c058 r3:0000000a
[  553.467910] [<c07defe8>] (schedule_preempt_disabled) from [<c07e0bbc>] (mutex_lock_nested+0x1a0/0x3e8)
[  553.477254] [<c07e0a1c>] (mutex_lock_nested) from [<c03e927c>] (dpm_complete+0xc0/0x1b0)
[  553.485358]  r10:00561408 r9:edd3c054 r8:c0b4863c r7:edddbd90 r6:c0b485d8 r5:edd3c020
[  553.493313]  r4:edd3c0d0
[  553.495896] [<c03e91bc>] (dpm_complete) from [<c03e9388>] (dpm_resume_end+0x1c/0x20)
[  553.503652]  r9:00000000 r8:c0b1a9d0 r7:c1334ec0 r6:c1334edc r5:00000003 r4:00000010
[  553.511544] [<c03e936c>] (dpm_resume_end) from [<c0079894>] (suspend_devices_and_enter+0x158/0x504)
[  553.520604]  r4:00000000 r3:c1334efc
[  553.524250] [<c007973c>] (suspend_devices_and_enter) from [<c0079e74>] (pm_suspend+0x234/0x2cc)
[  553.532961]  r10:00561408 r9:ed6b7300 r8:00000004 r7:c1334eec r6:00000000 r5:c1334ee8
[  553.540914]  r4:00000003
[  553.543493] [<c0079c40>] (pm_suspend) from [<c0078a6c>] (state_store+0x6c/0xc0)

[  555.703684] 7 locks held by kworker/u2:13/826:
[  555.708140]  #0:  ("%s""ci_otg"){++++.+}, at: [<c0045484>] process_one_work+0x128/0x420
[  555.716277]  #1:  ((&ci->work)){+.+.+.}, at: [<c0045484>] process_one_work+0x128/0x420
[  555.724317]  #2:  (usb_bus_list_lock){+.+.+.}, at: [<c04ae5e4>] usb_remove_hcd+0x98/0x1ac
[  555.732626]  #3:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[  555.740403]  #4:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[  555.748179]  #5:  (&dev->mutex){......}, at: [<c03dc2f4>] device_release_driver+0x20/0x34
[  555.756487]  #6:  (&shost->scan_mutex){+.+.+.}, at: [<c04097d0>] scsi_remove_host+0x20/0x104

Cc: <[email protected]> #v3.14+
Cc: Jun Li <[email protected]>
Signed-off-by: Peter Chen <[email protected]>

drivers: sh: Restore legacy clock domain on SuperH platforms

CONFIG_ARCH_SHMOBILE is not only enabled for Renesas ARM platforms
(which are DT based and multi-platform), but also on a select set of
Renesas SuperH platforms (SH7722/SH7723/SH7724/SH7343/SH7366). Hence
since commit 0ba58de231066e47 ("drivers: sh: Get rid of
CONFIG_ARCH_SHMOBILE_MULTI"), the legacy clock domain is no longer
installed on these SuperH platforms, and module clocks may not be
enabled when needed, leading to driver failures.

To fix this, add an additional check for CONFIG_OF.

Fixes: 0ba58de231066e47 ("drivers: sh: Get rid of CONFIG_ARCH_SHMOBILE_MULTI").
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Simon Horman <[email protected]>

Merge tag 'drm-intel-fixes-2016-02-22' of git://anongit.freedesktop.org/drm-intel into drm-fixes

This is a bit large, but it really helps Skylake bugs we are seeing
on a number of laptops.

Most of the commits are quite similar, ensuring the display power
doesn't vanish under us during hardware access. Also do note that it's
not just Skylake that's affected.

* tag 'drm-intel-fixes-2016-02-22' of git://anongit.freedesktop.org/drm-intel:
  drm/i915/gen9: Verify and enforce dc6 state writes
  drm/i915/gen9: Check for DC state mismatch
  drm/i915/skl: Ensure HW is powered during DDB HW state readout
  drm/i915/lvds: Ensure the HW is powered during HW state readout
  drm/i915/hdmi: Ensure the HW is powered during HW state readout
  drm/i915/dsi: Ensure the HW is powered during HW state readout
  drm/i915/dp: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered when accessing the CRC HW block
  drm/i915/ddi: Ensure the HW is powered during HW state readout
  drm/i915/crt: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered during HW access in assert_pipe
  drm/i915: Ensure the HW is powered when disabling VGA
  drm/i915/ibx: Ensure the HW is powered during PLL HW readout
  drm/i915: Ensure the HW is powered during display pipe HW readout
  drm/i915: Add helper to get a display power ref if it was already enabled

Merge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

A few radeon and amdgpu fixes for 4.5.  A few further fixes for the vblank
regressions in 4.4 and a couple of other minor fixes.

* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: disable direct VM updates when vm_debug is set
  amdgpu: fix NULL pointer dereference at tonga_check_states_equal
  drm/radeon/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: add some checks for PX
  drm/amdgpu: fix locking in force performance level
  drm/amdgpu/gfx8: fix priv reg interrupt enable
  drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
  drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)

Merge tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
- Fix for csd deadlock due to missing self IPI
- Accompanying IPI cleanups / optimization
- Brown paper bag bug in one of the cleanups above
- Boot reporting updates for new hardware features
- Don't force DEVTMPFS if INITRAMFS

* tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: SMP: CONFIG_ARC_IPI_DBG cleanup
  ARC: SMP: No need for CONFIG_ARC_IPI_DBG
  ARCv2: Elide sending new cross core intr if receiver didn't ack prev
  ARCv2: SMP: Push IPI_IRQ into IPI provider
  ARC: [intc-compact] Remove IPI setup from ARCompact port
  ARCv2: SMP: Emulate IPI to self using software triggered interrupt
  arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCE
  ARCv2: boot report CCMs (Closely Coupled Memories)
  ARCv2: boot print Low Latency Memory
  ARC: Assume multiplier is always present

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"Assorted fixes - xattr one from this cycle, the rest - stable fodder"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/pnode.c: treat zero mnt_group_id-s as unequal
  affs_do_readpage_ofs(): just use kmap_atomic() around memcpy()
  xattr handlers: plug a lock leak in simple_xattr_list
  fs: allow no_seek_end_llseek to actually seek

libceph: don't spam dmesg with stray reply warnings

Commit d15f9d694b77 ("libceph: check data_len in ->alloc_msg()")
mistakenly bumped the log level on the "tid %llu unknown, skipping"
message. Turn it back into a dout() - stray replies are perfectly
normal when OSDs flap, crash, get killed for testing purposes, etc.

Cc: [email protected] # 4.3+
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Alex Elder <[email protected]>

libceph: use the right footer size when skipping a message

ceph_msg_footer is 21 bytes long, while ceph_msg_footer_old is only 13.
Don't skip too much when CEPH_FEATURE_MSG_AUTH isn't negotiated.

Cc: [email protected] # 3.19+
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Alex Elder <[email protected]>

libceph: don't bail early from try_read() when skipping a message

The contract between try_read() and try_write() is that when called
each processes as much data as possible. When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more. try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.

Cc: [email protected] # 3.10+
Reported-by: Varada Kari <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Tested-by: Varada Kari <[email protected]>
Reviewed-by: Alex Elder <[email protected]>

thp: call pmdp_invalidate() with correct virtual address

Sebastian Ott and Gerald Schaefer reported random crashes on s390.
It was bisected to my THP refcounting patchset.

The problem is that pmdp_invalidated() called with wrong virtual
address. It got offset up by HPAGE_PMD_SIZE by loop over ptes.

The solution is to introduce new variable to be used in loop and don't
touch 'haddr'.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reported-and-tested-by: Gerald Schaefer <[email protected]>
Reported-and-tested-by Sebastian Ott <[email protected]>
Reviewed-by: Will Deacon <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drm/amdgpu: disable direct VM updates when vm_debug is set

That should make user space bugs more obvious.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>

amdgpu: fix NULL pointer dereference at tonga_check_states_equal

The event_data passed from pem_fini was not cleared upon initialization.
This caused NULL checks to pass and cast_const_phw_tonga_power_state to
attempt to dereference an invalid pointer. Clear the event_data in
pem_init and pem_fini before calling pem_handle_event.

Reviewed-by: Rex Zhu <[email protected]>
Signed-off-by: Bradley Pankow <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2

The GICv3 architecture spec says:

Writing to the active priority registers in any order other than
the following order will result in UNPREDICTABLE behavior:
- ICH_AP0R<n>_EL2.
- ICH_AP1R<n>_EL2.

So let's not pointlessly go against the rule...

Acked-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

Merge tag 'fixes-for-v4.5-rc6' of http://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

usb: fixes for v4.5-rc6

The most important fixes here are:

a) yet another fix to dwc3's EP transfer resource
assignment logic. This time around we will be
pre-allocating transfer resources to avoid any
future issues;

b) two DMA fixes for the old MUSB driver.

c) dwc2's data toggle fix for FS

Other than these, we have a few other minor fixes
elsewhere.

Merge tag 'renesas-soc-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Renesas ARM Based SoC Fixes for v4.5

* Avoid writing to .text

* tag 'renesas-soc-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: Remove shmobile_boot_arg
  ARM: shmobile: Move shmobile_smp_{mpidr, fn, arg}[] from .text to .bss
  ARM: shmobile: r8a7779: Remove remainings of removed SCU boot setup code
  ARM: shmobile: Move shmobile_scu_base from .text to .bss

Signed-off-by: Olof Johansson <[email protected]>

tracing: Fix showing function event in available_events

The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.

Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.

Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.

One documented way to enable all events is to:

cat available_events > set_event

But because the function event is displayed in the available_events, this
now causes an INVALID error:

cat: write error: Invalid argument

Reported-by: Chunyu Hu <[email protected]>
Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable"
Cc: [email protected] # 3.4+
Signed-off-by: Steven Rostedt <[email protected]>

KVM: async_pf: do not warn on page allocation failures

In async_pf we try to allocate with NOWAIT to get an element quickly
or fail. This code also handle failures gracefully. Lets silence
potential page allocation failures under load.

qemu-system-s39: page allocation failure: order:0,mode:0x2200000
[...]
Call Trace:
([<00000000001146b8>] show_trace+0xf8/0x148)
[<000000000011476a>] show_stack+0x62/0xe8
[<00000000004a36b8>] dump_stack+0x70/0x98
[<0000000000272c3a>] warn_alloc_failed+0xd2/0x148
[<000000000027709e>] __alloc_pages_nodemask+0x94e/0xb38
[<00000000002cd36a>] new_slab+0x382/0x400
[<00000000002cf7ac>] ___slab_alloc.constprop.30+0x2dc/0x378
[<00000000002d03d0>] kmem_cache_alloc+0x160/0x1d0
[<0000000000133db4>] kvm_setup_async_pf+0x6c/0x198
[<000000000013dee8>] kvm_arch_vcpu_ioctl_run+0xd48/0xd58
[<000000000012fcaa>] kvm_vcpu_ioctl+0x372/0x690
[<00000000002f66f6>] do_vfs_ioctl+0x3be/0x510
[<00000000002f68ec>] SyS_ioctl+0xa4/0xb8
[<0000000000781c5e>] system_call+0xd6/0x264
[<000003ffa24fa06a>] 0x3ffa24fa06a

Cc: [email protected]
Signed-off-by: Christian Borntraeger <[email protected]>
Reviewed-by: Dominik Dingel <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: fix conversion of addresses to linear in 32-bit protected mode

Commit e8dd2d2d641c ("Silence compiler warning in arch/x86/kvm/emulate.c",
2015-09-06) broke boot of the Hurd. The bug is that the "default:"
case actually could modify "la", but after the patch this change is
not reflected in *linear.

The bug is visible whenever a non-zero segment base causes the linear
address to wrap around the 4GB mark.

Fixes: e8dd2d2d641cb2724ee10e76c0ad02e04289c017
Cc: [email protected]
Reported-by: Aurelien Jarno <[email protected]>
Tested-by: Aurelien Jarno <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: fix missed hardware breakpoints

Sometimes when setting a breakpoint a process doesn't stop on it.
This is because the debug registers are not loaded correctly on
VCPU load.

The following simple reproducer from Oleg Nesterov tries using debug
registers in two threads.  To see the bug, run a 2-VCPU guest with
"taskset -c 0" and run "./bp 0 1" inside the guest.

    #include <unistd.h>
    #include <signal.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/wait.h>
    #include <sys/ptrace.h>
    #include <sys/user.h>
    #include <asm/debugreg.h>
    #include <assert.h>

    #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

    unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len)
    {
        unsigned long dr7;

        dr7 = ((len | type) & 0xf)
            << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE);
        if (enable)
            dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE));

        return dr7;
    }

    int write_dr(int pid, int dr, unsigned long val)
    {
        return ptrace(PTRACE_POKEUSER, pid,
                offsetof (struct user, u_debugreg[dr]),
                val);
    }

    void set_bp(pid_t pid, void *addr)
    {
        unsigned long dr7;
        assert(write_dr(pid, 0, (long)addr) == 0);
        dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1);
        assert(write_dr(pid, 7, dr7) == 0);
    }

    void *get_rip(int pid)
    {
        return (void*)ptrace(PTRACE_PEEKUSER, pid,
                offsetof(struct user, regs.rip), 0);
    }

    void test(int nr)
    {
        void *bp_addr = &&label + nr, *bp_hit;
        int pid;

        printf("test bp %d\n", nr);
        assert(nr < 16); // see 16 asm nops below

        pid = fork();
        if (!pid) {
            assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
            kill(getpid(), SIGSTOP);
            for (;;) {
                label: asm (
                    "nop; nop; nop; nop;"
                    "nop; nop; nop; nop;"
                    "nop; nop; nop; nop;"
                    "nop; nop; nop; nop;"
                );
            }
        }

        assert(pid == wait(NULL));
        set_bp(pid, bp_addr);

        for (;;) {
            assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0);
            assert(pid == wait(NULL));

            bp_hit = get_rip(pid);
            if (bp_hit != bp_addr)
                fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n",
                    bp_hit - &&label, nr);
        }
    }

    int main(int argc, const char *argv[])
    {
        while (--argc) {
            int nr = atoi(*++argv);
            if (!fork())
                test(nr);
        }

        while (wait(NULL) > 0)
            ;
        return 0;
    }

Cc: [email protected]
Suggested-by: Nadav Amit <[email protected]>
Reported-by: Andrey Wagin <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Revert "ACPI, PCI, irq: remove interrupt count restriction"

Revert commit b5bd02695471 (ACPI, PCI, irq: remove interrupt count
restriction) that introduced a boot regression on some systems
where it caused kmalloc() to be used too early.

Link: http://marc.info/?l=linux-acpi&m=145580159209240&w=2
Reported-by: Nalla, Ravikanth <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Revert "ACPI / PCI: Simplify acpi_penalize_isa_irq()"

Revert commit 0971686954f9 "ACPI / PCI: Simplify acpi_penalize_isa_irq()"
that depends on commit b5bd02695471 (ACPI, PCI, irq: remove interrupt
count restriction) which introduced a regression and needs to be
reverted for this reason.

Signed-off-by: Rafael J. Wysocki <[email protected]>

arm/arm64: KVM: Feed initialized memory to MMIO accesses

On an MMIO access, we always copy the on-stack buffer info
the shared "run" structure, even if this is a read access.
This ends up leaking up to 8 bytes of uninitialized memory
into userspace, depending on the size of the access.

An obvious fix for this one is to only perform the copy if
this is an actual write.

Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

arc: SMP: CONFIG_ARC_IPI_DBG cleanup

Previous Commit ("ARC: SMP: No need for CONFIG_ARC_IPI_DBG") removed
the Kconfig option ARC_IPI_DBG. Remove the last reference on this
option.

Signed-off-by: Valentin Rothberg <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

MAINTAINERS: Extend info, add wiki and ml for meson arch

Update the maintainers info with wiki and mailing list for the meson
platform. Fix a wrong file attribution and add maintainership for the
generic meson platforms.

Signed-off-by: Carlo Caione <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'omap-for-v4.5/fixes-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Two omap fixes for omaps against v4.5-rc5:

- Yet another fix for n900 onenand to avoid corruption. This time to
  fix the issue of mounting onenand back and forth between the original
  maemo kernel and mainline Linux kernel. And it also seems there will
  be two more fixes coming via the MTD tree as issues were discovered
  also in the onenand driver during testing.

- Revert tps65217 regulator clean up as it breaks MMC for am335x
  variants. The proper way to clean this up is just to rename the
  tps65217.dtsi file into tps65217-am335x.dtsi as a similar setup
  is used on many am335x boards.

* tag 'omap-for-v4.5/fixes-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption
  Revert "regulator: tps65217: remove tps65217.dtsi file"

Signed-off-by: Olof Johansson <[email protected]>

MAINTAINERS: alpine: add a new maintainer and update the entry

Add myself as a co-maintainer for the Alpine support. Also update the
entry to take in account Alpine ARM64 boards, Alpine ARM device trees
and Alpine-specific drivers.

Signed-off-by: Antoine Tenart <[email protected]>
Acked-by: Tsahee Zidenberg <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

ARM: at91/dt: fix typo in sama5d2 pinmux descriptions

PIN_PA15 macro has the same value as PIN_PA14 so we were overriding PA14
mux/configuration.

Signed-off-by: Ludovic Desroches <[email protected]>
Reported-by: Cyrille Pitchen <[email protected]>
Fixes: 7f16cb676c00 ("ARM: at91/dt: add sama5d2 pinmux")
Cc: <[email protected]> # v4.4+
Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'imx-fixes-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for v4.5:
- Drop the bogus interrupt-parent from i.MX6 CAAM node, which leads to
the CAAM IRQs not getting unmasked at the GPC level.

* tag 'imx-fixes-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6: remove bogus interrupt-parent from CAAM node

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'omap-for-v4.5/fixes-rc3-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Few fixes for omaps against v4.5-rc3:

- Improve omap_device error message to tell driver writers what is
  wrong after commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM
  states at probe error and driver unbind"). There will be also a
  handful of driver related fixes also queued separately. But adding
  this error message makes it easy to fix any omap_device using
  drivers suffering from this issue so I think it's important to
  have.

- Also related to commit 5de85b9d57ab discussion, let's fix a bug
  where disabling PM runtime via sysfs will also cause the hardware
  state to be different from PM runtime state.

- Fix audio clocks for beagle-x15.

- Use wakeup-source instead of gpio-key,wakeup for the new entries
  that sneaked in during the merge window.

- Fix a legacy booting vs device tree based booting regression for
  n900 where the legacy user space expects to have the device
  revision available in /proc/atags also when booted with device
  tree.

* tag 'omap-for-v4.5/fixes-rc3-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: Fix omap_device for module reload on PM runtime forbid
  ARM: OMAP2+: Improve omap_device error for driver writers
  ARM: DTS: am57xx-beagle-x15: Select SYS_CLK2 for audio clocks
  ARM: dts: am335x/am57xx: replace gpio-key,wakeup with wakeup-source property
  ARM: OMAP2+: Set system_rev from ATAGS for n900

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'mvebu-fixes-4.5-2' of git://git.infradead.org/linux-mvebu into fixes

mvebu fixes for 4.5 (part 2)

- Fix the missing mtd flash on linkstation lswtgl
- Use unique machine name for the kirkwood ds112 (for Debian flash-kernel tool)

* tag 'mvebu-fixes-4.5-2' of git://git.infradead.org/linux-mvebu:
ARM: dts: orion5x: fix the missing mtd flash on linkstation lswtgl
ARM: dts: kirkwood: use unique machine name for ds112

Signed-off-by: Olof Johansson <[email protected]>

x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32

Both before and after 5f310f739b4c ("x86/entry/32: Re-implement
SYSENTER using the new C path"), we relied on a uaccess very early
in the SYSENTER path to clear AC. After that change, though, we can
potentially make it all the way into C code with AC set, which
enlarges the attack surface for SMAP bypass by doing SYSENTER with
AC set.

Strengthen the SMAP protection by addding the missing ASM_CLAC right
at the beginning.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/3e36be110724896e32a4a1fe73bacb349d3cba94.1456262295.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>

ARC: SMP: No need for CONFIG_ARC_IPI_DBG

This was more relevant during SMP bringup.

The warning for bogus msg better be visible always.

Signed-off-by: Vineet Gupta <[email protected]>

ARCv2: Elide sending new cross core intr if receiver didn't ack prev

ARConnect/MCIP IPI sending has a retry-wait loop in case caller had
not seen a previous such interrupt. Turns out that it is not needed at
all. Linux cross core calling allows coalescing multiple IPIs to same
receiver - it is fine as long as there is one.

This logic is built into upper layer already, at a higher level of
abstraction. ipi_send_msg_one() sets the actual msg payload, but it only
calls MCIP IPI sending if msg holder was empty (using
atomic-set-new-and-get-old construct). Thus it is unlikely that the
retry-wait looping was ever getting exercised at all.

Cc: Chuck Jordan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARCv2: SMP: Push IPI_IRQ into IPI provider

Signed-off-by: Vineet Gupta <[email protected]>

ARC: [intc-compact] Remove IPI setup from ARCompact port

There is no real ARC700 based SMP SoC so remove IPI definition.
EZChip's SMP ARC700 is going to use a different intc and IPI provider
anyways.

Signed-off-by: Vineet Gupta <[email protected]>

ARCv2: SMP: Emulate IPI to self using software triggered interrupt

ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to
local core. So use core intc capability to trigger software
interrupt to self, using an unsued IRQ #21.

This showed up as csd deadlock with LTP trace_sched on a dual core
system. This test acts as scheduler fuzzer, triggering all sorts of
schedulting activity. Trouble starts with IPI to self, which doesn't get
delivered (effectively lost due to H/w capability), but the msg intended
to be sent remain enqueued in per-cpu @ipi_data.

All subsequent IPIs to this core from other cores get elided due to the
IPI coalescing optimization in ipi_send_msg_one() where a pending msg
implies an IPI already sent and assumes other core is yet to ack it.
After the elided IPI, other core simply goes into csd_lock_wait()
but never comes out as this core never sees the interrupt.

Fixes STAR 9001008624

Cc: Peter Zijlstra <[email protected]>
Cc: <[email protected]> [4.2]
Signed-off-by: Vineet Gupta <[email protected]>

Merge tag 'dm-4.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
"Fix a 112 byte leak for each IO request that is requeued while DM
  multipath is handling faults due to path failures.

  This leak does not happen if blk-mq DM multipath is used.  It only
  occurs if .request_fn DM multipath is stacked ontop of blk-mq paths
  (e.g. scsi-mq devices)"

* tag 'dm-4.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths

Merge tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fix from Ulf Hansson:
"Here's an mmc fix intended for v4.5 rc6.

  MMC host:
   - omap_hsmmc: Fix PM regression for deferred probe"

* tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: omap_hsmmc: Fix PM regression with deferred probe for pm_runtime_reinit

nvdimm: use 'u64' for pfn flags

A recent bugfix changed pfn_t to always be 64-bit wide, but did not
change the code in pmem.c, which is now broken on 32-bit architectures
as reported by gcc:

In file included from ../drivers/nvdimm/pmem.c:28:0:
drivers/nvdimm/pmem.c: In function 'pmem_alloc':
include/linux/pfn_t.h:15:17: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
#define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))

This changes the intermediate pfn_flags in struct pmem_device to
be 64 bit wide as well, so they can store the flags correctly.

Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: db78c22230d0 ("mm: fix pfn_t vs highmem")
Signed-off-by: Dan Williams <[email protected]>