Michal Hocko [Fri, 12 May 2017 22:46:41 +0000 (15:46 -0700)]
mm, vmalloc: fix vmalloc users tracking properly
Commit 1f5307b1e094 ("mm, vmalloc: properly track vmalloc users") has
pulled asm/pgtable.h include dependency to linux/vmalloc.h and that
turned out to be a bad idea for some architectures. E.g. m68k fails
with
In file included from arch/m68k/include/asm/pgtable_mm.h:145:0,
from arch/m68k/include/asm/pgtable.h:4,
from include/linux/vmalloc.h:9,
from arch/m68k/kernel/module.c:9:
arch/m68k/include/asm/mcf_pgtable.h: In function 'nocache_page':
>> arch/m68k/include/asm/mcf_pgtable.h:339:43: error: 'init_mm' undeclared (first use in this function)
#define pgd_offset_k(address) pgd_offset(&init_mm, address)
as spotted by kernel build bot. nios2 fails for other reason
In file included from include/asm-generic/io.h:767:0,
from arch/nios2/include/asm/io.h:61,
from include/linux/io.h:25,
from arch/nios2/include/asm/pgtable.h:18,
from include/linux/mm.h:70,
from include/linux/pid_namespace.h:6,
from include/linux/ptrace.h:9,
from arch/nios2/include/uapi/asm/elf.h:23,
from arch/nios2/include/asm/elf.h:22,
from include/linux/elf.h:4,
from include/linux/module.h:15,
from init/main.c:16:
include/linux/vmalloc.h: In function '__vmalloc_node_flags':
include/linux/vmalloc.h:99:40: error: 'PAGE_KERNEL' undeclared (first use in this function); did you mean 'GFP_KERNEL'?
which is due to the newly added #include <asm/pgtable.h>, which on nios2
includes <linux/io.h> and thus <asm/io.h> and <asm-generic/io.h> which
again includes <linux/vmalloc.h>.
Tweaking that around just turns out a bigger headache than necessary.
This patch reverts 1f5307b1e094 and reimplements the original fix in a
different way. __vmalloc_node_flags can stay static inline which will
cover vmalloc* functions. We only have one external user
(kvmalloc_node) and we can export __vmalloc_node_flags_caller and
provide the caller directly. This is much simpler and it doesn't really
need any games with header files.
SeongJae Park [Fri, 12 May 2017 22:46:38 +0000 (15:46 -0700)]
mm/khugepaged: add missed tracepoint for collapse_huge_page_swapin
One return case of `__collapse_huge_page_swapin()` does not invoke
tracepoint while every other return case does. This commit adds a
tracepoint invocation for the case.
Reza Arbab [Fri, 12 May 2017 22:46:32 +0000 (15:46 -0700)]
mm, vmstat: Remove spurious WARN() during zoneinfo print
After commit e2ecc8a79ed4 ("mm, vmstat: print non-populated zones in
zoneinfo"), /proc/zoneinfo will show unpopulated zones.
A memoryless node, having no populated zones at all, was previously
ignored, but will now trigger the WARN() in is_zone_first_populated().
Remove this warning, as its only purpose was to warn of a situation that
has since been enabled.
Aside: The "per-node stats" are still printed under the first populated
zone, but that's not necessarily the first stanza any more. I'm not
sure which criteria is more important with regard to not breaking
parsers, but it looks a little weird to the eye.
Michal Hocko [Fri, 12 May 2017 22:46:26 +0000 (15:46 -0700)]
hwpoison, memcg: forcibly uncharge LRU pages
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page. While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away. Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.
Linus Torvalds [Fri, 12 May 2017 22:43:10 +0000 (15:43 -0700)]
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"Incremental fixes and a small feature addition on top of the main
libnvdimm 4.12 pull request:
- Geert noticed that tinyconfig was bloated by BLOCK selecting DAX.
The size regression is fixed by moving all dax helpers into the
dax-core and only specifying "select DAX" for FS_DAX and
dax-capable drivers. He also asked for clarification of the
NR_DEV_DAX config option which, on closer look, does not need to be
a config option at all. Mike also throws in a DEV_DAX_PMEM fixup
for good measure.
- Ben's attention to detail on -stable patch submissions caught a
case where the recent fixes to arch_copy_from_iter_pmem() missed a
condition where we strand dirty data in the cache. This is tagged
for -stable and will also be included in the rework of the pmem api
to a proposed {memcpy,copy_user}_flushcache() interface for 4.13.
- Vishal adds a feature that missed the initial pull due to pending
review feedback. It allows the kernel to clear media errors when
initializing a BTT (atomic sector update driver) instance on a pmem
namespace.
- Ross noticed that the dax_device + dax_operations conversion broke
__dax_zero_page_range(). The nvdimm unit tests fail to check this
path, but xfstests immediately trips over it. No excuse for missing
this before submitting the 4.12 pull request.
These all pass the nvdimm unit tests and an xfstests spot check. The
set has received a build success notification from the kbuild robot"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
filesystem-dax: fix broken __dax_zero_page_range() conversion
libnvdimm, btt: ensure that initializing metadata clears poison
libnvdimm: add an atomic vs process context flag to rw_bytes
x86, pmem: Fix cache flushing for iovec write < 8 bytes
device-dax: kill NR_DEV_DAX
block, dax: move "select DAX" from BLOCK to FS_DAX
device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX
Linus Torvalds [Fri, 12 May 2017 19:10:38 +0000 (12:10 -0700)]
Merge tag 'sound-fix-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This contains a one-liner change that has a significant impact:
disabling the build of OSS. It's been unmaintained for long time, and
we'd like to drop the stuff. Finally, as the first step, stop the
build. Let's see whether it works without much complaints.
Other than that, there are two small fixes for HD-audio"
* tag 'sound-fix-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
sound: Disable the build of OSS drivers
ALSA: hda: Fix cpu lockup when stopping the cmd dmas
ALSA: hda - Add mute led support for HP EliteBook 840 G3
Linus Torvalds [Fri, 12 May 2017 19:02:21 +0000 (12:02 -0700)]
Merge tag 'for-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull more power-supply updates from Sebastian Reichel:
"The power-supply subsystem has a few more changes for the v4.12 merge
window:
- New battery driver for AXP20X and AXP22X PMICs
- Improve max17042_battery for usage on x86
- Misc small cleanups & fixes"
* tag 'for-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (34 commits)
power: supply: cpcap-charger: Keep trickle charger bits disabled
power: supply: cpcap-charger: Fix enable for 3.8V charge setting
power: supply: cpcap-charger: Fix charge voltage configuration
power: supply: cpcap-charger: Fix charger name
power: supply: twl4030-charger: make twl4030_bci_property_is_writeable static
power: supply: sbs-battery: Add alert callback
mailmap: add Sebastian Reichel
power: supply: avoid unused twl4030-madc.h
power: supply: sbs-battery: Correct supply status with current draw
power: supply: sbs-battery: Don't ignore the first external power change
power: supply: pda_power: move from timer to delayed_work
power: supply: max17042_battery: Add support for the SCOPE property
power: supply: max17042_battery: Add support for the CHARGE_NOW property
power: supply: max17042_battery: Add support for the CHARGE_FULL_DESIGN property
power: supply: max17042_battery: mAh readings depend on r_sns value
power: supply: max17042_battery: Add support for the VOLT_MIN property
power: supply: max17042_battery: Add support for the TECHNOLOGY attribute
power: supply: max17042_battery: Add external_power_changed callback
power: supply: max17042_battery: Add support for the STATUS property
power: supply: max17042_battery: Add default platform_data fallback data
...
Linus Torvalds [Fri, 12 May 2017 18:58:45 +0000 (11:58 -0700)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
- Fix a problem where orderly_shutdown() is called for multiple times
due to multiple critical overheating events raised in a short period
by platform thermal driver. (Keerthy)
- Introduce a backup thermal shutdown mechanism, which invokes
kernel_power_off()/emergency_restart() directly, after
orderly_shutdown() being issued for certain amount of time(specified
via Kconfig). This is useful in certain conditions that userspace may
be unable to power off the system in a clean manner and leaves the
system in a critical state, like in the middle of driver probing
phase. (Keerthy)
- Introduce a new interface in thermal devfreq_cooling code so that the
driver can provide more precise data regarding actual power to the
thermal governor every time the power budget is calculated. (Lukasz
Luba)
- Introduce BCM 2835 soc thermal driver and northstar thermal driver,
within a new sub-folder. (Rafał Miłecki)
- Small fix on MTK and intel-soc-dts thermal driver. (Dawei Chien,
Brian Bian)
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
thermal: core: Add a back up thermal shutdown mechanism
thermal: core: Allow orderly_poweroff to be called only once
Thermal: Intel SoC DTS: Change interrupt request behavior
trace: thermal: add another parameter 'power' to the tracing function
thermal: devfreq_cooling: add new interface for direct power read
thermal: devfreq_cooling: refactor code and add get_voltage function
thermal: mt8173: minor mtk_thermal.c cleanups
thermal: bcm2835: move to the broadcom subdirectory
thermal: broadcom: ns: specify myself as MODULE_AUTHOR
thermal: da9062/61: Thermal junction temperature monitoring driver
Documentation: devicetree: thermal: da9062/61 TJUNC temperature binding
thermal: broadcom: add Northstar thermal driver
dt-bindings: thermal: add support for Broadcom's Northstar thermal
thermal: bcm2835: add thermal driver for bcm2835 SoC
dt-bindings: Add thermal zone to bcm2835-thermal example
thermal: rcar_gen3_thermal: add suspend and resume support
thermal: rcar_gen3_thermal: store device match data in private structure
thermal: rcar_gen3_thermal: enable hardware interrupts for trip points
thermal: rcar_gen3_thermal: record and check number of TSCs found
thermal: rcar_gen3_thermal: check that TSC exists before memory allocation
...
Linus Torvalds [Fri, 12 May 2017 18:48:26 +0000 (11:48 -0700)]
Merge tag 'drm-fixes-for-v4.12-rc1' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"AMD, nouveau, one i915, and one EDID fix for v4.12-rc1
Some fixes that it would be good to have in rc1. It contains the i915
quiet fix that you reported.
It also has an amdgpu fixes pull, with lots of ongoing work on Vega10
which is new in this kernel and is preliminary support so may have a
fair bit of movement.
Otherwise a few non-Vega10 AMD fixes, one EDID fix and some nouveau
regression fixers"
* tag 'drm-fixes-for-v4.12-rc1' of git://people.freedesktop.org/~airlied/linux: (144 commits)
drm/i915: Make vblank evade warnings optional
drm/nouveau/therm: remove ineffective workarounds for alarm bugs
drm/nouveau/tmr: avoid processing completed alarms when adding a new one
drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
drm/nouveau/tmr: handle races with hw when updating the next alarm time
drm/nouveau/tmr: ack interrupt before processing alarms
drm/nouveau/core: fix static checker warning
drm/nouveau/fb/ram/gf100-: remove 0x10f200 read
drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes
drm/nouveau/kms/nv50: fix source-rect-only plane updates
drm/nouveau/kms/nv50: remove pointless argument to window atomic_check_acquire()
drm/amd/powerplay: refine pwm1_enable callback functions for CI.
drm/amd/powerplay: refine pwm1_enable callback functions for vi.
drm/amd/powerplay: refine pwm1_enable callback functions for Vega10.
drm/amdgpu: refine amdgpu pwm1_enable sysfs interface.
drm/amdgpu: add amd fan ctrl mode enums.
drm/amd/powerplay: add more smu message on Vega10.
drm/amdgpu: fix dependency issue
drm/amd: fix init order of sched job
drm/amdgpu: add some additional vega10 pci ids
...
Linus Torvalds [Fri, 12 May 2017 18:44:13 +0000 (11:44 -0700)]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"Things were a lot more calm than previously expected. It's primarily
fixes in various areas, with most of the new functionality centering
around TCMU backend driver work that Xiubo Li has been driving.
Here's the summary on the feature side:
- Make T10-PI verify configurable for emulated (FILEIO + RD) backends
(Dmitry Monakhov)
- Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic
(Bryant Ly + MNC)
- Add TCMU support for growing ring buffer size (Xiubo Li + MNC)
- Add TCMU support for global block data pool (Xiubo Li + MNC)
and on the bug-fix side:
- Fix COMPARE_AND_WRITE non GOOD status handling for READ phase
failures (Gary Guo + nab)
- Fix iscsi-target hang with explicitly changing per NodeACL
CmdSN number depth with concurrent login driven session
reinstatement. (Gary Guo + nab)
- Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly)
- Fix target-core/FILEIO zero length handling (Bart Van Assche)
Also, there was an OOPs introduced with the WRITE_VERIFY changes that
I ended up reverting at the last minute, because as not unusual Bart
and I could not agree on the fix in time for -rc1. Since it's specific
to a conformance test, it's been reverted for now.
There is a separate patch in the queue to address the underlying
control CDB write overflow regression in >= v4.3 separate from the
WRITE_VERIFY revert here, that will be pushed post -rc1"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits)
Revert "target: Fix VERIFY and WRITE VERIFY command parsing"
IB/srpt: Avoid that aborting a command triggers a kernel warning
IB/srpt: Fix abort handling
target/fileio: Fix zero-length READ and WRITE handling
ibmvscsis: Do not send aborted task response
tcmu: fix module removal due to stuck thread
target: Don't force session reset if queue_depth does not change
iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
target: Fix compare_and_write_callback handling for non GOOD status
tcmu: Recalculate the tcmu_cmd size to save cmd area memories
tcmu: Add global data block pool support
tcmu: Add dynamic growing data area feature support
target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store()
target: fixup error message in target_tg_pt_gp_alua_access_type_store()
target/user: PGR Support
target: Add WRITE_VERIFY_16
Documentation/target: add an example script to configure an iSCSI target
target: Use kmalloc_array() in transport_kmap_data_sg()
target: Use kmalloc_array() in compare_and_write_callback()
target: Improve size determinations in two functions
...
Linus Torvalds [Fri, 12 May 2017 18:39:59 +0000 (11:39 -0700)]
Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Making sure that something like a referral point won't end up as pwd
or root.
The main part is the last commit (fixing mntns_install()); that one
fixes a hard-to-hit race. The fchdir() commit is making fchdir(2) a
bit more robust - it should be impossible to get opened files (even
O_PATH ones) for referral points in the first place, so the existing
checks are OK, but checking the same thing as in chdir(2) is just as
cheap.
The path_init() commit removes a redundant check that shouldn't have
been there in the first place"
* 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
make sure that mntns_install() doesn't end up with referral for root
path_init(): don't bother with checking MAY_EXEC for LOOKUP_ROOT
make sure that fchdir() won't accept referral points, etc.
Linus Torvalds [Fri, 12 May 2017 17:45:36 +0000 (10:45 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates/fixes from Ingo Molnar:
"Mostly tooling updates, but also two kernel fixes: a call chain
handling robustness fix and an x86 PMU driver event definition fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/callchain: Force USER_DS when invoking perf_callchain_user()
tools build: Fixup sched_getcpu feature test
perf tests kmod-path: Don't fail if compressed modules aren't supported
perf annotate: Fix AArch64 comment char
perf tools: Fix spelling mistakes
perf/x86: Fix Broadwell-EP DRAM RAPL events
perf config: Refactor a duplicated code for obtaining config file name
perf symbols: Allow user probes on versioned symbols
perf symbols: Accept symbols starting at address 0
tools lib string: Adopt prefixcmp() from perf and subcmd
perf units: Move parse_tag_value() to units.[ch]
perf ui gtk: Move gtk .so name to the only place where it is used
perf tools: Move HAS_BOOL define to where perl headers are used
perf memswap: Split the byteswap memory range wrappers from util.[ch]
perf tools: Move event prototypes from util.h to event.h
perf buildid: Move prototypes from util.h to build-id.h
Linus Torvalds [Fri, 12 May 2017 17:41:45 +0000 (10:41 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull stackprotector fixlet from Ingo Molnar:
"A single fix/enhancement to increase stackprotector canary randomness
on 64-bit kernels with very little cost"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
Linus Torvalds [Fri, 12 May 2017 17:11:50 +0000 (10:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- two boot crash fixes
- unwinder fixes
- kexec related kernel direct mappings enhancements/fixes
- more Clang support quirks
- minor cleanups
- Documentation fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Fix a typo in Documentation
x86/build: Don't add -maccumulate-outgoing-args w/o compiler support
x86/boot/32: Fix UP boot on Quark and possibly other platforms
x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init()
x86/kexec/64: Use gbpages for identity mappings if available
x86/mm: Add support for gbpages to kernel_ident_mapping_init()
x86/boot: Declare error() as noreturn
x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility
x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()
x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic()
x86/microcode/AMD: Remove redundant NULL check on mc
Linus Torvalds [Fri, 12 May 2017 17:09:14 +0000 (10:09 -0700)]
Merge tag 'for-linus-4.12b-rc0c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains two fixes for booting under Xen introduced during this
merge window and two fixes for older problems, where one is just much
more probable due to another merge window change"
* tag 'for-linus-4.12b-rc0c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: adjust early dom0 p2m handling to xen hypervisor behavior
x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
xen/x86: Do not call xen_init_time_ops() until shared_info is initialized
x86/xen: fix xsave capability setting
Linus Torvalds [Fri, 12 May 2017 17:04:09 +0000 (10:04 -0700)]
Merge tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
"The change to the Linux page table geometry was delayed for more
testing with 16G pages, and there's the new CPU features stuff which
just needed one more polish before going in. Plus a few changes from
Scott which came in a bit late. And then various fixes, mostly minor.
Summary highlights:
- rework the Linux page table geometry to lower memory usage on
64-bit Book3S (IBM chips) using the Hash MMU.
- support for a new device tree binding for discovering CPU features
on future firmwares.
- Freescale updates from Scott:
"Includes a fix for a powerpc/next mm regression on 64e, a fix for
a kernel hang on 64e when using a debugger inside a relocated
kernel, a qman fix, and misc qe improvements."
Thanks to: Christophe Leroy, Gavin Shan, Horia Geantă, LiuHailong,
Nicholas Piggin, Roy Pledge, Scott Wood, Valentin Longchamp"
* tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Support new device tree binding for discovering CPU features
powerpc: Don't print cpu_spec->cpu_name if it's NULL
of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle
powerpc/64s: Fix unnecessary machine check handler relocation branch
powerpc/mm/book3s/64: Rework page table geometry for lower memory usage
powerpc: Fix distclean with Makefile.postlink
powerpc/64e: Don't place the stack beyond TASK_SIZE
powerpc/powernv: Block PCI config access on BCM5718 during EEH recovery
powerpc/8xx: Adding support of IRQ in MPC8xx GPIO
soc/fsl/qbman: Disable IRQs for deferred QBMan work
soc/fsl/qe: add EXPORT_SYMBOL for the 2 qe_tdm functions
soc/fsl/qe: only apply QE_General4 workaround on affected SoCs
soc/fsl/qe: round brg_freq to 1kHz granularity
soc/fsl/qe: get rid of immrbar_virt_to_phys()
net: ethernet: ucc_geth: fix MEM_PART_MURAM mode
powerpc/64e: Fix hang when debugging programs with relocated kernel
Linus Torvalds [Fri, 12 May 2017 16:56:30 +0000 (09:56 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from James Hogan:
"math-emu:
- Add missing clearing of BLTZALL and BGEZALL emulation counters
- Fix BC1EQZ and BC1NEZ condition handling
- Fix BLEZL and BGTZL identification
BPF:
- Add JIT support for SKF_AD_HATYPE
- Use unsigned access for unsigned SKB fields
- Quit clobbering callee saved registers in JIT code
- Fix multiple problems in JIT skb access helpers
Loongson 3:
- Select MIPS_L1_CACHE_SHIFT_6
Octeon:
- Remove vestiges of CONFIG_CAVIUM_OCTEON_2ND_KERNEL
- Remove unused L2C types and macros.
- Remove unused SLI types and macros.
- Fix compile error when USB is not enabled.
- Octeon: Remove unused PCIERCX types and macros.
- Octeon: Clean up platform code.
SNI:
- Remove recursive include of cpu-feature-overrides.h
Sibyte:
- Export symbol periph_rev to sb1250-mac network driver.
- Fix Kconfig warning.
Generic platform:
- Enable Root FS on NFS in generic_defconfig
SMP-MT:
- Use CPU interrupt controller IPI IRQ domain support
UASM:
- Add support for LHU for uasm.
- Remove needless ISA abstraction
mm:
- Add 48-bit VA space and 4-level page tables for 4K pages.
PCI:
- Add controllers before the specified head
irqchip driver for MIPS CPU:
- Replace magic 0x100 with IE_SW0
- Prepare for non-legacy IRQ domains
- Introduce IPI IRQ domain support
MAINTAINERS:
- Update email-id of Rahul Bedarkar
NET:
- sb1250-mac: Add missing MODULE_LICENSE()
CPUFREQ:
- Loongson2: drop set_cpus_allowed_ptr()
Misc:
- Disable Werror when W= is set
- Opt into HAVE_COPY_THREAD_TLS
- Enable GENERIC_CPU_AUTOPROBE
- Use common outgoing-CPU-notification code
- Remove dead define of ST_OFF
- Remove CONFIG_ARCH_HAS_ILOG2_U{32,64}
- Stengthen IPI IRQ domain sanity check
- Remove confusing else statement in __do_page_fault()
- Don't unnecessarily include kmalloc.h into <asm/cache.h>.
- Delete unused definition of SMP_CACHE_SHIFT.
- Delete redundant definition of SMP_CACHE_BYTES"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (39 commits)
MIPS: Sibyte: Fix Kconfig warning.
MIPS: Sibyte: Export symbol periph_rev to sb1250-mac network driver.
NET: sb1250-mac: Add missing MODULE_LICENSE()
MAINTAINERS: Update email-id of Rahul Bedarkar
MIPS: Remove confusing else statement in __do_page_fault()
MIPS: Stengthen IPI IRQ domain sanity check
MIPS: smp-mt: Use CPU interrupt controller IPI IRQ domain support
irqchip: mips-cpu: Introduce IPI IRQ domain support
irqchip: mips-cpu: Prepare for non-legacy IRQ domains
irqchip: mips-cpu: Replace magic 0x100 with IE_SW0
MIPS: Remove CONFIG_ARCH_HAS_ILOG2_U{32,64}
MIPS: generic: Enable Root FS on NFS in generic_defconfig
MIPS: mach-rm: Remove recursive include of cpu-feature-overrides.h
MIPS: Opt into HAVE_COPY_THREAD_TLS
CPUFREQ: Loongson2: drop set_cpus_allowed_ptr()
MIPS: uasm: Remove needless ISA abstraction
MIPS: Remove dead define of ST_OFF
MIPS: Use common outgoing-CPU-notification code
MIPS: math-emu: Fix BC1EQZ and BC1NEZ condition handling
MIPS: r2-on-r6-emu: Clear BLTZALL and BGEZALL debugfs counters
...
Linus Torvalds [Fri, 12 May 2017 16:53:16 +0000 (09:53 -0700)]
Merge tag 'nios2-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull nios2 updates from Ley Foon Tan:
"nios2 fixes/enhancements and adding nios2 R2 support"
* tag 'nios2-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: remove custom early console implementation
nios2: use generic strncpy_from_user() and strnlen_user()
nios2: Add CDX support
nios2: Add BMX support
nios2: Add NIOS2_ARCH_REVISION to select between R1 and R2
nios2: implement flush_dcache_mmap_lock/unlock
nios2: enable earlycon support
nios2: constify irq_domain_ops
nios2: remove wrapper header for cmpxchg.h
nios2: add .gitignore entries for auto-generated files
Neil Horman [Fri, 12 May 2017 16:00:01 +0000 (12:00 -0400)]
vmxnet3: ensure that adapter is in proper state during force_close
There are several paths in vmxnet3, where settings changes cause the
adapter to be brought down and back up (vmxnet3_set_ringparam among
them). Should part of the reset operation fail, these paths call
vmxnet3_force_close, which enables all napi instances prior to calling
dev_close (with the expectation that vmxnet3_close will then properly
disable them again). However, vmxnet3_force_close neglects to clear
VMXNET3_STATE_BIT_QUIESCED prior to calling dev_close. As a result
vmxnet3_quiesce_dev (called from vmxnet3_close), returns early, and
leaves all the napi instances in a enabled state while the device itself
is closed. If a device in this state is activated again, napi_enable
will be called on already enabled napi_instances, leading to a BUG halt.
The fix is to simply enausre that the QUIESCED bit is cleared in
vmxnet3_force_close to allow quesence to be completed properly on close.
Bert Kenward [Fri, 12 May 2017 16:18:50 +0000 (17:18 +0100)]
sfc: revert changes to NIC revision numbers
The revision enum values (eg EFX_REV_HUNT_A0) form part of our API,
and are included in ethtool. If these are inconsistent then ethtool
will print garbage for a register dump (ethtool -d).
Fixes: 5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver") Signed-off-by: Edward Cree <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Add default case to switch in order to avoid any chance of using an
uninitialized variable _low_, in case s->type does not match any of
the listed case values.
Xin Long [Fri, 12 May 2017 06:39:52 +0000 (14:39 +0800)]
sctp: fix src address selection if using secondary addresses for ipv6
Commit 0ca50d12fe46 ("sctp: fix src address selection if using secondary
addresses") has fixed a src address selection issue when using secondary
addresses for ipv4.
Now sctp ipv6 also has the similar issue. When using a secondary address,
sctp_v6_get_dst tries to choose the saddr which has the most same bits
with the daddr by sctp_v6_addr_match_len. It may make some cases not work
as expected.
route from hostA to hostB:
fd21:356b:459a:cf30::/64 dev eth1 metric 1024 mtu 1500
The expected path should be:
fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2
But addr[2] matches addr[a] more bits than addr[1] does, according to
sctp_v6_addr_match_len. It causes the path to be:
fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2
This patch is to fix it with the same way as Marcelo's fix for sctp ipv4.
As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check
if the saddr is in a dev instead.
Note that for backwards compatibility, it will still do the addr_match_len
check here when no optimal is found.
Takashi Iwai [Thu, 11 May 2017 09:14:45 +0000 (11:14 +0200)]
sound: Disable the build of OSS drivers
OSS drivers are left as badly unmaintained, and now we're facing a
problem to clean up the hackish set_fs() usage in their codes. Since
most of drivers have been covered by ALSA, and the others are dead old
and inactive, let's leave them RIP.
This patch is the first step: disable the build of OSS drivers.
We'll eventually drop the whole codes and clean up later.
Note that sound/oss/dmasound is still kept, since it's a completely
different implementation of OSS, and it doesn't suffer from set_fs()
hack. Moreover, the build of ALSA is disabled on M68K by some reason,
thus disabling it shall result in a regression. This one will be
disabled / removed once when we add the support in ALSA side.
Ville Syrjälä [Sun, 7 May 2017 17:12:52 +0000 (20:12 +0300)]
drm/i915: Make vblank evade warnings optional
Add a new Kconfig option to enable/disable the extra warnings
from the vblank evade code. For now we'll keep the warning
about an actually missed vblank always enabled as that can have
an actual user visible impact. But if we miss the deadline
othrwise there's no real need to bother the user with that.
We'll want these warnings enabled during development however
so that we can catch regressions.
Based on the reports it looks like this is still very easy
to hit on SKL, so we have more work ahead of us to optimize
the crtiical section further.
Dave Airlie [Fri, 12 May 2017 04:25:22 +0000 (14:25 +1000)]
Merge branch 'linux-4.12' of git://github.com/skeggsb/linux into drm-next
Quite a few patches, but not much code changed:
- Fixes regression from atomic when only the source rect of a plane
changes (ie. xrandr --right-of)
- Fixes another issue where atomic changed behaviour underneath us,
potentially causing laggy cursor position updates
- Fixes for a bunch of races in thermal code, which lead to random
lockups for a lot of users
* 'linux-4.12' of git://github.com/skeggsb/linux:
drm/nouveau/therm: remove ineffective workarounds for alarm bugs
drm/nouveau/tmr: avoid processing completed alarms when adding a new one
drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
drm/nouveau/tmr: handle races with hw when updating the next alarm time
drm/nouveau/tmr: ack interrupt before processing alarms
drm/nouveau/core: fix static checker warning
drm/nouveau/fb/ram/gf100-: remove 0x10f200 read
drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes
drm/nouveau/kms/nv50: fix source-rect-only plane updates
drm/nouveau/kms/nv50: remove pointless argument to window atomic_check_acquire()
Dave Airlie [Fri, 12 May 2017 03:58:29 +0000 (13:58 +1000)]
Merge branch 'drm-next-4.12' of git://people.freedesktop.org/~agd5f/linux into drm-next
Fixes for 4.12. This is a bit bigger than usual since it's 3 weeks
worth of fixes and most of these changes are for vega10 which is
new for 4.12 and still in a fair amount of flux. It looks like
you missed my last pull request, so those patches are included here
as well. Highlights:
- Lots of vega10 fixes
- Fix interruptable wait mixup
- Fan control method fixes
- Misc display fixes for radeon and amdgpu
- Misc bug fixes
* 'drm-next-4.12' of git://people.freedesktop.org/~agd5f/linux: (132 commits)
drm/amd/powerplay: refine pwm1_enable callback functions for CI.
drm/amd/powerplay: refine pwm1_enable callback functions for vi.
drm/amd/powerplay: refine pwm1_enable callback functions for Vega10.
drm/amdgpu: refine amdgpu pwm1_enable sysfs interface.
drm/amdgpu: add amd fan ctrl mode enums.
drm/amd/powerplay: add more smu message on Vega10.
drm/amdgpu: fix dependency issue
drm/amd: fix init order of sched job
drm/amdgpu: add some additional vega10 pci ids
drm/amdgpu/soc15: use atomfirmware for setting bios scratch for reset
drm/amdgpu/atomfirmware: add function to update engine hang status
drm/radeon: only warn once in radeon_ttm_bo_destroy if va list not empty
drm/amdgpu: fix mutex list null pointer reference
drm/amd/powerplay: fix bug sclk/mclk level can't be set on vega10.
drm/amd/powerplay: Setup sw CTF to allow graceful exit when temperature exceeds maximum.
drm/amd/powerplay: delete dead code in powerplay.
drm/amdgpu: Use less generic enum definitions
drm/amdgpu/gfx9: derive tile pipes from golden settings
drm/amdgpu/gfx: drop max_gs_waves_per_vgt
drm/amd/powerplay: disable engine spread spectrum feature on Vega10.
...
Dave Airlie [Fri, 12 May 2017 03:57:20 +0000 (13:57 +1000)]
Merge tag 'drm-misc-next-fixes-2017-05-05' of git://anongit.freedesktop.org/git/drm-misc into drm-next
Core Changes:
- Add quirk for LGD 764 panel to default 10bpc (Mario)
Cc: Mario Kleiner <[email protected]>
* tag 'drm-misc-next-fixes-2017-05-05' of git://anongit.freedesktop.org/git/drm-misc:
drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2
Florian Fainelli [Thu, 11 May 2017 18:24:16 +0000 (11:24 -0700)]
net: phy: Call bus->reset() after releasing PHYs from reset
The API convention makes it that a given MDIO bus reset should be able
to access PHY devices in its reset() callback and perform additional
MDIO accesses in order to bring the bus and PHYs in a working state.
Commit 69226896ad63 ("mdio_bus: Issue GPIO RESET to PHYs.") broke that
contract by first calling bus->reset() and then release all PHYs from
reset using their shared GPIO line, so restore the expected
functionality here.
Fixes: 69226896ad63 ("mdio_bus: Issue GPIO RESET to PHYs.") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Jon Paul Maloy [Thu, 11 May 2017 18:28:15 +0000 (20:28 +0200)]
tipc: make macro tipc_wait_for_cond() smp safe
The macro tipc_wait_for_cond() is embedding the macro sk_wait_event()
to fulfil its task. The latter, in turn, is evaluating the stated
condition outside the socket lock context. This is problematic if
the condition is accessing non-trivial data structures which may be
altered by incoming interrupts, as is the case with the cong_links()
linked list, used by socket to keep track of the current set of
congested links. We sometimes see crashes when this list is accessed
by a condition function at the same time as a SOCK_WAKEUP interrupt
is removing an element from the list.
We fix this by expanding selected parts of sk_wait_event() into the
outer macro, while ensuring that all evaluations of a given condition
are performed under socket lock protection.
Fixes: commit 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") Reviewed-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Andy Gospodarek [Thu, 11 May 2017 19:52:30 +0000 (15:52 -0400)]
samples/bpf: run cleanup routines when receiving SIGTERM
Shahid Habib noticed that when xdp1 was killed from a different console the xdp
program was not cleaned-up properly in the kernel and it continued to forward
traffic.
Most of the applications in samples/bpf cleanup properly, but only when getting
SIGINT. Since kill defaults to using SIGTERM, add support to cleanup when the
application receives either SIGINT or SIGTERM.
Colin Ian King [Thu, 11 May 2017 18:29:40 +0000 (19:29 +0100)]
ethernet: aquantia: remove redundant checks on error status
The error status err is initialized as zero and then being checked
several times to see if it is less than zero even when it has not
been updated. It may seem that the err should be assigned to the
return code of the call to the various *offload_en_set calls and
then we check for failure, however, these functions are void and
never actually return any status.
Since these error checks are redundant we can remove these
as well as err and the error exit label err_exit.
Detected by CoverityScan, CID#1398313 and CID#1398306 ("Logically
dead code")
Chopra, Manish [Thu, 11 May 2017 14:12:47 +0000 (07:12 -0700)]
qlcnic: Fix link configuration with autoneg disabled
Currently driver returns error on speed configurations
for 83xx adapter's non XGBE ports, due to this link doesn't
come up on the ports using 1000Base-T as a connector with
autoneg disabled. This patch fixes this with initializing
appropriate port type based on queried module/connector
types from hardware before any speed/autoneg configuration.
Vitaly Kuznetsov [Thu, 11 May 2017 11:58:06 +0000 (13:58 +0200)]
xen-netfront: avoid crashing on resume after a failure in talk_to_netback()
Unavoidable crashes in netfront_resume() and netback_changed() after a
previous fail in talk_to_netback() (e.g. when we fail to read MAC from
xenstore) were discovered. The failure path in talk_to_netback() does
unregister/free for netdev but we don't reset drvdata and we try accessing
it after resume.
Fix the bug by removing the whole xen device completely with
device_unregister(), this guarantees we won't have any calls into netfront
after a failure.
Eric Dumazet [Thu, 11 May 2017 04:59:28 +0000 (21:59 -0700)]
net: sched: optimize class dumps
In commit 59cc1f61f09c ("net: sched: convert qdisc linked list to
hashtable") we missed the opportunity to considerably speed up
tc_dump_tclass_root() if a qdisc handle is provided by user.
Instead of iterating all the qdiscs, use qdisc_match_from_root()
to directly get the one we look for.
Yuchung Cheng [Thu, 11 May 2017 00:01:27 +0000 (17:01 -0700)]
tcp: avoid fragmenting peculiar skbs in SACK
This patch fixes a bug in splitting an SKB during SACK
processing. Specifically if an skb contains multiple
packets and is only partially sacked in the higher sequences,
tcp_match_sack_to_skb() splits the skb and marks the second fragment
as SACKed.
The current code further attempts rounding up the first fragment
to MSS boundaries. But it misses a boundary condition when the
rounded-up fragment size (pkt_len) is exactly skb size. Spliting
such an skb is pointless and causses a kernel warning and aborts
the SACK processing. This patch universally checks such over-split
before calling tcp_fragment to prevent these unnecessary warnings.
Eric Dumazet [Thu, 11 May 2017 22:24:41 +0000 (15:24 -0700)]
netem: fix skb_orphan_partial()
I should have known that lowering skb->truesize was dangerous :/
In case packets are not leaving the host via a standard Ethernet device,
but looped back to local sockets, bad things can happen, as reported
by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )
So instead of tweaking skb->truesize, lets change skb->destructor
and keep a reference on the owner socket via its sk_refcnt.
Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Michael Madsen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Fri, 12 May 2017 01:30:58 +0000 (21:30 -0400)]
Merge branch 'generic-xdp-followups'
Daniel Borkmann says:
====================
Two generic xdp related follow-ups
Two follow-ups for the generic XDP API, would be great if
both could still be considered, since the XDP API is not
frozen yet. For details please see individual patches.
v1 -> v2:
- Implemented feedback from Jakub Kicinski (reusing
attribute on dump), thanks!
- Rest as is.
====================
Daniel Borkmann [Thu, 11 May 2017 23:04:46 +0000 (01:04 +0200)]
xdp: refine xdp api with regards to generic xdp
While working on the iproute2 generic XDP frontend, I noticed that
as of right now it's possible to have native *and* generic XDP
programs loaded both at the same time for the case when a driver
supports native XDP.
The intended model for generic XDP from b5cdae3291f7 ("net: Generic
XDP") is, however, that only one out of the two can be present at
once which is also indicated as such in the XDP netlink dump part.
The main rationale for generic XDP is to ease accessibility (in
case a driver does not yet have XDP support) and to generically
provide a semantical model as an example for driver developers
wanting to add XDP support. The generic XDP option for an XDP
aware driver can still be useful for comparing and testing both
implementations.
However, it is not intended to have a second XDP processing stage
or layer with exactly the same functionality of the first native
stage. Only reason could be to have a partial fallback for future
XDP features that are not supported yet in the native implementation
and we probably also shouldn't strive for such fallback and instead
encourage native feature support in the first place. Given there's
currently no such fallback issue or use case, lets not go there yet
if we don't need to.
Therefore, change semantics for loading XDP and bail out if the
user tries to load a generic XDP program when a native one is
present and vice versa. Another alternative to bailing out would
be to handle the transition from one flavor to another gracefully,
but that would require to bring the device down, exchange both
types of programs, and bring it up again in order to avoid a tiny
window where a packet could hit both hooks. Given this complicates
the logic for just a debugging feature in the native case, I went
with the simpler variant.
For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae3291f7
and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all
or just a subset of flags that were used for loading the XDP prog
is suboptimal in the long run since not all flags are useful for
dumping and if we start to reuse the same flag definitions for
load and dump, then we'll waste bit space. What we really just
want is to dump the mode for now.
Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0),
a program is running at the native driver layer (1). Thus, add a
mode that says that a program is running at generic XDP layer (2).
Applications will handle this fine in that older binaries will
just indicate that something is attached at XDP layer, effectively
this is similar to IFLA_XDP_FLAGS attr that we would have had
modulo the redundancy.
Daniel Borkmann [Thu, 11 May 2017 23:04:45 +0000 (01:04 +0200)]
xdp: add flag to enforce driver mode
After commit b5cdae3291f7 ("net: Generic XDP") we automatically fall
back to a generic XDP variant if the driver does not support native
XDP. Allow for an option where the user can specify that always the
native XDP variant should be selected and in case it's not supported
by a driver, just bail out.
Ben Skeggs [Thu, 11 May 2017 07:13:29 +0000 (17:13 +1000)]
drm/nouveau/tmr: avoid processing completed alarms when adding a new one
The idea here was to avoid having to "manually" program the HW if there's
a new earliest alarm. This was lazy and bad, as it leads to loads of fun
races between inter-related callers (ie. therm).
Turns out, it's not so difficult after all. Go figure ;)
This "optimisation" (which was originally meant to skip updating cursor
settings in the core channel on position-only updates) turned out to be
pointless in the final design of the code before it was merged.
Linus Torvalds [Thu, 11 May 2017 18:29:52 +0000 (11:29 -0700)]
Merge tag 'docs-4.12-2' of git://git.lwn.net/linux
Pull more documentation updates from Jonathan Corbet:
"Connect the newly RST-formatted documentation to the rest; this had to
wait until the input pull was done. There's also a few small fixes
that wandered in"
* tag 'docs-4.12-2' of git://git.lwn.net/linux:
doc: replace FTP URL to kernel.org with HTTPS one
docs: update references to the device io book
Documentation: earlycon: fix Marvell Armada 3700 UART name
docs-rst: add input docs at main index and use kernel-figure
David S. Miller [Wed, 10 May 2017 18:22:52 +0000 (11:22 -0700)]
bpf: Track alignment of register values in the verifier.
Currently if we add only constant values to pointers we can fully
validate the alignment, and properly check if we need to reject the
program on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.
However, once an unknown value is introduced we only allow byte sized
memory accesses which is too restrictive.
Add logic to track the known minimum alignment of register values,
and propagate this state into registers containing pointers.
The most common paradigm that makes use of this new logic is computing
the transport header using the IP header length field. For example:
The existing code will reject the load of th->dest because it cannot
validate that the alignment is at least 2 once "n * 4" is added the
the packet pointer.
In the new code, the register holding "n * 4" will have a reg->min_align
value of 4, because any value multiplied by 4 will be at least 4 byte
aligned. (actually, the eBPF code emitted by the compiler in this case
is most likely to use a shift left by 2, but the end result is identical)
At the critical addition:
th = ((void *)iph + (n * 4));
The register holding 'th' will start with reg->off value of 14. The
pointer addition will transform that reg into something that looks like:
reg->aux_off = 14
reg->aux_off_align = 4
Next, the verifier will look at the th->dest load, and it will see
a load offset of 2, and first check:
if (reg->aux_off_align % size)
which will pass because aux_off_align is 4. reg_off will be computed:
reg_off = reg->off;
...
reg_off += reg->aux_off;
plus we have off==2, and it will thus check:
if ((NET_IP_ALIGN + reg_off + off) % size != 0)
which evaluates to:
if ((NET_IP_ALIGN + 14 + 2) % size != 0)
On strict alignment architectures, NET_IP_ALIGN is 2, thus:
if ((2 + 14 + 2) % size != 0)
which passes.
These pointer transformations and checks work regardless of whether
the constant offset or the variable with known alignment is added
first to the pointer register.
Linus Torvalds [Thu, 11 May 2017 18:01:56 +0000 (11:01 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A smaller collection of fixes that should go into -rc1. This contains:
- A fix from Christoph, fixing a regression with the WRITE_SAME and
partial completions. Caused a BUG() on ppc.
- Fixup for __blk_mq_stop_hw_queues(), it should be static. From
Colin.
- Removal of dmesg error messages on elevator switching, when invoked
from sysfs. From me.
- Fix for blk-stat, using this_cpu_ptr() in a section only protected
by rcu_read_lock(). This breaks when PREEMPT_RCU is enabled. From
me.
- Two fixes for BFQ from Paolo, one fixing a crash and one updating
the documentation.
- An error handling lightnvm memory leak, from Rakesh.
- The previous blk-mq hot unplug lock reversal depends on the CPU
hotplug rework that isn't in mainline yet. This caused a lockdep
splat when people unplugged CPUs with blk-mq devices. From Wanpeng.
- A regression fix for DIF/DIX on blk-mq. From Wen"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: handle partial completions for special payload requests
blk-mq: NVMe 512B/4K+T10 DIF/DIX format returns I/O error on dd with split op
blk-stat: don't use this_cpu_ptr() in a preemptable section
elevator: remove redundant warnings on IO scheduler switch
block, bfq: stress that low_latency must be off to get max throughput
block, bfq: use pointer entity->sched_data only if set
nvme: lightnvm: fix memory leak
blk-mq: make __blk_mq_stop_hw_queues static
lightnvm: remove unused rq parameter of nvme_nvm_rqtocmd() to kill warning
block/mq: fix potential deadlock during cpu hotplug
Linus Torvalds [Thu, 11 May 2017 17:44:22 +0000 (10:44 -0700)]
Merge tag 'for-linus-20170510' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"NAND, from Boris:
- some minor fixes/improvements on existing drivers (fsmc, gpio, ifc,
davinci, brcmnand, omap)
- a huge cleanup/rework of the denali driver accompanied with core
fixes/improvements to simplify the driver code
- a complete rewrite of the atmel driver to support new DT bindings
make future evolution easier
- the addition of per-vendor detection/initialization steps to avoid
extending the nand_ids table with more extended-id entries
SPI NOR, from Cyrille:
- fixes in the hisi, intel and Mediatek SPI controller drivers
- fixes to some SPI flash memories not supporting the Chip Erase
command.
- add support to some new memory parts (Winbond, Macronix, Micron,
ESMT).
- add new driver for the STM32 QSPI controller
And a few fixes for Gemini and Versatile platforms on physmap-of"
* tag 'for-linus-20170510' of git://git.infradead.org/linux-mtd: (100 commits)
MAINTAINERS: Update NAND subsystem git repositories
mtd: nand: gpio: update binding
mtd: nand: add ooblayout for old hamming layout
mtd: oxnas_nand: Allocating more than necessary in probe()
dt-bindings: mtd: Document the STM32 QSPI bindings
mtd: mtk-nor: set controller's address width according to nor flash
mtd: spi-nor: add driver for STM32 quad spi flash controller
mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program
mtd: nand: davinci: add comment on NAND subpage write status on keystone
mtd: nand: omap2: Fix partition creation via cmdline mtdparts
mtd: nand: NULL terminate a of_device_id table
mtd: nand: Fix a couple error codes
mtd: nand: allow drivers to request minimum alignment for passed buffer
mtd: nand: allocate aligned buffers if NAND_OWN_BUFFERS is unset
mtd: nand: denali: allow to override revision number
mtd: nand: denali_dt: use pdev instead of ofdev for platform_device
mtd: nand: denali_dt: remove dma-mask DT property
mtd: nand: denali: support 64bit capable DMA engine
mtd: nand: denali_dt: enable HW_ECC_FIXUP for Altera SOCFPGA variant
mtd: nand: denali: support HW_ECC_FIXUP capability
...
Daniel Borkmann [Wed, 10 May 2017 23:53:15 +0000 (01:53 +0200)]
bpf, arm64: fix faulty emission of map access in tail calls
Shubham was recently asking on netdev why in arm64 JIT we don't multiply
the index for accessing the tail call map by 8. That led me into testing
out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
dereference on the tail call.
The buggy access is at:
prog = array->ptrs[index];
if (prog == NULL)
goto out;
The code triggering the crash is f862694b. x1 at the time contains the
address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
above we load the pointer to the program at map slot 0 into x10. x10
can then be NULL if the slot is not occupied, which we later on try to
access with a user given offset in x2 that is the map index.
This basically adds the offset to ptrs to the base address of the bpf
array we got and we later on access the map with an index * 8 offset
relative to that. The tail call map itself is basically one large area
with meta data at the head followed by the array of prog pointers.
This makes tail calls working again, tested on Cavium ThunderX ARMv8.
Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper") Reported-by: Shubham Bansal <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Julian Wiedmann [Wed, 10 May 2017 17:07:52 +0000 (19:07 +0200)]
s390/qeth: unbreak OSM and OSN support
commit b4d72c08b358 ("qeth: bridgeport support - basic control")
broke the support for OSM and OSN devices as follows:
As OSM and OSN are L2 only, qeth_core_probe_device() does an early
setup by loading the l2 discipline and calling qeth_l2_probe_device().
In this context, adding the l2-specific bridgeport sysfs attributes
via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c,
since the basic sysfs infrastructure for the device hasn't been
established yet.
Note that OSN actually has its own unique sysfs attributes
(qeth_osn_devtype), so the additional attributes shouldn't be created
at all.
For OSM, add a new qeth_l2_devtype that contains all the common
and l2-specific sysfs attributes.
When qeth_core_probe_device() does early setup for OSM or OSN, assign
the corresponding devtype so that the ccwgroup probe code creates the
full set of sysfs attributes.
This allows us to skip qeth_l2_create_device_attributes() in case
of an early setup.
Any device that can't do early setup will initially have only the
generic sysfs attributes, and when it's probed later
qeth_l2_probe_device() adds the l2-specific attributes.
If an early-setup device is removed (by calling ccwgroup_ungroup()),
device_unregister() will - using the devtype - delete the
l2-specific attributes before qeth_l2_remove_device() is called.
So make sure to not remove them twice.
What complicates the issue is that qeth_l2_probe_device() and
qeth_l2_remove_device() is also called on a device when its
layer2 attribute changes (ie. its layer mode is switched).
For early-setup devices this wouldn't work properly - we wouldn't
remove the l2-specific attributes when switching to L3.
But switching the layer mode doesn't actually make any sense;
we already decided that the device can only operate in L2!
So just refuse to switch the layer mode on such devices. Note that
OSN doesn't have a layer2 attribute, so we only need to special-case
OSM.
Based on an initial patch by Ursula Braun.
Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Ursula Braun [Wed, 10 May 2017 17:07:51 +0000 (19:07 +0200)]
s390/qeth: handle sysfs error during initialization
When setting up the device from within the layer discipline's
probe routine, creating the layer-specific sysfs attributes can fail.
Report this error back to the caller, and handle it by
releasing the layer discipline.
Signed-off-by: Ursula Braun <[email protected]>
[jwi: updated commit msg, moved an OSN change to a subsequent patch] Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
There is a potential unnecessary refcount decrement on error path of
put_device(&pb->mii_bus->dev), as it is possible to avoid the
of_mdio_find_bus() call if mux_bus is specified by the calling function.
The same put_device() is not called in the error path if the
devm_kzalloc of pb fails. This caused the variable used in the
put_device() to be changed, as the pb pointer was obviously not set up.
There is an unnecessary of_node_get() on child_bus_node if the
of_mdiobus_register() is successful, as the
for_each_available_child_of_node() automatically increments this.
Thus the refcount on this node will always be +1 more than it should be.
There is no of_node_put() on child_bus_node if the of_mdiobus_register()
call fails.
Finally, it is lacking devm_kfree() of pb in the error path. While this
might not be technically necessary, it was present in other parts of the
function. So, I am adding it where necessary to make it uniform.
Signed-off-by: Jon Mason <[email protected]> Fixes: f20e6657a875 ("mdio: mux: Enhanced MDIO mux framework for integrated multiplexers") Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.") Signed-off-by: David S. Miller <[email protected]>
Colin Ian King [Tue, 9 May 2017 16:19:42 +0000 (17:19 +0100)]
netxen_nic: set rcode to the return status from the call to netxen_issue_cmd
Currently rcode is being initialized to NX_RCODE_SUCCESS and later it
is checked to see if it is not NX_RCODE_SUCCESS which is never true. It
appears that there is an unintentional missing assignment of rcode from
the return of the call to netxen_issue_cmd() that was dropped in
an earlier fix, so add it in.
Detected by CoverityScan, CID#401900 ("Logically dead code")
Fixes: 2dcd5d95ad6b2 ("netxen_nic: fix cdrp race condition") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Stefan Wahren [Tue, 9 May 2017 13:40:38 +0000 (15:40 +0200)]
net: qca_spi: Fix alignment issues in rx path
The qca_spi driver causes alignment issues on ARM devices.
So fix this by using netdev_alloc_skb_ip_align().
Signed-off-by: Stefan Wahren <[email protected]> Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller <[email protected]>
Gao Feng [Tue, 9 May 2017 10:27:33 +0000 (18:27 +0800)]
driver: vrf: Fix one possible use-after-free issue
The current codes only deal with the case that the skb is dropped, it
may meet one use-after-free issue when NF_HOOK returns 0 that means
the skb is stolen by one netfilter rule or hook.
When one netfilter rule or hook stoles the skb and return NF_STOLEN,
it means the skb is taken by the rule, and other modules should not
touch this skb ever. Maybe the skb is queued or freed directly by the
rule.
Now uses the nf_hook instead of NF_HOOK to get the result of netfilter,
and check the return value of nf_hook. Only when its value equals 1, it
means the skb could go ahead. Or reset the skb as NULL.
BTW, because vrf_rcv_finish is empty function, so needn't invoke it
even though nf_hook returns 1. But we need to modify vrf_rcv_finish
to deal with the NF_STOLEN case.
There are two cases when skb is stolen.
1. The skb is stolen and freed directly.
There is nothing we need to do, and vrf_rcv_finish isn't invoked.
2. The skb is queued and reinjected again.
The vrf_rcv_finish would be invoked as okfn, so need to free the
skb in it.
block: handle partial completions for special payload requests
SCSI devices can return short writes on Write Same just like for normal
writes, so we need to handle this case for our special payload requests
as well.
Juergen Gross [Wed, 10 May 2017 04:08:44 +0000 (06:08 +0200)]
xen: adjust early dom0 p2m handling to xen hypervisor behavior
When booted as pv-guest the p2m list presented by the Xen is already
mapped to virtual addresses. In dom0 case the hypervisor might make use
of 2M- or 1G-pages for this mapping. Unfortunately while being properly
aligned in virtual and machine address space, those pages might not be
aligned properly in guest physical address space.
So when trying to obtain the guest physical address of such a page
pud_pfn() and pmd_pfn() must be avoided as those will mask away guest
physical address bits not being zero in this special case.
x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
When running as Xen pv guest X86_BUG_SYSRET_SS_ATTRS must not be set
on AMD cpus.
This bug/feature bit is kind of special as it will be used very early
when switching threads. Setting the bit and clearing it a little bit
later leaves a critical window where things can go wrong. This time
window has enlarged a little bit by using setup_clear_cpu_cap() instead
of the hypervisor's set_cpu_features callback. It seems this larger
window now makes it rather easy to hit the problem.
The proper solution is to never set the bit in case of Xen.
arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y
When CONFIG_ARM64_MODULE_PLTS is enabled, the first allocation using the
module space fails, because the module is too big, and then the module
allocation is attempted from vmalloc space. Silence the first allocation
failure in that case by setting __GFP_NOWARN.
ARM: Silence first allocation with CONFIG_ARM_MODULE_PLTS=y
When CONFIG_ARM_MODULE_PLTS is enabled, the first allocation using the
module space fails, because the module is too big, and then the module
allocation is attempted from vmalloc space. Silence the first allocation
failure in that case by setting __GFP_NOWARN.
mm: Silence vmap() allocation failures based on caller gfp_flags
If the caller has set __GFP_NOWARN don't print the following message:
vmap allocation for size 15736832 failed: use vmalloc=<size> to increase
size.
This can happen with the ARM/Linux or ARM64/Linux module loader built
with CONFIG_ARM{,64}_MODULE_PLTS=y which does a first attempt at loading
a large module from module space, then falls back to vmalloc space.
Tobias Klauser [Thu, 11 May 2017 09:40:16 +0000 (11:40 +0200)]
nios2: remove custom early console implementation
As of commits d8f347ba35cf ("nios2: enable earlycon support"), 0dcc0542a006 ("serial: altera_jtaguart: add earlycon support") and 4d9d7d896d77 ("serial: altera_uart: add earlycon support"), the nios2
architecture and the altera_uart/altera_jtaguart drivers support
earlycon. Thus, the custom early console implementation for nios2 is no
longer necessary to get early boot messages. Remove it and rely fully on
earlycon support.
Author: Bart Van Assche <[email protected]>
Date: Thu Mar 30 10:12:39 2017 -0700
target: Fix VERIFY and WRITE VERIFY command parsing
This patch broke existing behaviour for WRITE_VERIFY because
it dropped the original SCF_SCSI_DATA_CDB assignment for
bytchk = 0 so target_cmd_size_check() no longer rejected
this case, allowing an overflow case to trigger an OOPs
in iscsi-target.
Since the short term and long term fixes are still being
discussed, revert it for now since it's late in the merge
window and try again in v4.13-rc1.
The conversion of __dax_zero_page_range() to 'struct dax_operations'
caused it to frequently fail. The mistake was treating the @size
parameter as a dax mapping length rather than just a length of the
clear_pmem() operation. The dax mapping length is assumed to be hard
coded as PAGE_SIZE.
Without this fix any page unaligned zeroing request will trigger a
-EINVAL return from bdev_dax_pgoff().
Vishal Verma [Wed, 10 May 2017 21:01:31 +0000 (15:01 -0600)]
libnvdimm, btt: ensure that initializing metadata clears poison
If we had badblocks/poison in the metadata area of a BTT, recreating the
BTT would not clear the poison in all cases, notably the flog area. This
is because rw_bytes will only clear errors if the request being sent
down is 512B aligned and sized.
Make sure that when writing the map and info blocks, the rw_bytes being
sent are of the correct size/alignment. For the flog, instead of doing
the smaller log_entry writes only, first do a 'wipe' of the entire area
by writing zeroes in large enough chunks so that errors get cleared.
Vishal Verma [Wed, 10 May 2017 21:01:30 +0000 (15:01 -0600)]
libnvdimm: add an atomic vs process context flag to rw_bytes
nsio_rw_bytes can clear media errors, but this cannot be done while we
are in an atomic context due to locking within ACPI. From the BTT,
->rw_bytes may be called either from atomic or process context depending
on whether the calls happen during initialization or during IO.
During init, we want to ensure error clearing happens, and the flag
marking process context allows nsio_rw_bytes to do that. When called
during IO, we're in atomic context, and error clearing can be skipped.
Linus Torvalds [Thu, 11 May 2017 03:45:36 +0000 (20:45 -0700)]
Merge tag 'kbuild-uapi-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild UAPI updates from Masahiro Yamada:
"Improvement of headers_install by Nicolas Dichtel.
It has been long since the introduction of uapi directories, but the
de-coupling of exported headers has not been completed. Headers listed
in header-y are exported whether they exist in uapi directories or
not. His work fixes this inconsistency.
All (and only) headers under uapi directories are now exported. The
asm-generic wrappers are still exceptions, but this is a big step
forward"
* tag 'kbuild-uapi-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
arch/include: remove empty Kbuild files
uapi: export all arch specifics directories
uapi: export all headers under uapi directories
smc_diag.h: fix include from userland
btrfs_tree.h: fix include from userland
uapi: includes linux/types.h before exporting files
Makefile.headersinst: remove destination-y option
Makefile.headersinst: cleanup input files
x86: stop exporting msr-index.h to userland
nios2: put setup.h in uapi
h8300: put bitsperlong.h in uapi
Linus Torvalds [Thu, 11 May 2017 03:41:43 +0000 (20:41 -0700)]
Merge tag 'kbuild-misc-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull misc Kbuild updates from Masahiro Yamada:
- clean up builddeb script
- use full path for KBUILD_IMAGE to fix rpm-pkg build
- fix objdiff tool to ignore debug info
* tag 'kbuild-misc-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
builddeb: fix typo
builddeb: Update a few outdated and hardcoded strings
deb-pkg: Remove the KBUILD_IMAGE workaround
unicore32: Use full path in KBUILD_IMAGE definition
sh: Use full path in KBUILD_IMAGE definition
arc: Use full path in KBUILD_IMAGE definition
arm: Use full path in KBUILD_IMAGE definition
arm64: Use full path in KBUILD_IMAGE definition
scripts: objdiff: Ignore debug info when comparing
- improve compiler flag evaluation for better build performance
- fix GCC version-dependent warning
- fix genksyms
* tag 'kbuild-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (23 commits)
kbuild: dtbinst: remove unnecessary __dtbs_install_prep target
ia64: beatify build log for gate.so and gate-syms.o
alpha: make short build log available for division routines
alpha: merge build rules of division routines
alpha: add $(src)/ rather than $(obj)/ to make source file path
Makefile: evaluate LDFLAGS_BUILD_ID only once
objtool: make it visible in make V=1 output
kbuild: clang: add -no-integrated-as to KBUILD_[AC]FLAGS
kbuild: Add support to generate LLVM assembly files
kbuild: Add better clang cross build support
kbuild: drop -Wno-unknown-warning-option from clang options
kbuild: fix asm-offset generation to work with clang
kbuild: consolidate redundant sed script ASM offset generation
frv: Use OFFSET macro in DEF_*REG()
kbuild: avoid conflict between -ffunction-sections and -pg on gcc-4.7
kbuild: Consolidate header generation from ASM offset information
kbuild: use -Oz instead of -Os when using clang
kbuild, LLVMLinux: Add -Werror to cc-option to support clang
Kbuild: make designated_init attribute fatal
kbuild: drop unneeded patterns '.*.orig' and '.*.rej' from distclean
...
Linus Torvalds [Thu, 11 May 2017 02:37:14 +0000 (19:37 -0700)]
Merge tag 'rtc-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"RTC subsystem update:
- Add OF device ID table for i2c drivers
New RTC driver:
- Motorola CPCAP PMIC RTC
RTC driver updates:
- cmos: fix IRQ selection
- ds1307: Add ST m41t0 support
- ds1374: fix watchdog configuration
- sh: Add rza series support"
* tag 'rtc-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (33 commits)
rtc: gemini: add return value validation
rtc: snvs: fix an incorrect check of return value
rtc: ds1374: wdt: Fix stop/start ioctl always returning -EINVAL
rtc: ds1374: wdt: Fix issue with timeout scaling from secs to wdt ticks
rtc: sh: mark PM functions as unused
rtc: hid-sensor-time: remove some dead code
rtc: m41t80: Add proper compatible for rv4162
rtc: ds1307: Add m41t0 to OF device ID table
rtc: ds1307: support m41t0 variant
rtc: cpcap: fix improper use of IRQ_NONE for request_threaded_irq
rtc: cmos: Do not assume irq 8 for rtc when there are no legacy irqs
x86: i8259: export legacy_pic symbol
dt-bindings: rtc: document the rtc-sh bindings
rtc: sh: add support for rza series
rtc: cpcap: kfreeing devm allocated memory
rtc: wm8350: Remove unused to_wm8350_from_rtc_dev
rtc: cpcap: new rtc driver
dt-bindings: Add vendor prefix for Motorola
rtc: omap: mark PM methods as __maybe_unused
rtc: omap: remove incorrect __exit markups
...
Linus Torvalds [Thu, 11 May 2017 02:13:03 +0000 (19:13 -0700)]
Merge tag 'hwparam-20170420' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull hw lockdown support from David Howells:
"Annotation of module parameters that configure hardware resources
including ioports, iomem addresses, irq lines and dma channels.
This allows a future patch to prohibit the use of such module
parameters to prevent that hardware from being abused to gain access
to the running kernel image as part of locking the kernel down under
UEFI secure boot conditions.
where the module parameter refers to a hardware setting
hwtype specifies the type of the resource being configured. This can
be one of:
ioport Module parameter configures an I/O port
iomem Module parameter configures an I/O mem address
ioport_or_iomem Module parameter could be either (runtime set)
irq Module parameter configures an I/O port
dma Module parameter configures a DMA channel
dma_addr Module parameter configures a DMA buffer address
other Module parameter configures some other value
Note that the hwtype is compile checked, but not currently stored (the
lockdown code probably won't require it). It is, however, there for
future use.
A bonus is that the hwtype can also be used for grepping.
The intention is for the kernel to ignore or reject attempts to set
annotated module parameters if lockdown is enabled. This applies to
options passed on the boot command line, passed to insmod/modprobe or
direct twiddling in /sys/module/ parameter files.
The module initialisation then needs to handle the parameter not being
set, by (1) giving an error, (2) probing for a value or (3) using a
reasonable default.
What I can't do is just reject a module out of hand because it may
take a hardware setting in the module parameters. Some important
modules, some ipmi stuff for instance, both probe for hardware and
allow hardware to be manually specified; if the driver is aborts with
any error, you don't get any ipmi hardware.
Further, trying to do this entirely in the module initialisation code
doesn't protect against sysfs twiddling.
[!] Note that in and of itself, this series of patches should have no
effect on the the size of the kernel or code execution - that is
left to a patch in the next series to effect. It does mark
annotated kernel parameters with a KERNEL_PARAM_FL_HWPARAM flag in
an already existing field"
* tag 'hwparam-20170420' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (38 commits)
Annotate hardware config module parameters in sound/pci/
Annotate hardware config module parameters in sound/oss/
Annotate hardware config module parameters in sound/isa/
Annotate hardware config module parameters in sound/drivers/
Annotate hardware config module parameters in fs/pstore/
Annotate hardware config module parameters in drivers/watchdog/
Annotate hardware config module parameters in drivers/video/
Annotate hardware config module parameters in drivers/tty/
Annotate hardware config module parameters in drivers/staging/vme/
Annotate hardware config module parameters in drivers/staging/speakup/
Annotate hardware config module parameters in drivers/staging/media/
Annotate hardware config module parameters in drivers/scsi/
Annotate hardware config module parameters in drivers/pcmcia/
Annotate hardware config module parameters in drivers/pci/hotplug/
Annotate hardware config module parameters in drivers/parport/
Annotate hardware config module parameters in drivers/net/wireless/
Annotate hardware config module parameters in drivers/net/wan/
Annotate hardware config module parameters in drivers/net/irda/
Annotate hardware config module parameters in drivers/net/hamradio/
Annotate hardware config module parameters in drivers/net/ethernet/
...