Julian Wiedmann [Tue, 18 Jun 2019 09:25:59 +0000 (11:25 +0200)]
s390/qdio: (re-)initialize tiqdio list entries
When tiqdio_remove_input_queues() removes a queue from the tiq_list as
part of qdio_shutdown(), it doesn't re-initialize the queue's list entry
and the prev/next pointers go stale.
If a subsequent qdio_establish() fails while sending the ESTABLISH cmd,
it calls qdio_shutdown() again in QDIO_IRQ_STATE_ERR state and
tiqdio_remove_input_queues() will attempt to remove the queue entry a
second time. This dereferences the stale pointers, and bad things ensue.
Fix this by re-initializing the list entry after removing it from the
list.
For good practice also initialize the list entry when the queue is first
allocated, and remove the quirky checks that papered over this omission.
Note that prior to
commit e521813468f7 ("s390/qdio: fix access to uninitialized qdio_q fields"),
these checks were bogus anyway.
setup_queues_misc() clears the whole queue struct, and thus needs to
re-init the prev/next pointers as well.
Dan Carpenter [Wed, 26 Jun 2019 10:06:58 +0000 (13:06 +0300)]
s390/dasd: Fix a precision vs width bug in dasd_feature_list()
The "len" variable is the length of the option up to the next option or
to the end of the string which ever first. We want to print the invalid
option so we want precision "%.*s" but the format is width "%*s" so it
prints up to the end of the string.
Cornelia Huck [Thu, 13 Jun 2019 11:08:15 +0000 (13:08 +0200)]
s390/cio: introduce driver_override on the css bus
Sometimes, we want to control which of the matching drivers
binds to a subchannel device (e.g. for subchannels we want to
handle via vfio-ccw).
For pci devices, a mechanism to do so has been introduced in 782a985d7af2 ("PCI: Introduce new device binding path using
pci_dev.driver_override"). It makes sense to introduce the
driver_override attribute for subchannel devices as well, so
that we can easily extend the 'driverctl' tool (which makes
use of the driver_override attribute for pci).
Note that unlike pci we still require a driver override to
match the subchannel type; matching more than one subchannel
type is probably not useful anyway.
Despite what I think the prm recommends, commit f2253bd9859b
("drm/i915/ringbuffer: EMIT_INVALIDATE after switch context") turned out
to be a huge mistake when enabling Ironlake contexts as the GPU would
hang on either a MI_FLUSH or PIPE_CONTROL immediately following the
MI_SET_CONTEXT of an active mesa context (more vanilla contexts, e.g.
simple rendercopies with igt, do not suffer).
Ville found the following clue,
"[DevCTG+]: For the invalidate operation of the pipe control, the
following pointers are affected. The
invalidate operation affects the restore of these packets. If the pipe
control invalidate operation is completed
before the context save, the indirect pointers will not be restored from
memory.
1. Pipeline State Pointer
2. Media State Pointer
3. Constant Buffer Packet"
which suggests by us emitting the INVALIDATE prior to the MI_SET_CONTEXT,
we prevent the context-restore from chasing the dangling pointers within
the image, and explains why this likely prevents the GPU hang.
Arnd Bergmann [Mon, 17 Jun 2019 13:01:05 +0000 (15:01 +0200)]
soc: ti: fix irq-ti-sci link error
The irqchip driver depends on the SoC specific driver, but we want
to be able to compile-test it elsewhere:
WARNING: unmet direct dependencies detected for TI_SCI_INTA_MSI_DOMAIN
Depends on [n]: SOC_TI [=n]
Selected by [y]:
- TI_SCI_INTA_IRQCHIP [=y] && TI_SCI_PROTOCOL [=y]
drivers/irqchip/irq-ti-sci-inta.o: In function `ti_sci_inta_irq_domain_probe':
irq-ti-sci-inta.c:(.text+0x204): undefined reference to `ti_sci_inta_msi_create_irq_domain'
Rearrange the Kconfig and Makefile so we build the soc driver whenever
its users are there, regardless of the SOC_TI option.
Fixes: 49b323157bf1 ("soc: ti: Add MSI domain bus support for Interrupt Aggregator") Fixes: f011df6179bd ("irqchip/ti-sci-inta: Add msi domain support") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Lokesh Vutla <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: Olof Johansson <[email protected]>
Evan Green [Mon, 1 Jul 2019 17:30:30 +0000 (10:30 -0700)]
ALSA: hda: Fix widget_mutex incomplete protection
The widget_mutex was introduced to serialize callers to
hda_widget_sysfs_{re}init. However, its protection of the sysfs widget array
is incomplete. For example, it is acquired around the call to
hda_widget_sysfs_reinit(), which actually creates the new array, but isn't
still acquired when codec->num_nodes and codec->start_nid is updated. So
the lock ensures one thread sets up the new array at a time, but doesn't
ensure which thread's value will end up in codec->num_nodes. If a larger
num_nodes wins but a smaller array was set up, the next call to
refresh_widgets() will touch free memory as it iterates over codec->num_nodes
that aren't there.
The widget_lock really protects both the tree as well as codec->num_nodes,
start_nid, and end_nid, so make sure it's held across that update. It should
also be held during snd_hdac_get_sub_nodes(), so that a very old read from that
function doesn't end up clobbering a later update.
Fixes: ed180abba7f1 ("ALSA: hda: Fix race between creating and refreshing sysfs entries") Signed-off-by: Evan Green <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages
In IEC 61883-6, 8 MIDI data streams are multiplexed into single
MIDI conformant data channel. The index of stream is calculated by
modulo 8 of the value of data block counter.
In fireworks, the value of data block counter in CIP header has a quirk
with firmware version v5.0.0, v5.7.3 and v5.8.0. This brings ALSA
IEC 61883-1/6 packet streaming engine to miss detection of MIDI
messages.
This commit fixes the miss detection to modify the value of data block
counter for the modulo calculation.
For maintainers, this bug exists since a commit 18f5ed365d3f ("ALSA:
fireworks/firewire-lib: add support for recent firmware quirk") in Linux
kernel v4.2. There're many changes since the commit. This fix can be
backported to Linux kernel v4.4 or later. I tagged a base commit to the
backport for your convenience.
Besides, my work for Linux kernel v5.3 brings heavy code refactoring and
some structure members are renamed in 'sound/firewire/amdtp-stream.h'.
The content of this patch brings conflict when merging -rc tree with
this patch and the latest tree. I request maintainers to solve the
conflict to replace 'tx_first_dbc' with 'ctx_data.tx.first_dbc'.
sys_move_mount() crashes by dereferencing the pointer MNT_NS_INTERNAL,
a.k.a. ERR_PTR(-EINVAL), if the old mount is specified by fd for a
kernel object with an internal mount, such as a pipe or memfd.
Fix it by checking for this case and returning -EINVAL.
[AV: what we want is is_mounted(); use that instead of making the
condition even more convoluted]
Make sure to return a proper negative error code from copy_process()
when anon_inode_getfile() fails with CLONE_PIDFD.
Otherwise _do_fork() will not detect an error and get_task_pid() will
operator on a nonsensical pointer:
Lyude Paul [Thu, 20 Jun 2019 23:21:26 +0000 (19:21 -0400)]
drm/amdgpu: Don't skip display settings in hwmgr_resume()
I'm not entirely sure why this is, but for some reason:
921935dc6404 ("drm/amd/powerplay: enforce display related settings only on needed")
Breaks runtime PM resume on the Radeon PRO WX 3100 (Lexa) in one the
pre-production laptops I have. The issue manifests as the following
messages in dmesg:
[drm] UVD and UVD ENC initialized successfully.
amdgpu 0000:3b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vce1 test failed (-110)
[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vce_v3_0> failed -110
[drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-110).
And happens after about 6-10 runtime PM suspend/resume cycles (sometimes
sooner, if you're lucky!). Unfortunately I can't seem to pin down
precisely which part in psm_adjust_power_state_dynamic that is causing
the issue, but not skipping the display setting setup seems to fix it.
Hopefully if there is a better fix for this, this patch will spark
discussion around it.
Paul Cercueil [Sat, 29 Jun 2019 01:22:48 +0000 (03:22 +0200)]
mtd: rawnand: ingenic: Fix ingenic_ecc dependency
If MTD_NAND_JZ4780 is y and MTD_NAND_JZ4780_BCH is m,
which select CONFIG_MTD_NAND_INGENIC_ECC to m, building fails:
drivers/mtd/nand/raw/ingenic/ingenic_nand.o: In function `ingenic_nand_remove':
ingenic_nand.c:(.text+0x177): undefined reference to `ingenic_ecc_release'
drivers/mtd/nand/raw/ingenic/ingenic_nand.o: In function `ingenic_nand_ecc_correct':
ingenic_nand.c:(.text+0x2ee): undefined reference to `ingenic_ecc_correct'
To fix that, the ingenic_nand and ingenic_ecc modules have been fused
into one single module.
- The ingenic_ecc.c code is now compiled in only if
$(CONFIG_MTD_NAND_INGENIC_ECC) is set. This is now a boolean instead
of tristate.
- To avoid changing the module name, the ingenic_nand.c file is moved to
ingenic_nand_drv.c. Then the module name is still ingenic_nand.
- Since ingenic_ecc.c is no more a module, the module-specific macros
have been dropped, and the functions are no more exported for use by
the ingenic_nand driver.
mtd: spinand: Fix max_bad_eraseblocks_per_lun info in memorg
The 1Gb Macronix chip can have a maximum of 20 bad blocks, while
the 2Gb version has twice as many blocks and therefore the maximum
number of bad blocks is 40.
The 4Gb GigaDevice GD5F4GQ4xA has twice as many blocks as its 2Gb
counterpart and therefore a maximum of 80 bad blocks.
Fixes: 377e517b5fa5 ("mtd: nand: Add max_bad_eraseblocks_per_lun info to memorg") Reported-by: Emil Lenngren <[email protected]> Signed-off-by: Frieder Schrempf <[email protected]> Signed-off-by: Miquel Raynal <[email protected]>
When we remap memory as non-cached, to be used as a DMA coherent buffer,
we should writeback all cache and invalidate the cache lines so that we
make sure we have a clean slate. Implement this using the cache_push()
helper.
m68k: Use the generic dma coherent remap allocator
This switches m68k to using common code for the DMA allocations,
including potential use of the CMA allocator if configured.
Also add a comment where the existing behavior seems to be lacking.
Switching to the generic code enables DMA allocations from atomic
context, which is required by the DMA API documentation, and also
adds various other minor features drivers start relying upon. It
also makes sure we have a tested code base for all architectures
that require uncached pte bits for coherent DMA allocations.
Linus Torvalds [Sun, 30 Jun 2019 03:20:52 +0000 (11:20 +0800)]
Merge tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"One fix for a regression in my commit adding KUAP (Kernel User Access
Prevention) on Radix, which incorrectly touched the AMR in the early
machine check handler.
Thanks to Nicholas Piggin"
* tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/exception: Fix machine check early corrupting AMR
Linus Torvalds [Sun, 30 Jun 2019 03:19:17 +0000 (11:19 +0800)]
Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP fixes from Thomas Gleixner:
"Two small changes for the cpu hotplug code:
- Prevent out of bounds access which actually might crash the machine
caused by a missing bounds check in the fail injection code
- Warn about unsupported migitation mode command line arguments to
make people aware that they typoed the paramater. Not necessarily a
fix but quite some people tripped over that"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Fix out-of-bounds read when setting fail state
cpu/speculation: Warn on unsupported mitigations= parameter
Linus Torvalds [Sat, 29 Jun 2019 11:42:30 +0000 (19:42 +0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes all over the place:
- might_sleep() atomicity fix in the microcode loader
- resctrl boundary condition fix
- APIC arithmethics bug fix for frequencies >= 4.2 GHz
- three 5-level paging crash fixes
- two speculation fixes
- a perf/stacktrace fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/unwind/orc: Fall back to using frame pointers for generated code
perf/x86: Always store regs->ip in perf_callchain_kernel()
x86/speculation: Allow guests to use SSBD even if host does not
x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
x86/boot/64: Fix crash if kernel image crosses page table boundary
x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
x86/resctrl: Prevent possible overrun during bitmap operations
x86/microcode: Fix the microcode load on CPU hotplug for real
Linus Torvalds [Sat, 29 Jun 2019 11:39:17 +0000 (19:39 +0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Various fixes, most of them related to bugs perf fuzzing found in the
x86 code"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/regs: Use PERF_REG_EXTENDED_MASK
perf/x86: Remove pmu->pebs_no_xmm_regs
perf/x86: Clean up PEBS_XMM_REGS
perf/x86/regs: Check reserved bits
perf/x86: Disable extended registers for non-supported PMUs
perf/ioctl: Add check for the sample_period value
perf/core: Fix perf_sample_regs_user() mm check
Linus Torvalds [Sat, 29 Jun 2019 11:36:53 +0000 (19:36 +0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Diverse irqchip driver fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Fix command queue pointer comparison bug
irqchip/mips-gic: Use the correct local interrupt map registers
irqchip/ti-sci-inta: Fix kernel crash if irq_create_fwspec_mapping fail
irqchip/irq-csky-mpintc: Support auto irq deliver to all cpus
Linus Torvalds [Sat, 29 Jun 2019 11:32:09 +0000 (19:32 +0800)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Four fixes:
- fix a kexec crash on arm64
- fix a reboot crash on some Android platforms
- future-proof the code for upcoming ACPI 6.2 changes
- fix a build warning on x86"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efibc: Replace variable set function in notifier call
x86/efi: fix a -Wtype-limits compilation warning
efi/bgrt: Drop BGRT status field reserved bits check
efi/memreserve: deal with memreserve entries in unmapped memory
Linus Torvalds [Sat, 29 Jun 2019 11:29:45 +0000 (19:29 +0800)]
Merge tag 'pm-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Avoid skipping bus-level PCI power management during system resume for
PCIe ports left in D0 during the preceding suspend transition on
platforms where the power states of those ports can change out of the
PCI layer's control"
* tag 'pm-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PCI: PM: Avoid skipping bus-level PM on platforms without ACPI
Linus Torvalds [Sat, 29 Jun 2019 09:11:01 +0000 (17:11 +0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <[email protected]>:
linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL
mm, swap: fix THP swap out
fork,memcg: alloc_thread_stack_node needs to set tsk->stack
MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warning
mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
initramfs: fix populate_initrd_image() section mismatch
mm/oom_kill.c: fix uninitialized oc->constraint
mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
signal: remove the wrong signal_pending() check in restore_user_sigmask()
fs/binfmt_flat.c: make load_flat_shared_library() work
mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
fs/proc/array.c: allow reporting eip/esp for all coredumping threads
mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual address
Linus Torvalds [Sat, 29 Jun 2019 09:05:58 +0000 (17:05 +0800)]
Merge tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- hsdk platform unifying apertures
- build system CROSS_COMPILE prefix
* tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [plat-hsdk]: unify memory apertures configuration
ARC: build: Try to guess CROSS_COMPILE with cc-cross-prefix
Linus Torvalds [Sat, 29 Jun 2019 09:04:21 +0000 (17:04 +0800)]
Merge tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Minor RISC-V fixes and one defconfig update.
The fixes have no functional impact:
- Fix some comment text in the memory management vmalloc_fault path.
- Fix some warnings from the DT compiler in our newly-added DT files.
- Change the newly-added DT bindings such that SoC IP blocks with
external I/O are marked as "disabled" by default, then enable them
explicitly in board DT files when the devices are used on the
board. This aligns the bindings with existing upstream practice.
- Add the MIT license as an option for a minor header file, at the
request of one of the U-Boot maintainers.
The RISC-V defconfig update builds the SiFive SPI driver and the
MMC-SPI driver by default. The intention here is to make v5.2 more
usable for testers and users with RISC-V hardware"
* tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: mm: Fix code comment
dt-bindings: clock: sifive: add MIT license as an option for the header file
dt-bindings: riscv: resolve 'make dt_binding_check' warnings
riscv: dts: Re-organize the DT nodes
RISC-V: defconfig: enable MMC & SPI for RISC-V
Linus Torvalds [Sat, 29 Jun 2019 09:02:22 +0000 (17:02 +0800)]
Merge tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull two more NFS client fixes from Anna Schumaker:
"These are both stable fixes.
One to calculate the correct client message length in the case of
partial transmissions. And the other to set the proper TCP timeout for
flexfiles"
* tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
SUNRPC: Fix up calculation of client message length
Linus Torvalds [Sat, 29 Jun 2019 08:58:35 +0000 (16:58 +0800)]
Merge tag 'for-linus-20190628' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Just two small fixes.
One from Paolo, fixing a silly mistake in BFQ. The other one is from
me, ensuring that we have ->file cleared in the io_uring request a bit
earlier. That avoids a use-before-free, if we encounter an error
before ->file is assigned"
* tag 'for-linus-20190628' of git://git.kernel.dk/linux-block:
block, bfq: fix operator in BFQQ_TOTALLY_SEEKY
io_uring: ensure req->file is cleared on allocation
Linus Torvalds [Sat, 29 Jun 2019 08:51:10 +0000 (16:51 +0800)]
Merge tag 'pinctrl-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Sorry to bomb in fixes this late. Maybe I can comfort you by saying it
is only driver fixes, and mostly IRQ handling which is something GPIO
and pin control drivers never get right. You think it works and then
it doesn't.
Summary:
- Fix IRQ setup in the MCP23s08.
- Fix pin setup on pins > 31 in the Ocelot driver.
- Fix IRQs in the Mediatek driver"
* tag 'pinctrl-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: mediatek: Update cur_mask in mask/mask ops
pinctrl: mediatek: Ignore interrupts that are wake only during resume
pinctrl: ocelot: fix pinmuxing for pins after 31
pinctrl: ocelot: fix gpio direction for pins after 31
pinctrl: mcp23s08: Fix add_data and irqchip_add_nested call order
Vinod Koul [Fri, 28 Jun 2019 19:07:21 +0000 (12:07 -0700)]
linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL
DIV_ROUND_UP_ULL adds the two arguments and then invokes
DIV_ROUND_DOWN_ULL. But on a 32bit system the addition of two 32 bit
values can overflow. DIV_ROUND_DOWN_ULL does it correctly and stashes
the addition into a unsigned long long so cast the result to unsigned
long long here to avoid the overflow condition.
Huang Ying [Fri, 28 Jun 2019 19:07:18 +0000 (12:07 -0700)]
mm, swap: fix THP swap out
0-Day test system reported some OOM regressions for several THP
(Transparent Huge Page) swap test cases. These regressions are bisected
to 6861428921b5 ("block: always define BIO_MAX_PAGES as 256"). In the
commit, BIO_MAX_PAGES is set to 256 even when THP swap is enabled. So the
bio_alloc(gfp_flags, 512) in get_swap_bio() may fail when swapping out
THP. That causes the OOM.
As in the patch description of 6861428921b5 ("block: always define
BIO_MAX_PAGES as 256"), THP swap should use multi-page bvec to write THP
to swap space. So the issue is fixed via doing that in get_swap_bio().
BTW: I remember I have checked the THP swap code when 6861428921b5
("block: always define BIO_MAX_PAGES as 256") was merged, and thought the
THP swap code needn't to be changed. But apparently, I was wrong. I
should have done this at that time.
Andrea Arcangeli [Fri, 28 Jun 2019 19:07:14 +0000 (12:07 -0700)]
fork,memcg: alloc_thread_stack_node needs to set tsk->stack
Commit 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on
memcg charge fail") corrected two instances, but there was a third
instance of this bug.
Without setting tsk->stack, if memcg_charge_kernel_stack fails, it'll
execute free_thread_stack() on a dangling pointer.
Enterprise kernels are compiled with VMAP_STACK=y so this isn't
critical, but custom VMAP_STACK=n builds should have some performance
advantage, with the drawback of risking to fail fork because compaction
didn't succeed. So as long as VMAP_STACK=n is a supported option it's
worth fixing it upstream.
Nick Desaulniers [Fri, 28 Jun 2019 19:07:12 +0000 (12:07 -0700)]
MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
Add keyword support so that our mailing list gets cc'ed for clang/llvm
patches. We're pretty active on our mailing list so far as code review.
There are numerous Googlers like myself that are paid to support
building the Linux kernel with Clang and LLVM.
gcc gets confused in pcpu_get_vm_areas() because there are too many
branches that affect whether 'lva' was initialized before it gets used:
mm/vmalloc.c: In function 'pcpu_get_vm_areas':
mm/vmalloc.c:991:4: error: 'lva' may be used uninitialized in this function [-Werror=maybe-uninitialized]
insert_vmap_area_augment(lva, &va->rb_node,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
&free_vmap_area_root, &free_vmap_area_list);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/vmalloc.c:916:20: note: 'lva' was declared here
struct vmap_area *lva;
^~~
Add an intialization to NULL, and check whether this has changed before
the first use.
Colin Ian King [Fri, 28 Jun 2019 19:07:05 +0000 (12:07 -0700)]
mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
Currently the calcuation of end_pfn can round up the pfn number to more
than the actual maximum number of pfns, causing an Oops. Fix this by
ensuring end_pfn is never more than max_pfn.
This can be easily triggered when on systems where the end_pfn gets
rounded up to more than max_pfn using the idle-page stress-ng stress test:
WARNING: vmlinux.o(.text.unlikely+0x140): Section mismatch in reference from the function populate_initrd_image() to the variable .init.ramfs.info:__initramfs_size
The function populate_initrd_image() references
the variable __init __initramfs_size.
This is often because populate_initrd_image lacks a __init
annotation or the annotation of __initramfs_size is wrong.
WARNING: vmlinux.o(.text.unlikely+0x14c): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:unpack_to_rootfs()
The function populate_initrd_image() references
the function __init unpack_to_rootfs().
This is often because populate_initrd_image lacks a __init
annotation or the annotation of unpack_to_rootfs is wrong.
WARNING: vmlinux.o(.text.unlikely+0x198): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:xwrite()
The function populate_initrd_image() references
the function __init xwrite().
This is often because populate_initrd_image lacks a __init
annotation or the annotation of xwrite is wrong.
Indeed, if the compiler decides not to inline populate_initrd_image(), a
warning is generated.
Fix this by adding the missing __init annotations.
Yafang Shao [Fri, 28 Jun 2019 19:06:59 +0000 (12:06 -0700)]
mm/oom_kill.c: fix uninitialized oc->constraint
In dump_oom_summary() oc->constraint is used to show oom_constraint_text,
but it hasn't been set before. So the value of it is always the default
value 0. We should inititialize it before.
Bellow is the output when memcg oom occurs,
before this patch:
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=7997,uid=0
after this patch:
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=13681,uid=0
Naoya Horiguchi [Fri, 28 Jun 2019 19:06:56 +0000 (12:06 -0700)]
mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
madvise(MADV_SOFT_OFFLINE) often returns -EBUSY when calling soft offline
for hugepages with overcommitting enabled. That was caused by the
suboptimal code in current soft-offline code. See the following part:
ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
MIGRATE_SYNC, MR_MEMORY_FAILURE);
if (ret) {
...
} else {
/*
* We set PG_hwpoison only when the migration source hugepage
* was successfully dissolved, because otherwise hwpoisoned
* hugepage remains on free hugepage list, then userspace will
* find it as SIGBUS by allocation failure. That's not expected
* in soft-offlining.
*/
ret = dissolve_free_huge_page(page);
if (!ret) {
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
}
}
return ret;
Here dissolve_free_huge_page() returns -EBUSY if the migration source page
was freed into buddy in migrate_pages(), but even in that case we actually
has a chance that set_hwpoison_free_buddy_page() succeeds. So that means
current code gives up offlining too early now.
dissolve_free_huge_page() checks that a given hugepage is suitable for
dissolving, where we should return success for !PageHuge() case because
the given hugepage is considered as already dissolved.
This change also affects other callers of dissolve_free_huge_page(), which
are cleaned up together.
Naoya Horiguchi [Fri, 28 Jun 2019 19:06:53 +0000 (12:06 -0700)]
mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e. the result of
set_hwpoison_free_buddy_page()), but current code do not work like
that. It might lead us to misjudge the test result when
set_hwpoison_free_buddy_page() fails.
Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
not offline the original page and will not return an error.
Oleg Nesterov [Fri, 28 Jun 2019 19:06:50 +0000 (12:06 -0700)]
signal: remove the wrong signal_pending() check in restore_user_sigmask()
This is the minimal fix for stable, I'll send cleanups later.
Commit 854a6ed56839 ("signal: Add restore_user_sigmask()") introduced
the visible change which breaks user-space: a signal temporary unblocked
by set_user_sigmask() can be delivered even if the caller returns
success or timeout.
Change restore_user_sigmask() to accept the additional "interrupted"
argument which should be used instead of signal_pending() check, and
update the callers.
Eric said:
: For clarity. I don't think this is required by posix, or fundamentally to
: remove the races in select. It is what linux has always done and we have
: applications who care so I agree this fix is needed.
:
: Further in any case where the semantic change that this patch rolls back
: (aka where allowing a signal to be delivered and the select like call to
: complete) would be advantage we can do as well if not better by using
: signalfd.
:
: Michael is there any chance we can get this guarantee of the linux
: implementation of pselect and friends clearly documented. The guarantee
: that if the system call completes successfully we are guaranteed that no
: signal that is unblocked by using sigmask will be delivered?
Jann Horn [Fri, 28 Jun 2019 19:06:46 +0000 (12:06 -0700)]
fs/binfmt_flat.c: make load_flat_shared_library() work
load_flat_shared_library() is broken: It only calls load_flat_file() if
prepare_binprm() returns zero, but prepare_binprm() returns the number of
bytes read - so this only happens if the file is empty.
Instead, call into load_flat_file() if the number of bytes read is
non-negative. (Even if the number of bytes is zero - in that case,
load_flat_file() will see nullbytes and return a nice -ENOEXEC.)
In addition, remove the code related to bprm creds and stop using
prepare_binprm() - this code is loading a library, not a main executable,
and it only actually uses the members "buf", "file" and "filename" of the
linux_binprm struct. Instead, call kernel_read() directly.
zhong jiang [Fri, 28 Jun 2019 19:06:43 +0000 (12:06 -0700)]
mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
mpol_rebind_nodemask() is called for MPOL_BIND and MPOL_INTERLEAVE
mempoclicies when the tasks's cpuset's mems_allowed changes. For
policies created without MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES,
it works by remapping the policy's allowed nodes (stored in v.nodes)
using the previous value of mems_allowed (stored in
w.cpuset_mems_allowed) as the domain of map and the new mems_allowed
(passed as nodes) as the range of the map (see the comment of
bitmap_remap() for details).
The result of remapping is stored back as policy's nodemask in v.nodes,
and the new value of mems_allowed should be stored in
w.cpuset_mems_allowed to facilitate the next rebind, if it happens.
However, 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies
when updating cpusets") introduced a bug where the result of remapping
is stored in w.cpuset_mems_allowed instead. Thus, a mempolicy's
allowed nodes can evolve in an unexpected way after a series of
rebinding due to cpuset mems_allowed changes, possibly binding to a
wrong node or a smaller number of nodes which may e.g. overload them.
This patch fixes the bug so rebinding again works as intended.
John Ogness [Fri, 28 Jun 2019 19:06:40 +0000 (12:06 -0700)]
fs/proc/array.c: allow reporting eip/esp for all coredumping threads
0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat")
stopped reporting eip/esp and fd7d56270b52 ("fs/proc: Report eip/esp in
/prod/PID/stat for coredumping") reintroduced the feature to fix a
regression with userspace core dump handlers (such as minicoredumper).
Because PF_DUMPCORE is only set for the primary thread, this didn't fix
the original problem for secondary threads. Allow reporting the eip/esp
for all threads by checking for PF_EXITING as well. This is set for all
the other threads when they are killed. coredump_wait() waits for all the
tasks to become inactive before proceeding to invoke a core dumper.
mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual address
The presence of struct page does not guarantee linear mapping for the pfn
physical range. Device private memory which is non-coherent is excluded
from linear mapping during devm_memremap_pages() though they will still
have struct page coverage.
Change pfn_t_to_virt() to just check for device private memory before
giving out virtual address for a given pfn.
pfn_t_to_virt() actually has no callers. Let's fix it for the 5.2 kernel
and remove it in 5.3.
Boris Brezillon [Thu, 27 Jun 2019 17:24:14 +0000 (19:24 +0200)]
drm/panfrost: Fix a double-free error
drm_gem_shmem_create_with_handle() returns a GEM object and attach a
handle to it. When the user closes the DRM FD, the core releases all
GEM handles along with their backing GEM objs, which can lead to a
double-free issue if panfrost_ioctl_create_bo() failed and went
through the err_free path where drm_gem_object_put_unlocked() is
called without deleting the associate handle.
Replace this drm_gem_object_put_unlocked() call by a
drm_gem_handle_delete() one to fix that.
Eiichi Tsukata [Tue, 25 Jun 2019 01:29:10 +0000 (10:29 +0900)]
tracing/snapshot: Resize spare buffer if size changed
Current snapshot implementation swaps two ring_buffers even though their
sizes are different from each other, that can cause an inconsistency
between the contents of buffer_size_kb file and the current buffer size.
This patch adds resize_buffer_duplicate_size() to check if there is a
difference between current/spare buffer sizes and resize a spare buffer
if necessary.
ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()
Taking the text_mutex in ftrace_arch_code_modify_prepare() is to fix a
race against module loading and live kernel patching that might try to
change the text permissions while ftrace has it as read/write. This
really needs to be documented in the code. Add a comment that does such.
Petr Mladek [Thu, 27 Jun 2019 08:13:34 +0000 (10:13 +0200)]
ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text
permissions race") causes a possible deadlock between register_kprobe()
and ftrace_run_update_code() when ftrace is using stop_machine().
The existing dependency chain (in reverse order) is:
It is similar problem that has been solved by the commit 2d1e38f56622b9b
("kprobes: Cure hotplug lock ordering issues"). Many locks are involved.
To be on the safe side, text_mutex must become a low level lock taken
after cpu_hotplug_lock.rw_sem.
This can't be achieved easily with the current ftrace design.
For example, arm calls set_all_modules_text_rw() already in
ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c.
This functions is called:
+ outside stop_machine() from ftrace_run_update_code()
+ without stop_machine() from ftrace_module_enable()
Fortunately, the problematic fix is needed only on x86_64. It is
the only architecture that calls set_all_modules_text_rw()
in ftrace path and supports livepatching at the same time.
Therefore it is enough to move text_mutex handling from the generic
kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c:
This patch basically reverts the ftrace part of the problematic
commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module
text permissions race"). And provides x86_64 specific-fix.
Some refactoring of the ftrace code will be needed when livepatching
is implemented for arm or nds32. These architectures call
set_all_modules_text_rw() and use stop_machine() at the same time.
Trond Myklebust [Mon, 24 Jun 2019 23:15:44 +0000 (19:15 -0400)]
SUNRPC: Fix up calculation of client message length
In the case where a record marker was used, xs_sendpages() needs
to return the length of the payload + record marker so that we
operate correctly in the case of a partial transmission.
When the callers check return value, they therefore need to
take into account the record marker length.
Colin Ian King [Fri, 28 Jun 2019 09:54:29 +0000 (10:54 +0100)]
ALSA: seq: fix incorrect order of dest_client/dest_ports arguments
There are two occurrances of a call to snd_seq_oss_fill_addr where
the dest_client and dest_port arguments are in the wrong order. Fix
this by swapping them around.
Lucas Stach [Thu, 27 Jun 2019 14:42:00 +0000 (16:42 +0200)]
drm/etnaviv: add missing failure path to destroy suballoc
When something goes wrong in the GPU init after the cmdbuf suballocator
has been constructed, we fail to destroy it properly. This causes havok
later when the GPU is unbound due to a module unload or similar.
Fixes: e66774dd6f6a (drm/etnaviv: add cmdbuf suballocator) Signed-off-by: Lucas Stach <[email protected]> Tested-by: Russell King <[email protected]>
Colin Ian King [Thu, 27 Jun 2019 16:43:08 +0000 (17:43 +0100)]
ALSA: usb-audio: fix sign unintended sign extension on left shifts
There are a couple of left shifts of unsigned 8 bit values that
first get promoted to signed ints and hence get sign extended
on the shift if the top bit of the 8 bit values are set. Fix
this by casting the 8 bit values to unsigned ints to stop the
unintentional sign extension.
Ronnie Sahlberg [Thu, 27 Jun 2019 04:57:02 +0000 (14:57 +1000)]
cifs: fix crash querying symlinks stored as reparse-points
We never parsed/returned any data from .get_link() when the object is a windows reparse-point
containing a symlink. This results in the VFS layer oopsing accessing an uninitialized buffer:
Linus Torvalds [Fri, 28 Jun 2019 00:50:09 +0000 (08:50 +0800)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A handful of clk driver fixes and one core framework fix
- Do a DT/firmware lookup in clk_core_get() even when the DT index is
a nonsensical value
- Fix some clk data typos in the Amlogic DT headers/code
- Avoid returning junk in the TI clk driver when an invalid clk is
looked for
- Fix dividers for the emac clks on Stratix10 SoCs
- Fix default HDA rates on Tegra210 to correct distorted audio"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: socfpga: stratix10: fix divider entry for the emac clocks
clk: Do a DT parent lookup even when index < 0
clk: tegra210: Fix default rates for HDA clocks
clk: ti: clkctrl: Fix returning uninitialized data
clk: meson: meson8b: fix a typo in the VPU parent names array variable
clk: meson: fix MPLL 50M binding id typo
Linus Torvalds [Fri, 28 Jun 2019 00:48:21 +0000 (08:48 +0800)]
Merge tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix incorrect uses of kstrndup and DM logging macros in DM's early
init code.
- Fix DM log-writes target's handling of super block sectors so updates
are made in order through use of completion.
- Fix DM core's argument splitting code to avoid undefined behaviour
reported as a side-effect of UBSAN analysis on ppc64le.
- Fix DM verity target to limit the amount of error messages that can
result from a corrupt block being found.
* tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm verity: use message limit for data block corruption message
dm table: don't copy from a NULL pointer in realloc_argv()
dm log writes: make sure super sector log updates are written in order
dm init: remove trailing newline from calls to DMERR() and DMINFO()
dm init: fix incorrect uses of kstrndup()
Linus Torvalds [Fri, 28 Jun 2019 00:41:18 +0000 (08:41 +0800)]
Merge tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux
Pull pidfd fixes from Christian Brauner:
"Userspace tools and libraries such as strace or glibc need a cheap and
reliable way to tell whether CLONE_PIDFD is supported. The easiest way
is to pass an invalid fd value in the return argument, perform the
syscall and verify the value in the return argument has been changed
to a valid fd.
However, if CLONE_PIDFD is specified we currently check if pidfd == 0
and return EINVAL if not.
The check for pidfd == 0 was originally added to enable us to abuse
the return argument for passing additional flags along with
CLONE_PIDFD in the future.
However, extending legacy clone this way would be a terrible idea and
with clone3 on the horizon and the ability to reuse CLONE_DETACHED
with CLONE_PIDFD there's no real need for this clutch. So remove the
pidfd == 0 check and help userspace out.
Also, accordig to Al, anon_inode_getfd() should only be used past the
point of no failure and ksys_close() should not be used at all since
it is far too easy to get wrong. Al's motto being "basically, once
it's in descriptor table, it's out of your control". So Al's patch
switches back to what we already had in v1 of the original patchset
and uses a anon_inode_getfile() + put_user() + fd_install() sequence
in the success path and a fput() + put_unused_fd() in the failure
path.
The other two changes should be trivial"
* tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
proc: remove useless d_is_dir() check
copy_process(): don't use ksys_close() on cleanups
samples: make pidfd-metadata fail gracefully on older kernels
fork: don't check parent_tidptr with CLONE_PIDFD
Linus Torvalds [Fri, 28 Jun 2019 00:39:18 +0000 (08:39 +0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fix for one corner case in HID++ protocol with respect to handling
very long reports, from Hans de Goede
- power management fix in Intel-ISH driver, from Hyungwoo Yang
- use-after-free fix in Intel-ISH driver, from Dan Carpenter
- a couple of new device IDs/quirks from Kai-Heng Feng, Kyle Godbey and
Oleksandr Natalenko
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: intel-ish-hid: fix wrong driver_data usage
HID: multitouch: Add pointstick support for ALPS Touchpad
HID: logitech-dj: Fix forwarding of very long HID++ reports
HID: uclogic: Add support for Huion HS64 tablet
HID: chicony: add another quirk for PixArt mouse
HID: intel-ish-hid: Fix a use after free in load_fw_from_host()
Linus Torvalds [Fri, 28 Jun 2019 00:37:04 +0000 (08:37 +0800)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A smaller batch of fixes, nothing that stands out as risky or scary.
Mostly DTS tweaks for a few issues:
- GPU fixlets for Meson
- CPU idle fix for LS1028A
- PWM interrupt fixes for i.MX6UL
Also, enable a driver (FSL_EDMA) on arm64 defconfig, and a warning and
two MAINTAINER tweaks"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: imx6ul: fix PWM[1-4] interrupts
ARM: omap2: remove incorrect __init annotation
ARM: dts: gemini Fix up DNS-313 compatible string
ARM: dts: Blank D-Link DIR-685 console
arm64: defconfig: Enable FSL_EDMA driver
arm64: dts: ls1028a: Fix CPU idle fail.
MAINTAINERS: BCM53573: Add internal Broadcom mailing list
MAINTAINERS: BCM2835: Add internal Broadcom mailing list
ARM: dts: meson8b: fix the operating voltage of the Mali GPU
ARM: dts: meson8b: drop undocumented property from the Mali GPU node
ARM: dts: meson8: fix GPU interrupts and drop an undocumented property
Linus Torvalds [Fri, 28 Jun 2019 00:34:12 +0000 (08:34 +0800)]
Merge tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
"The in-kernel AFS client has been undergoing testing on opendev.org on
one of their mirror machines. They are using AFS to hold data that is
then served via apache, and Ian Wienand had reported seeing oopses,
spontaneous machine reboots and updates to volumes going missing. This
patch series appears to have fixed the problem, very probably due to
patch (2), but it's not 100% certain.
(1) Fix the printing of the "vnode modified" warning to exclude checks
on files for which we don't have a callback promise from the
server (and so don't expect the server to tell us when it
changes).
Without this, for every file or directory for which we still have
an in-core inode that gets changed on the server, we may get a
message logged when we next look at it. This can happen in bulk
if, for instance, someone does "vos release" to update a R/O
volume from a R/W volume and a whole set of files are all changed
together.
We only really want to log a message if the file changed and the
server didn't tell us about it or we failed to track the state
internally.
(2) Fix accidental corruption of either afs_vlserver struct objects or
the the following memory locations (which could hold anything).
The issue is caused by a union that points to two different
structs in struct afs_call (to save space in the struct). The call
cleanup code assumes that it can simply call the cleanup for one
of those structs if not NULL - when it might be actually pointing
to the other struct.
This means that every Volume Location RPC op is going to corrupt
something.
(3) Fix an uninitialised spinlock. This isn't too bad, it just causes
a one-off warning if lockdep is enabled when "vos release" is
called, but the spinlock still behaves correctly.
(4) Fix the setting of i_block in the inode. This causes du, for
example, to produce incorrect results, but otherwise should not be
dangerous to the kernel"
* tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix setting of i_blocks
afs: Fix uninitialised spinlock afs_volume::cb_break_lock
afs: Fix vlserver record corruption
afs: Fix over zealous "vnode modified" warnings
1) Fix ppp_mppe crypto soft dependencies, from Takashi Iawi.
2) Fix TX completion to be finite, from Sergej Benilov.
3) Use register_pernet_device to avoid a dst leak in tipc, from Xin
Long.
4) Double free of TX cleanup in Dirk van der Merwe.
5) Memory leak in packet_set_ring(), from Eric Dumazet.
6) Out of bounds read in qmi_wwan, from Bjørn Mork.
7) Fix iif used in mcast/bcast looped back packets, from Stephen
Suryaputra.
8) Fix neighbour resolution on raw ipv6 sockets, from Nicolas Dichtel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits)
af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
sctp: change to hold sk after auth shkey is created successfully
ipv6: fix neighbour resolution with raw socket
ipv6: constify rt6_nexthop()
net: dsa: microchip: Use gpiod_set_value_cansleep()
net: aquantia: fix vlans not working over bridged network
ipv4: reset rt_iif for recirculated mcast/bcast out pkts
team: Always enable vlan tx offload
net/smc: Fix error path in smc_init
net/smc: hold conns_lock before calling smc_lgr_register_conn()
bonding: Always enable vlan tx offload
net/ipv6: Fix misuse of proc_dointvec "skip_notify_on_dev_down"
ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
qmi_wwan: Fix out-of-bounds read
tipc: check msg->req data len in tipc_nl_compat_bearer_disable
net: macb: do not copy the mac address if NULL
net/packet: fix memory leak in packet_set_ring()
net/tls: fix page double free on TX cleanup
net/sched: cbs: Fix error path of cbs_module_init
tipc: change to use register_pernet_device
...
Josh Poimboeuf [Thu, 27 Jun 2019 00:33:55 +0000 (19:33 -0500)]
x86/unwind/orc: Fall back to using frame pointers for generated code
The ORC unwinder can't unwind through BPF JIT generated code because
there are no ORC entries associated with the code.
If an ORC entry isn't available, try to fall back to frame pointers. If
BPF and other generated code always do frame pointer setup (even with
CONFIG_FRAME_POINTERS=n) then this will allow ORC to unwind through most
generated code despite there being no corresponding ORC entries.
Song Liu [Thu, 27 Jun 2019 00:33:52 +0000 (19:33 -0500)]
perf/x86: Always store regs->ip in perf_callchain_kernel()
The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by
perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel().
This was broken by the following commit:
d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")
With that change, when starting with non-HW regs, the unwinder starts
with the current stack frame and unwinds until it passes up the frame
which called perf_arch_fetch_caller_regs(). So regs->ip needs to be
saved deliberately.
Jeff Layton [Thu, 9 May 2019 11:58:38 +0000 (07:58 -0400)]
ceph: fix ceph_mdsc_build_path to not stop on first component
When ceph_mdsc_build_path is handed a positive dentry, it will return a
zero-length path string with the base set to that dentry. This is not
what we want. Always include at least one path component in the string.
ceph_mdsc_build_path has behaved this way for a long time but it didn't
matter until recent d_name handling rework.
Fixes: 964fff7491e4 ("ceph: use ceph_mdsc_build_path instead of clone_dentry_name") Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
Jeremy Linton [Wed, 26 Jun 2019 21:37:17 +0000 (16:37 -0500)]
arm_pmu: acpi: spe: Add initial MADT/SPE probing
ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.
Jeremy Linton [Wed, 26 Jun 2019 21:37:16 +0000 (16:37 -0500)]
ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
ACPI 6.3 adds a flag to indicate that child nodes are all
identical cores. This is useful to authoritatively determine
if a set of (possibly offline) cores are identical or not.
Since the flag doesn't give us a unique id we can generate
one and use it to create bitmaps of sibling nodes, or simply
in a loop to determine if a subset of cores are identical.
Jeremy Linton [Wed, 26 Jun 2019 21:37:15 +0000 (16:37 -0500)]
ACPI/PPTT: Modify node flag detection to find last IDENTICAL
The ACPI specification implies that the IDENTICAL flag should be
set on all non leaf nodes where the children are identical.
This means that we need to be searching for the last node with
the identical flag set rather than the first one.
Since this flag is also dependent on the table revision, we
need to add a bit of extra code to verify the table revision,
and the next node's state in the traversal. Since we want to
avoid function pointers here, lets just special case
the IDENTICAL flag.
Nicolas Boichat [Wed, 26 Jun 2019 03:54:45 +0000 (11:54 +0800)]
pinctrl: mediatek: Update cur_mask in mask/mask ops
During suspend/resume, mtk_eint_mask may be called while
wake_mask is active. For example, this happens if a wake-source
with an active interrupt handler wakes the system:
irq/pm.c:irq_pm_check_wakeup would disable the interrupt, so
that it can be handled later on in the resume flow.
However, this may happen before mtk_eint_do_resume is called:
in this case, wake_mask is loaded, and cur_mask is restored
from an older copy, re-enabling the interrupt, and causing
an interrupt storm (especially for level interrupts).
Step by step, for a line that has both wake and interrupt enabled:
1. cur_mask[irq] = 1; wake_mask[irq] = 1; EINT_EN[irq] = 1 (interrupt
enabled at hardware level)
2. System suspends, resumes due to that line (at this stage EINT_EN
== wake_mask)
3. irq_pm_check_wakeup is called, and disables the interrupt =>
EINT_EN[irq] = 0, but we still have cur_mask[irq] = 1
4. mtk_eint_do_resume is called, and restores EINT_EN = cur_mask, so
it reenables EINT_EN[irq] = 1 => interrupt storm as the driver
is not yet ready to handle the interrupt.
This patch fixes the issue in step 3, by recording all mask/unmask
changes in cur_mask. This also avoids the need to read the current
mask in eint_do_suspend, and we can remove mtk_eint_chip_read_mask
function.
The interrupt will be re-enabled properly later on, sometimes after
mtk_eint_do_resume, when the driver is ready to handle it.
Fixes: 58a5e1b64bb0 ("pinctrl: mediatek: Implement wake handler and suspend resume") Signed-off-by: Nicolas Boichat <[email protected]> Acked-by: Sean Wang <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
Al Viro [Thu, 27 Jun 2019 02:22:09 +0000 (22:22 -0400)]
copy_process(): don't use ksys_close() on cleanups
anon_inode_getfd() should be used *ONLY* in situations when we are
guaranteed to be past the last failure point (including copying the
descriptor number to userland, at that). And ksys_close() should
not be used for cleanups at all.
anon_inode_getfile() is there for all nontrivial cases like that.
Just use that...
Eiichi Tsukata [Thu, 27 Jun 2019 02:47:32 +0000 (11:47 +0900)]
cpu/hotplug: Fix out-of-bounds read when setting fail state
Setting invalid value to /sys/devices/system/cpu/cpuX/hotplug/fail
can control `struct cpuhp_step *sp` address, results in the following
global-out-of-bounds read.
The buggy address belongs to the variable:
cpu_hotplug_lock+0x98/0xa0
Memory state around the buggy address: ffffffff89734300: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734380: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffff89734400: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
^ ffffffff89734480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Add a sanity check for the value written from user space.
Neil Horman [Tue, 25 Jun 2019 21:57:49 +0000 (17:57 -0400)]
af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel. This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).
However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.
We can fix this by converting the schedule call to a completion
mechanism. By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.
Tested by myself and the reporter, with good results
Change Notes:
V1->V2:
Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)
V2->V3:
Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status. Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)
Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)
V3->V4:
Willem has requested that the control flow be restored to the
previous state. Doing so lets us eliminate the need for the
po->wait_on_complete flag variable, and lets us get rid of the
packet_next_frame function, but introduces another complexity.
Specifically, but using the packet pending count, we can, if an
applications calls sendmsg multiple times with MSG_DONTWAIT set, each
set of transmitted frames, when complete, will cause
tpacket_destruct_skb to issue a complete call, for which there will
never be a wait_on_completion call. This imbalance will lead to any
future call to wait_for_completion here to return early, when the frames
they sent may not have completed. To correct this, we need to re-init
the completion queue on every call to tpacket_snd before we enter the
loop so as to ensure we wait properly for the frames we send in this
iteration.
Change the timeout and interrupted gotos to out_put rather than
out_status so that we don't try to free a non-existant skb
Clean up some extra newlines (Willem de Bruijn)
Xin Long [Mon, 24 Jun 2019 16:21:45 +0000 (00:21 +0800)]
sctp: change to hold sk after auth shkey is created successfully
Now in sctp_endpoint_init(), it holds the sk then creates auth
shkey. But when the creation fails, it doesn't release the sk,
which causes a sk defcnf leak,
Here to fix it by only holding the sk when auth shkey is created
successfully.
PCI: PM: Avoid skipping bus-level PM on platforms without ACPI
There are platforms that do not call pm_set_suspend_via_firmware(),
so pm_suspend_via_firmware() returns 'false' on them, but the power
states of PCI devices (PCIe ports in particular) are changed as a
result of powering down core platform components during system-wide
suspend. Thus the pm_suspend_via_firmware() checks in
pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by
commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-
idle") are not sufficient to determine that devices left in D0
during suspend will remain in D0 during resume and so the bus-level
power management can be skipped for them.
For this reason, introduce a new global suspend flag,
PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only
and replace the pm_suspend_via_firmware() checks mentioned above
with checks against this flag.
Nicolas Dichtel [Mon, 24 Jun 2019 14:01:09 +0000 (16:01 +0200)]
ipv6: fix neighbour resolution with raw socket
The scenario is the following: the user uses a raw socket to send an ipv6
packet, destinated to a not-connected network, and specify a connected nh.
Here is the corresponding python script to reproduce this scenario:
fd00:175::/64 is a connected route and fd00:200::fa is not a connected
host.
With this scenario, the kernel starts by sending a NS to resolve
fd00:175::2. When it receives the NA, it flushes its queue and try to send
the initial packet. But instead of sending it, it sends another NS to
resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
the user sends again the packet, it now uses the right nh (fd00:175::2).
The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
:: because the associated route is a connected route, thus it uses the dst
addr of the packet. Let's use rt6_nexthop() to choose the right nh.
Marek Vasut [Sun, 23 Jun 2019 15:12:57 +0000 (17:12 +0200)]
net: dsa: microchip: Use gpiod_set_value_cansleep()
Replace gpiod_set_value() with gpiod_set_value_cansleep(), as the switch
reset GPIO can be connected to e.g. I2C GPIO expander and it is perfectly
fine for the kernel to sleep for a bit in ksz_switch_register().
Dmitry Bogdanov [Sat, 22 Jun 2019 08:46:37 +0000 (08:46 +0000)]
net: aquantia: fix vlans not working over bridged network
In configuration of vlan over bridge over aquantia device
it was found that vlan tagged traffic is dropped on chip.
The reason is that bridge device enables promisc mode,
but in atlantic chip vlan filters will still apply.
So we have to corellate promisc settings with vlan configuration.
The solution is to track in a separate state variable the
need of vlan forced promisc. And also consider generic
promisc configuration when doing vlan filter config.
Fixes: 7975d2aff5af ("net: aquantia: add support of rx-vlan-filter offload") Signed-off-by: Dmitry Bogdanov <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
ipv4: reset rt_iif for recirculated mcast/bcast out pkts
Multicast or broadcast egress packets have rt_iif set to the oif. These
packets might be recirculated back as input and lookup to the raw
sockets may fail because they are bound to the incoming interface
(skb_iif). If rt_iif is not zero, during the lookup, inet_iif() function
returns rt_iif instead of skb_iif. Hence, the lookup fails.
v2: Make it non vrf specific (David Ahern). Reword the changelog to
reflect it. Signed-off-by: Stephen Suryaputra <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yash Shah [Tue, 25 Jun 2019 09:31:31 +0000 (15:01 +0530)]
riscv: dts: Re-organize the DT nodes
As per the convention for any SOC device with external connection,
define only device DT node in SOC DTSi file with status = "disabled"
and enable device in Board DTS file with status = "okay"