ASoC: tegra: Use ADMAIF component for DMA allocations
DMA memory is currently allocated for the soundcard device, which is a
virtual device added for the sole purpose of "stitching" together the
audio device. It is not a real device and therefore doesn't have a DMA
mask or a description of the path to and from memory of accesses.
Memory accesses really originate from the ADMA controller that provides
the DMA channels used by the PCM component. However, since the DMA
memory is allocated up-front and the DMA channels aren't known at that
point, there is no way of knowing the DMA channel provider at allocation
time.
The next best physical device in the memory path is the ADMAIF. Use it
as the device to allocate DMA memory to. iommus and interconnects device
tree properties can thus be added to the ADMAIF device tree node to
describe the memory access path for audio.
Mark Brown [Mon, 28 Jun 2021 16:47:52 +0000 (17:47 +0100)]
Merge series "ASoC: Intel: machine driver corrections" from Pierre-Louis Bossart <[email protected]>:
The first fix solves an underflow in SoundWire platforms using the
max98373 amplifier, the rest of the patches are minor corrections in
machine drivers.
The fix should be queued for the 5.14 cycle, the rest should be
harmless but can be deferred for 5.15 if it's too late already.
Brent Lu (2):
ASoC: SOF: add a helper to get topology configured bclk
ASoC: Intel: sof_cs42l42: use helper function to get bclk frequency
Gongjun Song (1):
ASoC: Intel: soc-acpi: add support for SoundWire of TGL-H-RVP
Rander Wang (1):
ASoC: Intel: boards: fix xrun issue on platform with max98373
Charles Keepax [Sat, 26 Jun 2021 15:59:40 +0000 (16:59 +0100)]
ASoC: wm_adsp: Add CCM_CORE_RESET to Halo start core
When starting the Halo core it is advised to also write the core reset
bit, this ensures the part starts up in the appropriate state. Omitting
this doesn't cause issues on most parts but cs40l25 requires it and
it is advised on all Halo parts.
Charles Keepax [Sat, 26 Jun 2021 15:59:39 +0000 (16:59 +0100)]
ASoC: wm_adsp: Correct wm_coeff_tlv_get handling
When wm_coeff_tlv_get was updated it was accidentally switch to the _raw
version of the helper causing it to ignore the current DSP state it
should be checking. Switch the code back to the correct helper so that
users can't read the controls when they arn't available.
Rander Wang [Fri, 25 Jun 2021 20:50:39 +0000 (15:50 -0500)]
ASoC: Intel: boards: fix xrun issue on platform with max98373
On TGL platform with max98373 codec the trigger start sequence is
fe first, then codec component and sdw link is the last. Recently
a delay was introduced in max98373 codec driver and this resulted
to the start of sdw stream transmission was delayed and the data
transmitted by fw can't be consumed by sdw controller, so xrun happened.
Adding delay in trigger function is a bad idea. This patch enable spk
pin in prepare function and disable it in hw_free to avoid xrun issue
caused by delay in trigger.
ASoC: qcom: lpass-cpu: mark IRQ_CLEAR register as volatile and readable
Currently IRQ_CLEAR register is marked as write-only, however using
regmap_update_bits on this register will have some side effects.
so mark IRQ_CLEAR register appropriately as readable and volatile.
Mark Brown [Wed, 23 Jun 2021 15:31:14 +0000 (16:31 +0100)]
Merge series "ASoC: tlv320aic32x4: Add support for TAS2505" from Claudius Heine <[email protected]>:
Hi,
this is v2 from my patchset that add support for the TAS2505 to the tlv320aic32x4 driver.
kind regards,
Claudius
Changes from v1:
- clarified commit message of first patch, which add the type value to the struct
- removed unnecessary code to put and get speaker volume
- removed 'Gain' from 'HP Driver Playback Volume' control
- fixed rebase issues
Claudius Heine (3):
ASoC: tlv320aic32x4: add type to device private data struct
ASoC: tlv320aic32x4: add support for TAS2505
ASoC: tlv320aic32x4: dt-bindings: add TAS2505 to compatible
Mark Brown [Wed, 23 Jun 2021 15:31:13 +0000 (16:31 +0100)]
Merge series "ASoC: tegra: Use devm_platform_get_and_ioremap_resource()" from Yang Yingliang <[email protected]>:
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Yang Yingliang (4):
ASoC: tegra20: i2s: Use devm_platform_get_and_ioremap_resource()
ASoC: tegra20: spdif: Use devm_platform_get_and_ioremap_resource()
ASoC: tegra: tegra210_admaif: Use
devm_platform_get_and_ioremap_resource()
ASoC: tegra30: ahub: Use devm_platform_get_and_ioremap_resource()
Claudius Heine [Thu, 17 Jun 2021 08:52:28 +0000 (10:52 +0200)]
ASoC: tlv320aic32x4: add type to device private data struct
While this driver can already handle different device variants, the
variant information cannot be used in the driver code and therefor
cannot have different code paths depending on the device variant.
This change adds a `type` value into the `aic32x4_priv` structure, that
contains a device variant identifier, which was set when the driver was
bound to the device.
Guido Günther [Tue, 22 Jun 2021 08:27:09 +0000 (10:27 +0200)]
ASoC: simple-card: Fill in driver name
alsa-ucm groups by driver name so fill that in as well. Otherwise the
presented information is redundant and doesn't reflect the used
driver. We can't just use 'asoc-simple-card' since the driver name is
restricted to 15 characters.
The Kconfig documentation for SND_SOC_INTEL_SKL_HDA_DSP_GENERIC_MACH
is a bit misleading as it refers to a set of older platforms,
while in practise this machine driver supports all modern
Intel systems with Smart Sound Technology based DSP and HDA codecs.
ASoC: Intel: use MODULE_DEVICE_TABLE with platform_device_id tables
When we have a platform_device_id table, we can use
MODULE_DEVICE_TABLE to automatically generate the modalias. As a
result we can remove the manual insertion of MODULE_ALIAS.
sound/soc/intel/boards/sof_sdw.c:796:31: error: incorrect type in argument 6 (different signedness)
sound/soc/intel/boards/sof_sdw.c:796:31: expected int *group_id
sound/soc/intel/boards/sof_sdw.c:796:31: got unsigned int *
The group_id cannot be negative, use unsigned int.
Kai Vehmanen [Mon, 21 Jun 2021 19:40:49 +0000 (14:40 -0500)]
ASoC: Intel: sof_sdw: remove hdac-hdmi support
Remove support for using hdac_hdmi codec driver. No known products use
this configuration and hdac_hdmi cannot support all the platforms
sof_sdw does.
This change also fixes a bug in Kconfig rules.
SND_SOC_INTEL_SOUNDWIRE_SOF_MACH did not have a select SND_SOC_HDAC_HDMI
and this could cause build failures.
Kai Vehmanen [Mon, 21 Jun 2021 19:40:48 +0000 (14:40 -0500)]
ASoC: Intel: sof_sdw: use mach data for ADL RVP DMIC count
On the reference boards, number of PCH dmics may vary and the number
should be taken from driver machine data. Remove the SOF_SDW_PCH_DMIC
quirk to make DMIC number configurable.
Fixes:d25bbe80485f8 ("ASoC: Intel: sof_sdw: add quirk for new ADL-P Rvp")
Mark Brown [Mon, 21 Jun 2021 18:16:54 +0000 (19:16 +0100)]
Merge series "ASoC: fsl: Use devm_platform_get_and_ioremap_resource()" from Yang Yingliang <[email protected]>:
patch #1 ~ #8:
Use devm_platform_get_and_ioremap_resource()
patch #9
check return value of platform_get_resource_byname()
v2:
change error message in patch #9
Yang Yingliang (9):
ASoC: fsl_asrc: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_aud2htx: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_easrc: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_esai: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_micfil: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_sai: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_spdif: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_ssi: Use devm_platform_get_and_ioremap_resource()
ASoC: fsl_xcvr: check return value after calling
platform_get_resource_byname()
Mark Brown [Mon, 21 Jun 2021 18:16:53 +0000 (19:16 +0100)]
Merge series "ASoC: tidyup snd_soc_of_parse_daifmt()" from Kuninori Morimoto <[email protected]>:
Hi Mark
These are v3 of parsing for daifmt.
I want to add new audio-graph-card2 sound card driver,
and this is last part of necessary soc-core cleanup for it.
Current some drivers are using DT, and then,
snd_soc_of_parse_daifmt() parses daifmt, but bitclock/frame provider
parsing part is one of headache, because we are assuming below both cases.
A) node {
bitclock-master;
frame-master;
...
};
B) link {
bitclock-master = <&xxx>;
frame-master = <&xxx>;
...
};
The original was style A), and style B) was added later.
snd_soc_of_parse_daifmt() parses A) style as original style,
and user need to update to B) style for clock_provider part if needed.
To handle it more flexibile, this patch-set adds new functions
which separates snd_soc_of_parse_daifmt() helper function.
snd_soc_daifmt_parse_format() : format part
snd_soc_daifmt_parse_clock_provider_as_flag() : clock part for style A)
snd_soc_daifmt_parse_clock_provider_as_phandl() : clock part for style B)
snd_soc_daifmt_parse_clock_provider_as_bitmap() : clock part use with _from_bitmap
v1 -> v2
- tidyup parse_clock_provider functions to _as_flag/phandle/bitmap()
- don't exchange code style on each drivers.
v2 -> v3
- use daifmt as much as possible (don't use daiclk) on each driver.
Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected]
Kuninori Morimoto (8):
ASoC: soc-core: add snd_soc_daifmt_clock_provider_from_bitmap()
ASoC: soc-core: add snd_soc_daifmt_clock_provider_fliped()
ASoC: soc-core: add snd_soc_daifmt_parse_format/clock_provider()
ASoC: atmel: switch to use snd_soc_daifmt_parse_format/clock_provider()
ASoC: fsl: switch to use snd_soc_daifmt_parse_format/clock_provider()
ASoC: meson: switch to use snd_soc_daifmt_parse_format/clock_provider()
ASoC: simple-card-utils: switch to use snd_soc_daifmt_parse_format/clock_provider()
ASoC: soc-core: remove snd_soc_of_parse_daifmt()
Mark Brown [Mon, 21 Jun 2021 18:16:52 +0000 (19:16 +0100)]
Merge series "ASoC: sunxi: Use devm_platform_get_and_ioremap_resource()" from Yang Yingliang <[email protected]>:
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Yang Yingliang (3):
ASoC: sunxi: sun4i-codec: Use devm_platform_get_and_ioremap_resource()
ASoC: sun4i-i2s: Use devm_platform_get_and_ioremap_resource()
ASoC: sunxi: sun4i-spdif: Use devm_platform_get_and_ioremap_resource()
Mark Brown [Mon, 21 Jun 2021 18:16:51 +0000 (19:16 +0100)]
Merge series "ASoC: samsung: Use devm_platform_get_and_ioremap_resource()" from Yang Yingliang <[email protected]>:
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Yang Yingliang (4):
ASoC: samsung: i2s: Use devm_platform_get_and_ioremap_resource()
ASoC: samsung: pcm: Use devm_platform_get_and_ioremap_resource()
ASoC: samsung: s3c2412-i2s: Use
devm_platform_get_and_ioremap_resource()
ASoC: samsung: s3c24xx-i2s: Use
devm_platform_get_and_ioremap_resource()
Shuming Fan [Thu, 17 Jun 2021 09:08:22 +0000 (17:08 +0800)]
ASoC: rt711: add two jack detection modes
Some boards use different circuits for jack detection.
This patch adds two modes as below
1. JD2/2 ports/external resister 100k
2. JD2/1 port/JD voltage 1.8V
snd_soc_of_parse_daifmt() parses daifmt, but bitclock/frame provider
parsing part is one of headacke, because we are assuming below both cases.
A) node {
bitclock-master;
frame-master;
...
};
B) link {
bitclock-master = <&xxx>;
frame-master = <&xxx>;
...
};
The original was style A), and style B) was added later
by commit b3ca11ff59bc ("ASoC: simple-card: Move dai-link level
properties away from dai subnodes").
snd_soc_of_parse_daifmt() parses it as style A),
and user need to update it to style B) if needed.
To handle it more flexibile, this patch adds new functions
which separates snd_soc_of_parse_daifmt() helper function.
snd_soc_daifmt_parse_format() :for DAI format
snd_soc_daifmt_parse_clock_provider_as_flag() :for style A)
snd_soc_daifmt_parse_clock_provider_as_phandl() :for style B)
snd_soc_daifmt_parse_clock_provider_as_bitmap() :use with _from_bitmap
This patch adds snd_soc_daifmt_clock_provider_from_bitmap() function
to judge clock/frame master from its bitmap.
This is prepare for snd_soc_of_parse_daifmt() cleanup.
The snd_soc_dai_ops structures is only stored in the ops field of a
snd_soc_dai_driver structure, so make the snd_soc_dai_ops structure
const to allow the compiler to put it in read-only memory.
Mark Brown [Thu, 17 Jun 2021 14:27:24 +0000 (15:27 +0100)]
Merge series "ASoC: stm32: Use devm_platform_get_and_ioremap_resource()" from Yang Yingliang <[email protected]>:
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Yang Yingliang (3):
ASoC: stm32: i2s: Use devm_platform_get_and_ioremap_resource()
ASoC: stm32: sai: Use devm_platform_get_and_ioremap_resource()
ASoC: stm32: spdifrx: Use devm_platform_get_and_ioremap_resource()
Linus Torvalds [Wed, 16 Jun 2021 16:40:28 +0000 (09:40 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"18 patches.
Subsystems affected by this patch series: mm (memory-failure, swap,
slub, hugetlb, memory-failure, slub, thp, sparsemem), and coredump"
* emailed patches from Andrew Morton <[email protected]>:
mm/sparse: fix check_usemap_section_nr warnings
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
mm/thp: fix page_address_in_vma() on file THP tails
mm/thp: fix vma_address() if virtual address below file offset
mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
mm/thp: make is_huge_zero_pmd() safe and quicker
mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
mm, thp: use head page in __migration_entry_wait()
mm/slub.c: include swab.h
crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo
mm/memory-failure: make sure wait for page writeback in memory_failure
mm/hugetlb: expand restore_reserve_on_error functionality
mm/slub: actually fix freelist pointer vs redzoning
mm/slub: fix redzoning for small allocations
mm/slub: clarify verification reporting
mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare
mm,hwpoison: fix race with hugetlb page allocation
If CONFIG_DEBUG_VIRTUAL is not enabled, __pa() can handle both
dynamically allocated linear addresses and symbol addresses. However,
if (CONFIG_DEBUG_VIRTUAL=y && CONFIG_NEED_MULTIPLE_NODES=n) we can see
the "virt_to_phys used for non-linear address" warning because that
&contig_page_data is not a linear address on arm64.
Warning message:
virt_to_phys used for non-linear address: (contig_page_data+0x0/0x1c00)
WARNING: CPU: 0 PID: 0 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x58/0x68
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.13.0-rc1-00074-g1140ab592e2e #3
Hardware name: linux,dummy-virt (DT)
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
Call trace:
__virt_to_phys+0x58/0x68
check_usemap_section_nr+0x50/0xfc
sparse_init_nid+0x1ac/0x28c
sparse_init+0x1c4/0x1e0
bootmem_init+0x60/0x90
setup_arch+0x184/0x1f0
start_kernel+0x78/0x488
To fix it, create a small function to handle both translation.
Yang Shi [Wed, 16 Jun 2021 01:24:07 +0000 (18:24 -0700)]
mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
When debugging the bug reported by Wang Yugui [1], try_to_unmap() may
fail, but the first VM_BUG_ON_PAGE() just checks page_mapcount() however
it may miss the failure when head page is unmapped but other subpage is
mapped. Then the second DEBUG_VM BUG() that check total mapcount would
catch it. This may incur some confusion.
As this is not a fatal issue, so consolidate the two DEBUG_VM checks
into one VM_WARN_ON_ONCE_PAGE().
Hugh Dickins [Wed, 16 Jun 2021 01:24:03 +0000 (18:24 -0700)]
mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
There is a race between THP unmapping and truncation, when truncate sees
pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared
it, but before its page_remove_rmap() gets to decrement
compound_mapcount: generating false "BUG: Bad page cache" reports that
the page is still mapped when deleted. This commit fixes that, but not
in the way I hoped.
The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
instead of unmap_mapping_range() in truncate_cleanup_page(): it has
often been an annoyance that we usually call unmap_mapping_range() with
no pages locked, but there apply it to a single locked page.
try_to_unmap() looks more suitable for a single locked page.
However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
it is used to insert THP migration entries, but not used to unmap THPs.
Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
needs are different, I'm too ignorant of the DAX cases, and couldn't
decide how far to go for anon+swap. Set that aside.
The second attempt took a different tack: make no change in truncate.c,
but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
clearing it initially, then pmd_clear() between page_remove_rmap() and
unlocking at the end. Nice. But powerpc blows that approach out of the
water, with its serialize_against_pte_lookup(), and interesting pgtable
usage. It would need serious help to get working on powerpc (with a
minor optimization issue on s390 too). Set that aside.
Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
that's likely to reduce or eliminate the number of incidents, it would
give less assurance of whether we had identified the problem correctly.
This successful iteration introduces "unmap_mapping_page(page)" instead
of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
with an addition to details. Then zap_pmd_range() watches for this
case, and does spin_unlock(pmd_lock) if so - just like
page_vma_mapped_walk() now does in the PVMW_SYNC case. Not pretty, but
safe.
Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
assert its interface; but currently that's only used to make sure that
page->mapping is stable, and zap_pmd_range() doesn't care if the page is
locked or not. Along these lines, in invalidate_inode_pages2_range()
move the initial unmap_mapping_range() out from under page lock, before
then calling unmap_mapping_page() under page lock if still mapped.
Jue Wang [Wed, 16 Jun 2021 01:24:00 +0000 (18:24 -0700)]
mm/thp: fix page_address_in_vma() on file THP tails
Anon THP tails were already supported, but memory-failure may need to
use page_address_in_vma() on file THP tails, which its page->mapping
check did not permit: fix it.
hughd adds: no current usage is known to hit the issue, but this does
fix a subtle trap in a general helper: best fixed in stable sooner than
later.
Hugh Dickins [Wed, 16 Jun 2021 01:23:56 +0000 (18:23 -0700)]
mm/thp: fix vma_address() if virtual address below file offset
Running certain tests with a DEBUG_VM kernel would crash within hours,
on the total_mapcount BUG() in split_huge_page_to_list(), while trying
to free up some memory by punching a hole in a shmem huge page: split's
try_to_unmap() was unable to find all the mappings of the page (which,
on a !DEBUG_VM kernel, would then keep the huge page pinned in memory).
When that BUG() was changed to a WARN(), it would later crash on the
VM_BUG_ON_VMA(end < vma->vm_start || start >= vma->vm_end, vma) in
mm/internal.h:vma_address(), used by rmap_walk_file() for
try_to_unmap().
vma_address() is usually correct, but there's a wraparound case when the
vm_start address is unusually low, but vm_pgoff not so low:
vma_address() chooses max(start, vma->vm_start), but that decides on the
wrong address, because start has become almost ULONG_MAX.
Rewrite vma_address() to be more careful about vm_pgoff; move the
VM_BUG_ON_VMA() out of it, returning -EFAULT for errors, so that it can
be safely used from page_mapped_in_vma() and page_address_in_vma() too.
Add vma_address_end() to apply similar care to end address calculation,
in page_vma_mapped_walk() and page_mkclean_one() and try_to_unmap_one();
though it raises a question of whether callers would do better to supply
pvmw->end to page_vma_mapped_walk() - I chose not, for a smaller patch.
An irritation is that their apparent generality breaks down on KSM
pages, which cannot be located by the page->index that page_to_pgoff()
uses: as commit 4b0ece6fa016 ("mm: migrate: fix remove_migration_pte()
for ksm pages") once discovered. I dithered over the best thing to do
about that, and have ended up with a VM_BUG_ON_PAGE(PageKsm) in both
vma_address() and vma_address_end(); though the only place in danger of
using it on them was try_to_unmap_one().
Sidenote: vma_address() and vma_address_end() now use compound_nr() on a
head page, instead of thp_size(): to make the right calculation on a
hugetlbfs page, whether or not THPs are configured. try_to_unmap() is
used on hugetlbfs pages, but perhaps the wrong calculation never
mattered.
Hugh Dickins [Wed, 16 Jun 2021 01:23:53 +0000 (18:23 -0700)]
mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
(!unmap_success): with dump_page() showing mapcount:1, but then its raw
struct page output showing _mapcount ffffffff i.e. mapcount 0.
And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
all indicative of some mapcount difficulty in development here perhaps.
But the !CONFIG_DEBUG_VM path handles the failures correctly and
silently.
I believe the problem is that once a racing unmap has cleared pte or
pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
from try_to_unmap() before the racing task has reached decrementing
mapcount.
Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
TTU_SYNC to the options, and passing that from unmap_page().
When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
for both: the slight overhead added should rarely matter, except perhaps
if splitting sparsely-populated multiply-mapped shmem. Once confident
that bugs are fixed, TTU_SYNC here can be removed, and the race
tolerated.
Hugh Dickins [Wed, 16 Jun 2021 01:23:49 +0000 (18:23 -0700)]
mm/thp: make is_huge_zero_pmd() safe and quicker
Most callers of is_huge_zero_pmd() supply a pmd already verified
present; but a few (notably zap_huge_pmd()) do not - it might be a pmd
migration entry, in which the pfn is encoded differently from a present
pmd: which might pass the is_huge_zero_pmd() test (though not on x86,
since L1TF forced us to protect against that); or perhaps even crash in
pmd_page() applied to a swap-like entry.
Make it safe by adding pmd_present() check into is_huge_zero_pmd()
itself; and make it quicker by saving huge_zero_pfn, so that
is_huge_zero_pmd() will not need to do that pmd_page() lookup each time.
__split_huge_pmd_locked() checked pmd_trans_huge() before: that worked,
but is unnecessary now that is_huge_zero_pmd() checks present.
Hugh Dickins [Wed, 16 Jun 2021 01:23:45 +0000 (18:23 -0700)]
mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
Patch series "mm/thp: fix THP splitting unmap BUGs and related", v10.
Here is v2 batch of long-standing THP bug fixes that I had not got
around to sending before, but prompted now by Wang Yugui's report
https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/
Wang Yugui has tested a rollup of these fixes applied to 5.10.39, and
they have done no harm, but have *not* fixed that issue: something more
is needed and I have no idea of what.
This patch (of 7):
Stressing huge tmpfs page migration racing hole punch often crashed on
the VM_BUG_ON(!pmd_present) in pmdp_huge_clear_flush(), with DEBUG_VM=y
kernel; or shortly afterwards, on a bad dereference in
__split_huge_pmd_locked() when DEBUG_VM=n. They forgot to allow for pmd
migration entries in the non-anonymous case.
Full disclosure: those particular experiments were on a kernel with more
relaxed mmap_lock and i_mmap_rwsem locking, and were not repeated on the
vanilla kernel: it is conceivable that stricter locking happens to avoid
those cases, or makes them less likely; but __split_huge_pmd_locked()
already allowed for pmd migration entries when handling anonymous THPs,
so this commit brings the shmem and file THP handling into line.
And while there: use old_pmd rather than _pmd, as in the following
blocks; and make it clearer to the eye that the !vma_is_anonymous()
block is self-contained, making an early return after accounting for
unmapping.
Xu Yu [Wed, 16 Jun 2021 01:23:42 +0000 (18:23 -0700)]
mm, thp: use head page in __migration_entry_wait()
We notice that hung task happens in a corner but practical scenario when
CONFIG_PREEMPT_NONE is enabled, as follows.
Process 0 Process 1 Process 2..Inf
split_huge_page_to_list
unmap_page
split_huge_pmd_address
__migration_entry_wait(head)
__migration_entry_wait(tail)
remap_page (roll back)
remove_migration_ptes
rmap_walk_anon
cond_resched
Where __migration_entry_wait(tail) is occurred in kernel space, e.g.,
copy_to_user in fstat, which will immediately fault again without
rescheduling, and thus occupy the cpu fully.
When there are too many processes performing __migration_entry_wait on
tail page, remap_page will never be done after cond_resched.
This makes __migration_entry_wait operate on the compound head page,
thus waits for remap_page to complete, whether the THP is split
successfully or roll back.
Note that put_and_wait_on_page_locked helps to drop the page reference
acquired with get_page_unless_zero, as soon as the page is on the wait
queue, before actually waiting. So splitting the THP is only prevented
for a brief interval.
Besides SECTIONS_SHIFT, SECTION_SIZE_BITS is also used to calculate
PAGES_PER_SECTION in makedumpfile just like kernel.
Unfortunately, this arch-dependent macro SECTION_SIZE_BITS changes, e.g.
recently in kernel commit f0b13ee23241 ("arm64/sparsemem: reduce
SECTION_SIZE_BITS"). But user space wants a stable interface to get
this info. Such info is impossible to be deduced from a crashdump
vmcore. Hence append SECTION_SIZE_BITS to vmcoreinfo.
A crash dump of this problem show that someone called __munlock_pagevec
to clear page LRU without lock_page: do_mmap -> mmap_region -> do_munmap
-> munlock_vma_pages_range -> __munlock_pagevec.
As a result memory_failure will call identify_page_state without
wait_on_page_writeback. And after truncate_error_page clear the mapping
of this page. end_page_writeback won't call sb_clear_inode_writeback to
clear inode->i_wb_list. That will trigger BUG_ON in clear_inode!
Fix it by checking PageWriteback too to help determine should we skip
wait_on_page_writeback.
The routine restore_reserve_on_error is called to restore reservation
information when an error occurs after page allocation. The routine
alloc_huge_page modifies the mapping reserve map and potentially the
reserve count during allocation. If code calling alloc_huge_page
encounters an error after allocation and needs to free the page, the
reservation information needs to be adjusted.
Currently, restore_reserve_on_error only takes action on pages for which
the reserve count was adjusted(HPageRestoreReserve flag). There is
nothing wrong with these adjustments. However, alloc_huge_page ALWAYS
modifies the reserve map during allocation even if the reserve count is
not adjusted. This can cause issues as observed during development of
this patch [1].
One specific series of operations causing an issue is:
- Create a shared hugetlb mapping
Reservations for all pages created by default
- Fault in a page in the mapping
Reservation exists so reservation count is decremented
- Punch a hole in the file/mapping at index previously faulted
Reservation and any associated pages will be removed
- Allocate a page to fill the hole
No reservation entry, so reserve count unmodified
Reservation entry added to map by alloc_huge_page
- Error after allocation and before instantiating the page
Reservation entry remains in map
- Allocate a page to fill the hole
Reservation entry exists, so decrement reservation count
This will cause a reservation count underflow as the reservation count
was decremented twice for the same index.
A user would observe a very large number for HugePages_Rsvd in
/proc/meminfo. This would also likely cause subsequent allocations of
hugetlb pages to fail as it would 'appear' that all pages are reserved.
This sequence of operations is unlikely to happen, however they were
easily reproduced and observed using hacked up code as described in [1].
Address the issue by having the routine restore_reserve_on_error take
action on pages where HPageRestoreReserve is not set. In this case, we
need to remove any reserve map entry created by alloc_huge_page. A new
helper routine vma_del_reservation assists with this operation.
There are three callers of alloc_huge_page which do not currently call
restore_reserve_on error before freeing a page on error paths. Add
those missing calls.
Kees Cook [Wed, 16 Jun 2021 01:23:26 +0000 (18:23 -0700)]
mm/slub: actually fix freelist pointer vs redzoning
It turns out that SLUB redzoning ("slub_debug=Z") checks from
s->object_size rather than from s->inuse (which is normally bumped to
make room for the freelist pointer), so a cache created with an object
size less than 24 would have the freelist pointer written beyond
s->object_size, causing the redzone to be corrupted by the freelist
pointer. This was very visible with "slub_debug=ZF":
BUG test (Tainted: G B ): Right Redzone overwritten
-----------------------------------------------------------------------------
Kees Cook [Wed, 16 Jun 2021 01:23:22 +0000 (18:23 -0700)]
mm/slub: fix redzoning for small allocations
The redzone area for SLUB exists between s->object_size and s->inuse
(which is at least the word-aligned object_size). If a cache were
created with an object_size smaller than sizeof(void *), the in-object
stored freelist pointer would overwrite the redzone (e.g. with boot
param "slub_debug=ZF"):
BUG test (Tainted: G B ): Right Redzone overwritten
-----------------------------------------------------------------------------
Store the freelist pointer out of line when object_size is smaller than
sizeof(void *) and redzoning is enabled.
Additionally remove the "smaller than sizeof(void *)" check under
CONFIG_DEBUG_VM in kmem_cache_sanity_check() as it is now redundant:
SLAB and SLOB both handle small sizes.
(Note that no caches within this size range are known to exist in the
kernel currently.)
Kees Cook [Wed, 16 Jun 2021 01:23:19 +0000 (18:23 -0700)]
mm/slub: clarify verification reporting
Patch series "Actually fix freelist pointer vs redzoning", v4.
This fixes redzoning vs the freelist pointer (both for middle-position
and very small caches). Both are "theoretical" fixes, in that I see no
evidence of such small-sized caches actually be used in the kernel, but
that's no reason to let the bugs continue to exist, especially since
people doing local development keep tripping over it. :)
This patch (of 3):
Instead of repeating "Redzone" and "Poison", clarify which sides of
those zones got tripped. Additionally fix column alignment in the
trailer.
BUG test (Tainted: G B ): Right Redzone overwritten
...
Redzone (____ptrval____): bb bb bb bb bb bb bb bb ........
Object (____ptrval____): f6 f4 a5 40 1d e8 ...@..
Redzone (____ptrval____): 1a aa ..
Padding (____ptrval____): 00 00 00 00 00 00 00 00 ........
The earlier commits that slowly resulted in the "Before" reporting were:
d86bd1bece6f ("mm/slub: support left redzone") ffc79d288000 ("slub: use print_hex_dump") 2492268472e7 ("SLUB: change error reporting format to follow lockdep loosely")
Peter Xu [Wed, 16 Jun 2021 01:23:16 +0000 (18:23 -0700)]
mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare
I found it by pure code review, that pte_same_as_swp() of unuse_vma()
didn't take uffd-wp bit into account when comparing ptes.
pte_same_as_swp() returning false negative could cause failure to
swapoff swap ptes that was wr-protected by userfaultfd.
__get_hwpoison_page() only checks the page refcount before taking an
additional one for memory error handling, which is not enough because
there's a time window where compound pages have non-zero refcount during
hugetlb page initialization.
So make __get_hwpoison_page() check page status a bit more for hugetlb
pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags
under hugetlb_lock makes sure that the hugetlb page is not transitive.
It's notable that another new function, HWPoisonHandlable(), is helpful
to prevent a race against other transitive page states (like a generic
compound page just before PageHuge becomes true).
Linus Torvalds [Wed, 16 Jun 2021 16:03:52 +0000 (09:03 -0700)]
Merge tag 'dmaengine-fix-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"A bunch of driver fixes, notably:
- More idxd fixes for driver unregister, error handling and bus
assignment
- HAS_IOMEM depends fix for few drivers
- lock fix in pl330 driver
- xilinx drivers fixes for initialize registers, missing dependencies
and limiting descriptor IDs
- mediatek descriptor management fixes"
* tag 'dmaengine-fix-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: mediatek: use GFP_NOWAIT instead of GFP_ATOMIC in prep_dma
dmaengine: mediatek: do not issue a new desc if one is still current
dmaengine: mediatek: free the proper desc in desc_free handler
dmaengine: ipu: fix doc warning in ipu_irq.c
dmaengine: rcar-dmac: Fix PM reference leak in rcar_dmac_probe()
dmaengine: idxd: Fix missing error code in idxd_cdev_open()
dmaengine: stedma40: add missing iounmap() on error in d40_probe()
dmaengine: SF_PDMA depends on HAS_IOMEM
dmaengine: QCOM_HIDMA_MGMT depends on HAS_IOMEM
dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM
dmaengine: idxd: Add missing cleanup for early error out in probe call
dmaengine: xilinx: dpdma: Limit descriptor IDs to 16 bits
dmaengine: xilinx: dpdma: Add missing dependencies to Kconfig
dmaengine: stm32-mdma: fix PM reference leak in stm32_mdma_alloc_chan_resourc()
dmaengine: zynqmp_dma: Fix PM reference leak in zynqmp_dma_alloc_chan_resourc()
dmaengine: xilinx: dpdma: initialize registers before request_irq
dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc
dmaengine: fsl-dpaa2-qdma: Fix error return code in two functions
dmaengine: idxd: add missing dsa driver unregister
dmaengine: idxd: add engine 'struct device' missing bus type assignment
Linus Torvalds [Wed, 16 Jun 2021 15:57:44 +0000 (08:57 -0700)]
Merge tag 'clang-features-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO fix from Kees Cook:
"It seems Clang has been scrubbing through the missing LTO IR flags for
Clang 13, and the last of these 'only with LTO' flags is fixed now.
I've asked that they please consider making these changes in a less
'break all the Clang kernel builds' kind of way in the future. :P
Summary:
- The '-warn-stack-size' option under LTO has moved in Clang 13 (Tor
Vic)"
* tag 'clang-features-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
Makefile: lto: Pass -warn-stack-size only on LLD < 13.0.0