Yury Norov [Tue, 10 May 2022 01:54:23 +0000 (18:54 -0700)]
KVM: x86: hyper-v: replace bitmap_weight() with hweight64()
kvm_hv_flush_tlb() applies bitmap API to a u64 variable valid_bank_mask.
Since valid_bank_mask has a fixed size, we can use hweight64() and avoid
excessive bloating.
Yury Norov [Thu, 19 May 2022 17:15:04 +0000 (10:15 -0700)]
KVM: x86: hyper-v: fix type of valid_bank_mask
In kvm_hv_flush_tlb(), valid_bank_mask is declared as unsigned long,
but is used as u64, which is wrong for i386, and has been spotted by
LKP after applying "KVM: x86: hyper-v: replace bitmap_weight() with
hweight64()"
drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
The smu_v1X_0_set_allowed_mask() uses bitmap_copy() to convert
bitmap to 32-bit array. This may be wrong due to endiannes issues.
Fix it by switching to bitmap_{from,to}_arr32.
Manipulating 64-bit arrays with bitmap functions is potentially dangerous
because on 32-bit BE machines the order of halfwords doesn't match.
Another issue is that compiler may throw a warning about out-of-boundary
access.
This patch adds bitmap_{from,to}_arr64 functions in addition to existing
bitmap_{from,to}_arr32.
lib/bitmap: extend comment for bitmap_(from,to)_arr32()
On LE systems bitmaps are naturally ordered, therefore we can potentially
use bitmap_copy routines when converting from 32-bit arrays, even if host
system is 64-bit. But it may lead to out-of-bond access due to unsafe
typecast, and the bitmap_(from,to)_arr32 comment doesn't explain that
clearly
The order of the arguments in function documentation doesn't fit the
implementation. Change the documentation so that it corresponds to the
code. This prevent people to get confused when reading the documentation.
Yury Norov [Sun, 23 Jan 2022 18:39:25 +0000 (10:39 -0800)]
MAINTAINERS: add cpumask and nodemask files to BITMAP_API
cpumask and nodemask APIs are thin wrappers around basic bitmap API, and
corresponding files are not formally maintained. This patch adds them to
BITMAP_API section, so that bitmap folks would have closer look at it.
Yury Norov [Sun, 23 Jan 2022 18:38:57 +0000 (10:38 -0800)]
arch/x86: replace nodes_weight with nodes_empty where appropriate
mm code calls nodes_weight() to check if any bit of a given nodemask is
set. We can do it more efficiently with nodes_empty() because nodes_empty()
stops traversing the nodemask as soon as it finds first set bit, while
nodes_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:56 +0000 (10:38 -0800)]
mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
mm/vmstat.c code calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:55 +0000 (10:38 -0800)]
clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
clocksource_verify_percpu() calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:51 +0000 (10:38 -0800)]
genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:50 +0000 (10:38 -0800)]
irq: mips: replace cpumask_weight with cpumask_empty where appropriate
bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:48 +0000 (10:38 -0800)]
drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:46 +0000 (10:38 -0800)]
arch/x86: replace cpumask_weight with cpumask_empty where appropriate
In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:45 +0000 (10:38 -0800)]
arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
setup_arch() calls cpumask_weight() to check if any bit of a given cpumask
is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Yury Norov [Sun, 23 Jan 2022 18:38:44 +0000 (10:38 -0800)]
arch/alpha: replace cpumask_weight with cpumask_empty where appropriate
common_shutdown_1() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Huacai Chen [Tue, 31 May 2022 10:04:12 +0000 (18:04 +0800)]
LoongArch: Add Non-Uniform Memory Access (NUMA) support
Add Non-Uniform Memory Access (NUMA) support for LoongArch. LoongArch
has 48-bit physical address, but the HyperTransport I/O bus only support
40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to
extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical
address space and embed it into 40-bit. In the 40-bit dma address, node
id offset can be read from the LS7A_DMA_CFG register.
Huacai Chen [Tue, 31 May 2022 10:04:11 +0000 (18:04 +0800)]
LoongArch: Add misc common routines
Add some misc common routines for LoongArch, including: asm-offsets
routines, futex functions, i/o memory access functions, frame-buffer
functions, procfs information display, etc.
Huacai Chen [Tue, 31 May 2022 10:04:11 +0000 (18:04 +0800)]
LoongArch: Add system call support
Add system call support and related uaccess.h for LoongArch.
Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
A: The latest glibc release has some basic support for clone3 but it is
not complete. E.g., pthread_create() and spawni() have converted to
use clone3 but fork() will still use clone. Moreover, some seccomp
related applications can still not work perfectly with clone3. E.g.,
Chromium sandbox cannot work at all and there is no solution for it,
which is more terrible than the fork() story [1].
Huacai Chen [Tue, 31 May 2022 10:04:11 +0000 (18:04 +0800)]
LoongArch: Add boot and setup routines
Add basic boot, setup and reset routines for LoongArch. Now, LoongArch
machines use UEFI-based firmware. The firmware passes configuration
information to the kernel via ACPI and DMI/SMBIOS.
Currently an existing interface between the kernel and the bootloader
is implemented. Kernel gets 2 values from the bootloader, passed in
registers a0 and a1; a0 is an "EFI boot flag" distinguishing UEFI and
non-UEFI firmware, while a1 is a pointer to an FDT with systable,
memmap, cmdline and initrd information.
The standard UEFI boot protocol (EFISTUB) will be added later.
Huacai Chen [Tue, 31 May 2022 10:04:10 +0000 (18:04 +0800)]
LoongArch: Add writecombine support for drm
LoongArch maintains cache coherency in hardware, but its WUC attribute
(Weak-ordered UnCached, which is similar to WC) is out of the scope of
cache coherency machanism. This means WUC can only used for write-only
memory regions.
Add some basic documentation (zh_CN version) for LoongArch. LoongArch is
a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch includes a
reduced 32-bit version (LA32R), a standard 32-bit version (LA32S) and a
64-bit version (LA64).
Add some basic documentation for LoongArch. LoongArch is a new RISC ISA,
which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit
version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version
(LA64).
Huacai Chen [Tue, 31 May 2022 10:04:10 +0000 (18:04 +0800)]
irqchip: Adjust Kconfig for Loongson
HTVEC will be shared by both MIPS-based and LoongArch-based Loongson
processors (not only Loongson-3), so we adjust its description. HTPIC is
only used by MIPS-based Loongson, so we add a MIPS dependency.
Mikulas Patocka [Wed, 1 Jun 2022 17:18:22 +0000 (13:18 -0400)]
parisc: fix a crash with multicore scheduler
With the kernel 5.18, the system will hang on boot if it is compiled with
CONFIG_SCHED_MC. The last printed message is "Brought up 1 node, 1 CPU".
The crash happens in sd_init
tl->mask (which is cpu_coregroup_mask) returns an empty mask. This happens
because cpu_topology[0].core_sibling is empty.
Consequently, sd_span is set to an empty mask
sd_id = cpumask_first(sd_span) sets sd_id == NR_CPUS (because the mask is
empty)
sd->shared = *per_cpu_ptr(sdd->sds, sd_id); sets sd->shared to NULL
because sd_id is out of range
atomic_inc(&sd->shared->ref); crashes without printing anything
We can fix it by calling reset_cpu_topology() from init_cpu_topology() -
this will initialize the sibling masks on CPUs, so that they're not empty.
This patch also removes the variable "dualcores_found", it is useless,
because during boot, init_cpu_topology is called before
store_cpu_topology. Thus, set_sched_topology(parisc_mc_topology) is never
called. We don't need to call it at all because default_topology in
kernel/sched/topology.c contains the same items as parisc_mc_topology.
Note that we should not call store_cpu_topology() from init_per_cpu()
because it is called too early in the kernel initialization process and it
results in the message "Failure to register CPU0 device". Before this
patch, store_cpu_topology() would exit immediatelly because
cpuid_topo->core id was uninitialized and it was 0.
Damien Le Moal [Fri, 3 Jun 2022 02:19:05 +0000 (11:19 +0900)]
block: Fix potential deadlock in blk_ia_range_sysfs_show()
When being read, a sysfs attribute is already protected against removal
with the kobject node active reference counter. As a result, in
blk_ia_range_sysfs_show(), there is no need to take the queue sysfs
lock when reading the value of a range attribute. Using the queue sysfs
lock in this function creates a potential deadlock situation with the
disk removal, something that a lockdep signals with a splat when the
device is removed:
Dave Airlie [Fri, 3 Jun 2022 01:35:46 +0000 (11:35 +1000)]
Merge tag 'drm/tegra/for-5.19-prep-work' of https://gitlab.freedesktop.org/drm/tegra into drm-next
drm/tegra: Preparatory work for v5.19
This contains a single patch from a series that's ready to go for v5.10
but is also a shared build-time dependency for an IOMMU series that is
planned for v5.20. The idea is to take this into v5.19 to fulfill that
dependency and remove the need for close coordination for the two
series.
Dave Airlie [Fri, 3 Jun 2022 01:19:11 +0000 (11:19 +1000)]
Merge tag 'msm-next-5.19-fixes-06-01' of https://gitlab.freedesktop.org/abhinavk/msm into drm-next
5.19 fixes for msm-next
- Fix to add minimum ICC vote in the msm_mdss pm_resume path to address
bootup splats
- Fix to avoid dereferencing without checking in WB encoder
- Fix to avoid crash during suspend in DP driver by ensuring interrupt
mask bits are updated
- Remove unused code from dpu_encoder_virt_atomic_check()
- Fix to remove redundant init of dsc variable
riscv: Move alternative length validation into subsection
After commit 49b290e430d3 ("riscv: prevent compressed instructions in
alternatives"), builds with LLVM's integrated assembler fail:
In file included from arch/riscv/mm/init.c:10:
In file included from ./include/linux/mm.h:29:
In file included from ./include/linux/pgtable.h:6:
In file included from ./arch/riscv/include/asm/pgtable.h:108:
./arch/riscv/include/asm/tlbflush.h:23:2: error: expected assembly-time absolute expression
ALT_FLUSH_TLB_PAGE(__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory"));
^
./arch/riscv/include/asm/errata_list.h:33:5: note: expanded from macro 'ALT_FLUSH_TLB_PAGE'
asm(ALTERNATIVE("sfence.vma %0", "sfence.vma", SIFIVE_VENDOR_ID, \
^
./arch/riscv/include/asm/alternative-macros.h:187:2: note: expanded from macro 'ALTERNATIVE'
_ALTERNATIVE_CFG(old_content, new_content, vendor_id, errata_id, CONFIG_k)
^
./arch/riscv/include/asm/alternative-macros.h:113:2: note: expanded from macro '_ALTERNATIVE_CFG'
__ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, IS_ENABLED(CONFIG_k))
^
./arch/riscv/include/asm/alternative-macros.h:110:2: note: expanded from macro '__ALTERNATIVE_CFG'
ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c)
^
./arch/riscv/include/asm/alternative-macros.h:99:3: note: expanded from macro 'ALT_NEW_CONTENT'
".org . - (889b - 888b) + (887b - 886b)\n" \
^
<inline asm>:26:6: note: instantiated into assembly here
.org . - (889b - 888b) + (887b - 886b)
^
This error happens because LLVM's integrated assembler has a one-pass
design, which means it cannot figure out the instruction lengths when
the .org directive is outside of the subsection that contains the
instructions, which was changed by the .option directives added by the
above change.
Move the .org directives before the .previous directive so that these
directives are always within the same subsection, which resolves the
failures and does not introduce any new issues with GNU as. This was
done for arm64 in commit 966a0acce2fc ("arm64/alternatives: move length
validation inside the subsection") and commit 22315a2296f4 ("arm64:
alternatives: Move length validation in alternative_{insn, endif}").
While there is no error from the assembly versions of the macro, they
appear to have the same problem so just make the same change there as
well so that there are no problems in the future.
Linus Torvalds [Thu, 2 Jun 2022 22:36:06 +0000 (15:36 -0700)]
Merge tag 'docs-5.19-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of late-arriving documentation fixes and the addition of an
SVG tux logo which, I'm assured, we're going to want"
* tag 'docs-5.19-2' of git://git.lwn.net/linux:
documentation: Format button_dev as a pointer.
docs: add SVG version of the Linux logo
docs: move Linux logo into a new `images` folder
docs: blockdev: change title to match section content
docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0
Linus Torvalds [Thu, 2 Jun 2022 22:32:26 +0000 (15:32 -0700)]
Merge tag 'asm-generic-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
"The header cleanup series from Masahiro Yamada ended up causing some
regressions in the ABI because of an ambigous uid_t type.
This was only caught after the original patches got merged, but at
least the fixes are trivial and hopefully complete"
* tag 'asm-generic-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
binder: fix sender_euid type in uapi header
sparc: fix mis-use of __kernel_{uid,gid}_t in uapi/asm/stat.h
powerpc: use __kernel_{uid,gid}32_t in uapi/asm/stat.h
mips: use __kernel_{uid,gid}32_t in uapi/asm/stat.h
Linus Torvalds [Thu, 2 Jun 2022 22:27:44 +0000 (15:27 -0700)]
Merge tag 'arm-late-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more ARM SoC updates from Arnd Bergmann:
"This is the second part of the general SoC updates, containing
everything that did not make it in the initial pull request, or that
came in as a bugfix later.
- Devicetree updates for SoCFPGA, ASPEED, AT91 and Rockchip,
including a new machine using an ASPEED BMC.
- More DT fixes from Krzysztof Kozlowski across platforms
- A new SoC platform for the GXP baseboard management controller,
used in current server products from HPE"
* tag 'arm-late-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (56 commits)
ARM: configs: Enable more audio support for i.MX
tee: optee: Pass a pointer to virt_addr_valid()
arm64: dts: rockchip: rename Quartz64-A bluetooth gpios
arm64: dts: rockchip: add clocks property to cru node rk3368
arm64: dts: rockchip: add clocks property to cru node rk3308
arm64: dts: rockchip: add clocks to rk356x cru
ARM: dts: rockchip: add clocks property to cru node rk3228
ARM: dts: rockchip: add clocks property to cru node rk3036
ARM: dts: rockchip: add clocks property to cru node rk3066a/rk3188
ARM: dts: rockchip: add clocks property to cru node rk3288
ARM: dts: rockchip: Remove "amba" bus nodes from rv1108
ARM: dts: rockchip: add clocks property to cru node rv1108
arm64: dts: sprd: use new 'dma-channels' property
ARM: dts: da850: use new 'dma-channels' property
ARM: dts: pxa: use new 'dma-channels/requests' properties
soc: ixp4xx/qmgr: Fix unused match warning
ARM: ep93xx: Make ts72xx_register_flash() static
ARM: configs: enable support for Kontron KSwitch D10
ep93xx: clock: Do not return the address of the freed memory
arm64: dts: intel: add device tree for n6000
...
Linus Torvalds [Thu, 2 Jun 2022 22:23:54 +0000 (15:23 -0700)]
Merge tag 'arm-multiplatform-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more ARM multiplatform updates from Arnd Bergmann:
"The second part of the multiplatform changes now converts the
Intel/Marvell PXA platform along with the rest. The patches went
through several rebases before the merge window as bugs were found, so
they remained separate.
This has to touch a lot of drivers, in particular the touchscreen,
pcmcia, sound and clk bits, to detach the driver files from the
platform and board specific header files"
* tag 'arm-multiplatform-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (48 commits)
ARM: pxa/mmp: remove traces of plat-pxa
ARM: pxa: convert to multiplatform
ARM: pxa/sa1100: move I/O space to PCI_IOBASE
ARM: pxa: remove support for MTD_XIP
ARM: pxa: move mach/*.h to mach-pxa/
ARM: PXA: fix multi-cpu build of xsc3
ARM: pxa: move plat-pxa to drivers/soc/
ARM: mmp: rename pxa_register_device
ARM: mmp: remove tavorevb board support
ARM: pxa: remove unused mach/bitfield.h
ARM: pxa: move clk register definitions to driver
ARM: pxa: move smemc register access from clk to platform
cpufreq: pxa3: move clk register access to clk driver
ARM: pxa: remove get_clk_frequency_khz()
ARM: pxa: pcmcia: move smemc configuration back to arch
ASoC: pxa: i2s: use normal MMIO accessors
ASoC: pxa: ac97: use normal MMIO accessors
ASoC: pxa: use pdev resource for FIFO regs
Input: wm97xx - get rid of irq_enable method in wm97xx_mach_ops
Input: wm97xx - switch to using threaded IRQ
...
Jisheng Zhang [Mon, 16 May 2022 14:32:04 +0000 (22:32 +0800)]
riscv: mm: init: make pt_ops_set_[early|late|fixmap] static
These three functions are only used in init.c, so make them static.
Fix W=1 warnings like below:
arch/riscv/mm/init.c:721:13: warning: no previous prototype for function
'pt_ops_set_early' [-Wmissing-prototypes]
void __init pt_ops_set_early(void)
^
- Revert "net: af_key: add check for pfkey_broadcast in function
pfkey_process"
- tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd
- nf_tables: disallow non-stateful expression in sets earlier
- nft_limit: clone packet limits' cost value
- nf_tables: double hook unregistration in netns path
- ping6: fix ping -6 with interface name
Previous releases - always broken:
- sched: fix memory barriers to prevent skbs from getting stuck in
lockless qdiscs
- neigh: set lower cap for neigh_managed_work rearming, avoid
constantly scheduling the probe work
- bpf: fix probe read error on big endian in ___bpf_prog_run()
- amt: memory leak and error handling fixes
Misc:
- ipv6: expand & rename accept_unsolicited_na to accept_untracked_na"
* tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
net/af_packet: make sure to pull mac header
net: add debug info to __skb_pull()
net: CONFIG_DEBUG_NET depends on CONFIG_NET
stmmac: intel: Add RPL-P PCI ID
net: stmmac: use dev_err_probe() for reporting mdio bus registration failure
tipc: check attribute length for bearer name
ice: fix access-beyond-end in the switch code
nfp: remove padding in nfp_nfdk_tx_desc
ax25: Fix ax25 session cleanup problems
net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline
sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels
sfc/siena: fix considering that all channels have TX queues
socket: Don't use u8 type in uapi socket.h
net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()
net: ping6: Fix ping -6 with interface name
macsec: fix UAF bug for real_dev
octeontx2-af: fix error code in is_valid_offset()
wifi: mac80211: fix use-after-free in chanctx code
bonding: guard ns_targets by CONFIG_IPV6
tcp: tcp_rtx_synack() can be called from process context
...
Saravana Kannan [Thu, 2 Jun 2022 03:56:52 +0000 (20:56 -0700)]
module: Fix prefix for module.sig_enforce module param
Commit cfc1d277891e ("module: Move all into module/") changed the prefix
of the module param by moving/renaming files. A later commit also moves
the module_param() into a different file, thereby changing the prefix
yet again.
This would break kernel cmdline compatibility and also userspace
compatibility at /sys/module/module/parameters/sig_enforce.
Stephen Boyd [Thu, 2 Jun 2022 02:21:09 +0000 (19:21 -0700)]
arm64: Initialize jump labels before setup_machine_fdt()
A static key warning splat appears during early boot on arm64 systems
that credit randomness from devicetrees that contain an "rng-seed"
property. This is because setup_machine_fdt() is called before
jump_label_init() during setup_arch(). Let's swap the order of these two
calls so that jump labels are initialized before the devicetree is
unflattened and the rng seed is credited.
Linus Torvalds [Thu, 2 Jun 2022 19:11:25 +0000 (12:11 -0700)]
Merge tag 'pci-v5.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:
- Revert brcmstb patches that broke booting on Raspberry Pi Compute
Module 4 (Bjorn Helgaas)
- Fix bridge_d3_blacklist[] error that overwrote the existing Gigabyte
X299 entry instead of adding a new one (Bjorn Helgaas)
- Update Lorenzo Pieralisi's email address in MAINTAINERS (Lorenzo
Pieralisi)
* tag 'pci-v5.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Update Lorenzo Pieralisi's email address
PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299
Revert "PCI: brcmstb: Split brcm_pcie_setup() into two funcs"
Revert "PCI: brcmstb: Add mechanism to turn on subdev regulators"
Revert "PCI: brcmstb: Add control of subdevice voltage regulators"
Revert "PCI: brcmstb: Do not turn off WOL regulators on suspend"
x86/sgx: Set active memcg prior to shmem allocation
When the system runs out of enclave memory, SGX can reclaim EPC pages
by swapping to normal RAM. These backing pages are allocated via a
per-enclave shared memory area. Since SGX allows unlimited over
commit on EPC memory, the reclaimer thread can allocate a large
number of backing RAM pages in response to EPC memory pressure.
When the shared memory backing RAM allocation occurs during
the reclaimer thread context, the shared memory is charged to
the root memory control group, and the shmem usage of the enclave
is not properly accounted for, making cgroups ineffective at
limiting the amount of RAM an enclave can consume.
For example, when using a cgroup to launch a set of test
enclaves, the kernel does not properly account for 50% - 75% of
shmem page allocations on average. In the worst case, when
nearly all allocations occur during the reclaimer thread, the
kernel accounts less than a percent of the amount of shmem used
by the enclave's cgroup to the correct cgroup.
SGX stores a list of mm_structs that are associated with
an enclave. Pick one of them during reclaim and charge that
mm's memcg with the shmem allocation. The one that gets picked
is arbitrary, but this list almost always only has one mm. The
cases where there is more than one mm with different memcg's
are not worth considering.
Create a new function - sgx_encl_alloc_backing(). This function
is used whenever a new backing storage page needs to be
allocated. Previously the same function was used for page
allocation as well as retrieving a previously allocated page.
Prior to backing page allocation, if there is a mm_struct associated
with the enclave that is requesting the allocation, it is set
as the active memory control group.
[ dhansen: - fix merge conflict with ELDU fixes
- check against actual ksgxd_tsk, not ->mm ]
net: stmmac: use dev_err_probe() for reporting mdio bus registration failure
I have a board where these two lines are always printed during boot:
imx-dwmac 30bf0000.ethernet: Cannot register the MDIO bus
imx-dwmac 30bf0000.ethernet: stmmac_dvr_probe: MDIO bus (id: 1) registration failed
It's perfectly fine, and the device is successfully (and silently, as
far as the console goes) probed later.
Use dev_err_probe() instead, which will demote these messages to debug
level (thus removing the alarming messages from the console) when the
error is -EPROBE_DEFER, and also has the advantage of including the
error code if/when it happens to be something other than -EPROBE_DEFER.
While here, add the missing \n to one of the format strings.
Linus Torvalds [Thu, 2 Jun 2022 15:59:39 +0000 (08:59 -0700)]
Merge tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"A big pile of assorted fixes and improvements for the filesystem with
nothing in particular standing out, except perhaps that the fact that
the MDS never really maintained atime was made official and thus it's
no longer updated on the client either.
We also have a MAINTAINERS update: Jeff is transitioning his
filesystem maintainership duties to Xiubo"
* tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client: (23 commits)
MAINTAINERS: move myself from ceph "Maintainer" to "Reviewer"
ceph: fix decoding of client session messages flags
ceph: switch TASK_INTERRUPTIBLE to TASK_KILLABLE
ceph: remove redundant variable ino
ceph: try to queue a writeback if revoking fails
ceph: fix statfs for subdir mounts
ceph: fix possible deadlock when holding Fwb to get inline_data
ceph: redirty the page for writepage on failure
ceph: try to choose the auth MDS if possible for getattr
ceph: disable updating the atime since cephfs won't maintain it
ceph: flush the mdlog for filesystem sync
ceph: rename unsafe_request_wait()
libceph: use swap() macro instead of taking tmp variable
ceph: fix statx AT_STATX_DONT_SYNC vs AT_STATX_FORCE_SYNC check
ceph: no need to invalidate the fscache twice
ceph: replace usage of found with dedicated list iterator variable
ceph: use dedicated list iterator variable
ceph: update the dlease for the hashed dentry when removing
ceph: stop retrying the request when exceeding 256 times
ceph: stop forwarding the request when exceeding 256 times
...
Linus Torvalds [Thu, 2 Jun 2022 15:55:01 +0000 (08:55 -0700)]
Merge tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching cleanup from Petr Mladek:
- Remove duplicated livepatch code [Christophe]
* tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
Carlos Llamas [Wed, 1 Jun 2022 01:00:17 +0000 (01:00 +0000)]
binder: fix sender_euid type in uapi header
The {pid,uid}_t fields of struct binder_transaction were recently
replaced to use kernel types in commit 169adc2b6b3c ("android/binder.h:
add linux/android/binder(fs).h to UAPI compile-test coverage").
However, using __kernel_uid_t here breaks backwards compatibility in
architectures using 16-bits for this type, since glibc and some others
still expect a 32-bit uid_t. Instead, let's use __kernel_uid32_t which
avoids this compatibility problem.
Fixes: 169adc2b6b3c ("android/binder.h: add linux/android/binder(fs).h to UAPI compile-test coverage") Reported-by: Christopher Ferris <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Acked-by: Todd Kjos <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
Linus Torvalds [Thu, 2 Jun 2022 15:46:30 +0000 (08:46 -0700)]
Merge tag 'memblock-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock test suite updates from Mike Rapoport:
"Comment updates for memblock test suite
Update comments in the memblock tests so that they will have
consistent style"
* tag 'memblock-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock tests: remove completed TODO item
memblock tests: update style of comments for memblock_free_*() functions
memblock tests: update style of comments for memblock_remove_*() functions
memblock tests: update style of comments for memblock_reserve_*() functions
memblock tests: update style of comments for memblock_add_*() functions
Dan Carpenter [Thu, 2 Jun 2022 11:02:18 +0000 (14:02 +0300)]
i2c: ismt: prevent memory corruption in ismt_access()
The "data->block[0]" variable comes from the user and is a number
between 0-255. It needs to be capped to prevent writing beyond the end
of dma_buffer[].
Fixes: 5e9a97b1f449 ("i2c: ismt: Adding support for I2C_SMBUS_BLOCK_PROC_CALL") Reported-and-tested-by: Zheyu Ma <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Masahiro Yamada [Wed, 1 Jun 2022 18:19:40 +0000 (03:19 +0900)]
powerpc: use __kernel_{uid,gid}32_t in uapi/asm/stat.h
Commit c01013a2f8dd ("powerpc: add asm/stat.h to UAPI compile-test
coverage") converted as follows:
uid_t --> __kernel_uid_t
gid_t --> __kernel_gid_t
The bit width of __kernel_{uid,gid}_t is 16 or 32-bits depending on
architectures.
PPC uses 32-bits for them as in include/uapi/asm-generic/posix_types.h,
so the previous conversion is probably fine, but let's stick to the
arch-independent conversion just in case.
The safe replacements across all architectures are:
Masahiro Yamada [Wed, 1 Jun 2022 18:19:39 +0000 (03:19 +0900)]
mips: use __kernel_{uid,gid}32_t in uapi/asm/stat.h
Commit 8c1a381a4fbb ("mips: add asm/stat.h to UAPI compile-test
coverage") converted as follows:
uid_t --> __kernel_uid_t
gid_t --> __kernel_gid_t
The bit width of __kernel_{uid,gid}_t is 16 or 32-bits depending on
architectures.
MIPS uses 32-bits for them as in include/uapi/asm-generic/posix_types.h,
so the previous conversion is probably fine, but let's stick to the
arch-independent conversion just in case.
The safe replacements across all architectures are:
The 'unevaluatedProperties' schema checks is not fully working and doesn't
catch some cases where there's a $ref to another schema. A fix is pending,
but results in new warnings in examples.
The Apple PCIe host schema is missing 'power-domains' in the schema.
The example has 3 power domains. However, this is wrong too as actual
dts files have a single power domain and Sven confirmed 1 is correct.
Damien Le Moal [Thu, 2 Jun 2022 12:03:44 +0000 (21:03 +0900)]
block: null_blk: Fix null_zone_write()
The bio and rq fields of struct nullb_cmd are now overlapping in a
union. So we cannot use a test on ->bio being non-NULL to detect the
NULL_Q_BIO queue mode. null_zone_write() use such broken test to set the
sector position of a zone append write in the command bio or request.
When the null_blk device uses the NULL_Q_MQ queue mode,
null_zone_write() wrongly end up setting the bio sector position,
resulting in the command request to be broken and random crashes
following.
Fix this by testing the device queue mode directly.
Catalin Marinas [Wed, 1 Jun 2022 17:13:38 +0000 (18:13 +0100)]
arm64: Remove the __user annotation for the restore_za_context() argument
The struct user_ctx *user pointer passed to restore_za_context() is not
a user point but a structure containing several __user pointers. Remove
the __user annotation.
Global `-Warray-bounds` enablement revealed some problems, one of
which is the way we define and use AQC rules messages.
In fact, they have a shared header, followed by the actual message,
which can be of one of several different formats. So it is
straightforward enough to define that header as a separate struct
and then embed it into message structures as needed, but currently
all the formats reside in one union coupled with the header. Then,
the code allocates only the memory needed for a particular message
format, leaving the union potentially incomplete.
There are no actual reads or writes beyond the end of an allocated
chunk, but at the same time, the whole implementation is fragile and
backed by an equilibrium rather than strong type and memory checks.
Define the structures the other way around: one for the common
header and the rest for the actual formats with the header embedded.
There are no places where several union members would be used at the
same time anyway. This allows to use proper struct_size() and let
the compiler know what is going to be done.
Finally, unsilence `-Warray-bounds` back for ice_switch.c.
Other little things worth mentioning:
* &ice_sw_rule_vsi_list_query is not used anywhere, remove it. It's
weird anyway to talk to hardware with purely kernel types
(bitmaps);
* expand the ICE_SW_RULE_*_SIZE() macros to pass a structure
variable name to struct_size() to let it do strict typechecking;
* rename ice_sw_rule_lkup_rx_tx::hdr to ::hdr_data to keep ::hdr
for the header structure to have the same name for it constistenly
everywhere;
* drop the duplicate of %ICE_SW_RULE_RX_TX_NO_HDR_SIZE residing in
ice_switch.h.
Fei Qin [Wed, 1 Jun 2022 08:34:49 +0000 (10:34 +0200)]
nfp: remove padding in nfp_nfdk_tx_desc
NFDK firmware supports 48-bit dma addressing and
parses 16 high bits of dma addresses.
In nfp_nfdk_tx_desc, dma related structure and tso
related structure are union. When "mss" be filled
with nonzero value due to enable tso, the memory used
by "padding" may be also filled. Then, firmware may
parse wrong dma addresses which causes TX watchdog
timeout problem.
This patch removes padding and unifies the dma_addr_hi
bits with the one in firmware. nfp_nfdk_tx_desc_set_dma_addr
is also added to match this change.
Duoming Zhou [Mon, 30 May 2022 15:21:58 +0000 (23:21 +0800)]
ax25: Fix ax25 session cleanup problems
There are session cleanup problems in ax25_release() and
ax25_disconnect(). If we setup a session and then disconnect,
the disconnected session is still in "LISTENING" state that
is shown below.
Active AX.25 sockets
Dest Source Device State Vr/Vs Send-Q Recv-Q
DL9SAU-4 DL9SAU-3 ??? LISTENING 000/000 0 0
DL9SAU-3 DL9SAU-4 ??? LISTENING 000/000 0 0
The first reason is caused by del_timer_sync() in ax25_release().
The timers of ax25 are used for correct session cleanup. If we use
ax25_release() to close ax25 sessions and ax25_dev is not null,
the del_timer_sync() functions in ax25_release() will execute.
As a result, the sessions could not be cleaned up correctly,
because the timers have stopped.
In order to solve this problem, this patch adds a device_up flag
in ax25_dev in order to judge whether the device is up. If there
are sessions to be cleaned up, the del_timer_sync() in
ax25_release() will not execute. What's more, we add ax25_cb_del()
in ax25_kill_by_device(), because the timers have been stopped
and there are no functions that could delete ax25_cb if we do not
call ax25_release(). Finally, we reorder the position of
ax25_list_lock in ax25_cb_del() in order to synchronize among
different functions that call ax25_cb_del().
The second reason is caused by improper check in ax25_disconnect().
The incoming ax25 sessions which ax25->sk is null will close
heartbeat timer, because the check "if(!ax25->sk || ..)" is
satisfied. As a result, the session could not be cleaned up properly.
In order to solve this problem, this patch changes the improper
check to "if(ax25->sk && ..)" in ax25_disconnect().
What`s more, the ax25_disconnect() may be called twice, which is
not necessary. For example, ax25_kill_by_device() calls
ax25_disconnect() and sets ax25->state to AX25_STATE_0, but
ax25_release() calls ax25_disconnect() again.
In order to solve this problem, this patch add a check in
ax25_release(). If the flag of ax25->sk equals to SOCK_DEAD,
the ax25_disconnect() in ax25_release() should not be executed.
Jan Kara [Thu, 2 Jun 2022 08:12:42 +0000 (10:12 +0200)]
block: fix bio_clone_blkg_association() to associate with proper blkcg_gq
Commit d92c370a16cb ("block: really clone the block cgroup in
bio_clone_blkg_association") changed bio_clone_blkg_association() to
just clone bio->bi_blkg reference from source to destination bio. This
is however wrong if the source and destination bios are against
different block devices because struct blkcg_gq is different for each
bdev-blkcg pair. This will result in IOs being accounted (and throttled
as a result) multiple times against the same device (src bdev) while
throttling of the other device (dst bdev) is ignored. In case of BFQ the
inconsistency can even result in crashes in bfq_bic_update_cgroup().
Fix the problem by looking up correct blkcg_gq for the cloned bio.
Damien Le Moal [Thu, 2 Jun 2022 07:51:59 +0000 (16:51 +0900)]
block: remove useless BUG_ON() in blk_mq_put_tag()
Since the if condition in blk_mq_put_tag() checks that the tag to put is
not a reserved one, the BUG_ON() check in the else branch checking if
the tag is indeed a reserved one is useless. Remove it.
Jens Axboe [Thu, 2 Jun 2022 07:48:35 +0000 (01:48 -0600)]
Merge tag 'nvme-5.19-2022-06-02' of git://git.infradead.org/nvme into for-5.19/drivers
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- set controller enable bit in a separate write (Niklas Cassel)
- disable namespace identifiers for the MAXIO MAP1001 (me)
- fix a comment typo (Julia Lawall)"
* tag 'nvme-5.19-2022-06-02' of git://git.infradead.org/nvme:
nvmet: fix typo in comment
nvme: set controller enable bit in a separate write
nvme-pci: disable namespace identifiers for the MAXIO MAP1001
Uwe Kleine-König [Mon, 23 May 2022 08:39:47 +0000 (10:39 +0200)]
gpio: adp5588: Remove support for platform setup and teardown callbacks
If the teardown callback failed in the gpio driver, it fails to free the
irq (if there is one). The device is removed anyhow. If later on the irq
triggers, all sorts of unpleasant things might happen (e.g. accessing
the struct adp5588_gpio which is already freed in the meantime or starting
i2c bus transfers for an unregistered device). Even before irq support was
added to this driver, exiting early was wrong; back then it failed to
unregister the gpiochip.
Fortunately these callbacks aren't used any more since at least blackfin
was removed in 2018. So just drop them.
Note that they are not removed from struct adp5588_gpio_platform_data
because the keyboard driver adp5588-keys.c also makes use of them.
(I didn't check if the callbacks might have been called twice, maybe there
is another reason hidden to better not call these functions.)
This patch is a preparation for making i2c remove callbacks return void.
Fixes: 80884094e344 ("gpio: adp5588-gpio: new driver for ADP5588 GPIO expanders") Signed-off-by: Uwe Kleine-König <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Acked-by: Michael Hennerich <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
Jens Axboe [Thu, 2 Jun 2022 05:57:02 +0000 (23:57 -0600)]
io_uring: reinstate the inflight tracking
After some debugging, it was realized that we really do still need the
old inflight tracking for any file type that has io_uring_fops assigned.
If we don't, then trivial circular references will mean that we never get
the ctx cleaned up and hence it'll leak.
Just bring back the inflight tracking, which then also means we can
eliminate the conditional dropping of the file when task_work is queued.
Fixes: d5361233e9ab ("io_uring: drop the old style inflight file tracking") Signed-off-by: Jens Axboe <[email protected]>
Xianting Tian [Wed, 18 May 2022 01:34:28 +0000 (09:34 +0800)]
RISC-V: Mark IORESOURCE_EXCLUSIVE for reserved mem instead of IORESOURCE_BUSY
Commit 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree")
marked IORESOURCE_BUSY for reserved memory, which caused resource map
failed in subsequent operations of related driver, so remove the
IORESOURCE_BUSY flag. In order to prohibit userland mapping reserved
memory, mark IORESOURCE_EXCLUSIVE for it.
The code to reproduce the issue,
dts:
mem0: memory@a0000000 {
reg = <0x0 0xa0000000 0 0x1000000>;
no-map;
};
&test {
status = "okay";
memory-region = <&mem0>;
};
code:
np = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
ret = of_address_to_resource(np, 0, &r);
base = devm_ioremap_resource(&pdev->dev, &r);
// base = -EBUSY
Tobias Klauser [Thu, 5 May 2022 08:18:15 +0000 (10:18 +0200)]
riscv: Wire up memfd_secret in UAPI header
Move the __ARCH_WANT_MEMFD_SECRET define added in commit 7bb7f2ac24a0
("arch, mm: wire up memfd_secret system call where relevant") to
<uapi/asm/unistd.h> so __NR_memfd_secret is defined when including
<unistd.h> in userspace.
This allows the memfd_secret selftest to pass on riscv.
Samuel Holland [Sat, 30 Apr 2022 03:00:23 +0000 (22:00 -0500)]
riscv: Fix irq_work when SMP is disabled
irq_work is triggered via an IPI, but the IPI infrastructure is not
included in uniprocessor kernels. As a result, irq_work never runs.
Fall back to the tick-based irq_work implementation on uniprocessor
configurations.
Alexandre Ghiti [Mon, 6 Dec 2021 10:46:54 +0000 (11:46 +0100)]
riscv: Improve virtual kernel memory layout dump
With the arrival of sv48 and its large address space, it would be
cumbersome to statically define the unit size to use to print the different
portions of the virtual memory layout: instead, determine it dynamically.
Alexandre Ghiti [Mon, 6 Dec 2021 10:46:56 +0000 (11:46 +0100)]
riscv: Initialize thread pointer before calling C functions
Because of the stack canary feature that reads from the current task
structure the stack canary value, the thread pointer register "tp" must
be set before calling any C function from head.S: by chance, setup_vm
and all the functions that it calls does not seem to be part of the
functions where the canary check is done, but in the following commits,
some functions will.