]> Git Repo - linux.git/log
linux.git
2 years agoMerge tag 'io_uring-6.1-2022-10-28' of git://git.kernel.dk/linux
Linus Torvalds [Sun, 30 Oct 2022 01:01:16 +0000 (18:01 -0700)]
Merge tag 'io_uring-6.1-2022-10-28' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:
 "Just a fix for a locking regression introduced with the deferred
  task_work running from this merge window"

* tag 'io_uring-6.1-2022-10-28' of git://git.kernel.dk/linux:
  io_uring: unlock if __io_run_local_work locked inside
  io_uring: use io_run_local_work_locked helper

2 years agoMerge tag 'mm-hotfixes-stable-2022-10-28' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Sun, 30 Oct 2022 00:49:33 +0000 (17:49 -0700)]
Merge tag 'mm-hotfixes-stable-2022-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "Eight fix pre-6.0 bugs and the remainder address issues which were
  introduced in the 6.1-rc merge cycle, or address issues which aren't
  considered sufficiently serious to warrant a -stable backport"

* tag 'mm-hotfixes-stable-2022-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  mm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off region
  lib: maple_tree: remove unneeded initialization in mtree_range_walk()
  mmap: fix remap_file_pages() regression
  mm/shmem: ensure proper fallback if page faults
  mm/userfaultfd: replace kmap/kmap_atomic() with kmap_local_page()
  x86: fortify: kmsan: fix KMSAN fortify builds
  x86: asm: make sure __put_user_size() evaluates pointer once
  Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default
  x86/purgatory: disable KMSAN instrumentation
  mm: kmsan: export kmsan_copy_page_meta()
  mm: migrate: fix return value if all subpages of THPs are migrated successfully
  mm/uffd: fix vma check on userfault for wp
  mm: prep_compound_tail() clear page->private
  mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
  mm/page_isolation: fix clang deadcode warning
  fs/ext4/super.c: remove unused `deprecated_msg'
  ipc/msg.c: fix percpu_counter use after free
  memory tier, sysfs: rename attribute "nodes" to "nodelist"
  MAINTAINERS: git://github.com -> https://github.com for nilfs2
  mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops
  ...

2 years agoMerge tag 'powerpc-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 29 Oct 2022 17:35:17 +0000 (10:35 -0700)]
Merge tag 'powerpc-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix a case of rescheduling with user access unlocked, when preempt is
   enabled.

 - A follow-up fix for a recent fix, which could lead to IRQ state
   assertions firing incorrectly.

 - Two fixes for lockdep warnings seen when using kfence with the Hash
   MMU.

 - Two fixes for preempt warnings seen when using the Hash MMU.

 - Two fixes for the VAS coprocessor mechanism used on pseries.

 - Prevent building some of our older KVM backends when
   CONTEXT_TRACKING_USER is enabled, as it's known to cause crashes.

 - A couple of fixes for issues seen with PMU NMIs.

Thanks to Nicholas Piggin, Guenter Roeck, Frederic Barrat Haren Myneni,
Sachin Sant, and Samuel Holland.

* tag 'powerpc-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/interrupt: Fix clear of PACA_IRQS_HARD_DIS when returning to soft-masked context
  powerpc/64s/interrupt: Perf NMI should not take normal exit path
  powerpc/64/interrupt: Prevent NMI PMI causing a dangerous warning
  KVM: PPC: BookS PR-KVM and BookE do not support context tracking
  powerpc: Fix reschedule bug in KUAP-unlocked user copy
  powerpc/64s: Fix hash__change_memory_range preemption warning
  powerpc/64s: Disable preemption in hash lazy mmu mode
  powerpc/64s: make linear_map_hash_lock a raw spinlock
  powerpc/64s: make HPTE lock and native_tlbie_lock irq-safe
  powerpc/64s: Add lockdep for HPTE lock
  powerpc/pseries: Use lparcfg to reconfig VAS windows for DLPAR CPU
  powerpc/pseries/vas: Add VAS IRQ primary handler

2 years agoplatform/loongarch: laptop: Fix possible UAF and simplify generic_acpi_laptop_init()
Yang Yingliang [Sat, 29 Oct 2022 08:29:31 +0000 (16:29 +0800)]
platform/loongarch: laptop: Fix possible UAF and simplify generic_acpi_laptop_init()

Currently the return value of 'sub_driver->init' is not checked. If
sparse_keymap_setup() called in the init function fails, 'generic_
inputdev' is freed, then it will lead a UAF when using it in generic_
acpi_laptop_init(). Fix it by checking the return value and setting
generic_inputdev to NULL after free, so as to avoid double free it.

The error code in generic_subdriver_init() is always negative, so the
return of generic_subdriver_init() can be simplified.

Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
2 years agoplatform/loongarch: laptop: Adjust resume order for loongson_hotkey_resume()
Huacai Chen [Sat, 29 Oct 2022 08:29:31 +0000 (16:29 +0800)]
platform/loongarch: laptop: Adjust resume order for loongson_hotkey_resume()

Some laptops don't support SW_LID, but still have backlight control,
move backlight resuming before SW_LID event handling so as to avoid
backlight mistake due to early return.

Signed-off-by: Huacai Chen <[email protected]>
2 years agoLoongArch: BPF: Avoid declare variables in switch-case
Huacai Chen [Sat, 29 Oct 2022 08:29:31 +0000 (16:29 +0800)]
LoongArch: BPF: Avoid declare variables in switch-case

Not all compilers support declare variables in switch-case, so move
declarations to the beginning of a function. Otherwise we may get such
build errors:

arch/loongarch/net/bpf_jit.c: In function ‘emit_atomic’:
arch/loongarch/net/bpf_jit.c:362:3: error: a label can only be part of a statement and a declaration is not a statement
   u8 r0 = regmap[BPF_REG_0];
   ^~
arch/loongarch/net/bpf_jit.c: In function ‘build_insn’:
arch/loongarch/net/bpf_jit.c:727:3: error: a label can only be part of a statement and a declaration is not a statement
   u8 t7 = -1;
   ^~
arch/loongarch/net/bpf_jit.c:778:3: error: a label can only be part of a statement and a declaration is not a statement
   int ret;
   ^~~
arch/loongarch/net/bpf_jit.c:779:3: error: expected expression before ‘u64’
   u64 func_addr;
   ^~~
arch/loongarch/net/bpf_jit.c:780:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
   bool func_addr_fixed;
   ^~~~
arch/loongarch/net/bpf_jit.c:784:11: error: ‘func_addr’ undeclared (first use in this function); did you mean ‘in_addr’?
          &func_addr, &func_addr_fixed);
           ^~~~~~~~~
           in_addr
arch/loongarch/net/bpf_jit.c:784:11: note: each undeclared identifier is reported only once for each function it appears in
arch/loongarch/net/bpf_jit.c:814:3: error: a label can only be part of a statement and a declaration is not a statement
   u64 imm64 = (u64)(insn + 1)->imm << 32 | (u32)insn->imm;
   ^~~

Signed-off-by: Huacai Chen <[email protected]>
2 years agoLoongArch: Use flexible-array member instead of zero-length array
Yushan Zhou [Sat, 29 Oct 2022 08:29:31 +0000 (16:29 +0800)]
LoongArch: Use flexible-array member instead of zero-length array

Eliminate the following coccicheck warning:
./arch/loongarch/include/asm/ptrace.h:32:15-21: WARNING use flexible-array member instead

Reviewed-by: WANG Xuerui <[email protected]>
Signed-off-by: Yushan Zhou <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
2 years agoLoongArch: Remove unused kernel stack padding
Jinyang He [Sat, 29 Oct 2022 08:29:31 +0000 (16:29 +0800)]
LoongArch: Remove unused kernel stack padding

The current LoongArch kernel stack is padded as if obeying the MIPS o32
calling convention (32 bytes), signifying the port's MIPS lineage but no
longer making sense. Remove the padding for clarity.

Reviewed-by: WANG Xuerui <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
2 years agonet: dsa: fall back to default tagger if we can't load the one from DT
Vladimir Oltean [Thu, 27 Oct 2022 14:54:39 +0000 (17:54 +0300)]
net: dsa: fall back to default tagger if we can't load the one from DT

DSA tagging protocol drivers can be changed at runtime through sysfs and
at probe time through the device tree (support for the latter was added
later).

When changing through sysfs, it is assumed that the module for the new
tagging protocol was already loaded into the kernel (in fact this is
only a concern for Ocelot/Felix switches, where we have tag_ocelot.ko
and tag_ocelot_8021q.ko; for every other switch, the default and
alternative protocols are compiled within the same .ko, so there is
nothing for the user to load).

The kernel cannot currently call request_module(), because it has no way
of constructing the modalias name of the tagging protocol driver
("dsa_tag-%d", where the number is one of DSA_TAG_PROTO_*_VALUE).
The device tree only contains the string name of the tagging protocol
("ocelot-8021q"), and the only mapping between the string and the
DSA_TAG_PROTO_OCELOT_8021Q_VALUE is present in tag_ocelot_8021q.ko.
So this is a chicken-and-egg situation and dsa_core.ko has nothing based
on which it can automatically request the insertion of the module.

As a consequence, if CONFIG_NET_DSA_TAG_OCELOT_8021Q is built as module,
the switch will forever defer probing.

The long-term solution is to make DSA call request_module() somehow,
but that probably needs some refactoring.

What we can do to keep operating with existing device tree blobs is to
cancel the attempt to change the tagging protocol with the one specified
there, and to remain operating with the default one. Depending on the
situation, the default protocol might still allow some functionality
(in the case of ocelot, it does), and it's better to have that than to
fail to probe.

Fixes: deff710703d8 ("net: dsa: Allow default tag protocol to be overridden from DT")
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Heiko Thiery <[email protected]>
Reported-by: Michael Walle <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Tested-by: Michael Walle <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet: ethernet: adi: adin1110: Fix notifiers
Alexandru Tachici [Thu, 27 Oct 2022 09:56:55 +0000 (12:56 +0300)]
net: ethernet: adi: adin1110: Fix notifiers

ADIN1110 was registering netdev_notifiers on each device probe.
This leads to warnings/probe failures because of double registration
of the same notifier when to adin1110/2111 devices are connected to
the same system.

Move the registration of netdev_notifiers in module init call,
in this way multiple driver instances can use the same notifiers.

Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
Signed-off-by: Alexandru Tachici <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agoMerge branch 'a-few-corrections-for-sock_support_zc'
Jakub Kicinski [Sat, 29 Oct 2022 03:21:29 +0000 (20:21 -0700)]
Merge branch 'a-few-corrections-for-sock_support_zc'

Pavel Begunkov says:

====================
a few corrections for SOCK_SUPPORT_ZC

There are several places/cases that got overlooked in regards to
SOCK_SUPPORT_ZC. We're lacking the flag for IPv6 UDP sockets and
accepted TCP sockets. We also should clear the flag when someone
tries to hijack a socket by replacing the ->sk_prot callbacks.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet: also flag accepted sockets supporting msghdr originated zerocopy
Stefan Metzmacher [Wed, 26 Oct 2022 23:25:59 +0000 (00:25 +0100)]
net: also flag accepted sockets supporting msghdr originated zerocopy

Without this only the client initiated tcp sockets have SOCK_SUPPORT_ZC.
The listening socket on the server also has it, but the accepted
connections didn't, which meant IORING_OP_SEND[MSG]_ZC will always
fails with -EOPNOTSUPP.

Fixes: e993ffe3da4b ("net: flag sockets supporting msghdr originated zerocopy")
Cc: <[email protected]> # 6.0
CC: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/io-uring/[email protected]/T/#m38aa19b0b825758fb97860a38ad13122051f9dda
Signed-off-by: Stefan Metzmacher <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet/ulp: remove SOCK_SUPPORT_ZC from tls sockets
Pavel Begunkov [Wed, 26 Oct 2022 23:25:58 +0000 (00:25 +0100)]
net/ulp: remove SOCK_SUPPORT_ZC from tls sockets

Remove SOCK_SUPPORT_ZC when we're setting ulp as it might not support
msghdr::ubuf_info, e.g. like TLS replacing ->sk_prot with a new set of
handlers.

Cc: <[email protected]> # 6.0
Reported-by: Jakub Kicinski <[email protected]>
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet: remove SOCK_SUPPORT_ZC from sockmap
Pavel Begunkov [Wed, 26 Oct 2022 23:25:57 +0000 (00:25 +0100)]
net: remove SOCK_SUPPORT_ZC from sockmap

sockmap replaces ->sk_prot with its own callbacks, we should remove
SOCK_SUPPORT_ZC as the new proto doesn't support msghdr::ubuf_info.

Cc: <[email protected]> # 6.0
Reported-by: Jakub Kicinski <[email protected]>
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agoudp: advertise ipv6 udp support for msghdr::ubuf_info
Pavel Begunkov [Wed, 26 Oct 2022 23:25:56 +0000 (00:25 +0100)]
udp: advertise ipv6 udp support for msghdr::ubuf_info

Mark udp ipv6 as supporting msghdr::ubuf_info. In the original commit
SOCK_SUPPORT_ZC was supposed to be set by a udp_init_sock() call from
udp6_init_sock(), but
d38afeec26ed4 ("tcp/udp: Call inet6_destroy_sock() in IPv6 ...")
removed it and so ipv6 udp misses the flag.

Cc: <[email protected]> # 6.0
Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy")
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agoenic: MAINTAINERS: Update enic maintainers
Govindarajulu Varadarajan [Fri, 28 Oct 2022 04:21:59 +0000 (21:21 -0700)]
enic: MAINTAINERS: Update enic maintainers

Update enic maintainers.

Signed-off-by: Govindarajulu Varadarajan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet: openvswitch: add missing .resv_start_op
Jakub Kicinski [Fri, 28 Oct 2022 03:25:01 +0000 (20:25 -0700)]
net: openvswitch: add missing .resv_start_op

I missed one of the families in OvS when annotating .resv_start_op.
This triggers the warning added in commit ce48ebdd5651 ("genetlink:
limit the use of validation workarounds to old ops").

Reported-by: [email protected]
Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonetlink: hide validation union fields from kdoc
Jakub Kicinski [Thu, 27 Oct 2022 21:21:07 +0000 (14:21 -0700)]
netlink: hide validation union fields from kdoc

Mark the validation fields as private, users shouldn't set
them directly and they are too complicated to explain in
a more succinct way (there's already a long explanation
in the comment above).

The strict_start_type field is set directly and has a dedicated
comment so move that above the "private" section.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agoMerge tag 's390-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 29 Oct 2022 00:11:26 +0000 (17:11 -0700)]
Merge tag 's390-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Remove outdated linux390 link from MAINTAINERS

 - Add few missing EX_TABLE entries to inline assemblies

 - Fix raw data collection for pai_ext PMU

 - Add kernel image secure boot trailer for future firmware versions

 - Fix out-of-bounds access on cio_ignore free

 - Fix memory allocation of mdev_types array in vfio-ap

* tag 's390-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/vfio-ap: Fix memory allocation for mdev_types array
  s390/cio: fix out-of-bounds access on cio_ignore free
  s390/pai: fix raw data collection for PMU pai_ext
  s390/boot: add secure boot trailer
  s390/pci: add missing EX_TABLE entries to __pcistg_mio_inuser()/__pcilg_mio_inuser()
  s390/futex: add missing EX_TABLE entry to __futex_atomic_op()
  s390/uaccess: add missing EX_TABLE entries to __clear_user()
  MAINTAINERS: remove outdated linux390 link

2 years agoMerge tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 29 Oct 2022 00:03:00 +0000 (17:03 -0700)]
Merge tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for a build warning in the jump_label code

 - One of the git://github -> https://github cleanups, for the SiFive
   drivers

 - A fix for the kasan initialization code, this still likely warrants
   some cleanups but that's a bigger problem and at least this fixes the
   crashes in the short term

 - A pair of fixes for extension support detection on mixed LLVM/GNU
   toolchains

 - A fix for a runtime warning in the /proc/cpuinfo code

* tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Fix /proc/cpuinfo cpumask warning
  riscv: fix detection of toolchain Zihintpause support
  riscv: fix detection of toolchain Zicbom support
  riscv: mm: add missing memcpy in kasan_init
  MAINTAINERS: git://github.com -> https://github.com for sifive
  riscv: jump_label: mark arguments as const to satisfy asm constraints

2 years agoMerge tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 28 Oct 2022 23:48:29 +0000 (16:48 -0700)]
Merge tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and device properties fixes from Rafael Wysocki:
 "These fix device properties documentation and the ACPI PCC code, add a
  new IRQ override quirk for resource handling and add one more item to
  the list of device IDs to be ignored when returned by _DEP.

  Specifics:

   - Fix the documentation of the *_match_string() family of functions
     to properly cover the return value (Andy Shevchenko)

   - Fix a possible integer overflow during multiplication in the ACPI
     PCC code (Manank Patel)

   - Make the ACPI device resources code skip IRQ override on Asus
     Vivobook S5602ZA (Tamim Khan)

   - Add LATT2021 to the list of device IDs that are ignored when
     returned by _DEP, because there are no drivers for them in the
     kernel and no plans to add such drivers (Hans de Goede)"

* tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: scan: Add LATT2021 to acpi_ignore_dep_ids[]
  ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA
  ACPI: PCC: Fix unintentional integer overflow
  device property: Fix documentation for *_match_string() APIs

2 years agoMerge tag 'pm-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 28 Oct 2022 23:44:12 +0000 (16:44 -0700)]
Merge tag 'pm-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These make the intel_pstate driver work as expected on all hybrid
  platforms to date (regardless of possible platform firmware issues),
  fix hybrid sleep on systems using suspend-to-idle by default, make the
  generic power domains code handle disabled idle states properly and
  update pm-graph.

  Specifics:

   - Make intel_pstate use what is known about the hardware instead of
     relying on information from the platform firmware (ACPI CPPC in
     particular) to establish the relationship between the HWP CPU
     performance levels and frequencies on all hybrid platforms
     available to date (Rafael Wysocki)

   - Allow hybrid sleep to use suspend-to-idle as a system suspend
     method if it is the current suspend method of choice (Mario
     Limonciello)

   - Fix handling of unavailable/disabled idle states in the generic
     power domains code (Sudeep Holla)

   - Update the pm-graph suite of utilities to version 5.10 which is
     fixes-mostly and does not add any new features (Todd Brandt)"

* tag 'pm-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: domains: Fix handling of unavailable/disabled idle states
  pm-graph v5.10
  cpufreq: intel_pstate: hybrid: Use known scaling factor for P-cores
  cpufreq: intel_pstate: Read all MSRs on the target CPU
  PM: hibernate: Allow hybrid sleep to work with s2idle

2 years agorandom: use arch_get_random*_early() in random_init()
Jean-Philippe Brucker [Fri, 28 Oct 2022 16:00:42 +0000 (17:00 +0100)]
random: use arch_get_random*_early() in random_init()

While reworking the archrandom handling, commit d349ab99eec7 ("random:
handle archrandom with multiple longs") switched to the non-early
archrandom helpers in random_init(), which broke initialization of the
entropy pool from the arm64 random generator.

Indeed at that point the arm64 CPU features, which verify that all CPUs
have compatible capabilities, are not finalized so arch_get_random_seed_longs()
is unsuccessful. Instead random_init() should use the _early functions,
which check only the boot CPU on arm64. On other architectures the
_early functions directly call the normal ones.

Fixes: d349ab99eec7 ("random: handle archrandom with multiple longs")
Cc: [email protected]
Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
2 years agotools/nolibc/string: Fix memcmp() implementation
Rasmus Villemoes [Fri, 21 Oct 2022 06:01:53 +0000 (08:01 +0200)]
tools/nolibc/string: Fix memcmp() implementation

The C standard says that memcmp() must treat the buffers as consisting
of "unsigned chars". If char happens to be unsigned, the casts are ok,
but then obviously the c1 variable can never contain a negative
value. And when char is signed, the casts are wrong, and there's still
a problem with using an 8-bit quantity to hold the difference, because
that can range from -255 to +255.

For example, assuming char is signed, comparing two 1-byte buffers,
one containing 0x00 and another 0x80, the current implementation would
return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one
of those should of course return something positive.

Signed-off-by: Rasmus Villemoes <[email protected]>
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Cc: [email protected] # v5.0+
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
2 years agotools/nolibc: Fix missing strlen() definition and infinite loop with gcc-12
Willy Tarreau [Sun, 9 Oct 2022 18:29:36 +0000 (20:29 +0200)]
tools/nolibc: Fix missing strlen() definition and infinite loop with gcc-12

When built at -Os, gcc-12 recognizes an strlen() pattern in nolibc_strlen()
and replaces it with a jump to strlen(), which is not defined as a symbol
and breaks compilation. Worse, when the function is called strlen(), the
function is simply replaced with a jump to itself, hence becomes an
infinite loop.

One way to avoid this is to always set -ffreestanding, but the calling
code doesn't know this and there's no way (either via attributes or
pragmas) to globally enable it from include files, effectively leaving
a painful situation for the caller.

Alexey suggested to place an empty asm() statement inside the loop to
stop gcc from recognizing a well-known pattern, which happens to work
pretty fine. At least it allows us to make sure our local definition
is not replaced with a self jump.

The function only needs to be renamed back to strlen() so that the symbol
exists, which implies that nolibc_strlen() which is used on variable
strings has to be declared as a macro that points back to it before the
strlen() macro is redifined.

It was verified to produce valid code with gcc 3.4 to 12.1 at different
optimization levels, and both with constant and variable strings.

In case this problem surfaces again in the future, an alternate approach
consisting in adding an optimize("no-tree-loop-distribute-patterns")
function attribute for gcc>=12 worked as well but is less pretty.

Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Fixes: 96980b833a21 ("tools/nolibc/string: do not use __builtin_strlen() at -O0")
Cc: "Paul E. McKenney" <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
2 years agomm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off region
Sebastian Andrzej Siewior [Wed, 26 Oct 2022 13:48:30 +0000 (15:48 +0200)]
mm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off region

lru_gen_add_mm() has been added within an IRQ-off region in the commit
mentioned below.  The other invocations of lru_gen_add_mm() are not within
an IRQ-off region.

The invocation within IRQ-off region is problematic on PREEMPT_RT because
the function is using a spin_lock_t which must not be used within
IRQ-disabled regions.

The other invocations of lru_gen_add_mm() occur while
task_struct::alloc_lock is acquired.  Move lru_gen_add_mm() after
interrupts are enabled and before task_unlock().

Link: https://lkml.kernel.org/r/[email protected]
Fixes: bd74fdaea1460 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Yu Zhao <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agolib: maple_tree: remove unneeded initialization in mtree_range_walk()
Lukas Bulwahn [Wed, 26 Oct 2022 12:00:29 +0000 (14:00 +0200)]
lib: maple_tree: remove unneeded initialization in mtree_range_walk()

Before the do-while loop in mtree_range_walk(), the variables next, min,
max need to be initialized.  The variables last, prev_min and prev_max are
set within the loop body before they are eventually used after exiting the
loop body.

As it is a do-while loop, the loop body is executed at least once, so the
variables last, prev_min and prev_max do not need to be initialized before
the loop body.

Remove unneeded initialization of last and prev_min.

The needless initialization was reported by clang-analyzer as Dead Stores.

As the compiler already identifies these assignments as unneeded, it
optimizes the assignments away.  Hence:

No functional change. No change in object code.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Lukas Bulwahn <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agommap: fix remap_file_pages() regression
Liam Howlett [Tue, 25 Oct 2022 16:12:49 +0000 (16:12 +0000)]
mmap: fix remap_file_pages() regression

When using the VMA iterator, the final execution will set the variable
'next' to NULL which causes the function to fail out.  Restore the break
in the loop to exit the VMA iterator early without clearing NULL fixes the
issue.

Link: https://lore.kernel.org/lkml/29344.1666681759@jrobl/
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 763ecb035029 (mm: remove the vma linked list)
Signed-off-by: Liam R. Howlett <[email protected]>
Reported-by: "J. R. Okajima" <[email protected]>
Tested-by: "J. R. Okajima" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm/shmem: ensure proper fallback if page faults
Ira Weiny [Tue, 25 Oct 2022 22:01:08 +0000 (15:01 -0700)]
mm/shmem: ensure proper fallback if page faults

The kernel test robot flagged a recursive lock as a result of a conversion
from kmap_atomic() to kmap_local_folio()[Link]

The cause was due to the code depending on the kmap_atomic() side effect
of disabling page faults.  In that case the code expects the fault to fail
and take the fallback case.

git archaeology implied that the recursion may not be an actual bug.[1]
However, depending on the implementation of the mmap_lock and the
condition of the call there may still be a deadlock.[2] So this is not
purely a lockdep issue.  Considering a single threaded call stack there
are 3 options.

1) Different mm's are in play (no issue)
2) Readlock implementation is recursive and same mm is in play
   (no issue)
3) Readlock implementation is _not_ recursive (issue)

The mmap_lock is recursive so with a single thread there is no issue.

However, Matthew pointed out a deadlock scenario when you consider
additional process' and threads thusly.

"The readlock implementation is only recursive if nobody else has taken a
write lock.  If you have a multithreaded process, one of the other threads
can call mmap() and that will prevent recursion (due to fairness).  Even
if it's a different process that you're trying to acquire the mmap read
lock on, you can still get into a deadly embrace.  eg:

process A thread 1 takes read lock on own mmap_lock
process A thread 2 calls mmap, blocks taking write lock
process B thread 1 takes page fault, read lock on own mmap lock
process B thread 2 calls mmap, blocks taking write lock
process A thread 1 blocks taking read lock on process B
process B thread 1 blocks taking read lock on process A

Now all four threads are blocked waiting for each other."

Regardless using pagefault_disable() ensures that no matter what locking
implementation is used a deadlock will not occur.  Add an explicit
pagefault_disable() and a big comment to explain this for future souls
looking at this code.

[1] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/
[2] https://lore.kernel.org/lkml/Y1bXBtGTCym77%[email protected]/

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Fixes: 7a7256d5f512 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
Signed-off-by: Ira Weiny <[email protected]>
Reported-by: Matthew Wilcox (Oracle) <[email protected]>
Reported-by: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm/userfaultfd: replace kmap/kmap_atomic() with kmap_local_page()
Ira Weiny [Mon, 24 Oct 2022 04:34:52 +0000 (21:34 -0700)]
mm/userfaultfd: replace kmap/kmap_atomic() with kmap_local_page()

kmap() and kmap_atomic() are being deprecated in favor of
kmap_local_page() which is appropriate for any thread local context.[1]

A recent locking bug report with userfaultfd showed that the conversion of
the kmap_atomic()'s in those code flows requires care with regard to the
prevention of deadlock.[2]

git archaeology implied that the recursion may not be an actual bug.[3]
However, depending on the implementation of the mmap_lock and the
condition of the call there may still be a deadlock.[4] So this is not
purely a lockdep issue.  Considering a single threaded call stack there
are 3 options.

1) Different mm's are in play (no issue)
2) Readlock implementation is recursive and same mm is in play
   (no issue)
3) Readlock implementation is _not_ recursive (issue)

The mmap_lock is recursive so with a single thread there is no issue.

However, Matthew pointed out a deadlock scenario when you consider
additional process' and threads thusly.

"The readlock implementation is only recursive if nobody else has taken a
write lock.  If you have a multithreaded process, one of the other threads
can call mmap() and that will prevent recursion (due to fairness).  Even
if it's a different process that you're trying to acquire the mmap read
lock on, you can still get into a deadly embrace.  eg:

process A thread 1 takes read lock on own mmap_lock
process A thread 2 calls mmap, blocks taking write lock
process B thread 1 takes page fault, read lock on own mmap lock
process B thread 2 calls mmap, blocks taking write lock
process A thread 1 blocks taking read lock on process B
process B thread 1 blocks taking read lock on process A

Now all four threads are blocked waiting for each other."

Regardless using pagefault_disable() ensures that no matter what locking
implementation is used a deadlock will not occur.

Complete kmap conversion in userfaultfd by replacing the kmap() and
kmap_atomic() calls with kmap_local_page().  When replacing the
kmap_atomic() call ensure page faults continue to be disabled to support
the correct fall back behavior and add a comment to inform future souls of
the requirement.

[1] https://lore.kernel.org/all/20220813220034[email protected]/
[2] https://lore.kernel.org/all/Y1Mh2S7fUGQ%2FiKFR@iweiny-desk3/
[3] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/
[4] https://lore.kernel.org/lkml/Y1bXBtGTCym77%[email protected]/

[[email protected]: v2]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agox86: fortify: kmsan: fix KMSAN fortify builds
Alexander Potapenko [Mon, 24 Oct 2022 21:21:44 +0000 (23:21 +0200)]
x86: fortify: kmsan: fix KMSAN fortify builds

Ensure that KMSAN builds replace memset/memcpy/memmove calls with the
respective __msan_XXX functions, and that none of the macros are redefined
twice.  This should allow building kernel with both CONFIG_KMSAN and
CONFIG_FORTIFY_SOURCE.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.com/google/kmsan/issues/89
Signed-off-by: Alexander Potapenko <[email protected]>
Reported-by: Tamas K Lengyel <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agox86: asm: make sure __put_user_size() evaluates pointer once
Alexander Potapenko [Mon, 24 Oct 2022 21:21:43 +0000 (23:21 +0200)]
x86: asm: make sure __put_user_size() evaluates pointer once

User access macros must ensure their arguments are evaluated only once if
they are used more than once in the macro body.  Adding
instrument_put_user() to __put_user_size() resulted in double evaluation
of the `ptr` argument, which led to correctness issues when performing
e.g.  unsafe_put_user(..., p++, ...).

To fix those issues, evaluate the `ptr` argument of __put_user_size() at
the beginning of the macro.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 888f84a6da4d ("x86: asm: instrument usercopy in get_user() and put_user()")
Signed-off-by: Alexander Potapenko <[email protected]>
Reported-by: youling257 <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agoKconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default
Alexander Potapenko [Mon, 24 Oct 2022 21:21:42 +0000 (23:21 +0200)]
Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default

KMSAN adds a lot of instrumentation to the code, which results in
increased stack usage (up to 2048 bytes and more in some cases).  It's
hard to predict how big the stack frames can be, so we disable the
warnings for KMSAN instead.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.com/google/kmsan/issues/89
Signed-off-by: Alexander Potapenko <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agox86/purgatory: disable KMSAN instrumentation
Alexander Potapenko [Mon, 24 Oct 2022 21:21:41 +0000 (23:21 +0200)]
x86/purgatory: disable KMSAN instrumentation

The stand-alone purgatory.ro does not contain the KMSAN runtime, therefore
it can't be built with KMSAN compiler instrumentation.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.com/google/kmsan/issues/89
Signed-off-by: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm: kmsan: export kmsan_copy_page_meta()
Alexander Potapenko [Mon, 24 Oct 2022 21:21:40 +0000 (23:21 +0200)]
mm: kmsan: export kmsan_copy_page_meta()

Certain modules call copy_user_highpage(), which calls
kmsan_copy_page_meta() under KMSAN, so we need to export the latter.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.com/google/kmsan/issues/89
Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations")
Signed-off-by: Alexander Potapenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm: migrate: fix return value if all subpages of THPs are migrated successfully
Baolin Wang [Mon, 24 Oct 2022 08:34:21 +0000 (16:34 +0800)]
mm: migrate: fix return value if all subpages of THPs are migrated successfully

During THP migration, if THPs are not migrated but they are split and all
subpages are migrated successfully, migrate_pages() will still return the
number of THP pages that were not migrated.  This will confuse the callers
of migrate_pages().  For example, the longterm pinning will failed though
all pages are migrated successfully.

Thus we should return 0 to indicate that all pages are migrated in this
case

Link: https://lkml.kernel.org/r/de386aa864be9158d2f3b344091419ea7c38b2f7.1666599848.git.baolin.wang@linux.alibaba.com
Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Baolin Wang <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm/uffd: fix vma check on userfault for wp
Peter Xu [Mon, 24 Oct 2022 19:33:35 +0000 (15:33 -0400)]
mm/uffd: fix vma check on userfault for wp

We used to have a report that pte-marker code can be reached even when
uffd-wp is not compiled in for file memories, here:

https://lore.kernel.org/all/YzeR+R6b4bwBlBHh@x1n/T/#u

I just got time to revisit this and found that the root cause is we simply
messed up with the vma check, so that for !PTE_MARKER_UFFD_WP system, we
will allow UFFDIO_REGISTER of MINOR & WP upon shmem as the check was
wrong:

    if (vm_flags & VM_UFFD_MINOR)
        return is_vm_hugetlb_page(vma) || vma_is_shmem(vma);

Where we'll allow anything to pass on shmem as long as minor mode is
requested.

Axel did it right when introducing minor mode but I messed it up in
b1f9e876862d when moving code around.  Fix it.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs")
Signed-off-by: Peter Xu <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm: prep_compound_tail() clear page->private
Hugh Dickins [Sat, 22 Oct 2022 07:51:06 +0000 (00:51 -0700)]
mm: prep_compound_tail() clear page->private

Although page allocation always clears page->private in the first page or
head page of an allocation, it has never made a point of clearing
page->private in the tails (though 0 is often what is already there).

But now commit 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t
during THP split") issues a warning when page_tail->private is found to be
non-0 (unless it's swapcache).

Change that warning to dump page_tail (which also dumps head), instead of
just the head: so far we have seen dead000000000122dead000000000003,
dead000000000001 or 0000000000000002 in the raw output for tail private.

We could just delete the warning, but today's consensus appears to want
page->private to be 0, unless there's a good reason for it to be set: so
now clear it in prep_compound_tail() (more general than just for THP; but
not for high order allocation, which makes no pass down the tails).

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t during THP split")
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
Rik van Riel [Fri, 21 Oct 2022 23:28:05 +0000 (19:28 -0400)]
mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs

A common use case for hugetlbfs is for the application to create
memory pools backed by huge pages, which then get handed over to
some malloc library (eg. jemalloc) for further management.

That malloc library may be doing MADV_DONTNEED calls on memory
that is no longer needed, expecting those calls to happen on
PAGE_SIZE boundaries.

However, currently the MADV_DONTNEED code rounds up any such
requests to HPAGE_PMD_SIZE boundaries. This leads to undesired
outcomes when jemalloc expects a 4kB MADV_DONTNEED, but 2MB of
memory get zeroed out, instead.

Use of pre-built shared libraries means that user code does not
always know the page size of every memory arena in use.

Avoid unexpected data loss with MADV_DONTNEED by rounding up
only to PAGE_SIZE (in do_madvise), and rounding down to huge
page granularity.

That way programs will only get as much memory zeroed out as
they requested.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Rik van Riel <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm/page_isolation: fix clang deadcode warning
Maria Yu [Fri, 21 Oct 2022 10:15:55 +0000 (18:15 +0800)]
mm/page_isolation: fix clang deadcode warning

When !CONFIG_VM_BUG_ON, there is warning of
clang-analyzer-deadcode.DeadStores:
Value stored to 'mt' during its initialization is never read.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Maria Yu <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Doug Berger <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agofs/ext4/super.c: remove unused `deprecated_msg'
Andrew Morton [Fri, 21 Oct 2022 15:05:49 +0000 (08:05 -0700)]
fs/ext4/super.c: remove unused `deprecated_msg'

fs/ext4/super.c:1744:19: warning: 'deprecated_msg' defined but not used [-Wunused-const-variable=]

Reported-by: kernel test robot <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agoipc/msg.c: fix percpu_counter use after free
Andrew Morton [Fri, 21 Oct 2022 04:19:22 +0000 (21:19 -0700)]
ipc/msg.c: fix percpu_counter use after free

These percpu counters are referenced in free_ipcs->freeque, so destroy
them later.

Fixes: 72d1e611082e ("ipc/msg: mitigate the lock contention with percpu counter")
Reported-by: [email protected]
Tested-by: Mark Rutland <[email protected]>
Cc: Jiebin Sun <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomemory tier, sysfs: rename attribute "nodes" to "nodelist"
Huang Ying [Thu, 20 Oct 2022 01:51:22 +0000 (09:51 +0800)]
memory tier, sysfs: rename attribute "nodes" to "nodelist"

In sysfs, we use attribute name "cpumap" or "cpus" for cpu mask and
"cpulist" or "cpus_list" for cpu list.  For example, in my system,

 $ cat /sys/devices/system/node/node0/cpumap
 f,ffffffff
 $ cat /sys/devices/system/cpu/cpu2/topology/core_cpus
 0,00100004
 $ cat cat /sys/devices/system/node/node0/cpulist
 0-35
 $ cat /sys/devices/system/cpu/cpu2/topology/core_cpus_list
 2,20

It looks reasonable to use "nodemap" for node mask and "nodelist" for
node list.  So, rename the attribute to follow the naming convention.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 9832fb87834e2b ("mm/demotion: expose memory tier details via sysfs")
Signed-off-by: "Huang, Ying" <[email protected]>
Acked-by: Wei Xu <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Bharata B Rao <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Hesham Almatary <[email protected]>
Cc: Jagdish Gediya <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Tim Chen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agoMAINTAINERS: git://github.com -> https://github.com for nilfs2
Palmer Dabbelt [Thu, 20 Oct 2022 02:42:55 +0000 (11:42 +0900)]
MAINTAINERS: git://github.com -> https://github.com for nilfs2

Github deprecated the git:// links about a year ago, so let's move to the
https:// URLs instead.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.blog/2021-09-01-improving-git-protocol-security-github/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: Conor Dooley <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agomm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops
Waiman Long [Thu, 20 Oct 2022 17:56:19 +0000 (13:56 -0400)]
mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops

Commit 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object
iteration loop of kmemleak_scan()") adds cond_resched() in the first
object iteration loop of kmemleak_scan().  However, it turns that the 2nd
objection iteration loop can still cause soft lockup to happen in some
cases.  So add a cond_resched() call in the 2nd and 3rd loops as well to
prevent that and for completeness.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()")
Signed-off-by: Waiman Long <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agosquashfs: fix buffer release race condition in readahead code
Phillip Lougher [Thu, 20 Oct 2022 22:36:16 +0000 (23:36 +0100)]
squashfs: fix buffer release race condition in readahead code

Fix a buffer release race condition, where the error value was used after
release.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: b09a7a036d20 ("squashfs: support reading fragments in readahead call")
Signed-off-by: Phillip Lougher <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Reported-by: Marc Miltenberger <[email protected]>
Cc: Dimitri John Ledkov <[email protected]>
Cc: Hsin-Yi Wang <[email protected]>
Cc: Mirsad Goran Todorovac <[email protected]>
Cc: Slade Watkins <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agosquashfs: fix extending readahead beyond end of file
Phillip Lougher [Thu, 20 Oct 2022 22:36:15 +0000 (23:36 +0100)]
squashfs: fix extending readahead beyond end of file

The readahead code will try to extend readahead to the entire size of the
Squashfs data block.

But, it didn't take into account that the last block at the end of the
file may not be a whole block.  In this case, the code would extend
readahead to beyond the end of the file, leaving trailing pages.

Fix this by only requesting the expected number of pages.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 8fc78b6fe24c ("squashfs: implement readahead")
Signed-off-by: Phillip Lougher <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Reported-by: Marc Miltenberger <[email protected]>
Cc: Dimitri John Ledkov <[email protected]>
Cc: Hsin-Yi Wang <[email protected]>
Cc: Mirsad Goran Todorovac <[email protected]>
Cc: Slade Watkins <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agosquashfs: fix read regression introduced in readahead code
Phillip Lougher [Thu, 20 Oct 2022 22:36:14 +0000 (23:36 +0100)]
squashfs: fix read regression introduced in readahead code

Patch series "squashfs: fix some regressions introduced in the readahead
code".

This patchset fixes 3 regressions introduced by the recent readahead code
changes.  The first regression is causing "snaps" to randomly fail after a
couple of hours or days, which how the regression came to light.

This patch (of 3):

If a file isn't a whole multiple of the page size, the last page will have
trailing bytes unfilled.

There was a mistake in the readahead code which did this.  In particular
it incorrectly assumed that the last page in the readahead page array
(page[nr_pages - 1]) will always contain the last page in the block, which
if we're at file end, will be the page that needs to be zero filled.

But the readahead code may not return the last page in the block, which
means it is unmapped and will be skipped by the decompressors (a temporary
buffer used).

In this case the zero filling code will zero out the wrong page, leading
to data corruption.

Fix this by by extending the "page actor" to return the last page if
present, or NULL if a temporary buffer was used.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 8fc78b6fe24c ("squashfs: implement readahead")
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Phillip Lougher <[email protected]>
Reported-by: Mirsad Goran Todorovac <[email protected]>
Tested-by: Mirsad Goran Todorovac <[email protected]>
Tested-by: Slade Watkins <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Reported-by: Marc Miltenberger <[email protected]>
Cc: Dimitri John Ledkov <[email protected]>
Cc: Hsin-Yi Wang <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 years agoMerge tag 'rtc-6.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni...
Linus Torvalds [Fri, 28 Oct 2022 20:24:57 +0000 (13:24 -0700)]
Merge tag 'rtc-6.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC fixes from Alexandre Belloni:
 "Fix wakeup support that broke on multiple platforms"

* tag 'rtc-6.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: cmos: fix build on non-ACPI platforms
  rtc: cmos: Fix wake alarm breakage

2 years agoMerge tag 'mmc-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 28 Oct 2022 20:00:54 +0000 (13:00 -0700)]
Merge tag 'mmc-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Cancel recovery work on cleanup to avoid NULL pointer dereference
   - Fix error path in the read/write error recovery path
   - Fix kernel panic when remove non-standard SDIO card
   - Fix WRITE_ZEROES handling for CQE

  MMC host:
   - sdhci_am654: Fixup Kconfig dependency for REGMAP_MMIO
   - sdhci-esdhc-imx: Avoid warning of misconfigured bus-width
   - sdhci-pci: Disable broken HS400 ES mode for ASUS BIOS on Jasper
     Lake"

* tag 'mmc-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci_am654: 'select', not 'depends' REGMAP_MMIO
  mmc: core: Fix WRITE_ZEROES CQE handling
  mmc: core: Fix kernel panic when remove non-standard SDIO card
  mmc: sdhci-pci-core: Disable ES for ASUS BIOS on Jasper Lake
  mmc: sdhci-esdhc-imx: Propagate ESDHC_FLAG_HS400* only on 8bit bus
  mmc: queue: Cancel recovery work on cleanup
  mmc: block: Remove error check of hw_reset on reset

2 years agoMerge tag 'mtd/fixes-for-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 28 Oct 2022 19:24:19 +0000 (12:24 -0700)]
Merge tag 'mtd/fixes-for-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull mtd fixes from Miquel Raynal:
 "MTD core:
   - partitions: Add missing of_node_get() in dynamic partitions code

  Parser drivers:
   - bcm47xxpart: Fix halfblock reads

  Raw NAND controller drivers:
   - marvell: Use correct logic for nand-keep-config
   - tegra: Fix PM disable depth imbalance in probe
   - intel: Add missing of_node_put() in ebu_nand_probe()

  SPI-NOR core changes:
   - Ignore -ENOTSUPP in spi_nor_init()"

* tag 'mtd/fixes-for-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: parsers: bcm47xxpart: Fix halfblock reads
  mtd: rawnand: marvell: Use correct logic for nand-keep-config
  mtd: rawnand: tegra: Fix PM disable depth imbalance in probe
  mtd: rawnand: intel: Add missing of_node_put() in ebu_nand_probe()
  mtd: core: add missing of_node_get() in dynamic partitions code
  mtd: spi-nor: core: Ignore -ENOTSUPP in spi_nor_init()

2 years agoMerge tag 'sound-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 28 Oct 2022 19:20:31 +0000 (12:20 -0700)]
Merge tag 'sound-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes:

   - fixes for regressions by the recent ALSA control hash usages

   - fixes for UAF with del_timer() at removals in a few drivers

   - char signedness fixes

   - a few memory leak fixes in error paths

   - device-specific fixes / quirks for Intel SOF, AMD, HD-audio,
     USB-audio, and various ASoC codecs"

* tag 'sound-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (50 commits)
  ALSA: aoa: Fix I2S device accounting
  ALSA: Use del_timer_sync() before freeing timer
  ALSA: aoa: i2sbus: fix possible memory leak in i2sbus_add_dev()
  ALSA: rme9652: use explicitly signed char
  ALSA: au88x0: use explicitly signed char
  ALSA: hda/realtek: Add another HP ZBook G9 model quirks
  ALSA: usb-audio: Add quirks for M-Audio Fast Track C400/600
  ASoC: SOF: Intel: hda-codec: fix possible memory leak in hda_codec_device_init()
  ASoC: amd: yc: Add Lenovo Thinkbook 14+ 2022 21D0 to quirks table
  ASoC: Intel: Skylake: fix possible memory leak in skl_codec_device_init()
  ALSA: ac97: Use snd_ctl_rename() to rename a control
  ALSA: ca0106: Use snd_ctl_rename() to rename a control
  ALSA: emu10k1: Use snd_ctl_rename() to rename a control
  ALSA: hda/realtek: Use snd_ctl_rename() to rename a control
  ALSA: usb-audio: Use snd_ctl_rename() to rename a control
  ALSA: control: add snd_ctl_rename()
  ALSA: ac97: fix possible memory leak in snd_ac97_dev_register()
  ASoC: SOF: Intel: pci-tgl: fix ADL-N descriptor
  ASoC: qcom: lpass-cpu: Mark HDMI TX parity register as volatile
  ASoC: amd: yc: Adding Lenovo ThinkBook 14 Gen 4+ ARA and Lenovo ThinkBook 16 Gen 4+ ARA to the Quirks List
  ...

2 years agoMerge tag 'drm-fixes-2022-10-28' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 28 Oct 2022 19:10:43 +0000 (12:10 -0700)]
Merge tag 'drm-fixes-2022-10-28' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regularly scheduled fixes for drm, live from a Red Hat office for the
  first time in a while.

  The core has two fixes, one for scheduler leak and one for aperture
  uninit read.

  Otherwise a single bridge fix, and msm, amdgpu/kfd and i915 have a set
  of fixes each.

  sched:
   - Stop leaking fences when killing a sched entity.

  aperture:
   - Avoid uninitialized read in aperture_remove_conflicting_pci_device()

  bridge:
   - Fix HPD on bridge/ps8640.

  msm:
   - Fix shrinker deadlock
   - Fix crash during suspend after unbind
   - Fix IRQ lifetime issues
   - Fix potential memory corruption with too many bridges
   - Fix memory corruption on GPU state capture

  amdgpu:
   - Stable pstate fix
   - SMU 13.x updates
   - SR-IOV fixes
   - PCI AER fix
   - GC 11.x fixes
   - Display fixes
   - Expose IMU firmware version for debugging
   - Plane modifier fix
   - S0i3 fix

  amdkfd:
   - Fix possible memory leak
   - Fix GC 10.x cache info reporting

  i915:
   - Extend Wa_1607297627 to Alderlake-P
   - Keep PCI autosuspend control 'on' by default on all dGPU
   - Reset frl trained flag before restarting FRL training"

* tag 'drm-fixes-2022-10-28' of git://anongit.freedesktop.org/drm/drm: (39 commits)
  fbdev/core: Avoid uninitialized read in aperture_remove_conflicting_pci_device()
  drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resume
  drm/scheduler: fix fence ref counting
  drm/amd/display: Revert logic for plane modifiers
  drm/amdkfd: correct the cache info for gfx1036
  drm/amdkfd: update gfx1037 Lx cache setting
  drm/amdgpu: skip mes self test for gc 11.0.3 in recover
  drm/amd: Add IMU fw version to fw version queries
  drm/amd/display: Don't return false if no stream
  drm/amd/display: Remove wrong pipe control lock
  drm/amd/pm: allow gfxoff on gc_11_0_3
  drm/amdkfd: Fix memory leak in kfd_mem_dmamap_userptr()
  drm/amdgpu: Remove ATC L2 access for MMHUB 2.1.x
  drm/i915/dp: Reset frl trained flag before restarting FRL training
  drm/i915/dgfx: Keep PCI autosuspend control 'on' by default on all dGPU
  drm/i915: Extend Wa_1607297627 to Alderlake-P
  drm/amdgpu: Adjust MES polling timeout for sriov
  drm/amd/pm: update driver-if header for smu_v13_0_10
  drm/amdgpu: fix pstate setting issue
  drm/bridge: ps8640: Add back the 50 ms mystery delay after HPD
  ...

2 years agoMerge tag 'v6.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 28 Oct 2022 16:53:30 +0000 (09:53 -0700)]
Merge tag 'v6.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "Fix an alignment crash in x86/polyval"

* tag 'v6.1-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/polyval - Fix crashes when keys are not 16-byte aligned

2 years agoRDMA/qedr: clean up work queue on failure in qedr_alloc_resources()
Dan Carpenter [Tue, 25 Oct 2022 15:32:32 +0000 (18:32 +0300)]
RDMA/qedr: clean up work queue on failure in qedr_alloc_resources()

Add a check for if create_singlethread_workqueue() fails and also destroy
the work queue on failure paths.

Fixes: e411e0587e0d ("RDMA/qedr: Add iWARP connection management functions")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/Y1gBkDucQhhWj5YM@kili
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
2 years agoRDMA/core: Fix null-ptr-deref in ib_core_cleanup()
Chen Zhongjin [Tue, 25 Oct 2022 02:41:46 +0000 (10:41 +0800)]
RDMA/core: Fix null-ptr-deref in ib_core_cleanup()

KASAN reported a null-ptr-deref error:

  KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
  CPU: 1 PID: 379
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  RIP: 0010:destroy_workqueue+0x2f/0x740
  RSP: 0018:ffff888016137df8 EFLAGS: 00000202
  ...
  Call Trace:
   ib_core_cleanup+0xa/0xa1 [ib_core]
   __do_sys_delete_module.constprop.0+0x34f/0x5b0
   do_syscall_64+0x3a/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x7fa1a0d221b7
  ...

It is because the fail of roce_gid_mgmt_init() is ignored:

 ib_core_init()
   roce_gid_mgmt_init()
     gid_cache_wq = alloc_ordered_workqueue # fail
 ...
 ib_core_cleanup()
   roce_gid_mgmt_cleanup()
     destroy_workqueue(gid_cache_wq)
     # destroy an unallocated wq

Fix this by catching the fail of roce_gid_mgmt_init() in ib_core_init().

Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Signed-off-by: Chen Zhongjin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
2 years agoMAINTAINERS: Change myself to a maintainer
Matti Vaittinen [Fri, 28 Oct 2022 05:15:45 +0000 (08:15 +0300)]
MAINTAINERS: Change myself to a maintainer

After some off-list discussion with Marek Vasut and Geert Uytterhoeven
and finally a kx022a driver related discussion with Joe Perches
https://lore.kernel.org/lkml/92c3f72e60bc99bf4a21da259b4d78c1bdca447d[email protected]/
it seems that my status as a reviewer has been wrong. I do look after
the ROHM/Kionix drivers I've authored and currently I am also paid to do
so as is reflected by the 'S: Supported'. According to Joe, the reviewer
entry in MAINTAINERS do not indicate such level of support and having a
reviewer supporting an IC is a contradiction.

Switch undersigned from a reviewer to a maintainer for IC drivers I am
taking care of.

Signed-off-by: Matti Vaittinen <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
2 years agoMerge branches 'acpi-resource', 'acpi-pcc' and 'devprop'
Rafael J. Wysocki [Fri, 28 Oct 2022 14:37:02 +0000 (16:37 +0200)]
Merge branches 'acpi-resource', 'acpi-pcc' and 'devprop'

Merge an IRQ override quirk, an ACPI PCC code fix and a device
properties documentation update for 6.1-rc3:

 - Make the ACPI device resources code skip IRQ override on Asus
   Vivobook S5602ZA (Tamim Khan).

 - Fix a possible integer overflow during multiplication in the ACPI
   PCC code (Manank Patel).

 - Fix the documentation of the *_match_string() family of functions to
   properly cover the return value (Andy Shevchenko).

* acpi-resource:
  ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA

* acpi-pcc:
  ACPI: PCC: Fix unintentional integer overflow

* devprop:
  device property: Fix documentation for *_match_string() APIs

2 years agoMerge branches 'pm-sleep', 'pm-domains' and 'pm-tools'
Rafael J. Wysocki [Fri, 28 Oct 2022 14:31:25 +0000 (16:31 +0200)]
Merge branches 'pm-sleep', 'pm-domains' and 'pm-tools'

Merge a hiberantion-related fix, a generic power domains code fix and
a pm-graph update for 6.1-rc1:

 - Allow hybrid sleep to use suspend-to-idle as a system suspend method
   if it is the current suspend method of choice (Mario Limonciello).

 - Fix handling of unavailable/disabled idle states in the generic
   power domains code (Sudeep Holla).

 - Update the pm-graph suite of utilities to version 5.10 which is
   fixes-mostly and does not add any new features (Todd Brandt).

* pm-sleep:
  PM: hibernate: Allow hybrid sleep to work with s2idle

* pm-domains:
  PM: domains: Fix handling of unavailable/disabled idle states

* pm-tools:
  pm-graph v5.10

2 years agoblk-mq: Properly init requests from blk_mq_alloc_request_hctx()
John Garry [Wed, 26 Oct 2022 10:35:13 +0000 (18:35 +0800)]
blk-mq: Properly init requests from blk_mq_alloc_request_hctx()

Function blk_mq_alloc_request_hctx() is missing zeroing/init of rq->bio,
biotail, __sector, and __data_len members, which blk_mq_alloc_request()
has, so duplicate what we do in blk_mq_alloc_request().

Fixes: 1f5bd336b9150 ("blk-mq: add blk_mq_alloc_request_hctx")
Signed-off-by: John Garry <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
2 years agoKVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign()
Eiichi Tsukata [Fri, 28 Oct 2022 09:26:31 +0000 (09:26 +0000)]
KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign()

Should not call eventfd_ctx_put() in case of error.

Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Reported-by: [email protected]
Signed-off-by: Eiichi Tsukata <[email protected]>
Message-Id: <20221028092631[email protected]>
[Introduce new goto target instead. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agocapabilities: fix potential memleak on error path from vfs_getxattr_alloc()
Gaosheng Cui [Tue, 25 Oct 2022 13:33:57 +0000 (21:33 +0800)]
capabilities: fix potential memleak on error path from vfs_getxattr_alloc()

In cap_inode_getsecurity(), we will use vfs_getxattr_alloc() to
complete the memory allocation of tmpbuf, if we have completed
the memory allocation of tmpbuf, but failed to call handler->get(...),
there will be a memleak in below logic:

  |-- ret = (int)vfs_getxattr_alloc(mnt_userns, ...)
    |           /* ^^^ alloc for tmpbuf */
    |-- value = krealloc(*xattr_value, error + 1, flags)
    |           /* ^^^ alloc memory */
    |-- error = handler->get(handler, ...)
    |           /* error! */
    |-- *xattr_value = value
    |           /* xattr_value is &tmpbuf (memory leak!) */

So we will try to free(tmpbuf) after vfs_getxattr_alloc() fails to fix it.

Cc: [email protected]
Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
Signed-off-by: Gaosheng Cui <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
[PM: subject line and backtrace tweaks]
Signed-off-by: Paul Moore <[email protected]>
2 years agonet: emaclite: update reset_lock member documentation
Radhey Shyam Pandey [Wed, 26 Oct 2022 15:15:24 +0000 (20:45 +0530)]
net: emaclite: update reset_lock member documentation

Instead of generic description, mention what reset_lock actually
protects i.e. lock to serialize xmit and tx_timeout execution.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
2 years agoKVM: x86: smm: number of GPRs in the SMRAM image depends on the image format
Maxim Levitsky [Tue, 25 Oct 2022 12:47:32 +0000 (15:47 +0300)]
KVM: x86: smm: number of GPRs in the SMRAM image depends on the image format

On 64 bit host, if the guest doesn't have X86_FEATURE_LM, KVM will
access 16 gprs to 32-bit smram image, causing out-ouf-bound ram
access.

On 32 bit host, the rsm_load_state_64/enter_smm_save_state_64
is compiled out, thus access overflow can't happen.

Fixes: b443183a25ab61 ("KVM: x86: Reduce the number of emulator GPRs to '8' for 32-bit KVM")
Signed-off-by: Maxim Levitsky <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Message-Id: <20221025124741[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: x86: emulator: update the emulation mode after CR0 write
Maxim Levitsky [Tue, 25 Oct 2022 12:47:31 +0000 (15:47 +0300)]
KVM: x86: emulator: update the emulation mode after CR0 write

Update the emulation mode when handling writes to CR0, because
toggling CR0.PE switches between Real and Protected Mode, and toggling
CR0.PG when EFER.LME=1 switches between Long and Protected Mode.

This is likely a benign bug because there is no writeback of state,
other than the RIP increment, and when toggling CR0.PE, the CPU has
to execute code from a very low memory address.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20221025124741[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: x86: emulator: update the emulation mode after rsm
Maxim Levitsky [Tue, 25 Oct 2022 12:47:30 +0000 (15:47 +0300)]
KVM: x86: emulator: update the emulation mode after rsm

Update the emulation mode after RSM so that RIP will be correctly
written back, because the RSM instruction can switch the CPU mode from
32 bit (or less) to 64 bit.

This fixes a guest crash in case the #SMI is received while the guest
runs a code from an address > 32 bit.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20221025124741[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: x86: emulator: introduce emulator_recalc_and_set_mode
Maxim Levitsky [Tue, 25 Oct 2022 12:47:29 +0000 (15:47 +0300)]
KVM: x86: emulator: introduce emulator_recalc_and_set_mode

Some instructions update the cpu execution mode, which needs to update the
emulation mode.

Extract this code, and make assign_eip_far use it.

assign_eip_far now reads CS, instead of getting it via a parameter,
which is ok, because callers always assign CS to the same value
before calling this function.

No functional change is intended.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20221025124741[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: x86: emulator: em_sysexit should update ctxt->mode
Maxim Levitsky [Tue, 25 Oct 2022 12:47:28 +0000 (15:47 +0300)]
KVM: x86: emulator: em_sysexit should update ctxt->mode

SYSEXIT is one of the instructions that can change the
processor mode, thus ctxt->mode should be updated after it.

Note that this is likely a benign bug, because the only problematic
mode change is from 32 bit to 64 bit which can lead to truncation of RIP,
and it is not possible to do with sysexit,
since sysexit running in 32 bit mode will be limited to 32 bit version.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20221025124741[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_test
Sean Christopherson [Thu, 13 Oct 2022 21:12:34 +0000 (21:12 +0000)]
KVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_test

Tag "guest_saw_irq" as "volatile" to ensure that the compiler will never
optimize away lookups.  Relying on the compiler thinking that the flag
is global and thus might change also works, but it's subtle, less robust,
and looks like a bug at first glance, e.g. risks being "fixed" and
breaking the test.

Make the flag "static" as well since convincing the compiler it's global
is no longer necessary.

Alternatively, the flag could be accessed with {READ,WRITE}_ONCE(), but
literally every access would need the wrappers, and eking out performance
isn't exactly top priority for selftests.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20221013211234.1318131[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agoKVM: selftests: Add tests in xen_shinfo_test to detect lock races
Michal Luczaj [Thu, 13 Oct 2022 21:12:33 +0000 (21:12 +0000)]
KVM: selftests: Add tests in xen_shinfo_test to detect lock races

Tests for races between shinfo_cache (de)activation and hypercall+ioctl()
processing.  KVM has had bugs where activating the shared info cache
multiple times and/or with concurrent users results in lock corruption,
NULL pointer dereferences, and other fun.

For the timer injection testcase (#22), re-arm the timer until the IRQ
is successfully injected.  If the timer expires while the shared info
is deactivated (invalid), KVM will drop the event.

Signed-off-by: Michal Luczaj <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20221013211234.1318131[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
2 years agonet: dsa: Fix possible memory leaks in dsa_loop_init()
Chen Zhongjin [Wed, 26 Oct 2022 02:03:21 +0000 (10:03 +0800)]
net: dsa: Fix possible memory leaks in dsa_loop_init()

kmemleak reported memory leaks in dsa_loop_init():

kmemleak: 12 new suspected memory leaks

unreferenced object 0xffff8880138ce000 (size 2048):
  comm "modprobe", pid 390, jiffies 4295040478 (age 238.976s)
  backtrace:
    [<000000006a94f1d5>] kmalloc_trace+0x26/0x60
    [<00000000a9c44622>] phy_device_create+0x5d/0x970
    [<00000000d0ee2afc>] get_phy_device+0xf3/0x2b0
    [<00000000dca0c71f>] __fixed_phy_register.part.0+0x92/0x4e0
    [<000000008a834798>] fixed_phy_register+0x84/0xb0
    [<0000000055223fcb>] dsa_loop_init+0xa9/0x116 [dsa_loop]
    ...

There are two reasons for memleak in dsa_loop_init().

First, fixed_phy_register() create and register phy_device:

fixed_phy_register()
  get_phy_device()
    phy_device_create() # freed by phy_device_free()
  phy_device_register() # freed by phy_device_remove()

But fixed_phy_unregister() only calls phy_device_remove().
So the memory allocated in phy_device_create() is leaked.

Second, when mdio_driver_register() fail in dsa_loop_init(),
it just returns and there is no cleanup for phydevs.

Fix the problems by catching the error of mdio_driver_register()
in dsa_loop_init(), then calling both fixed_phy_unregister() and
phy_device_free() to release phydevs.
Also add a function for phydevs cleanup to avoid duplacate.

Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver")
Signed-off-by: Chen Zhongjin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
2 years agocifs: fix use-after-free caused by invalid pointer `hostname`
Zeng Heng [Thu, 27 Oct 2022 12:45:28 +0000 (20:45 +0800)]
cifs: fix use-after-free caused by invalid pointer `hostname`

`hostname` needs to be set as null-pointer after free in
`cifs_put_tcp_session` function, or when `cifsd` thread attempts
to resolve hostname and reconnect the host, the thread would deref
the invalid pointer.

Here is one of practical backtrace examples as reference:

Task 477
---------------------------
 do_mount
  path_mount
   do_new_mount
    vfs_get_tree
     smb3_get_tree
      smb3_get_tree_common
       cifs_smb3_do_mount
        cifs_mount
         mount_put_conns
          cifs_put_tcp_session
          --> kfree(server->hostname)

cifsd
---------------------------
 kthread
  cifs_demultiplex_thread
   cifs_reconnect
    reconn_set_ipaddr_from_hostname
    --> if (!server->hostname)
    --> if (server->hostname[0] == '\0')  // !! UAF fault here

CIFS: VFS: cifs_mount failed w/return code = -112
mount error(112): Host is down
BUG: KASAN: use-after-free in reconn_set_ipaddr_from_hostname+0x2ba/0x310
Read of size 1 at addr ffff888108f35380 by task cifsd/480
CPU: 2 PID: 480 Comm: cifsd Not tainted 6.1.0-rc2-00106-gf705792f89dd-dirty #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x68/0x85
 print_report+0x16c/0x4a3
 kasan_report+0x95/0x190
 reconn_set_ipaddr_from_hostname+0x2ba/0x310
 __cifs_reconnect.part.0+0x241/0x800
 cifs_reconnect+0x65f/0xb60
 cifs_demultiplex_thread+0x1570/0x2570
 kthread+0x2c5/0x380
 ret_from_fork+0x22/0x30
 </TASK>
Allocated by task 477:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7e/0x90
 __kmalloc_node_track_caller+0x52/0x1b0
 kstrdup+0x3b/0x70
 cifs_get_tcp_session+0xbc/0x19b0
 mount_get_conns+0xa9/0x10c0
 cifs_mount+0xdf/0x1970
 cifs_smb3_do_mount+0x295/0x1660
 smb3_get_tree+0x352/0x5e0
 vfs_get_tree+0x8e/0x2e0
 path_mount+0xf8c/0x1990
 do_mount+0xee/0x110
 __x64_sys_mount+0x14b/0x1f0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 477:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x50
 __kasan_slab_free+0x10a/0x190
 __kmem_cache_free+0xca/0x3f0
 cifs_put_tcp_session+0x30c/0x450
 cifs_mount+0xf95/0x1970
 cifs_smb3_do_mount+0x295/0x1660
 smb3_get_tree+0x352/0x5e0
 vfs_get_tree+0x8e/0x2e0
 path_mount+0xf8c/0x1990
 do_mount+0xee/0x110
 __x64_sys_mount+0x14b/0x1f0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff888108f35380
 which belongs to the cache kmalloc-16 of size 16
The buggy address is located 0 bytes inside of
 16-byte region [ffff888108f35380ffff888108f35390)
The buggy address belongs to the physical page:
page:00000000333f8e58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888108f350e0 pfn:0x108f35
flags: 0x200000000000200(slab|node=0|zone=2)
raw: 0200000000000200 0000000000000000 dead000000000122 ffff8881000423c0
raw: ffff888108f350e0 000000008080007a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff888108f35280: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
 ffff888108f35300: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
>ffff888108f35380: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
                   ^
 ffff888108f35400: fa fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888108f35480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches")
Signed-off-by: Zeng Heng <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
2 years agoMerge tag 'drm-misc-fixes-2022-10-27' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 28 Oct 2022 02:58:33 +0000 (12:58 +1000)]
Merge tag 'drm-misc-fixes-2022-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v6.1-rc3:
- Fix HPD on bridge/ps8640.
- Stop leaking fences when killing a sched entity.
- Avoid uninitialized read in aperture_remove_conflicting_pci_device()

Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2 years agoMerge tag 'drm-intel-fixes-2022-10-27-1' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Fri, 28 Oct 2022 02:57:18 +0000 (12:57 +1000)]
Merge tag 'drm-intel-fixes-2022-10-27-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Extend Wa_1607297627 to Alderlake-P (José Roberto de Souza)
- Keep PCI autosuspend control 'on' by default on all dGPU (Anshuman Gupta)
- Reset frl trained flag before restarting FRL training (Ankit Nautiyal)

Signed-off-by: Dave Airlie <[email protected]>
From: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/Y1o+teE2Z11pT1MN@tursulin-desk
2 years agoRISC-V: Fix /proc/cpuinfo cpumask warning
Andrew Jones [Fri, 14 Oct 2022 15:58:44 +0000 (17:58 +0200)]
RISC-V: Fix /proc/cpuinfo cpumask warning

Commit 78e5a3399421 ("cpumask: fix checking valid cpu range") has
started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1
are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's
start and next seq operations implement a pattern like

  n = cpumask_next(n - 1, mask);
  show(n);
  while (1) {
      ++n;
      n = cpumask_next(n - 1, mask);
      if (n >= nr_cpu_ids)
          break;
      show(n);
  }

which will issue the warning when reading /proc/cpuinfo. Ensure no
warning is generated by validating the cpu index before calling
cpumask_next().

[*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled.

Signed-off-by: Andrew Jones <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Tested-by: Conor Dooley <[email protected]>
Acked-by: Yury Norov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Fixes: 78e5a3399421 ("cpumask: fix checking valid cpu range")
Cc: [email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
2 years agoMerge patch series "Fix RISC-V toolchain extension support detection"
Palmer Dabbelt [Thu, 27 Oct 2022 22:09:50 +0000 (15:09 -0700)]
Merge patch series "Fix RISC-V toolchain extension support detection"

Conor Dooley <[email protected]> says:

From: Conor Dooley <[email protected]>

This came up due to a report from Kevin @ kernel-ci, who had been
running a mixed configuration of GNU binutils and clang. Their compiler
was relatively recent & supports Zicbom but binutils @ 2.35.2 did not.

Our current checks for extension support only cover the compiler, but it
appears to me that we need to check both the compiler & linker support
in case of "pot-luck" configurations that mix different versions of
LD,AS,CC etc.

Linker support does not seem possible to actually check, since the ISA
string is emitted into the object files - so I put in version checks for
that. The checks have gotten a bit ugly since 32 & 64 bit support need
to be checked independently but ahh well.

As I was going, I fell into the trap of there being duplicated checks
for CC support in both the Makefile and Kconfig, so as part of renaming
the Kconfig symbol to TOOLCHAIN_HAS_FOO, I dropped the extra checks in
the Makefile. This has the added advantage of the TOOLCHAIN_HAS_FOO
symbol for Zihintpause appearing in .config.

I pushed out a version of this that specificly checked for assember
support for LKP to test & it looked /okay/ - but I did some more testing
today and realised that this is redudant & have since dropped the as
check.

I tested locally with a fair few different combinations, to try and
cover each of AS, LD, CC missing support for the extension.

* b4-shazam-merge:
  riscv: fix detection of toolchain Zihintpause support
  riscv: fix detection of toolchain Zicbom support

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
2 years agoriscv: fix detection of toolchain Zihintpause support
Conor Dooley [Thu, 6 Oct 2022 17:35:21 +0000 (18:35 +0100)]
riscv: fix detection of toolchain Zihintpause support

It is not sufficient to check if a toolchain supports a particular
extension without checking if the linker supports that extension
too. For example, Clang 15 supports Zihintpause but GNU bintutils
2.35.2 does not, leading build errors like so:

riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zihintpause'

Add a TOOLCHAIN_HAS_ZIHINTPAUSE which checks if each of the compiler,
assembler and linker support the extension. Replace the ifdef in the
vdso with one depending on this new symbol.

Fixes: 8eb060e10185 ("arch/riscv: add Zihintpause support")
Signed-off-by: Conor Dooley <[email protected]>
Reviewed-by: Heiko Stuebner <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
2 years agoriscv: fix detection of toolchain Zicbom support
Conor Dooley [Thu, 6 Oct 2022 17:35:20 +0000 (18:35 +0100)]
riscv: fix detection of toolchain Zicbom support

It is not sufficient to check if a toolchain supports a particular
extension without checking if the linker supports that extension too.
For example, Clang 15 supports Zicbom but GNU bintutils 2.35.2 does
not, leading build errors like so:

riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicbom1p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zicbom'

Convert CC_HAS_ZICBOM to TOOLCHAIN_HAS_ZICBOM & check if the linker
also supports Zicbom.

Reported-by: Kevin Hilman <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/1714
Link: https://storage.kernelci.org/next/master/next-20220920/riscv/defconfig+CONFIG_EFI=n/clang-16/logs/kernel.log
Fixes: 1631ba1259d6 ("riscv: Add support for non-coherent devices using zicbom extension")
Signed-off-by: Conor Dooley <[email protected]>
Reviewed-by: Heiko Stuebner <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[Palmer: Check for ld-2.38, not 2.39, as 2.38 no longer errors.]
Signed-off-by: Palmer Dabbelt <[email protected]>
2 years agoriscv: mm: add missing memcpy in kasan_init
Qinglin Pan [Sun, 9 Oct 2022 08:30:50 +0000 (16:30 +0800)]
riscv: mm: add missing memcpy in kasan_init

Hi Atish,

It seems that the panic is due to the missing memcpy during kasan_init.
Could you please check whether this patch is helpful?

When doing kasan_populate, the new allocated base_pud/base_p4d should
contain kasan_early_shadow_{pud, p4d}'s content. Add the missing memcpy
to avoid page fault when read/write kasan shadow region.

Tested on:
 - qemu with sv57 and CONFIG_KASAN on.
 - qemu with sv48 and CONFIG_KASAN on.

Signed-off-by: Qinglin Pan <[email protected]>
Tested-by: Atish Patra <[email protected]>
Fixes: 8fbdccd2b173 ("riscv: mm: Support kasan for sv57")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
2 years agoMerge tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 27 Oct 2022 20:36:59 +0000 (13:36 -0700)]
Merge tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from 802.15.4 (Zigbee et al).

  Current release - regressions:

   - ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1

  Current release - new code bugs:

   - mptcp: fix abba deadlock on fastopen

   - eth: stmmac: rk3588: allow multiple gmac controllers in one system

  Previous releases - regressions:

   - ip: rework the fix for dflt addr selection for connected nexthop

   - net: couple more fixes for misinterpreting bits in struct page
     after the signature was added

  Previous releases - always broken:

   - ipv6: ensure sane device mtu in tunnels

   - openvswitch: switch from WARN to pr_warn on a user-triggerable path

   - ethtool: eeprom: fix null-deref on genl_info in dump

   - ieee802154: more return code fixes for corner cases in
     dgram_sendmsg

   - mac802154: fix link-quality-indicator recording

   - eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack
     offload

   - eth: fec: limit register access on i.MX6UL

   - eth: bcm4908_enet: update TX stats after actual transmission

   - can: rcar_canfd: improve IRQ handling for RZ/G2L

  Misc:

   - genetlink: piggy back on the newly added resv_op_start to enforce
     more sanity checks on new commands"

* tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  net: enetc: survive memory pressure without crashing
  kcm: do not sense pfmemalloc status in kcm_sendpage()
  net: do not sense pfmemalloc status in skb_append_pagefrags()
  net/mlx5e: Fix macsec sci endianness at rx sa update
  net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
  net/mlx5e: Fix macsec rx security association (SA) update/delete
  net/mlx5e: Fix macsec coverity issue at rx sa update
  net/mlx5: Fix crash during sync firmware reset
  net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
  net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
  net/mlx5e: TC, Reject forwarding from internal port to internal port
  net/mlx5: Fix possible use-after-free in async command interface
  net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
  net/mlx5e: Update restore chain id for slow path packets
  net/mlx5e: Extend SKB room check to include PTP-SQ
  net/mlx5: DR, Fix matcher disconnect error flow
  net/mlx5: Wait for firmware to enable CRS before pci_restore_state
  net/mlx5e: Do not increment ESN when updating IPsec ESN state
  netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
  netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed
  ...

2 years agoMerge tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 27 Oct 2022 20:16:36 +0000 (13:16 -0700)]
Merge tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve fixes from Kees Cook:

 - Fix an ancient signal action copy race (Bernd Edlinger)

 - Fix a memory leak in ELF loader, when under memory pressure (Li
   Zetao)

* tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  fs/binfmt_elf: Fix memory leak in load_elf_binary()
  exec: Copy oldsighand->action under spin-lock

2 years agonfs4: Fix kmemleak when allocate slot failed
Zhang Xiaoxu [Thu, 20 Oct 2022 03:20:54 +0000 (11:20 +0800)]
nfs4: Fix kmemleak when allocate slot failed

If one of the slot allocate failed, should cleanup all the other
allocated slots, otherwise, the allocated slots will leak:

  unreferenced object 0xffff8881115aa100 (size 64):
    comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s)
    hex dump (first 32 bytes):
      00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff  ...s......Z.....
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130
      [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270
      [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90
      [<00000000128486db>] nfs4_init_client+0xce/0x270
      [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0
      [<000000000e593b52>] nfs4_create_server+0x300/0x5f0
      [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110
      [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0
      [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0
      [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0
      [<000000005d56bdec>] do_syscall_64+0x35/0x80
      [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: abf79bb341bf ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking")
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFSv4.2: Fixup CLONE dest file size for zero-length count
Benjamin Coddington [Thu, 13 Oct 2022 15:58:01 +0000 (11:58 -0400)]
NFSv4.2: Fixup CLONE dest file size for zero-length count

When holding a delegation, the NFS client optimizes away setting the
attributes of a file from the GETATTR in the compound after CLONE, and for
a zero-length CLONE we will end up setting the inode's size to zero in
nfs42_copy_dest_done().  Handle this case by computing the resulting count
from the server's reported size after CLONE's GETATTR.

Suggested-by: Trond Myklebust <[email protected]>
Signed-off-by: Benjamin Coddington <[email protected]>
Fixes: 94d202d5ca39 ("NFSv42: Copy offload should update the file size when appropriate")
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoSUNRPC: Fix crasher in gss_unwrap_resp_integ()
Chuck Lever [Sat, 8 Oct 2022 18:58:29 +0000 (14:58 -0400)]
SUNRPC: Fix crasher in gss_unwrap_resp_integ()

If a zero length is passed to kmalloc() it returns 0x10, which is
not a valid address. gss_unwrap_resp_integ() subsequently crashes
when it attempts to dereference that pointer.

Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFSv4: Retry LOCK on OLD_STATEID during delegation return
Benjamin Coddington [Wed, 19 Oct 2022 16:09:18 +0000 (12:09 -0400)]
NFSv4: Retry LOCK on OLD_STATEID during delegation return

There's a small window where a LOCK sent during a delegation return can
race with another OPEN on client, but the open stateid has not yet been
updated.  In this case, the client doesn't handle the OLD_STATEID error
from the server and will lose this lock, emitting:
"NFS: nfs4_handle_delegation_recall_error: unhandled error -10024".

Fix this by sending the task through the nfs4 error handling in
nfs4_lock_done() when we may have to reconcile our stateid with what the
server believes it to be.  For this case, the result is a retry of the
LOCK operation with the updated stateid.

Reported-by: Gonzalo Siero Humet <[email protected]>
Signed-off-by: Benjamin Coddington <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoSUNRPC: Fix null-ptr-deref when xps sysfs alloc failed
Zhang Xiaoxu [Thu, 20 Oct 2022 03:42:17 +0000 (11:42 +0800)]
SUNRPC: Fix null-ptr-deref when xps sysfs alloc failed

There is a null-ptr-deref when xps sysfs alloc failed:
  BUG: KASAN: null-ptr-deref in sysfs_do_create_link_sd+0x40/0xd0
  Read of size 8 at addr 0000000000000030 by task gssproxy/457

  CPU: 5 PID: 457 Comm: gssproxy Not tainted 6.0.0-09040-g02357b27ee03 #9
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   kasan_report+0xa3/0x120
   sysfs_do_create_link_sd+0x40/0xd0
   rpc_sysfs_client_setup+0x161/0x1b0
   rpc_new_client+0x3fc/0x6e0
   rpc_create_xprt+0x71/0x220
   rpc_create+0x1d4/0x350
   gssp_rpc_create+0xc3/0x160
   set_gssp_clnt+0xbc/0x140
   write_gssp+0x116/0x1a0
   proc_reg_write+0xd6/0x130
   vfs_write+0x177/0x690
   ksys_write+0xb9/0x150
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

When the xprt_switch sysfs alloc failed, should not add xprt and
switch sysfs to it, otherwise, maybe null-ptr-deref; also initialize
the 'xps_sysfs' to NULL to avoid oops when destroy it.

Fixes: 2a338a543163 ("sunrpc: add a symlink from rpc-client directory to the xprt_switch")
Fixes: d408ebe04ac5 ("sunrpc: add add sysfs directory per xprt under each xprt_switch")
Fixes: baea99445dd4 ("sunrpc: add xprt_switch direcotry to sunrpc's sysfs")
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFSv4.1: We must always send RECLAIM_COMPLETE after a reboot
Trond Myklebust [Sun, 16 Oct 2022 18:44:33 +0000 (14:44 -0400)]
NFSv4.1: We must always send RECLAIM_COMPLETE after a reboot

Currently, we are only guaranteed to send RECLAIM_COMPLETE if we have
open state to recover. Fix the client to always send RECLAIM_COMPLETE
after setting up the lease.

Fixes: fce5c838e133 ("nfs41: RECLAIM_COMPLETE functionality")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFSv4.1: Handle RECLAIM_COMPLETE trunking errors
Trond Myklebust [Sun, 16 Oct 2022 18:44:32 +0000 (14:44 -0400)]
NFSv4.1: Handle RECLAIM_COMPLETE trunking errors

If RECLAIM_COMPLETE sets the NFS4CLNT_BIND_CONN_TO_SESSION flag, then we
need to loop back in order to handle it.

Fixes: 0048fdd06614 ("NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFSv4: Fix a potential state reclaim deadlock
Trond Myklebust [Sun, 16 Oct 2022 18:44:31 +0000 (14:44 -0400)]
NFSv4: Fix a potential state reclaim deadlock

If the server reboots while we are engaged in a delegation return, and
there is a pNFS layout with return-on-close set, then the current code
can end up deadlocking in pnfs_roc() when nfs_inode_set_delegation()
tries to return the old delegation.
Now that delegreturn actually uses its own copy of the stateid, it
should be safe to just always update the delegation stateid in place.

Fixes: 078000d02d57 ("pNFS: We want return-on-close to complete when evicting the inode")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoNFS: Avoid memcpy() run-time warning for struct sockaddr overflows
Kees Cook [Mon, 17 Oct 2022 04:36:50 +0000 (21:36 -0700)]
NFS: Avoid memcpy() run-time warning for struct sockaddr overflows

The 'nfs_server' and 'mount_server' structures include a union of
'struct sockaddr' (with the older 16 bytes max address size) and
'struct sockaddr_storage' which is large enough to hold all the
supported sa_family types (128 bytes max size). The runtime memcpy()
buffer overflow checker is seeing attempts to write beyond the 16
bytes as an overflow, but the actual expected size is that of 'struct
sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
completely through-out NFS, which results in adjusting the memcpy()
buffers to the correct union members. Avoids this false positive run-time
warning under CONFIG_FORTIFY_SOURCE:

  memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)

Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Cc: Trond Myklebust <[email protected]>
Cc: Anna Schumaker <[email protected]>
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agonfs: Remove redundant null checks before kfree
Yushan Zhou [Tue, 18 Oct 2022 04:07:08 +0000 (12:07 +0800)]
nfs: Remove redundant null checks before kfree

Fix the following coccicheck warning:
fs/nfs/dir.c:2494:2-7: WARNING:
NULL check before some freeing functions is not needed.

Signed-off-by: Yushan Zhou <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
2 years agoMerge tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 27 Oct 2022 19:31:57 +0000 (12:31 -0700)]
Merge tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Fix older Clang vs recent overflow KUnit test additions (Nick
   Desaulniers, Kees Cook)

 - Fix kern-doc visibility for overflow helpers (Kees Cook)

* tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  overflow: Refactor test skips for Clang-specific issues
  overflow: disable failing tests for older clang versions
  overflow: Fix kern-doc markup for functions

2 years agoMerge tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 27 Oct 2022 19:21:57 +0000 (12:21 -0700)]
Merge tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "A bunch of patches addressing issues in the vivid driver and adding
  new checks in V4L2 to validate the input parameters from some ioctls"

* tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: vivid.rst: loop_video is set on the capture devnode
  media: vivid: set num_in/outputs to 0 if not supported
  media: vivid: drop GFP_DMA32
  media: vivid: fix control handler mutex deadlock
  media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced'
  media: v4l2-dv-timings: add sanity checks for blanking values
  media: vivid: dev->bitmap_cap wasn't freed in all cases
  media: vivid: s_fbuf: add more sanity checks

2 years agoMerge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Linus Torvalds [Thu, 27 Oct 2022 18:44:18 +0000 (11:44 -0700)]
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt fix from Eric Biggers:
 "Fix a memory leak that was introduced by a change that went into -rc1"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: fix keyring memory leak on mount failure

2 years agonet: enetc: survive memory pressure without crashing
Vladimir Oltean [Thu, 27 Oct 2022 18:29:25 +0000 (21:29 +0300)]
net: enetc: survive memory pressure without crashing

Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.

An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.

This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).

The hardware guide explains this best:

| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
|   value, it indicates that all BDs in the ring have been prepared and
|   hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
|   indicate N BDs have been prepared. Note that hardware cannot start if
|   only a single buffer is prepared due to the restrictions described in
|   (2).
| - Software may write consumer index to match producer index anytime
|   while the ring is operational to indicate all received BDs prior have
|   been processed and new BDs prepared for hardware.

Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.

When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.

The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.

To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.

The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).

Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Claudiu Manoil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agofbdev: cyber2000fb: fix missing pci_disable_device()
Yang Yingliang [Mon, 24 Oct 2022 14:00:28 +0000 (22:00 +0800)]
fbdev: cyber2000fb: fix missing pci_disable_device()

Add missing pci_disable_device() in error path of probe() and remove() path.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
2 years agokcm: do not sense pfmemalloc status in kcm_sendpage()
Eric Dumazet [Thu, 27 Oct 2022 04:06:37 +0000 (04:06 +0000)]
kcm: do not sense pfmemalloc status in kcm_sendpage()

Similar to changes done in TCP in blamed commit.
We should not sense pfmemalloc status in sendpage() methods.

Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status")
Signed-off-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet: do not sense pfmemalloc status in skb_append_pagefrags()
Eric Dumazet [Thu, 27 Oct 2022 04:03:46 +0000 (04:03 +0000)]
net: do not sense pfmemalloc status in skb_append_pagefrags()

skb_append_pagefrags() is used by af_unix and udp sendpage()
implementation so far.

In commit 326140063946 ("tcp: TX zerocopy should not sense
pfmemalloc status") we explained why we should not sense
pfmemalloc status for pages owned by user space.

We should also use skb_fill_page_desc_noacc()
in skb_append_pagefrags() to avoid following KCSAN report:

BUG: KCSAN: data-race in lru_add_fn / skb_append_pagefrags

write to 0xffffea00058fc1c8 of 8 bytes by task 17319 on cpu 0:
__list_add include/linux/list.h:73 [inline]
list_add include/linux/list.h:88 [inline]
lruvec_add_folio include/linux/mm_inline.h:323 [inline]
lru_add_fn+0x327/0x410 mm/swap.c:228
folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246
lru_add_drain_cpu+0x73/0x250 mm/swap.c:669
lru_add_drain+0x21/0x60 mm/swap.c:773
free_pages_and_swap_cache+0x16/0x70 mm/swap_state.c:311
tlb_batch_pages_flush mm/mmu_gather.c:59 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:256 [inline]
tlb_flush_mmu+0x5b2/0x640 mm/mmu_gather.c:263
tlb_finish_mmu+0x86/0x100 mm/mmu_gather.c:363
exit_mmap+0x190/0x4d0 mm/mmap.c:3098
__mmput+0x27/0x1b0 kernel/fork.c:1185
mmput+0x3d/0x50 kernel/fork.c:1207
copy_process+0x19fc/0x2100 kernel/fork.c:2518
kernel_clone+0x166/0x550 kernel/fork.c:2671
__do_sys_clone kernel/fork.c:2812 [inline]
__se_sys_clone kernel/fork.c:2796 [inline]
__x64_sys_clone+0xc3/0xf0 kernel/fork.c:2796
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffffea00058fc1c8 of 8 bytes by task 17325 on cpu 1:
page_is_pfmemalloc include/linux/mm.h:1817 [inline]
__skb_fill_page_desc include/linux/skbuff.h:2432 [inline]
skb_fill_page_desc include/linux/skbuff.h:2453 [inline]
skb_append_pagefrags+0x210/0x600 net/core/skbuff.c:3974
unix_stream_sendpage+0x45e/0x990 net/unix/af_unix.c:2338
kernel_sendpage+0x184/0x300 net/socket.c:3561
sock_sendpage+0x5a/0x70 net/socket.c:1054
pipe_to_sendpage+0x128/0x160 fs/splice.c:361
splice_from_pipe_feed fs/splice.c:415 [inline]
__splice_from_pipe+0x222/0x4d0 fs/splice.c:559
splice_from_pipe fs/splice.c:594 [inline]
generic_splice_sendpage+0x89/0xc0 fs/splice.c:743
do_splice_from fs/splice.c:764 [inline]
direct_splice_actor+0x80/0xa0 fs/splice.c:931
splice_direct_to_actor+0x305/0x620 fs/splice.c:886
do_splice_direct+0xfb/0x180 fs/splice.c:974
do_sendfile+0x3bf/0x910 fs/read_write.c:1255
__do_sys_sendfile64 fs/read_write.c:1323 [inline]
__se_sys_sendfile64 fs/read_write.c:1309 [inline]
__x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1309
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x0000000000000000 -> 0xffffea00058fc188

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17325 Comm: syz-executor.0 Not tainted 6.1.0-rc1-syzkaller-00158-g440b7895c990-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022

Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet/mlx5e: Fix macsec sci endianness at rx sa update
Raed Salem [Wed, 26 Oct 2022 13:51:53 +0000 (14:51 +0100)]
net/mlx5e: Fix macsec sci endianness at rx sa update

The cited commit at rx sa update operation passes the sci object
attribute, in the wrong endianness and not as expected by the HW
effectively create malformed hw sa context in case of update rx sa
consequently, HW produces unexpected MACsec packets which uses this
sa.

Fix by passing sci to create macsec object with the correct endianness,
while at it add __force u64 to prevent sparse check error of type
"sparse: error: incorrect type in assignment".

Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
2 years agonet/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
Raed Salem [Wed, 26 Oct 2022 13:51:52 +0000 (14:51 +0100)]
net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function

The cited commit produces a sparse check error of type
"sparse: error: restricted __be64 degrades to integer". The
offending line wrongly did a bitwise operation between two different
storage types one of 64 bit when the other smaller side is 16 bit
which caused the above sparse error, furthermore bitwise operation
usage here is wrong in the first place as the constant MACSEC_PORT_ES
is not a bitwise field.

Fix by using the right mask to get the lower 16 bit if the sci number,
and use comparison operator '==' instead of bitwise '&' operator.

Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules")
Signed-off-by: Raed Salem <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
This page took 0.146913 seconds and 4 git commands to generate.