]> Git Repo - linux.git/log
linux.git
10 months agoplatform/x86: p2sb: Don't init until unassigned resources have been assigned
Ben Fradella [Thu, 9 May 2024 16:49:34 +0000 (16:49 +0000)]
platform/x86: p2sb: Don't init until unassigned resources have been assigned

The P2SB could get an invalid BAR from the BIOS, and that won't be fixed
up until pcibios_assign_resources(), which is an fs_initcall().

- Move p2sb_fs_init() to an fs_initcall_sync(). This is still early
  enough to avoid a race with any dependent drivers.

- Add a check for IORESOURCE_UNSET in p2sb_valid_resource() to catch
  unset BARs going forward.

- Return error values from p2sb_fs_init() so that the 'initcall_debug'
  cmdline arg provides useful data.

Signed-off-by: Ben Fradella <[email protected]>
Acked-by: Andy Shevchenko <[email protected]>
Tested-by: Klara Modin <[email protected]>
Reviewed-by: Shin'ichiro Kawasaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
10 months agoplatform/surface: aggregator: Log critical errors during SAM probing
Weifeng Liu [Sun, 5 May 2024 13:07:50 +0000 (21:07 +0800)]
platform/surface: aggregator: Log critical errors during SAM probing

Emits messages upon errors during probing of SAM.  Hopefully this could
provide useful context to user for the purpose of diagnosis when
something miserable happen.

Reviewed-by: Maximilian Luz <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Weifeng Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
10 months agofirmware: dmi: Add info message for number of populated and total memory slots
Heiner Kallweit [Tue, 14 May 2024 09:23:02 +0000 (11:23 +0200)]
firmware: dmi: Add info message for number of populated and total memory slots

As part of adding support for calling i2c_register_spd() on muxed SMBUS
segments the same message has been removed from i2c_register_spd().
However users may find it useful, therefore reintroduce it as part of
the DMI scan code.

[JD: Static variable dmi_memdev_populated_nr is only used in __init
 functions, so it can be marked __initdata.]

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jean Delvare <[email protected]>
10 months agobpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:28 +0000 (19:06 +0300)]
bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of

BPF just-in-time compiler depended on CONFIG_MODULES because it used
module_alloc() to allocate memory for the generated code.

Since code allocations are now implemented with execmem, drop dependency of
CONFIG_BPF_JIT on CONFIG_MODULES and make it select CONFIG_EXECMEM.

Suggested-by: Björn Töpel <[email protected]>
Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agokprobes: remove dependency on CONFIG_MODULES
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:27 +0000 (19:06 +0300)]
kprobes: remove dependency on CONFIG_MODULES

kprobes depended on CONFIG_MODULES because it has to allocate memory for
code.

Since code allocations are now implemented with execmem, kprobes can be
enabled in non-modular kernels.

Add #ifdef CONFIG_MODULE guards for the code dealing with kprobes inside
modules, make CONFIG_KPROBES select CONFIG_EXECMEM and drop the
dependency of CONFIG_KPROBES on CONFIG_MODULES.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
[mcgrof: rebase in light of NEED_TASKS_RCU ]
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agopowerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:26 +0000 (19:06 +0300)]
powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate

There are places where CONFIG_MODULES guards the code that depends on
memory allocation being done with module_alloc().

Replace CONFIG_MODULES with CONFIG_EXECMEM in such places.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agox86/ftrace: enable dynamic ftrace without CONFIG_MODULES
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:25 +0000 (19:06 +0300)]
x86/ftrace: enable dynamic ftrace without CONFIG_MODULES

Dynamic ftrace must allocate memory for code and this was impossible
without CONFIG_MODULES.

With execmem separated from the modules code, execmem_text_alloc() is
available regardless of CONFIG_MODULES.

Remove dependency of dynamic ftrace on CONFIG_MODULES and make
CONFIG_DYNAMIC_FTRACE select CONFIG_EXECMEM in Kconfig.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agoarch: make execmem setup available regardless of CONFIG_MODULES
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:24 +0000 (19:06 +0300)]
arch: make execmem setup available regardless of CONFIG_MODULES

execmem does not depend on modules, on the contrary modules use
execmem.

To make execmem available when CONFIG_MODULES=n, for instance for
kprobes, split execmem_params initialization out from
arch/*/kernel/module.c and compile it when CONFIG_EXECMEM=y

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agopowerpc: extend execmem_params for kprobes allocations
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:23 +0000 (19:06 +0300)]
powerpc: extend execmem_params for kprobes allocations

powerpc overrides kprobes::alloc_insn_page() to remove writable
permissions when STRICT_MODULE_RWX is on.

Add definition of EXECMEM_KRPOBES to execmem_params to allow using the
generic kprobes::alloc_insn_page() with the desired permissions.

As powerpc uses breakpoint instructions to inject kprobes, it does not
need to constrain kprobe allocations to the modules area and can use the
entire vmalloc address space.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agoarm64: extend execmem_info for generated code allocations
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:22 +0000 (19:06 +0300)]
arm64: extend execmem_info for generated code allocations

The memory allocations for kprobes and BPF on arm64 can be placed
anywhere in vmalloc address space and currently this is implemented with
overrides of alloc_insn_page() and bpf_jit_alloc_exec() in arm64.

Define EXECMEM_KPROBES and EXECMEM_BPF ranges in arm64::execmem_info and
drop overrides of alloc_insn_page() and bpf_jit_alloc_exec().

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agoriscv: extend execmem_params for generated code allocations
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:21 +0000 (19:06 +0300)]
riscv: extend execmem_params for generated code allocations

The memory allocations for kprobes and BPF on RISC-V are not placed in
the modules area and these custom allocations are implemented with
overrides of alloc_insn_page() and  bpf_jit_alloc_exec().

Define MODULES_VADDR and MODULES_END as VMALLOC_START and VMALLOC_END for
32 bit and slightly reorder execmem_params initialization to support both
32 and 64 bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
riscv::execmem_params and drop overrides of alloc_insn_page() and
bpf_jit_alloc_exec().

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Alexandre Ghiti <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomm/execmem, arch: convert remaining overrides of module_alloc to execmem
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:20 +0000 (19:06 +0300)]
mm/execmem, arch: convert remaining overrides of module_alloc to execmem

Extend execmem parameters to accommodate more complex overrides of
module_alloc() by architectures.

This includes specification of a fallback range required by arm, arm64
and powerpc, EXECMEM_MODULE_DATA type required by powerpc, support for
allocation of KASAN shadow required by s390 and x86 and support for
late initialization of execmem required by arm64.

The core implementation of execmem_alloc() takes care of suppressing
warnings when the initial allocation fails but there is a fallback range
defined.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Liviu Dudau <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomm/execmem, arch: convert simple overrides of module_alloc to execmem
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:19 +0000 (19:06 +0300)]
mm/execmem, arch: convert simple overrides of module_alloc to execmem

Several architectures override module_alloc() only to define address
range for code allocations different than VMALLOC address space.

Provide a generic implementation in execmem that uses the parameters for
address space ranges, required alignment and page protections provided
by architectures.

The architectures must fill execmem_info structure and implement
execmem_arch_setup() that returns a pointer to that structure. This way the
execmem initialization won't be called from every architecture, but rather
from a central place, namely a core_initcall() in execmem.

The execmem provides execmem_alloc() API that wraps __vmalloc_node_range()
with the parameters defined by the architectures.  If an architecture does
not implement execmem_arch_setup(), execmem_alloc() will fall back to
module_alloc().

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Song Liu <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomm: introduce execmem_alloc() and execmem_free()
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:18 +0000 (19:06 +0300)]
mm: introduce execmem_alloc() and execmem_free()

module_alloc() is used everywhere as a mean to allocate memory for code.

Beside being semantically wrong, this unnecessarily ties all subsystems
that need to allocate code, such as ftrace, kprobes and BPF to modules and
puts the burden of code allocation to the modules code.

Several architectures override module_alloc() because of various
constraints where the executable memory can be located and this causes
additional obstacles for improvements of code allocation.

Start splitting code allocation from modules by introducing execmem_alloc()
and execmem_free() APIs.

Initially, execmem_alloc() is a wrapper for module_alloc() and
execmem_free() is a replacement of module_memfree() to allow updating all
call sites to use the new APIs.

Since architectures define different restrictions on placement,
permissions, alignment and other parameters for memory that can be used by
different subsystems that allocate executable memory, execmem_alloc() takes
a type argument, that will be used to identify the calling subsystem and to
allow architectures define parameters for ranges suitable for that
subsystem.

No functional changes.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomodule: make module_memory_{alloc,free} more self-contained
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:17 +0000 (19:06 +0300)]
module: make module_memory_{alloc,free} more self-contained

Move the logic related to the memory allocation and freeing into
module_memory_alloc() and module_memory_free().

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Song Liu <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agosparc: simplify module_alloc()
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:16 +0000 (19:06 +0300)]
sparc: simplify module_alloc()

Define MODULES_VADDR and MODULES_END as VMALLOC_START and VMALLOC_END
for 32-bit and reduce module_alloc() to

__vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, ...)

as with the new defines the allocations becomes identical for both 32
and 64 bits.

While on it, drop unused include of <linux/jump_label.h>

Suggested-by: Sam Ravnborg <[email protected]>
Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Sam Ravnborg <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agonios2: define virtual address space for modules
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:15 +0000 (19:06 +0300)]
nios2: define virtual address space for modules

nios2 uses kmalloc() to implement module_alloc() because CALL26/PCREL26
cannot reach all of vmalloc address space.

Define module space as 32MiB below the kernel base and switch nios2 to
use vmalloc for module allocations.

Suggested-by: Thomas Gleixner <[email protected]>
Acked-by: Dinh Nguyen <[email protected]>
Acked-by: Song Liu <[email protected]>
Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomips: module: rename MODULE_START to MODULES_VADDR
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:14 +0000 (19:06 +0300)]
mips: module: rename MODULE_START to MODULES_VADDR

and MODULE_END to MODULES_END to match other architectures that define
custom address space for modules.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agoarm64: module: remove unneeded call to kasan_alloc_module_shadow()
Mike Rapoport (IBM) [Sun, 5 May 2024 16:06:13 +0000 (19:06 +0300)]
arm64: module: remove unneeded call to kasan_alloc_module_shadow()

Since commit f6f37d9320a1 ("arm64: select KASAN_VMALLOC for SW/HW_TAGS
modes") KASAN_VMALLOC is always enabled when KASAN is on. This means
that allocations in module_alloc() will be tracked by KASAN protection
for vmalloc() and that kasan_alloc_module_shadow() will be always an
empty inline and there is no point in calling it.

Drop meaningless call to kasan_alloc_module_shadow() from
module_alloc().

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agokallsyms: replace deprecated strncpy with strscpy
Justin Stitt [Fri, 12 Apr 2024 18:53:47 +0000 (18:53 +0000)]
kallsyms: replace deprecated strncpy with strscpy

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces. The goal is to remove its use completely [2].

namebuf is eventually cleaned of any trailing llvm suffixes using
strstr(). This hints that namebuf should be NUL-terminated.

static void cleanup_symbol_name(char *s)
{
char *res;
...
res = strstr(s, ".llvm.");
...
}

Due to this, use strscpy() over strncpy() as it guarantees
NUL-termination on the destination buffer. Drop the -1 from the length
calculation as it is no longer needed to ensure NUL-termination.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Cc: [email protected]
Signed-off-by: Justin Stitt <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agomodule: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.
Yifan Hong [Wed, 10 Apr 2024 19:48:02 +0000 (19:48 +0000)]
module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.

If UNUSED_KSYMS_WHITELIST is a file generated
before Kbuild runs, and the source tree is in
a read-only filesystem, the developer must put
the file somewhere and specify an absolute
path to UNUSED_KSYMS_WHITELIST. This worked,
but if IKCONFIG=y, an absolute path is embedded
into .config and eventually into vmlinux, causing
the build to be less reproducible when building
on a different machine.

This patch makes the handling of
UNUSED_KSYMS_WHITELIST to be similar to
MODULE_SIG_KEY.

First, check if UNUSED_KSYMS_WHITELIST is an
absolute path, just as before this patch. If so,
use the path as is.

If it is a relative path, use wildcard to check
the existence of the file below objtree first.
If it does not exist, fall back to the original
behavior of adding $(srctree)/ before the value.

After this patch, the developer can put the generated
file in objtree, then use a relative path against
objtree in .config, eradicating any absolute paths
that may be evaluated differently on different machines.

Signed-off-by: Yifan Hong <[email protected]>
Reviewed-by: Elliot Berman <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
10 months agopower: supply: sbs-manager: Remove class argument from i2c_mux_add_adapter()
Wolfram Sang [Thu, 18 Apr 2024 20:55:39 +0000 (22:55 +0200)]
power: supply: sbs-manager: Remove class argument from i2c_mux_add_adapter()

Commit 99a741aa7a2d ("i2c: mux: gpio: remove support for class-based
device instantiation") removed the last call to i2c_mux_add_adapter()
with a non-null class argument. Therefore the class argument can be
removed.

Note: Class-based device instantiation is a legacy mechanism which
shouldn't be used in new code, so we can rule out that this argument
may be needed again in the future.

This driver was forgotten by the patch in the Fixes tag.

Fixes: fec1982d7072 ("i2c: mux: Remove class argument from i2c_mux_add_adapter()")
Signed-off-by: Wolfram Sang <[email protected]>
Acked-by: Sebastian Reichel <[email protected]>
10 months agoftrace: Remove unused global 'ftrace_direct_func_count'
Dr. David Alan Gilbert [Mon, 6 May 2024 23:33:05 +0000 (00:33 +0100)]
ftrace: Remove unused global 'ftrace_direct_func_count'

Commit 8788ca164eb4b ("ftrace: Remove the legacy _ftrace_direct API")
stopped setting the 'ftrace_direct_func_count' variable, but left
it around.  Clean it up.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agomedia: bcm2835-unicam: Depend on COMMON_CLK
Laurent Pinchart [Sun, 12 May 2024 22:21:04 +0000 (01:21 +0300)]
media: bcm2835-unicam: Depend on COMMON_CLK

The bcm2835-unicam driver calls the clk_set_min_rate() function, which
is declared but not implemented on platforms that don't provide
COMMON_CLK. This causes linkage failures with some configurations.

Fix it by depending on COMMON_CLK. This only slightly restricts
compilation testing, but not usage of the driver as all platforms on
which the hardware can be found provide COMMON_CLK.

Fixes: 392cd78d495f ("media: bcm2835-unicam: Add support for CCP2/CSI2 camera interface")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Dave Stevenson <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
10 months agoftrace: Remove unused list 'ftrace_direct_funcs'
Dr. David Alan Gilbert [Sat, 4 May 2024 13:23:03 +0000 (14:23 +0100)]
ftrace: Remove unused list 'ftrace_direct_funcs'

Commit 8788ca164eb4b ("ftrace: Remove the legacy _ftrace_direct API")
stopped using 'ftrace_direct_funcs' (and the associated
struct ftrace_direct_func).  Remove them.

Build tested only (on x86-64 with FTRACE and DYNAMIC_FTRACE
enabled)

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agomodules: Drop the .export_symbol section from the final modules
Wang Yao [Wed, 17 Apr 2024 05:35:30 +0000 (13:35 +0800)]
modules: Drop the .export_symbol section from the final modules

Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
forget drop the .export_symbol section from the final modules.

Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
Signed-off-by: Wang Yao <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
10 months agoMerge tag 'idxd-for-linus-may2024' of git bundle from Arjan
Linus Torvalds [Tue, 14 May 2024 03:10:19 +0000 (20:10 -0700)]
Merge tag 'idxd-for-linus-may2024' of git bundle from Arjan

Pull DSA and IAA accelerator mis-alignment fix from Arjan van de Ven:
 "The DSA (memory copy/zero/etc) and IAA (compression) accelerators in
  the Sapphire Rapids and Emerald Rapids SOCs turn out to have a bug
  that has security implications.

  Both of these accelerators work by the application submitting a 64
  byte command to the device; this command contains an opcode as well as
  the virtual address of the return value that the device will update on
  completion... and a set of opcode specific values.

  In a typical scenario a ring 3 application mmaps the device file and
  uses the ENQCMD or MOVDIR64 instructions (which are variations of a 64
  byte atomic write) on this mmap'd memory region to directly submit
  commands to a device hardware.

  The return value as specified in the command, is supposed to be 32 (or
  64) bytes aligned in memory, and generally the hardware checks and
  enforces this alignment.

  However in testing it has been found that there are conditions
  (controlled by the submitter) where this enforcement does not
  happen... which makes it possible for the return value to span a page
  boundary. And this is where it goes wrong - the accelerators will
  perform the virtual to physical address lookup on the first of the two
  pages, but end up continue writing to the next consecutive physical
  (host) page rather than the consecutive virtual page. In addition, the
  device will end up in a hung state on such unaligned write of the
  return value.

  This patch series has the proposed software side solution consisting
  of three parts:

   - Don't allow these two PCI devices to be assigned to VM guests (we
     cannot trust a VM guest to behave correctly and not cause this
     condition)

   - Don't allow ring 3 applications to set up the mmap unless they have
     CAP_SYS_RAWIO permissions. This makes it no longer possible for
     non-root applications to directly submit commands to the
     accelerator

   - Add a write() method to the device so that an application can
     submit its commands to the kernel driver, which performs the needed
     sanity checks before submitting it to the hardware.

  This switch from mmap to write is an incompatible interface change to
  non-root userspace, but we have not found a way to avoid this. All
  software we know of uses a small set of accessor libraries for these
  accelerators, for which libqpl and libdml (on github) are the most
  common. As part of the security release, updated versions of these
  libraries will be released that transparently fall back to write().

  Intel has assigned CVE-2024-21823 to this hardware issue"

* tag 'idxd-for-linus-may2024' of git bundle from Arjan:
  dmaengine: idxd: add a write() method for applications to submit work
  dmaengine: idxd: add a new security check to deal with a hardware erratum
  VFIO: Add the SPR_DSA and SPR_IAX devices to the denylist

10 months agoMerge tag 'x86-shstk-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 02:33:23 +0000 (19:33 -0700)]
Merge tag 'x86-shstk-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 shadow stacks from Ingo Molnar:
 "Enable shadow stacks for x32.

  While we normally don't do such feature-enabling for 32-bit anymore,
  this change is small, straightforward & tested on upstream glibc"

* tag 'x86-shstk-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/shstk: Enable shadow stacks for x32

10 months agoMerge tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 14 May 2024 02:29:08 +0000 (19:29 -0700)]
Merge tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:

 - Improve the DeviceTree (OF) NUMA enumeration code to address
   kernel warnings & mis-mappings on DeviceTree platforms

 - Migrate x86 platform drivers to the .remove_new callback API

 - Misc cleanups & fixes

* tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/olpc-xo1-sci: Convert to platform remove callback returning void
  x86/platform/olpc-x01-pm: Convert to platform remove callback returning void
  x86/platform/iris: Convert to platform remove callback returning void
  x86/of: Change x86_dtb_parse_smp_config() to static
  x86/of: Map NUMA node to CPUs as per DeviceTree
  x86/of: Set the parse_smp_cfg for all the DeviceTree platforms by default
  x86/hyperv/vtl: Correct x86_init.mpparse.parse_smp_cfg assignment

10 months agoMerge tag 'x86-percpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 02:16:02 +0000 (19:16 -0700)]
Merge tag 'x86-percpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 percpu updates from Ingo Molnar:

 - Expand the named address spaces optimizations down to
   GCC 9.1+.

 - Re-enable named address spaces with sanitizers for GCC 13.3+

 - Generate better this_percpu_xchg_op() code

 - Introduce raw_cpu_read_long() to reduce ifdeffery

 - Simplify the x86_this_cpu_test_bit() et al macros

 - Address Sparse warnings

 - Misc cleanups & fixes

* tag 'x86-percpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/percpu: Introduce raw_cpu_read_long() to reduce ifdeffery
  x86/percpu: Rewrite x86_this_cpu_test_bit() and friends as macros
  x86/percpu: Fix x86_this_cpu_variable_test_bit() asm template
  x86/percpu: Re-enable named address spaces with sanitizers for GCC 13.3+
  x86/percpu: Use __force to cast from __percpu address space
  x86/percpu: Do not use this_cpu_read_stable_8() for 32-bit targets
  x86/percpu: Unify arch_raw_cpu_ptr() defines
  x86/percpu: Enable named address spaces for GCC 9.1+
  x86/percpu: Re-enable named address spaces with KASAN for GCC 13.3+
  x86/percpu: Move raw_percpu_xchg_op() to a better place
  x86/percpu: Convert this_percpu_xchg_op() from asm() to C code, to generate better code

10 months agoMerge tag 'x86-mm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 14 May 2024 02:02:49 +0000 (19:02 -0700)]
Merge tag 'x86-mm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mm updates from Ingo Molnar:

 - Fix W^X violation check false-positives in the CPA code
   when running as a Xen PV guest

 - Fix W^X violation warning false-positives in show_fault_oops()

* tag 'x86-mm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pat: Fix W^X violation false-positives when running as Xen PV guest
  x86/pat: Restructure _lookup_address_cpa()
  x86/mm: Use lookup_address_in_pgd_attr() in show_fault_oops()
  x86/pat: Introduce lookup_address_in_pgd_attr()

10 months agocerts: Add ECDSA signature verification self-test
Joachim Vandersmissen [Mon, 13 May 2024 04:55:07 +0000 (23:55 -0500)]
certs: Add ECDSA signature verification self-test

Commit c27b2d2012e1 ("crypto: testmgr - allow ecdsa-nist-p256 and -p384
in FIPS mode") enabled support for ECDSA in crypto/testmgr.c. The
PKCS#7 signature verification API builds upon the KCAPI primitives to
perform its high-level operations. Therefore, this change in testmgr.c
also allows ECDSA to be used by the PKCS#7 signature verification API
(in FIPS mode).

However, from a FIPS perspective, the PKCS#7 signature verification API
is a distinct "service" from the KCAPI primitives. This is because the
PKCS#7 API performs a "full" signature verification, which consists of
both hashing the data to be verified, and the public key operation.
On the other hand, the KCAPI primitive does not perform this hashing
step - it accepts pre-hashed data from the caller and only performs the
public key operation.

For this reason, the ECDSA self-tests in crypto/testmgr.c are not
sufficient to cover ECDSA signature verification offered by the PKCS#7
API. This is reflected by the self-test already present in this file
for RSA PKCS#1 v1.5 signature verification.

The solution is simply to add a second self-test here for ECDSA. P-256
with SHA-256 hashing was chosen as those parameters should remain
FIPS-approved for the foreseeable future, while keeping the performance
impact to a minimum. The ECDSA certificate and PKCS#7 signed data was
generated using OpenSSL. The input data is identical to the input data
for the existing RSA self-test.

Signed-off-by: Joachim Vandersmissen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
10 months agocerts: Move RSA self-test data to separate file
Joachim Vandersmissen [Mon, 13 May 2024 04:55:06 +0000 (23:55 -0500)]
certs: Move RSA self-test data to separate file

In preparation of adding new ECDSA self-tests, the existing data for
the RSA self-tests is moved to a separate file. This file is only
compiled if the new CONFIG_FIPS_SIGNATURE_SELFTEST_RSA configuration
option is set, which ensures that the required dependencies (RSA,
SHA-256) are present. Otherwise, the kernel would panic when trying to
execute the self-test.
The introduction of this new option, rather than adding the
dependencies to the existing CONFIG_FIPS_SIGNATURE_SELFTEST option,
allows for additional self-tests to be added for different algorithms.
The kernel can then be configured to only execute the self-tests for
those algorithms that are included.

Signed-off-by: Joachim Vandersmissen <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
10 months agoMerge tag 'x86-fpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 14 May 2024 02:00:26 +0000 (19:00 -0700)]
Merge tag 'x86-fpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Ingo Molnar:

 - Fix asm() constraints & modifiers in restore_fpregs_from_fpstate()

 - Update comments

 - Robustify the free_vm86() definition

* tag 'x86-fpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Update fpu_swap_kvm_fpu() uses in comments as well
  x86/vm86: Make sure the free_vm86(task) definition uses its parameter even in the !CONFIG_VM86 case
  x86/fpu: Fix AMD X86_BUG_FXSAVE_LEAK fixup

10 months agoMerge tag 'x86-entry-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 01:58:31 +0000 (18:58 -0700)]
Merge tag 'x86-entry-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 entry cleanup from Ingo Molnar:

 - Merge thunk_64.S and thunk_32.S into thunk.S

* tag 'x86-entry-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Merge thunk_64.S and thunk_32.S into thunk.S

10 months agoMerge tag 'x86-cpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 14 May 2024 01:44:44 +0000 (18:44 -0700)]
Merge tag 'x86-cpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Ingo Molnar:

 - Rework the x86 CPU vendor/family/model code: introduce the 'VFM'
   value that is an 8+8+8 bit concatenation of the vendor/family/model
   value, and add macros that work on VFM values. This simplifies the
   addition of new Intel models & families, and simplifies existing
   enumeration & quirk code.

 - Add support for the AMD 0x80000026 leaf, to better parse topology
   information

 - Optimize the NUMA allocation layout of more per-CPU data structures

 - Improve the workaround for AMD erratum 1386

 - Clear TME from /proc/cpuinfo as well, when disabled by the firmware

 - Improve x86 self-tests

 - Extend the mce_record tracepoint with the ::ppin and ::microcode fields

 - Implement recovery for MCE errors in TDX/SEAM non-root mode

 - Misc cleanups and fixes

* tag 'x86-cpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  x86/mm: Switch to new Intel CPU model defines
  x86/tsc_msr: Switch to new Intel CPU model defines
  x86/tsc: Switch to new Intel CPU model defines
  x86/cpu: Switch to new Intel CPU model defines
  x86/resctrl: Switch to new Intel CPU model defines
  x86/microcode/intel: Switch to new Intel CPU model defines
  x86/mce: Switch to new Intel CPU model defines
  x86/cpu: Switch to new Intel CPU model defines
  x86/cpu/intel_epb: Switch to new Intel CPU model defines
  x86/aperfmperf: Switch to new Intel CPU model defines
  x86/apic: Switch to new Intel CPU model defines
  perf/x86/msr: Switch to new Intel CPU model defines
  perf/x86/intel/uncore: Switch to new Intel CPU model defines
  perf/x86/intel/pt: Switch to new Intel CPU model defines
  perf/x86/lbr: Switch to new Intel CPU model defines
  perf/x86/intel/cstate: Switch to new Intel CPU model defines
  x86/bugs: Switch to new Intel CPU model defines
  x86/bugs: Switch to new Intel CPU model defines
  x86/cpu/vfm: Update arch/x86/include/asm/intel-family.h
  x86/cpu/vfm: Add new macros to work with (vendor/family/model) values
  ...

10 months agonet: revert partially applied PHY topology series
Jakub Kicinski [Mon, 13 May 2024 15:41:55 +0000 (08:41 -0700)]
net: revert partially applied PHY topology series

The series is causing issues with PHY drivers built as modules.
Since it was only partially applied and the merge window has
opened let's revert and try again for v6.11.

Revert 6916e461e793 ("net: phy: Introduce ethernet link topology representation")
Revert 0ec5ed6c130e ("net: sfp: pass the phy_device when disconnecting an sfp module's PHY")
Revert e75e4e074c44 ("net: phy: add helpers to handle sfp phy connect/disconnect")
Revert fdd353965b52 ("net: sfp: Add helper to return the SFP bus name")
Revert 841942bc6212 ("net: ethtool: Allow passing a phy index for some commands")

Link: https://lore.kernel.org/all/171242462917.4000.9759453824684907063.git-patchwork-notify@kernel.org/
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge branch 'move-est-lock-and-est-structure-to-struct-stmmac_priv'
Jakub Kicinski [Tue, 14 May 2024 01:33:14 +0000 (18:33 -0700)]
Merge branch 'move-est-lock-and-est-structure-to-struct-stmmac_priv'

Xiaolei Wang says:

====================
Move EST lock and EST structure to struct stmmac_priv

1. Pulling the mutex protecting the EST structure out to avoid
    clearing it during reinit/memset of the EST structure,and
    reacquire the mutex lock when doing this initialization.

2. Moving the EST structure to a more logical location
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: move the EST structure to struct stmmac_priv
Xiaolei Wang [Mon, 13 May 2024 01:43:46 +0000 (09:43 +0800)]
net: stmmac: move the EST structure to struct stmmac_priv

Move the EST structure to struct stmmac_priv, because the
EST configs don't look like platform config, but EST is
enabled in runtime with the settings retrieved for the TC
TAPRIO feature also in runtime. So it's better to have the
EST-data preserved in the driver private data instead of
the platform data storage.

Signed-off-by: Xiaolei Wang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Andrew Halaney <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: move the EST lock to struct stmmac_priv
Xiaolei Wang [Mon, 13 May 2024 01:43:45 +0000 (09:43 +0800)]
net: stmmac: move the EST lock to struct stmmac_priv

Reinitialize the whole EST structure would also reset the mutex
lock which is embedded in the EST structure, and then trigger
the following warning. To address this, move the lock to struct
stmmac_priv. We also need to reacquire the mutex lock when doing
this initialization.

DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 3 PID: 505 at kernel/locking/mutex.c:587 __mutex_lock+0xd84/0x1068
 Modules linked in:
 CPU: 3 PID: 505 Comm: tc Not tainted 6.9.0-rc6-00053-g0106679839f7-dirty #29
 Hardware name: NXP i.MX8MPlus EVK board (DT)
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __mutex_lock+0xd84/0x1068
 lr : __mutex_lock+0xd84/0x1068
 sp : ffffffc0864e3570
 x29: ffffffc0864e3570 x28: ffffffc0817bdc78 x27: 0000000000000003
 x26: ffffff80c54f1808 x25: ffffff80c9164080 x24: ffffffc080d723ac
 x23: 0000000000000000 x22: 0000000000000002 x21: 0000000000000000
 x20: 0000000000000000 x19: ffffffc083bc3000 x18: ffffffffffffffff
 x17: ffffffc08117b080 x16: 0000000000000002 x15: ffffff80d2d40000
 x14: 00000000000002da x13: ffffff80d2d404b8 x12: ffffffc082b5a5c8
 x11: ffffffc082bca680 x10: ffffffc082bb2640 x9 : ffffffc082bb2698
 x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
 x5 : ffffff8178fe0d48 x4 : 0000000000000000 x3 : 0000000000000027
 x2 : ffffff8178fe0d50 x1 : 0000000000000000 x0 : 0000000000000000
 Call trace:
  __mutex_lock+0xd84/0x1068
  mutex_lock_nested+0x28/0x34
  tc_setup_taprio+0x118/0x68c
  stmmac_setup_tc+0x50/0xf0
  taprio_change+0x868/0xc9c

Fixes: b2aae654a479 ("net: stmmac: add mutex lock to protect est parameters")
Signed-off-by: Xiaolei Wang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Andrew Halaney <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge branch 'mptcp-small-improvements-fix-and-clean-ups'
Jakub Kicinski [Tue, 14 May 2024 01:29:25 +0000 (18:29 -0700)]
Merge branch 'mptcp-small-improvements-fix-and-clean-ups'

Mat Martineau says:

====================
mptcp: small improvements, fix and clean-ups

This series contain mostly unrelated patches:

- The two first patches can be seen as "fixes". They are part of this
  series for -next because it looks like the last batch of fixes for
  v6.9 has already been sent. These fixes are not urgent, so they can
  wait if an unlikely v6.9-rc8 is published. About the two patches:
    - Patch 1 fixes getsockopt(SO_KEEPALIVE) support on MPTCP sockets
    - Patch 2 makes sure the full TCP keep-alive feature is supported,
      not just SO_KEEPALIVE.

- Patch 3 is a small optimisation when getsockopt(MPTCP_INFO) is used
  without buffer, just to check if MPTCP is still being used: no
  fallback to TCP.

- Patch 4 adds net.mptcp.available_schedulers sysctl knob to list packet
  schedulers, similar to net.ipv4.tcp_available_congestion_control.

- Patch 5 and 6 fix CheckPatch warnings: "prefer strscpy over strcpy"
  and "else is not generally useful after a break or return".

- Patch 7 and 8 remove and add header includes to avoid unused ones, and
  add missing ones to be self-contained.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: include inet_common in mib.h
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:32 +0000 (18:13 -0700)]
mptcp: include inet_common in mib.h

So this file is now self-contained: it can be compiled alone with
analytic tools.

Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: move mptcp_pm_gen.h's include
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:31 +0000 (18:13 -0700)]
mptcp: move mptcp_pm_gen.h's include

Nothing from protocol.h depends on mptcp_pm_gen.h, only code from
pm_netlink.c and pm_userspace.c depends on it.

So this include can be moved where it is needed to avoid a "unused
includes" warning.

Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: remove unnecessary else statements
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:30 +0000 (18:13 -0700)]
mptcp: remove unnecessary else statements

The 'else' statements are not needed here, because their previous 'if'
block ends with a 'return'.

This fixes CheckPatch warnings:

  WARNING: else is not generally useful after a break or return

Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: prefer strscpy over strcpy
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:29 +0000 (18:13 -0700)]
mptcp: prefer strscpy over strcpy

strcpy() performs no bounds checking on the destination buffer. This
could result in linear overflows beyond the end of the buffer, leading
to all kinds of misbehaviors. The safe replacement is strscpy() [1].

This is in preparation of a possible future step where all strcpy() uses
will be removed in favour of strscpy() [2].

This fixes CheckPatch warnings:

  WARNING: Prefer strscpy over strcpy

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Link: https://github.com/KSPP/linux/issues/88
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: add net.mptcp.available_schedulers
Gregory Detal [Tue, 14 May 2024 01:13:28 +0000 (18:13 -0700)]
mptcp: add net.mptcp.available_schedulers

The sysctl lists the available schedulers that can be set using
net.mptcp.scheduler similarly to net.ipv4.tcp_available_congestion_control.

Signed-off-by: Gregory Detal <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Tested-by: Geliang Tang <[email protected]>
Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: sockopt: info: stop early if no buffer
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:27 +0000 (18:13 -0700)]
mptcp: sockopt: info: stop early if no buffer

Up to recently, it has been recommended to use getsockopt(MPTCP_INFO) to
check if a fallback to TCP happened, or if the client requested to use
MPTCP.

In this case, the userspace app is only interested by the returned value
of the getsocktop() call, and can then give 0 for the option length, and
NULL for the buffer address. An easy optimisation is then to stop early,
and avoid filling a local buffer -- which now requires two different
locks -- if it is not needed.

Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: fix full TCP keep-alive support
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:26 +0000 (18:13 -0700)]
mptcp: fix full TCP keep-alive support

SO_KEEPALIVE support has been added a while ago, as part of a series
"adding SOL_SOCKET" support. To have a full control of this keep-alive
feature, it is important to also support TCP_KEEP* socket options at the
SOL_TCP level.

Supporting them on the setsockopt() part is easy, it is just a matter of
remembering each value in the MPTCP sock structure, and calling
tcp_sock_set_keep*() helpers on each subflow. If the value is not
modified (0), calling these helpers will not do anything. For the
getsockopt() part, the corresponding value from the MPTCP sock structure
or the default one is simply returned. All of this is very similar to
other TCP_* socket options supported by MPTCP.

It looks important for kernels supporting SO_KEEPALIVE, to also support
TCP_KEEP* options as well: some apps seem to (wrongly) consider that if
the former is supported, the latter ones will be supported as well. But
also, not having this simple and isolated change is preventing MPTCP
support in some apps, and libraries like GoLang [1]. This is why this
patch is seen as a fix.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/383
Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
Link: https://github.com/golang/go/issues/56539
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agomptcp: SO_KEEPALIVE: fix getsockopt support
Matthieu Baerts (NGI0) [Tue, 14 May 2024 01:13:25 +0000 (18:13 -0700)]
mptcp: SO_KEEPALIVE: fix getsockopt support

SO_KEEPALIVE support has to be set on each subflow: on each TCP socket,
where sk_prot->keepalive is defined. Technically, nothing has to be done
on the MPTCP socket. That's why mptcp_sol_socket_sync_intval() was
called instead of mptcp_sol_socket_intval().

Except that when nothing is done on the MPTCP socket, the
getsockopt(SO_KEEPALIVE), handled in net/core/sock.c:sk_getsockopt(),
will not know if SO_KEEPALIVE has been set on the different subflows or
not.

The fix is simple: simply call mptcp_sol_socket_intval() which will end
up calling net/core/sock.c:sk_setsockopt() where the SOCK_KEEPOPEN flag
will be set, the one used in sk_getsockopt().

So now, getsockopt(SO_KEEPALIVE) on an MPTCP socket will return the same
value as the one previously set with setsockopt(SO_KEEPALIVE).

Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'x86-cleanups-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 14 May 2024 01:21:24 +0000 (18:21 -0700)]
Merge tag 'x86-cleanups-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Ingo Molnar:

 - Fix function prototypes to address clang function type cast
   warnings in the math-emu code

 - Reorder definitions in <asm/msr-index.h>

 - Remove unused code

 - Fix typos

 - Simplify #include sections

* tag 'x86-cleanups-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pci/ce4100: Remove unused 'struct sim_reg_op'
  x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place
  x86/math-emu: Fix function cast warnings
  x86/extable: Remove unused fixup type EX_TYPE_COPY
  x86/rtc: Remove unused intel-mid.h
  x86/32: Remove unused IA32_STACK_TOP and two externs
  x86/head: Simplify relative include path to xen-head.S
  x86/fred: Fix typo in Kconfig description
  x86/syscall/compat: Remove ia32_unistd.h
  x86/syscall/compat: Remove unused macro __SYSCALL_ia32_NR
  x86/virt/tdx: Remove duplicate include
  x86/xen: Remove duplicate #include

10 months agoMerge tag 'x86-build-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 01:05:08 +0000 (18:05 -0700)]
Merge tag 'x86-build-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 build updates from Ingo Molnar:

 - Use -fpic to build the kexec 'purgatory' (the self-contained
   code that runs between two kernels)

 - Clean up vmlinux.lds.S generation

 - Simplify the X86_EXTENDED_PLATFORM section of the x86 Kconfig

 - Misc cleanups & fixes

* tag 'x86-build-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Merge the two CONFIG_X86_EXTENDED_PLATFORM entries
  x86/purgatory: Switch to the position-independent small code model
  x86/boot: Replace __PHYSICAL_START with LOAD_PHYSICAL_ADDR
  x86/vmlinux.lds.S: Take __START_KERNEL out conditional definition
  x86/vmlinux.lds.S: Remove conditional definition of LOAD_OFFSET
  vmlinux.lds.h: Fix a typo in comment

10 months agoMerge tag 'x86-bugs-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 01:03:15 +0000 (18:03 -0700)]
Merge tag 'x86-bugs-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 oops message cleanup from Ingo Molnar:

 - Use uniform "Oops: " prefix for die() messages

* tag 'x86-bugs-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Use uniform "Oops: " prefix for die() messages

10 months agoMerge tag 'x86-boot-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 00:50:36 +0000 (17:50 -0700)]
Merge tag 'x86-boot-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Ingo Molnar:

 - Move the kernel cmdline setup earlier in the boot process (again),
   to address a split_lock_detect= boot parameter bug

 - Ignore relocations in .notes sections

 - Simplify boot stack setup

 - Re-introduce a bootloader quirk wrt CR4 handling

 - Miscellaneous cleanups & fixes

* tag 'x86-boot-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot/64: Clear most of CR4 in startup_64(), except PAE, MCE and LA57
  x86/boot: Move kernel cmdline setup earlier in the boot process (again)
  x86/build: Clean up arch/x86/tools/relocs.c a bit
  x86/boot: Ignore relocations in .notes sections in walk_relocs() too
  x86: Rename __{start,end}_init_task to __{start,end}_init_stack
  x86/boot: Simplify boot stack setup

10 months agonet: mana: Enable MANA driver on ARM64 with 4K page size
Haiyang Zhang [Mon, 13 May 2024 20:29:01 +0000 (13:29 -0700)]
net: mana: Enable MANA driver on ARM64 with 4K page size

Change the Kconfig dependency, so this driver can be built and run on ARM64
with 4K page size.
16/64K page sizes are not supported yet.

Signed-off-by: Haiyang Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: prestera: Add flex arrays to some structs
Erick Archer [Sun, 12 May 2024 16:10:27 +0000 (18:10 +0200)]
net: prestera: Add flex arrays to some structs

The "struct prestera_msg_vtcam_rule_add_req" uses a dynamically sized
set of trailing elements. Specifically, it uses an array of structures
of type "prestera_msg_acl_action actions_msg".

The "struct prestera_msg_flood_domain_ports_set_req" also uses a
dynamically sized set of trailing elements. Specifically, it uses an
array of structures of type "prestera_msg_acl_action actions_msg".

So, use the preferred way in the kernel declaring flexible arrays [1].

At the same time, prepare for the coming implementation by GCC and Clang
of the __counted_by attribute. Flexible array members annotated with
__counted_by can have their accesses bounds-checked at run-time via
CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for
strcpy/memcpy-family functions). In this case, it is important to note
that the attribute used is specifically __counted_by_le since the
counters are of type __le32.

The logic does not need to change since the counters for the flexible
arrays are asigned before any access to the arrays.

The order in which the structure prestera_msg_vtcam_rule_add_req and the
structure prestera_msg_flood_domain_ports_set_req are defined must be
changed to avoid incomplete type errors.

Also, avoid the open-coded arithmetic in memory allocator functions [2]
using the "struct_size" macro.

Moreover, the new structure members also allow us to avoid the open-
coded arithmetic on pointers. So, take advantage of this refactoring
accordingly.

This code was detected with the help of Coccinelle, and audited and
modified manually.

Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Signed-off-by: Erick Archer <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/AS8PR02MB7237E8469568A59795F1F0408BE12@AS8PR02MB7237.eurprd02.prod.outlook.com
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: fec: remove .ndo_poll_controller to avoid deadlocks
Wei Fang [Sat, 11 May 2024 06:20:09 +0000 (14:20 +0800)]
net: fec: remove .ndo_poll_controller to avoid deadlocks

There is a deadlock issue found in sungem driver, please refer to the
commit ac0a230f719b ("eth: sungem: remove .ndo_poll_controller to avoid
deadlocks"). The root cause of the issue is that netpoll is in atomic
context and disable_irq() is called by .ndo_poll_controller interface
of sungem driver, however, disable_irq() might sleep. After analyzing
the implementation of fec_poll_controller(), the fec driver should have
the same issue. Due to the fec driver uses NAPI for TX completions, the
.ndo_poll_controller is unnecessary to be implemented in the fec driver,
so fec_poll_controller() can be safely removed.

Fixes: 7f5c6addcdc0 ("net/fec: add poll controller function for fec nic")
Signed-off-by: Wei Fang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 14 May 2024 00:36:32 +0000 (17:36 -0700)]
Merge tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 asm updates from Ingo Molnar:

 - Clean up & fix asm() operand modifiers & constraints

 - Misc cleanups

* tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Remove a superfluous newline in _static_cpu_has()
  x86/asm/64: Clean up memset16(), memset32(), memset64() assembly constraints in <asm/string_64.h>
  x86/asm: Use "m" operand constraint in WRUSSQ asm template
  x86/asm: Use %a instead of %P operand modifier in asm templates
  x86/asm: Use %c/%n instead of %P operand modifier in asm templates
  x86/asm: Remove %P operand modifier from altinstr asm templates

10 months agoMerge branch 'tcp-support-rstreasons-in-the-passive-logic'
Jakub Kicinski [Tue, 14 May 2024 00:34:10 +0000 (17:34 -0700)]
Merge branch 'tcp-support-rstreasons-in-the-passive-logic'

Jason Xing says:

====================
tcp: support rstreasons in the passive logic

In this series, I split all kinds of reasons into five part which,
I think, can be easily reviewed. I respectively implement corresponding
rstreasons in those functions. After this, we can trace the whole tcp
passive reset with clear reasons.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agotcp: rstreason: fully support in tcp_check_req()
Jason Xing [Fri, 10 May 2024 12:25:02 +0000 (20:25 +0800)]
tcp: rstreason: fully support in tcp_check_req()

We're going to send an RST due to invalid syn packet which is already
checked whether 1) it is in sequence, 2) it is a retransmitted skb.

As RFC 793 says, if the state of socket is not CLOSED/LISTEN/SYN-SENT,
then we should send an RST when receiving bad syn packet:
"fourth, check the SYN bit,...If the SYN is in the window it is an
error, send a reset"

Signed-off-by: Jason Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agotcp: rstreason: handle timewait cases in the receive path
Jason Xing [Fri, 10 May 2024 12:25:01 +0000 (20:25 +0800)]
tcp: rstreason: handle timewait cases in the receive path

There are two possible cases where TCP layer can send an RST. Since they
happen in the same place, I think using one independent reason is enough
to identify this special situation.

Signed-off-by: Jason Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agotcp: rstreason: fully support in tcp_rcv_state_process()
Jason Xing [Fri, 10 May 2024 12:25:00 +0000 (20:25 +0800)]
tcp: rstreason: fully support in tcp_rcv_state_process()

Like the previous patch does in this series, finish the conversion map is
enough to let rstreason mechanism work in this function.

Signed-off-by: Jason Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agotcp: rstreason: fully support in tcp_ack()
Jason Xing [Fri, 10 May 2024 12:24:59 +0000 (20:24 +0800)]
tcp: rstreason: fully support in tcp_ack()

Based on the existing skb drop reason, updating the rstreason map can
help us finish the rstreason job in this function.

Signed-off-by: Jason Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agotcp: rstreason: fully support in tcp_rcv_synsent_state_process()
Jason Xing [Fri, 10 May 2024 12:24:58 +0000 (20:24 +0800)]
tcp: rstreason: fully support in tcp_rcv_synsent_state_process()

In this function, only updating the map can finish the job for socket
reset reason because the corresponding drop reasons are ready.

Signed-off-by: Jason Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'x86-misc-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 00:33:48 +0000 (17:33 -0700)]
Merge tag 'x86-misc-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull tip tree documentation update from Ingo Molnar:

 - Update the -tip maintainers merge policy document wrt
   merge window timing

* tag 'x86-misc-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/maintainer-tip: Clarify merge window policy

10 months agoMerge branch 'net-stmmac-add-support-for-rzn1-gmac-devices'
Jakub Kicinski [Tue, 14 May 2024 00:20:03 +0000 (17:20 -0700)]
Merge branch 'net-stmmac-add-support-for-rzn1-gmac-devices'

Romain Gantois says:

====================
net: stmmac: Add support for RZN1 GMAC devices

This is version seven of my series that adds support for a Gigabit Ethernet
controller featured in the Renesas r9a06g032 SoC, of the RZ/N1 family. This
GMAC device is based on a Synopsys IP and is compatible with the stmmac driver.

My former colleague Clément Léger originally sent a series for this driver,
but an issue in bringing up the PCS clock had blocked the upstreaming
process. This issue has since been resolved by the following series:

https://lore.kernel.org/all/20240326-rxc_bugfix-v6-0-24a74e5c761f@bootlin.com/

This series consists of a devicetree binding describing the RZN1 GMAC
controller IP, a node for the GMAC1 device in the r9a06g032 SoC device
tree, and the GMAC driver itself which is a glue layer in stmmac.

There are also two patches by Russell that improve pcs initialization handling
in stmmac.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: add support for RZ/N1 GMAC
Clément Léger [Mon, 13 May 2024 07:25:17 +0000 (09:25 +0200)]
net: stmmac: add support for RZ/N1 GMAC

Add support for the Renesas RZ/N1 GMAC. This support can make use of a
custom RZ/N1 PCS which is fetched by parsing the pcs-handle device tree
property.

Signed-off-by: Clément Léger <[email protected]>
Co-developed-by: Romain Gantois <[email protected]>
Signed-off-by: Romain Gantois <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Hariprasad Kelam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: dwmac-socfpga: use pcs_init/pcs_exit
Russell King (Oracle) [Mon, 13 May 2024 07:25:16 +0000 (09:25 +0200)]
net: stmmac: dwmac-socfpga: use pcs_init/pcs_exit

Use the newly introduced pcs_init() and pcs_exit() operations to
create and destroy the PCS instance at a more appropriate moment during
the driver lifecycle, thereby avoiding publishing a network device to
userspace that has not yet finished its PCS initialisation.

There are other similar issues with this driver which remain
unaddressed, but these are out of scope for this patch.

Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Maxime Chevallier <[email protected]>
[rgantois: removed second parameters of new callbacks]
Signed-off-by: Romain Gantois <[email protected]>
Reviewed-by: Hariprasad Kelam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: introduce pcs_init/pcs_exit stmmac operations
Russell King (Oracle) [Mon, 13 May 2024 07:25:15 +0000 (09:25 +0200)]
net: stmmac: introduce pcs_init/pcs_exit stmmac operations

Introduce a mechanism whereby platforms can create their PCS instances
prior to the network device being published to userspace, but after
some of the core stmmac initialisation has been completed. This means
that the data structures that platforms need will be available.

Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Maxime Chevallier <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Co-developed-by: Romain Gantois <[email protected]>
Signed-off-by: Romain Gantois <[email protected]>
Reviewed-by: Hariprasad Kelam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: Make stmmac_xpcs_setup() generic to all PCS devices
Serge Semin [Mon, 13 May 2024 07:25:14 +0000 (09:25 +0200)]
net: stmmac: Make stmmac_xpcs_setup() generic to all PCS devices

A pcs_init() callback will be introduced to stmmac in a future patch. This
new function will be called during the hardware initialization phase.
Instead of separately initializing XPCS and PCS components, let's group all
PCS-related hardware initialization logic in the current
stmmac_xpcs_setup() function.

Rename stmmac_xpcs_setup() to stmmac_pcs_setup() and move the conditional
call to stmmac_xpcs_setup() inside the function itself.

Signed-off-by: Serge Semin <[email protected]>
Co-developed-by: Romain Gantois <[email protected]>
Signed-off-by: Romain Gantois <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Hariprasad Kelam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: stmmac: Add dedicated XPCS cleanup method
Serge Semin [Mon, 13 May 2024 07:25:13 +0000 (09:25 +0200)]
net: stmmac: Add dedicated XPCS cleanup method

Currently the XPCS handler destruction is performed in the
stmmac_mdio_unregister() method. It doesn't look good because the handler
isn't originally created in the corresponding protagonist
stmmac_mdio_unregister(), but in the stmmac_xpcs_setup() function. In
order to have more coherent MDIO and XPCS setup/cleanup procedures,
let's move the DW XPCS destruction to the dedicated stmmac_pcs_clean()
method.

This method will also be used to cleanup PCS hardware using the
pcs_exit() callback that will be introduced to stmmac in a subsequent
patch.

Signed-off-by: Serge Semin <[email protected]>
Co-developed-by: Romain Gantois <[email protected]>
Signed-off-by: Romain Gantois <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Hariprasad Kelam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agodt-bindings: net: renesas,rzn1-gmac: Document RZ/N1 GMAC support
Clément Léger [Mon, 13 May 2024 07:25:12 +0000 (09:25 +0200)]
dt-bindings: net: renesas,rzn1-gmac: Document RZ/N1 GMAC support

The RZ/N1 series of MPUs feature up to two Gigabit Ethernet controllers.
These controllers are based on Synopsys IPs. They can be connected to
RZ/N1 RGMII/RMII converters.

Add a binding that describes these GMAC devices.

Signed-off-by: Clément Léger <[email protected]>
[rgantois: commit log]
Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Romain Gantois <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoio_uring/net: wire up IORING_CQE_F_SOCK_NONEMPTY for accept
Jens Axboe [Thu, 9 May 2024 15:41:10 +0000 (09:41 -0600)]
io_uring/net: wire up IORING_CQE_F_SOCK_NONEMPTY for accept

If the given protocol supports passing back whether or not we had more
pending accept post this one, pass back this information to userspace.
This is done by setting IORING_CQE_F_SOCK_NONEMPTY in the CQE flags,
just like we do for recv/recvmsg if there's more data available post
a receive operation.

We can also use this information to be smarter about multishot retry,
as we don't need to do a pointless retry if we know for a fact that
there aren't any more connections to accept.

Suggested-by: Norman Maurer <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
10 months agonet: pass back whether socket was empty post accept
Jens Axboe [Thu, 9 May 2024 15:35:01 +0000 (09:35 -0600)]
net: pass back whether socket was empty post accept

This adds an 'is_empty' argument to struct proto_accept_arg, which can
be used to pass back information on whether or not the given socket has
more connections to accept post the one just accepted.

To utilize this information, the caller should initialize the 'is_empty'
field to, eg, -1 and then check for 0/1 after the accept. If the field
has been set, the caller knows whether there are more pending connections
or not. If the field remains -1 after the accept call, the protocol
doesn't support passing back this information.

This patch wires it up for ipv4/6 TCP.

Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
10 months agonet: have do_accept() take a struct proto_accept_arg argument
Jens Axboe [Thu, 9 May 2024 15:31:05 +0000 (09:31 -0600)]
net: have do_accept() take a struct proto_accept_arg argument

In preparation for passing in more information via this API, change
do_accept() to take a proto_accept_arg struct pointer rather than just
the file flags separately.

No functional changes in this patch.

Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
10 months agonet: change proto and proto_ops accept type
Jens Axboe [Thu, 9 May 2024 15:20:08 +0000 (09:20 -0600)]
net: change proto and proto_ops accept type

Rather than pass in flags, error pointer, and whether this is a kernel
invocation or not, add a struct proto_accept_arg struct as the argument.
This then holds all of these arguments, and prepares accept for being
able to pass back more information.

No functional changes in this patch.

Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
10 months agoMerge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 00:18:51 +0000 (17:18 -0700)]
Merge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Add cpufreq pressure feedback for the scheduler

 - Rework misfit load-balancing wrt affinity restrictions

 - Clean up and simplify the code around ::overutilized and
   ::overload access.

 - Simplify sched_balance_newidle()

 - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
   handling that changed the output.

 - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()

 - Reorganize, clean up and unify most of the higher level
   scheduler balancing function names around the sched_balance_*()
   prefix

 - Simplify the balancing flag code (sched_balance_running)

 - Miscellaneous cleanups & fixes

* tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/pelt: Remove shift of thermal clock
  sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
  thermal/cpufreq: Remove arch_update_thermal_pressure()
  sched/cpufreq: Take cpufreq feedback into account
  cpufreq: Add a cpufreq pressure feedback for the scheduler
  sched/fair: Fix update of rd->sg_overutilized
  sched/vtime: Do not include <asm/vtime.h> header
  s390/irq,nmi: Include <asm/vtime.h> header directly
  s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
  sched/vtime: Get rid of generic vtime_task_switch() implementation
  sched/vtime: Remove confusing arch_vtime_task_switch() declaration
  sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
  sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
  sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
  sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
  sched/fair: Rename root_domain::overload to ::overloaded
  sched/fair: Use helper functions to access root_domain::overload
  sched/fair: Check root_domain::overload value before update
  sched/fair: Combine EAS check with root_domain::overutilized access
  sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
  ...

10 months agonet: qede: flower: validate control flags
Asbjørn Sloth Tønnesen [Sat, 11 May 2024 07:37:03 +0000 (07:37 +0000)]
net: qede: flower: validate control flags

This driver currently doesn't support any control flags.

Use flow_rule_match_has_control_flags() to check for control flags,
such as can be set through `tc flower ... ip_flags frag`.

In case any control flags are masked, flow_rule_match_has_control_flags()
sets a NL extended error message, and we return -EOPNOTSUPP.

Only compile-tested.

Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 14 May 2024 00:13:47 +0000 (17:13 -0700)]
Merge tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:

 - Combine perf and BPF for fast evalution of HW breakpoint
   conditions

 - Add LBR capture support outside of hardware events

 - Trigger IO signals for watermark_wakeup

 - Add RAPL support for Intel Arrow Lake and Lunar Lake

 - Optimize frequency-throttling

 - Miscellaneous cleanups & fixes

* tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too
  selftests/perf_events: Test FASYNC with watermark wakeups
  perf/ring_buffer: Trigger IO signals for watermark_wakeup
  perf: Move perf_event_fasync() to perf_event.h
  perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines
  selftest/bpf: Test a perf BPF program that suppresses side effects
  perf/bpf: Allow a BPF program to suppress all sample side effects
  perf/bpf: Remove unneeded uses_default_overflow_handler()
  perf/bpf: Call BPF handler directly, not through overflow machinery
  perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members
  perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL
  perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow()
  perf/x86/rapl: Add support for Intel Lunar Lake
  perf/x86/rapl: Add support for Intel Arrow Lake
  perf/core: Reduce PMU access to adjust sample freq
  perf/core: Optimize perf_adjust_freq_unthr_context()
  perf/x86/amd: Don't reject non-sampling events with configured LBR
  perf/x86/amd: Support capturing LBR from software events
  perf/x86/amd: Avoid taking branches before disabling LBR
  perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined
  ...

10 months agoMerge branch 'virtio_net-rx-enable-premapped-mode-by-default'
Jakub Kicinski [Tue, 14 May 2024 00:07:43 +0000 (17:07 -0700)]
Merge branch 'virtio_net-rx-enable-premapped-mode-by-default'

Xuan Zhuo says:

====================
virtio_net: rx enable premapped mode by default

Actually, for the virtio drivers, we can enable premapped mode whatever
the value of use_dma_api. Because we provide the virtio dma apis.
So the driver can enable premapped mode unconditionally.

This patch set makes the big mode of virtio-net to support premapped mode.
And enable premapped mode for rx by default.

Based on the following points, we do not use page pool to manage these
    pages:

    1. virtio-net uses the DMA APIs wrapped by virtio core. Therefore,
       we can only prevent the page pool from performing DMA operations, and
       let the driver perform DMA operations on the allocated pages.
    2. But when the page pool releases the page, we have no chance to
       execute dma unmap.
    3. A solution to #2 is to execute dma unmap every time before putting
       the page back to the page pool. (This is actually a waste, we don't
       execute unmap so frequently.)
    4. But there is another problem, we still need to use page.dma_addr to
       save the dma address. Using page.dma_addr while using page pool is
       unsafe behavior.
    5. And we need space the chain the pages submitted once to virtio core.

    More:
        https://lore.kernel.org/all/CACGkMEu=Aok9z2imB_c5qVuujSh=vjj1kx12fy9N7hqyi+M5Ow@mail.gmail.com/

Why we do not use the page space to store the dma?

    http://lore.kernel.org/all/CACGkMEuyeJ9mMgYnnB42=hw6umNuo=agn7VBqBqYPd7GN=+39Q@mail.gmail.com
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agovirtio_net: remove the misleading comment
Xuan Zhuo [Sat, 11 May 2024 03:14:04 +0000 (11:14 +0800)]
virtio_net: remove the misleading comment

We call the build_skb() actually without copying data.
The comment is misleading. So remove it.

Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agovirtio_net: rx remove premapped failover code
Xuan Zhuo [Sat, 11 May 2024 03:14:03 +0000 (11:14 +0800)]
virtio_net: rx remove premapped failover code

Now, the premapped mode can be enabled unconditionally.

So we can remove the failover code for merge and small mode.

Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Larysa Zaremba <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agovirtio_net: big mode skip the unmap check
Xuan Zhuo [Sat, 11 May 2024 03:14:02 +0000 (11:14 +0800)]
virtio_net: big mode skip the unmap check

The virtio-net big mode did not enable premapped mode,
so we did not need to check the unmap. And the subsequent
commit will remove the failover code for failing enable
premapped for merge and small mode. So we need to remove
the checking do_dma code in the big mode path.

Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agovirtio_ring: enable premapped mode whatever use_dma_api
Xuan Zhuo [Sat, 11 May 2024 03:14:01 +0000 (11:14 +0800)]
virtio_ring: enable premapped mode whatever use_dma_api

Now, we have virtio DMA APIs, the driver can be the premapped
mode whatever the virtio core uses dma api or not.

So remove the limit of checking use_dma_api from
virtqueue_set_dma_premapped().

Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 14 May 2024 00:01:28 +0000 (17:01 -0700)]
Merge tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Over a dozen code generation micro-optimizations for the atomic
   and spinlock code

 - Add more __ro_after_init attributes

 - Robustify the lockdevent_*() macros

* tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/pvqspinlock/x86: Use _Q_LOCKED_VAL in PV_UNLOCK_ASM macro
  locking/qspinlock/x86: Micro-optimize virt_spin_lock()
  locking/atomic/x86: Merge __arch{,_try}_cmpxchg64_emu_local() with __arch{,_try}_cmpxchg64_emu()
  locking/atomic/x86: Introduce arch_try_cmpxchg64_local()
  locking/pvqspinlock/x86: Remove redundant CMP after CMPXCHG in __raw_callee_save___pv_queued_spin_unlock()
  locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h
  locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending()
  locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()
  locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() functions
  locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions
  locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32
  locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32
  locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64
  locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}()
  locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128()
  x86/tsc: Make __use_tsc __ro_after_init
  x86/kvm: Make kvm_async_pf_enabled __ro_after_init
  context_tracking: Make context_tracking_key __ro_after_init
  jump_label,module: Don't alloc static_key_mod for __ro_after_init keys
  locking/qspinlock: Always evaluate lockevent* non-event parameter once

10 months agotracing: Improve benchmark test performance by using do_div()
Thorsten Blum [Fri, 29 Mar 2024 16:02:30 +0000 (17:02 +0100)]
tracing: Improve benchmark test performance by using do_div()

Partially revert commit d6cb38e10810 ("tracing: Use div64_u64() instead
of do_div()") and use do_div() again to utilize its faster 64-by-32
division compared to the 64-by-64 division done by div64_u64().

Explicitly cast the divisor bm_cnt to u32 to prevent a Coccinelle
warning reported by do_div.cocci. The warning was removed with commit
d6cb38e10810 ("tracing: Use div64_u64() instead of do_div()").

Using the faster 64-by-32 division and casting bm_cnt to u32 is safe
because we return early from trace_do_benchmark() if bm_cnt > UINT_MAX.
This approach is already used twice in trace_do_benchmark() when
calculating the standard deviation:

do_div(stddev, (u32)bm_cnt);
do_div(stddev, (u32)bm_cnt - 1);

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Thorsten Blum <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agoring-buffer: Have mmapped ring buffer keep track of missed events
Steven Rostedt (Google) [Tue, 12 Mar 2024 21:54:05 +0000 (17:54 -0400)]
ring-buffer: Have mmapped ring buffer keep track of missed events

While testing libtracefs on the mmapped ring buffer, the test that checks
if missed events are accounted for failed when using the mapped buffer.
This is because the mapped page does not update the missed events that
were dropped because the writer filled up the ring buffer before the
reader could catch it.

Add the missed events to the reader page/sub-buffer when the IOCTL is done
and a new reader page is acquired.

Note that all accesses to the reader_page via rb_page_commit() had to be
switched to rb_page_size(), and rb_page_size() which was just a copy of
rb_page_commit() but now it masks out the RB_MISSED bits. This is needed
as the mapped reader page is still active in the ring buffer code and
where it reads the commit field of the bpage for the size, it now must
mask it otherwise the missed bits that are now set will corrupt the size
returned.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Vincent Donnefort <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agodpll: fix return value check for kmemdup
Chen Ni [Mon, 13 May 2024 03:28:24 +0000 (11:28 +0800)]
dpll: fix return value check for kmemdup

The return value of kmemdup() is dst->freq_supported, not
src->freq_supported. Update the check accordingly.

Fixes: 830ead5fb0c5 ("dpll: fix pin dump crash for rebound module")
Signed-off-by: Chen Ni <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm...
Linus Torvalds [Mon, 13 May 2024 23:48:15 +0000 (16:48 -0700)]
Merge tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform firmware updates from Tzung-Bi Shih:

 - Set driver owner in the core registration so that coreboot drivers
   don't need to set it individually

* tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  firmware: google: cbmem: drop driver owner initialization
  firmware: coreboot: store owner from modules with coreboot_driver_register()

10 months agoMerge tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Mon, 13 May 2024 23:44:47 +0000 (16:44 -0700)]
Merge tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform updates from Tzung-Bi Shih:
 "New:
   - Support Framework Laptop 13 and 16 (AMD Ryzen)

  Improvements:
   - Use sysfs_emit() instead of sprintf() for sysfs' show()

  Fixes:
   - Fix flex-array-member-not-at-end compiler warnings by using
     DEFINE_RAW_FLEX()
   - Add HAS_IOPORT dependencies
   - Fix long pending events during suspend after resume

  Misc cleanups:
   - Provide ID tables for avoiding fallback match
   - Replace deprecated UNIVERSAL_DEV_PM_OPS()"

* tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: (22 commits)
  platform/chrome: cros_ec: Handle events during suspend after resume completion
  platform/chrome: cros_ec_lpc: add quirks for the Framework Laptop (AMD)
  platform/chrome: cros_ec_lpc: add a "quirks" system
  platform/chrome: cros_ec_lpc: pass driver_data from DMI to the device
  platform/chrome: cros_ec_lpc: introduce a priv struct for the lpc device
  platform/chrome: add HAS_IOPORT dependencies
  platform/chrome: cros_hps_i2c: Replace deprecated UNIVERSAL_DEV_PM_OPS()
  platform/chrome: cros_kbd_led_backlight: provide ID table for avoiding fallback match
  platform/chrome: wilco_ec: core: provide ID table for avoiding fallback match
  platform/chrome: wilco_ec: event: remove redundant MODULE_ALIAS
  platform/chrome: wilco_ec: debugfs: provide ID table for avoiding fallback match
  platform/chrome: wilco_ec: telemetry: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_vbc: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_lightbar: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_sysfs: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_debugfs: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_chardev: provide ID table for avoiding fallback match
  platform/chrome: cros_usbpd_notify: provide ID table for avoiding fallback match
  platform/chrome: cros_usbpd_logger: provide ID table for avoiding fallback match
  platform/chrome: cros_ec_sensorhub: provide ID table for avoiding fallback match
  ...

10 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Mon, 13 May 2024 23:40:22 +0000 (16:40 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-13

We've added 119 non-merge commits during the last 14 day(s) which contain
a total of 134 files changed, 9462 insertions(+), 4742 deletions(-).

The main changes are:

1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi.

2) Add BPF range computation improvements to the verifier in particular
   around XOR and OR operators, refactoring of checks for range computation
   and relaxing MUL range computation so that src_reg can also be an unknown
   scalar, from Cupertino Miranda.

3) Add support to attach kprobe BPF programs through kprobe_multi link in
   a session mode, meaning, a BPF program is attached to both function entry
   and return, the entry program can decide if the return program gets
   executed and the entry program can share u64 cookie value with return
   program. Session mode is a common use-case for tetragon and bpftrace,
   from Jiri Olsa.

4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf
   as well as BPF selftest's struct_ops handling, from Andrii Nakryiko.

5) Improvements to BPF selftests in context of BPF gcc backend,
   from Jose E. Marchesi & David Faust.

6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test-
   -style in order to retire the old test, run it in BPF CI and additionally
   expand test coverage, from Jordan Rife.

7) Big batch for BPF selftest refactoring in order to remove duplicate code
   around common network helpers, from Geliang Tang.

8) Another batch of improvements to BPF selftests to retire obsolete
   bpf_tcp_helpers.h as everything is available vmlinux.h,
   from Martin KaFai Lau.

9) Fix BPF map tear-down to not walk the map twice on free when both timer
   and wq is used, from Benjamin Tissoires.

10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL,
    from Alexei Starovoitov.

11) Change BTF build scripts to using --btf_features for pahole v1.26+,
    from Alan Maguire.

12) Small improvements to BPF reusing struct_size() and krealloc_array(),
    from Andy Shevchenko.

13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions,
    from Ilya Leoshkevich.

14) Extend TCP ->cong_control() callback in order to feed in ack and
    flag parameters and allow write-access to tp->snd_cwnd_stamp
    from BPF program, from Miao Xu.

15) Add support for internal-only per-CPU instructions to inline
    bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs,
    from Puranjay Mohan.

16) Follow-up to remove the redundant ethtool.h from tooling infrastructure,
    from Tushar Vyavahare.

17) Extend libbpf to support "module:<function>" syntax for tracing
    programs, from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
  bpf: make list_for_each_entry portable
  bpf: ignore expected GCC warning in test_global_func10.c
  bpf: disable strict aliasing in test_global_func9.c
  selftests/bpf: Free strdup memory in xdp_hw_metadata
  selftests/bpf: Fix a few tests for GCC related warnings.
  bpf: avoid gcc overflow warning in test_xdp_vlan.c
  tools: remove redundant ethtool.h from tooling infra
  selftests/bpf: Expand ATTACH_REJECT tests
  selftests/bpf: Expand getsockname and getpeername tests
  sefltests/bpf: Expand sockaddr hook deny tests
  selftests/bpf: Expand sockaddr program return value tests
  selftests/bpf: Retire test_sock_addr.(c|sh)
  selftests/bpf: Remove redundant sendmsg test cases
  selftests/bpf: Migrate ATTACH_REJECT test cases
  selftests/bpf: Migrate expected_attach_type tests
  selftests/bpf: Migrate wildcard destination rewrite test
  selftests/bpf: Migrate sendmsg6 v4 mapped address tests
  selftests/bpf: Migrate sendmsg deny test cases
  selftests/bpf: Migrate WILDCARD_IP test
  selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
  ...
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet: pcs: lynx: no need to read LPA in lynx_pcs_get_state_2500basex()
Vladimir Oltean [Mon, 13 May 2024 11:53:45 +0000 (14:53 +0300)]
net: pcs: lynx: no need to read LPA in lynx_pcs_get_state_2500basex()

Nothing useful is done with the LPA variable in lynx_pcs_get_state_2500basex(),
we can just remove the read.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoftrace: Use asynchronous grace period for register_ftrace_direct()
Paul E. McKenney [Wed, 1 May 2024 23:12:37 +0000 (16:12 -0700)]
ftrace: Use asynchronous grace period for register_ftrace_direct()

When running heavy test workloads with KASAN enabled, RCU Tasks grace
periods can extend for many tens of seconds, significantly slowing
trace registration.  Therefore, make the registration-side RCU Tasks
grace period be asynchronous via call_rcu_tasks().

Link: https://lore.kernel.org/linux-trace-kernel/ac05be77-2972-475b-9b57-56bef15aa00a@paulmck-laptop
Reported-by: Jakub Kicinski <[email protected]>
Reported-by: Alexei Starovoitov <[email protected]>
Reported-by: Chris Mason <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agoftrace: Replaces simple_strtoul in ftrace
Yuran Pereira [Mon, 20 Nov 2023 00:16:13 +0000 (05:46 +0530)]
ftrace: Replaces simple_strtoul in ftrace

The function simple_strtoul performs no error checking in scenarios
where the input value overflows the intended output variable.
This results in this function successfully returning, even when the
output does not match the input string (aka the function returns
successfully even when the result is wrong).

Or as it was mentioned [1], "...simple_strtol(), simple_strtoll(),
simple_strtoul(), and simple_strtoull() functions explicitly ignore
overflows, which may lead to unexpected results in callers."
Hence, the use of those functions is discouraged.

This patch replaces all uses of the simple_strtoul with the safer
alternatives kstrtoul and kstruint.

Callers affected:
- add_rec_by_index
- set_graph_max_depth_function

Side effects of this patch:
- Since `fgraph_max_depth` is an `unsigned int`, this patch uses
  kstrtouint instead of kstrtoul to avoid any compiler warnings
  that could originate from calling the latter.
- This patch ensures that the callers of kstrtou* return accordingly
  when kstrtoul and kstruint fail for some reason.
  In this case, both callers this patch is addressing return 0 on error.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#simple-strtol-simple-strtoll-simple-strtoul-simple-strtoull

Link: https://lore.kernel.org/linux-trace-kernel/GV1PR10MB656333529A8D7B8AFB28D238E8B4A@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
Signed-off-by: Yuran Pereira <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
10 months agoMerge branch 'mlx5-misc-patches'
Jakub Kicinski [Mon, 13 May 2024 23:35:49 +0000 (16:35 -0700)]
Merge branch 'mlx5-misc-patches'

Tariq Toukan says:

====================
mlx5 misc patches

This series includes patches for the mlx5 driver.

Patch 1 by Shay enables LAG with HCAs of 8 ports.

Patch 2 by Carolina optimizes the safe switch channels operation for the
TX-only changes.

Patch 3 by Parav cleans up some unused code.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet/mlx5: Remove unused msix related exported APIs
Parav Pandit [Sun, 12 May 2024 12:43:05 +0000 (15:43 +0300)]
net/mlx5: Remove unused msix related exported APIs

MSIX irq allocation and free APIs are no longer
in use. Hence, remove the dead code.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Dragos Tatulea <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet/mlx5e: Modifying channels number and updating TX queues
Carolina Jubran [Sun, 12 May 2024 12:43:04 +0000 (15:43 +0300)]
net/mlx5e: Modifying channels number and updating TX queues

It is not appropriate for the mlx5e_num_channels_changed
function to be called solely for updating the TX queues,
even if the channels number has not been changed.

Move the code responsible for updating the TC and TX queues
from mlx5e_num_channels_changed and produce a new function
called mlx5e_update_tc_and_tx_queues. This new function should
only be called when the channels number remains unchanged.

Signed-off-by: Carolina Jubran <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agonet/mlx5: Enable 8 ports LAG
Shay Drory [Sun, 12 May 2024 12:43:03 +0000 (15:43 +0300)]
net/mlx5: Enable 8 ports LAG

This patch adds to mlx5 drivers support for 8 ports HCAs.
Starting with ConnectX-8 HCAs with 8 ports are possible.

As most driver parts aren't affected by such configuration most driver
code is unchanged.

Specially the only affected areas are:
- Lag
- Multiport E-Switch
- Single FDB E-Switch

All of the above are already factored in generic way, and LAG and VF LAG
are tested, so all that left is to change a #define and remove checks
which are no longer needed.
However, Multiport E-Switch is not tested yet, so it is left untouched.

This patch will allow to create hardware LAG/VF LAG when all 8 ports are
added to the same bond device.

for example, In order to activate the hardware lag a user can execute
the following:

ip link add bond0 type bond
ip link set bond0 type bond miimon 100 mode 2
ip link set eth2 master bond0
ip link set eth3 master bond0
ip link set eth4 master bond0
ip link set eth5 master bond0
ip link set eth6 master bond0
ip link set eth7 master bond0
ip link set eth8 master bond0
ip link set eth9 master bond0

Where eth2, eth3, eth4, eth5, eth6, eth7, eth8 and eth9 are the PFs of
the same HCA.

Signed-off-by: Shay Drory <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoMerge branch 'ax25-fix-issues-of-ax25_dev-and-net_device'
Jakub Kicinski [Mon, 13 May 2024 23:09:40 +0000 (16:09 -0700)]
Merge branch 'ax25-fix-issues-of-ax25_dev-and-net_device'

Duoming Zhou says:

====================
ax25: Fix issues of ax25_dev and net_device

The first patch uses kernel universal linked list to implement
ax25_dev_list, which makes the operation of the list easier.
The second and third patch fix reference count leak issues of
the object "ax25_dev" and "net_device".
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoax25: Fix reference count leak issue of net_device
Duoming Zhou [Thu, 9 May 2024 09:37:02 +0000 (17:37 +0800)]
ax25: Fix reference count leak issue of net_device

There is a reference count leak issue of the object "net_device" in
ax25_dev_device_down(). When the ax25 device is shutting down, the
ax25_dev_device_down() drops the reference count of net_device one
or zero times depending on if we goto unlock_put or not, which will
cause memory leak.

In order to solve the above issue, decrease the reference count of
net_device after dev->ax25_ptr is set to null.

Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Suggested-by: Dan Carpenter <[email protected]>
Signed-off-by: Duoming Zhou <[email protected]>
Reviewed-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/7ce3b23a40d9084657ba1125432f0ecc380cbc80.1715247018.git.duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <[email protected]>
10 months agoax25: Fix reference count leak issues of ax25_dev
Duoming Zhou [Thu, 9 May 2024 09:36:47 +0000 (17:36 +0800)]
ax25: Fix reference count leak issues of ax25_dev

The ax25_addr_ax25dev() and ax25_dev_device_down() exist a reference
count leak issue of the object "ax25_dev".

Memory leak issue in ax25_addr_ax25dev():

The reference count of the object "ax25_dev" can be increased multiple
times in ax25_addr_ax25dev(). This will cause a memory leak.

Memory leak issues in ax25_dev_device_down():

The reference count of ax25_dev is set to 1 in ax25_dev_device_up() and
then increase the reference count when ax25_dev is added to ax25_dev_list.
As a result, the reference count of ax25_dev is 2. But when the device is
shutting down. The ax25_dev_device_down() drops the reference count once
or twice depending on if we goto unlock_put or not, which will cause
memory leak.

As for the issue of ax25_addr_ax25dev(), it is impossible for one pointer
to be on a list twice. So add a break in ax25_addr_ax25dev(). As for the
issue of ax25_dev_device_down(), increase the reference count of ax25_dev
once in ax25_dev_device_up() and decrease the reference count of ax25_dev
after it is removed from the ax25_dev_list.

Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Suggested-by: Dan Carpenter <[email protected]>
Signed-off-by: Duoming Zhou <[email protected]>
Reviewed-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/361bbf2a4b091e120006279ec3b382d73c4a0c17.1715247018.git.duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <[email protected]>
This page took 0.148487 seconds and 4 git commands to generate.