linux.git
2 years agoMerge tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 23:08:48 +0000 (16:08 -0700)]
Merge tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull fscache fixes from David Howells:

 - Fix a check in fscache_wait_on_volume_collision() in which the
   polarity is reversed. It should complain if a volume is still marked
   acquisition-pending after 20s, but instead complains if the mark has
   been cleared (ie. the condition has cleared).

   Also switch an open-coded test of the ACQUIRE_PENDING volume flag to
   use the helper function for consistency.

 - Not a fix per se, but neaten the code by using a helper to check for
   the DROPPED state.

 - Fix cachefiles's support for erofs to only flush requests associated
   with a released control file, not all requests.

 - Fix a race between one process invalidating an object in the cache
   and another process trying to look it up.

* tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: Fix invalidation/lookup race
  cachefiles: narrow the scope of flushed requests when releasing fd
  fscache: Introduce fscache_cookie_is_dropped()
  fscache: Fix if condition in fscache_wait_on_volume_collision()

2 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Fri, 8 Jul 2022 22:24:16 +0000 (15:24 -0700)]
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
bpf 2022-07-08

We've added 3 non-merge commits during the last 2 day(s) which contain
a total of 7 files changed, 40 insertions(+), 24 deletions(-).

The main changes are:

1) Fix cBPF splat triggered by skb not having a mac header, from Eric Dumazet.

2) Fix spurious packet loss in generic XDP when pushing packets out (note
   that native XDP is not affected by the issue), from Johan Almbladh.

3) Fix bpf_dynptr_{read,write}() helper signatures with flag argument before
   its set in stone as UAPI, from Joanne Koong.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs
  bpf: Make sure mac_header was set before using it
  xdp: Fix spurious packet loss in generic XDP TX path
====================

Link: https://lore.kernel.org/r/20220708213418.19626-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()
Sven Schnelle [Wed, 6 Jul 2022 10:16:25 +0000 (12:16 +0200)]
ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()

CI reported the following splat while running the strace testsuite:

[ 3976.640309] WARNING: CPU: 1 PID: 3570031 at kernel/ptrace.c:272 ptrace_check_attach+0x12e/0x178
[ 3976.640391] CPU: 1 PID: 3570031 Comm: strace Tainted: G           OE     5.19.0-20220624.rc3.git0.ee819a77d4e7.300.fc36.s390x #1
[ 3976.640410] Hardware name: IBM 3906 M04 704 (z/VM 7.1.0)
[ 3976.640452] Call Trace:
[ 3976.640454]  [<00000000ab4b645a>] ptrace_check_attach+0x132/0x178
[ 3976.640457] ([<00000000ab4b6450>] ptrace_check_attach+0x128/0x178)
[ 3976.640460]  [<00000000ab4b6cde>] __s390x_sys_ptrace+0x86/0x160
[ 3976.640463]  [<00000000ac03fcec>] __do_syscall+0x1d4/0x200
[ 3976.640468]  [<00000000ac04e312>] system_call+0x82/0xb0
[ 3976.640470] Last Breaking-Event-Address:
[ 3976.640471]  [<00000000ab4ea3c8>] wait_task_inactive+0x98/0x190

This is because JOBCTL_TRACED is set, but the task is not in TASK_TRACED
state. Caused by ptrace_unfreeze_traced() which does:

task->jobctl &= ~TASK_TRACED

but it should be:

task->jobctl &= ~JOBCTL_TRACED

Fixes: 31cae1eaae4f ("sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Link: https://lkml.kernel.org/r/20220706101625.2100298-1-svens@linux.ibm.com
Link: https://lkml.kernel.org/r/YrHA5UkJLornOdCz@li-4a3a4a4c-28e5-11b2-a85c-a8d192c6f089.ibm.com
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2101641
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Tested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2 years agoMerge tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 8 Jul 2022 20:05:56 +0000 (13:05 -0700)]
Merge tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix two recent regressions related to CPPC support.

  Specifics:

   - Prevent _CPC from being used if the platform firmware does not
     confirm CPPC v2 support via _OSC (Mario Limonciello)

   - Allow systems with X86_FEATURE_CPPC set to use _CPC even if CPPC
     support cannot be agreed on via _OSC (Mario Limonciello)"

* tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported
  ACPI: CPPC: Only probe for _CPC if CPPC v2 is acked

2 years agoMerge tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 8 Jul 2022 20:01:04 +0000 (13:01 -0700)]
Merge tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a NULL pointer dereference in a devfreq driver and a runtime
  PM framework issue that may cause a supplier device to be suspended
  before its consumer.

  Specifics:

   - Fix NULL pointer dereference related to printing a diagnostic
     message in the exynos-bus devfreq driver (Christian Marangi)

   - Fix race condition in the runtime PM framework which in some cases
     may cause a supplier device to be suspended when its consumer is
     still active (Rafael Wysocki)"

* tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / devfreq: exynos-bus: Fix NULL pointer dereference
  PM: runtime: Fix supplier device management during consumer probe
  PM: runtime: Redefine pm_runtime_release_supplier()

2 years agoMerge tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 19:55:25 +0000 (12:55 -0700)]
Merge tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl fixes from Vishal Verma:

 - Update MAINTAINERS for Ben's email

 - Fix cleanup of port devices on failure to probe driver

 - Fix endianness in get/set LSA mailbox command structures

 - Fix memregion_free() fallback definition

 - Fix missing variable payload checks in CXL cmd size validation

* tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/mbox: Fix missing variable payload checks in cmd size validation
  memregion: Fix memregion_free() fallback definition
  cxl/mbox: Use __le32 in get,set_lsa mailbox structures
  cxl/core: Use is_endpoint_decoder
  cxl: Fix cleanup of port devices on failure to probe driver.
  MAINTAINERS: Update Ben's email address

2 years agoMerge tag 'iommu-fixes-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 19:49:00 +0000 (12:49 -0700)]
Merge tag 'iommu-fixes-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - fix device setup failures in the Intel VT-d driver when the PASID
   table is shared

 - fix Intel VT-d device hot-add failure due to wrong device notifier
   order

 - remove the old IOMMU mailing list from the MAINTAINERS file now that
   it has been retired

* tag 'iommu-fixes-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  MAINTAINERS: Remove iommu@lists.linux-foundation.org
  iommu/vt-d: Fix RID2PASID setup/teardown failure
  iommu/vt-d: Fix PCI bus rescan device hot add

2 years agoMerge tag 'gpio-fixes-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 8 Jul 2022 19:39:52 +0000 (12:39 -0700)]
Merge tag 'gpio-fixes-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a build error in gpio-vf610

 - fix a null-pointer dereference in the GPIO character device code

* tag 'gpio-fixes-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: cdev: fix null pointer dereference in linereq_free()
  gpio: vf610: fix compilation error

2 years agoMerge branch 'pm-core'
Rafael J. Wysocki [Fri, 8 Jul 2022 18:38:51 +0000 (20:38 +0200)]
Merge branch 'pm-core'

Merge a runtime PM framework cleanup and fix related to device links.

* pm-core:
  PM: runtime: Fix supplier device management during consumer probe
  PM: runtime: Redefine pm_runtime_release_supplier()

2 years agoMerge tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 8 Jul 2022 18:32:23 +0000 (11:32 -0700)]
Merge tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "NVMe pull request with another id quirk addition, and a tracing fix"

* tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
  nvme: use struct group for generic command dwords
  nvme-pci: phison e16 has bogus namespace ids

2 years agoMerge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 8 Jul 2022 18:25:01 +0000 (11:25 -0700)]
Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block

Pull io_uring tweak from Jens Axboe:
 "Just a minor tweak to an addition made in this release cycle: padding
  a 32-bit value that's in a 64-bit union to avoid any potential
  funkiness from that"

* tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
  io_uring: explicit sqe padding for ioctl commands

2 years agoMerge tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Fri, 8 Jul 2022 18:03:26 +0000 (11:03 -0700)]
Merge tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:

 - fbcon now prevents switching to screen resolutions which are smaller
   than the font size, and prevents enabling a font which is bigger than
   the current screen resolution. This fixes vmalloc-out-of-bounds
   accesses found by KASAN.

 - Guiling Deng fixed a bug where the centered fbdev logo wasn't
   displayed correctly if the screen size matched the logo size.

 - Hsin-Yi Wang provided a patch to include errno.h to fix build when
   CONFIG_OF isn't enabled.

* tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
  fbmem: Check virtual screen sizes in fb_set_var()
  fbcon: Prevent that screen size is smaller than font size
  fbcon: Disallow setting font bigger than screen size
  video: of_display_timing.h: include errno.h
  fbdev: fbmem: Fix logo center image dx issue

2 years agobtrfs: zoned: drop optimization of zone finish
Naohiro Aota [Wed, 29 Jun 2022 02:00:38 +0000 (11:00 +0900)]
btrfs: zoned: drop optimization of zone finish

We have an optimization in do_zone_finish() to send REQ_OP_ZONE_FINISH only
when necessary, i.e. we don't send REQ_OP_ZONE_FINISH when we assume we
wrote fully into the zone.

The assumption is determined by "alloc_offset == capacity". This condition
won't work if the last ordered extent is canceled due to some errors. In
that case, we consider the zone is deactivated without sending the finish
command while it's still active.

This inconstancy results in activating another block group while we cannot
really activate the underlying zone, which causes the active zone exceeds
errors like below.

    BTRFS error (device nvme3n2): allocation failed flags 1, wanted 520192 tree-log 0, relocation: 0
    nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
    active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
    nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
    active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0

Fix the issue by removing the optimization for now.

Fixes: 8376d9e1ed8f ("btrfs: zoned: finish superblock zone once no space left for new SB")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agobtrfs: zoned: fix a leaked bioc in read_zone_info
Christoph Hellwig [Thu, 30 Jun 2022 16:03:19 +0000 (18:03 +0200)]
btrfs: zoned: fix a leaked bioc in read_zone_info

The bioc would leak on the normal completion path and also on the RAID56
check (but that one won't happen in practice due to the invalid
combination with zoned mode).

Fixes: 7db1c5d14dcd ("btrfs: zoned: support dev-replace in zoned filesystems")
CC: stable@vger.kernel.org # 5.16+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ update changelog ]
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agobtrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
Filipe Manana [Mon, 4 Jul 2022 11:42:03 +0000 (12:42 +0100)]
btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents

When doing a direct IO read or write, we always return -ENOTBLK when we
find a compressed extent (or an inline extent) so that we fallback to
buffered IO. This however is not ideal in case we are in a NOWAIT context
(io_uring for example), because buffered IO can block and we currently
have no support for NOWAIT semantics for buffered IO, so if we need to
fallback to buffered IO we should first signal the caller that we may
need to block by returning -EAGAIN instead.

This behaviour can also result in short reads being returned to user
space, which although it's not incorrect and user space should be able
to deal with partial reads, it's somewhat surprising and even some popular
applications like QEMU (Link tag #1) and MariaDB (Link tag #2) don't
deal with short reads properly (or at all).

The short read case happens when we try to read from a range that has a
non-compressed and non-inline extent followed by a compressed extent.
After having read the first extent, when we find the compressed extent we
return -ENOTBLK from btrfs_dio_iomap_begin(), which results in iomap to
treat the request as a short read, returning 0 (success) and waiting for
previously submitted bios to complete (this happens at
fs/iomap/direct-io.c:__iomap_dio_rw()). After that, and while at
btrfs_file_read_iter(), we call filemap_read() to use buffered IO to
read the remaining data, and pass it the number of bytes we were able to
read with direct IO. Than at filemap_read() if we get a page fault error
when accessing the read buffer, we return a partial read instead of an
-EFAULT error, because the number of bytes previously read is greater
than zero.

So fix this by returning -EAGAIN for NOWAIT direct IO when we find a
compressed or an inline extent.

Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Link: https://lore.kernel.org/linux-btrfs/YrrFGO4A1jS0GI0G@atmark-techno.com/
Link: https://jira.mariadb.org/browse/MDEV-27900?focusedCommentId=216582&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-216582
Tested-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agoovl: turn of SB_POSIXACL with idmapped layers temporarily
Christian Brauner [Wed, 6 Jul 2022 13:56:11 +0000 (15:56 +0200)]
ovl: turn of SB_POSIXACL with idmapped layers temporarily

This cycle we added support for mounting overlayfs on top of idmapped
mounts.  Recently I've started looking into potential corner cases when
trying to add additional tests and I noticed that reporting for POSIX ACLs
is currently wrong when using idmapped layers with overlayfs mounted on top
of it.

I have sent out an patch that fixes this and makes POSIX ACLs work
correctly but the patch is a bit bigger and we're already at -rc5 so I
recommend we simply don't raise SB_POSIXACL when idmapped layers are
used. Then we can fix the VFS part described below for the next merge
window so we can have good exposure in -next.

I'm going to give a rather detailed explanation to both the origin of the
problem and mention the solution so people know what's going on.

Let's assume the user creates the following directory layout and they have
a rootfs /var/lib/lxc/c1/rootfs. The files in this rootfs are owned as you
would expect files on your host system to be owned. For example, ~/.bashrc
for your regular user would be owned by 1000:1000 and /root/.bashrc would
be owned by 0:0. IOW, this is just regular boring filesystem tree on an
ext4 or xfs filesystem.

The user chooses to set POSIX ACLs using the setfacl binary granting the
user with uid 4 read, write, and execute permissions for their .bashrc
file:

        setfacl -m u:4:rwx /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc

Now they to expose the whole rootfs to a container using an idmapped
mount. So they first create:

        mkdir -pv /vol/contpool/{ctrover,merge,lowermap,overmap}
        mkdir -pv /vol/contpool/ctrover/{over,work}
        chown 10000000:10000000 /vol/contpool/ctrover/{over,work}

The user now creates an idmapped mount for the rootfs:

        mount-idmapped/mount-idmapped --map-mount=b:0:10000000:65536 \
                                      /var/lib/lxc/c2/rootfs \
                                      /vol/contpool/lowermap

This for example makes it so that
/var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc which is owned by uid and gid
1000 as being owned by uid and gid 10001000 at
/vol/contpool/lowermap/home/ubuntu/.bashrc.

Assume the user wants to expose these idmapped mounts through an overlayfs
mount to a container.

       mount -t overlay overlay                      \
             -o lowerdir=/vol/contpool/lowermap,     \
                upperdir=/vol/contpool/overmap/over, \
                workdir=/vol/contpool/overmap/work   \
             /vol/contpool/merge

The user can do this in two ways:

(1) Mount overlayfs in the initial user namespace and expose it to the
    container.

(2) Mount overlayfs on top of the idmapped mounts inside of the container's
    user namespace.

Let's assume the user chooses the (1) option and mounts overlayfs on the
host and then changes into a container which uses the idmapping
0:10000000:65536 which is the same used for the two idmapped mounts.

Now the user tries to retrieve the POSIX ACLs using the getfacl command

        getfacl -n /vol/contpool/lowermap/home/ubuntu/.bashrc

and to their surprise they see:

        # file: vol/contpool/merge/home/ubuntu/.bashrc
        # owner: 1000
        # group: 1000
        user::rw-
        user:4294967295:rwx
        group::r--
        mask::rwx
        other::r--

indicating the uid wasn't correctly translated according to the idmapped
mount. The problem is how we currently translate POSIX ACLs. Let's inspect
the callchain in this example:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  -> __vfs_getxattr()
                  |     -> handler->get == ovl_posix_acl_xattr_get()
                  |        -> ovl_xattr_get()
                  |           -> vfs_getxattr()
                  |              -> __vfs_getxattr()
                  |                 -> handler->get() /* lower filesystem callback */
                  |> posix_acl_fix_xattr_to_user()
                     {
                              4 = make_kuid(&init_user_ns, 4);
                              4 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 4);
                              /* FAILURE */
                             -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
                     }

If the user chooses to use option (2) and mounts overlayfs on top of
idmapped mounts inside the container things don't look that much better:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  -> __vfs_getxattr()
                  |     -> handler->get == ovl_posix_acl_xattr_get()
                  |        -> ovl_xattr_get()
                  |           -> vfs_getxattr()
                  |              -> __vfs_getxattr()
                  |                 -> handler->get() /* lower filesystem callback */
                  |> posix_acl_fix_xattr_to_user()
                     {
                              4 = make_kuid(&init_user_ns, 4);
                              4 = mapped_kuid_fs(&init_user_ns, 4);
                              /* FAILURE */
                             -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
                     }

As is easily seen the problem arises because the idmapping of the lower
mount isn't taken into account as all of this happens in do_gexattr(). But
do_getxattr() is always called on an overlayfs mount and inode and thus
cannot possible take the idmapping of the lower layers into account.

This problem is similar for fscaps but there the translation happens as
part of vfs_getxattr() already. Let's walk through an fscaps overlayfs
callchain:

        setcap 'cap_net_raw+ep' /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc

The expected outcome here is that we'll receive the cap_net_raw capability
as we are able to map the uid associated with the fscap to 0 within our
container.  IOW, we want to see 0 as the result of the idmapping
translations.

If the user chooses option (1) we get the following callchain for fscaps:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                   -> vfs_getxattr()
                      -> xattr_getsecurity()
                         -> security_inode_getsecurity()                                       ________________________________
                            -> cap_inode_getsecurity()                                         |                              |
                               {                                                               V                              |
                                        10000000 = make_kuid(0:0:4k /* overlayfs idmapping */, 10000000);                     |
                                        10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000);                  |
                                               /* Expected result is 0 and thus that we own the fscap. */                     |
                                               0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000);            |
                               }                                                                                              |
                               -> vfs_getxattr_alloc()                                                                        |
                                  -> handler->get == ovl_other_xattr_get()                                                    |
                                     -> vfs_getxattr()                                                                        |
                                        -> xattr_getsecurity()                                                                |
                                           -> security_inode_getsecurity()                                                    |
                                              -> cap_inode_getsecurity()                                                      |
                                                 {                                                                            |
                                                                0 = make_kuid(0:0:4k /* lower s_user_ns */, 0);               |
                                                         10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); |
                                                         10000000 = from_kuid(0:0:4k /* overlayfs idmapping */, 10000000);    |
                                                         |____________________________________________________________________|
                                                 }
                                                 -> vfs_getxattr_alloc()
                                                    -> handler->get == /* lower filesystem callback */

And if the user chooses option (2) we get:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                   -> vfs_getxattr()
                      -> xattr_getsecurity()
                         -> security_inode_getsecurity()                                                _______________________________
                            -> cap_inode_getsecurity()                                                  |                             |
                               {                                                                        V                             |
                                       10000000 = make_kuid(0:10000000:65536 /* overlayfs idmapping */, 0);                           |
                                       10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000);                           |
                                               /* Expected result is 0 and thus that we own the fscap. */                             |
                                              0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000);                     |
                               }                                                                                                      |
                               -> vfs_getxattr_alloc()                                                                                |
                                  -> handler->get == ovl_other_xattr_get()                                                            |
                                    |-> vfs_getxattr()                                                                                |
                                        -> xattr_getsecurity()                                                                        |
                                           -> security_inode_getsecurity()                                                            |
                                              -> cap_inode_getsecurity()                                                              |
                                                 {                                                                                    |
                                                                 0 = make_kuid(0:0:4k /* lower s_user_ns */, 0);                      |
                                                          10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0);        |
                                                                 0 = from_kuid(0:10000000:65536 /* overlayfs idmapping */, 10000000); |
                                                                 |____________________________________________________________________|
                                                 }
                                                 -> vfs_getxattr_alloc()
                                                    -> handler->get == /* lower filesystem callback */

We can see how the translation happens correctly in those cases as the
conversion happens within the vfs_getxattr() helper.

For POSIX ACLs we need to do something similar. However, in contrast to
fscaps we cannot apply the fix directly to the kernel internal posix acl
data structure as this would alter the cached values and would also require
a rework of how we currently deal with POSIX ACLs in general which almost
never take the filesystem idmapping into account (the noteable exception
being FUSE but even there the implementation is special) and instead
retrieve the raw values based on the initial idmapping.

The correct values are then generated right before returning to
userspace. The fix for this is to move taking the mount's idmapping into
account directly in vfs_getxattr() instead of having it be part of
posix_acl_fix_xattr_to_user().

To this end we simply move the idmapped mount translation into a separate
step performed in vfs_{g,s}etxattr() instead of in
posix_acl_fix_xattr_{from,to}_user().

To see how this fixes things let's go back to the original example. Assume
the user chose option (1) and mounted overlayfs on top of idmapped mounts
on the host:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  |> __vfs_getxattr()
                  |  |  -> handler->get == ovl_posix_acl_xattr_get()
                  |  |     -> ovl_xattr_get()
                  |  |        -> vfs_getxattr()
                  |  |           |> __vfs_getxattr()
                  |  |           |  -> handler->get() /* lower filesystem callback */
                  |  |           |> posix_acl_getxattr_idmapped_mnt()
                  |  |              {
                  |  |                              4 = make_kuid(&init_user_ns, 4);
                  |  |                       10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
                  |  |                       10000004 = from_kuid(&init_user_ns, 10000004);
                  |  |                       |_______________________
                  |  |              }                               |
                  |  |                                              |
                  |  |> posix_acl_getxattr_idmapped_mnt()           |
                  |     {                                           |
                  |                                                 V
                  |             10000004 = make_kuid(&init_user_ns, 10000004);
                  |             10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
                  |             10000004 = from_kuid(&init_user_ns, 10000004);
                  |     }       |_________________________________________________
                  |                                                              |
                  |                                                              |
                  |> posix_acl_fix_xattr_to_user()                               |
                     {                                                           V
                                 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
                                        /* SUCCESS */
                                        4 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000004);
                     }

And similarly if the user chooses option (1) and mounted overayfs on top of
idmapped mounts inside the container:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  |> __vfs_getxattr()
                  |  |  -> handler->get == ovl_posix_acl_xattr_get()
                  |  |     -> ovl_xattr_get()
                  |  |        -> vfs_getxattr()
                  |  |           |> __vfs_getxattr()
                  |  |           |  -> handler->get() /* lower filesystem callback */
                  |  |           |> posix_acl_getxattr_idmapped_mnt()
                  |  |              {
                  |  |                              4 = make_kuid(&init_user_ns, 4);
                  |  |                       10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
                  |  |                       10000004 = from_kuid(&init_user_ns, 10000004);
                  |  |                       |_______________________
                  |  |              }                               |
                  |  |                                              |
                  |  |> posix_acl_getxattr_idmapped_mnt()           |
                  |     {                                           V
                  |             10000004 = make_kuid(&init_user_ns, 10000004);
                  |             10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
                  |             10000004 = from_kuid(0(&init_user_ns, 10000004);
                  |             |_________________________________________________
                  |     }                                                        |
                  |                                                              |
                  |> posix_acl_fix_xattr_to_user()                               |
                     {                                                           V
                                 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
                                        /* SUCCESS */
                                        4 = from_kuid(0:10000000:65536 /* caller's idmappings */, 10000004);
                     }

The last remaining problem we need to fix here is ovl_get_acl(). During
ovl_permission() overlayfs will call:

        ovl_permission()
        -> generic_permission()
           -> acl_permission_check()
              -> check_acl()
                 -> get_acl()
                    -> inode->i_op->get_acl() == ovl_get_acl()
                        > get_acl() /* on the underlying filesystem)
                          ->inode->i_op->get_acl() == /*lower filesystem callback */
                 -> posix_acl_permission()

passing through the get_acl request to the underlying filesystem. This will
retrieve the acls stored in the lower filesystem without taking the
idmapping of the underlying mount into account as this would mean altering
the cached values for the lower filesystem. The simple solution is to have
ovl_get_acl() simply duplicate the ACLs, update the values according to the
idmapped mount and return it to acl_permission_check() so it can be used in
posix_acl_permission(). Since overlayfs doesn't cache ACLs they'll be
released right after.

Link: https://github.com/brauner/mount-idmapped/issues/9
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: linux-unionfs@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Fixes: bc70682a497c ("ovl: support idmapped layers")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2 years agoMerge branch 'sysctl-data-races'
David S. Miller [Fri, 8 Jul 2022 11:10:34 +0000 (12:10 +0100)]
Merge branch 'sysctl-data-races'

Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_table.

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

The first half of this series changes some proc handlers used in ipv4_table
to use READ_ONCE() and WRITE_ONCE() internally to fix data-races on the
sysctl side.  Then, the second half adds READ_ONCE() to the other readers
of ipv4_table.

Changes:
  v2:
    * Drop some changes that makes backporting difficult
      * First cleanup patch
      * Lockless helpers and .proc_handler changes
    * Drop the tracing part for .sysctl_mem
      * Steve already posted a fix
    * Drop int-to-bool change for cipso
      * Should be posted to net-next later
    * Drop proc_dobool() change
      * Can be included in another series

  v1: https://lore.kernel.org/netdev/20220706052130.16368-1-kuniyu@amazon.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv4: Fix a data-race around sysctl_fib_sync_mem.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:40:03 +0000 (16:40 -0700)]
ipv4: Fix a data-race around sysctl_fib_sync_mem.

While reading sysctl_fib_sync_mem, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid a data-race.

Fixes: 9ab948a91b2c ("ipv4: Allow amount of dirty memory from fib resizing to be controllable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix data-races around sysctl.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:40:02 +0000 (16:40 -0700)]
icmp: Fix data-races around sysctl.

While reading icmp sysctl variables, they can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.

Fixes: 4cdf507d5452 ("icmp: add a global rate limitation")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocipso: Fix data-races around sysctl.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:40:01 +0000 (16:40 -0700)]
cipso: Fix data-races around sysctl.

While reading cipso sysctl variables, they can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.

Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: Fix data-races around sysctl_mem.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:40:00 +0000 (16:40 -0700)]
net: Fix data-races around sysctl_mem.

While reading .sysctl_mem, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoinetpeer: Fix data-races around sysctl.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:59 +0000 (16:39 -0700)]
inetpeer: Fix data-races around sysctl.

While reading inetpeer sysctl variables, they can be changed
concurrently.  So, we need to add READ_ONCE() to avoid data-races.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_max_orphans.
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:58 +0000 (16:39 -0700)]
tcp: Fix a data-race around sysctl_tcp_max_orphans.

While reading sysctl_tcp_max_orphans, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid a data-race.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_dointvec_jiffies().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:57 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_dointvec_jiffies().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_dointvec_jiffies() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_dointvec_jiffies() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_doulongvec_minmax().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:56 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_doulongvec_minmax().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_doulongvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_doulongvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_douintvec_minmax().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:55 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_douintvec_minmax().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_douintvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_douintvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: 61d9b56a8920 ("sysctl: add unsigned int range support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_dointvec_minmax().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:54 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_dointvec_minmax().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_dointvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_dointvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_douintvec().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:53 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_douintvec().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_douintvec() to use READ_ONCE() and WRITE_ONCE()
internally to fix data-races on the sysctl side.  For now, proc_douintvec()
itself is tolerant to a data-race, but we still need to add annotations on
the other subsystem's side.

Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data races in proc_dointvec().
Kuniyuki Iwashima [Wed, 6 Jul 2022 23:39:52 +0000 (16:39 -0700)]
sysctl: Fix data races in proc_dointvec().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_dointvec() to use READ_ONCE() and WRITE_ONCE()
internally to fix data-races on the sysctl side.  For now, proc_dointvec()
itself is tolerant to a data-race, but we still need to add annotations on
the other subsystem's side.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
Steven Rostedt (Google) [Wed, 6 Jul 2022 14:50:40 +0000 (10:50 -0400)]
net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer

The trace event sock_exceed_buf_limit saves the prot->sysctl_mem pointer
and then dereferences it in the TP_printk() portion. This is unsafe as the
TP_printk() portion is executed at the time the buffer is read. That is,
it can be seconds, minutes, days, months, even years later. If the proto
is freed, then this dereference will can also lead to a kernel crash.

Instead, save the sysctl_mem array into the ring buffer and have the
TP_printk() reference that instead. This is the proper and safe way to
read pointers in trace events.

Link: https://lore.kernel.org/all/20220706052130.16368-12-kuniyu@amazon.com/
Cc: stable@vger.kernel.org
Fixes: 3847ce32aea9f ("core: add tracepoints for queueing skb to rcvbuf")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agox86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
Thadeu Lima de Souza Cascardo [Thu, 7 Jul 2022 16:41:52 +0000 (13:41 -0300)]
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported

There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.

If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.

Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agobpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs
Joanne Koong [Wed, 6 Jul 2022 23:25:47 +0000 (16:25 -0700)]
bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs

Commit 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
added the bpf_dynptr_write() and bpf_dynptr_read() APIs.

However, it will be needed for some dynptr types to pass in flags as
well (e.g. when writing to a skb, the user may like to invalidate the
hash or recompute the checksum).

This patch adds a "u64 flags" arg to the bpf_dynptr_read() and
bpf_dynptr_write() APIs before their UAPI signature freezes where
we then cannot change them anymore with a 5.19.x released kernel.

Fixes: 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220706232547.4016651-1-joannelkoong@gmail.com
2 years agoMAINTAINERS: Remove iommu@lists.linux-foundation.org
Joerg Roedel [Wed, 6 Jul 2022 10:33:31 +0000 (12:33 +0200)]
MAINTAINERS: Remove iommu@lists.linux-foundation.org

The IOMMU mailing list has moved to iommu@lists.linux.dev
and the old list should bounce by now. Remove it from the
MAINTAINERS file.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220706103331.10215-1-joro@8bytes.org
2 years agonet: ocelot: fix wrong time_after usage
Pavel Skripkin [Wed, 6 Jul 2022 13:28:45 +0000 (16:28 +0300)]
net: ocelot: fix wrong time_after usage

Accidentally noticed, that this driver is the only user of
while (time_after(jiffies...)).

It looks like typo, because likely this while loop will finish after 1st
iteration, because time_after() returns true when 1st argument _is after_
2nd one.

There is one possible problem with this poll loop: the scheduler could put
the thread to sleep, and it does not get woken up for
OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
its thing, but you exit the while loop and return -ETIMEDOUT.

Fix it by using sane poll API that avoids all problems described above

Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220706132845.27968-1-paskripkin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'mlx5-fixes-2022-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Fri, 8 Jul 2022 00:44:45 +0000 (17:44 -0700)]
Merge tag 'mlx5-fixes-2022-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-07-06

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Ring the TX doorbell on DMA errors
  net/mlx5e: Fix capability check for updating vnic env counters
  net/mlx5e: CT: Use own workqueue instead of mlx5e priv
  net/mlx5: Lag, correct get the port select mode str
  net/mlx5e: Fix enabling sriov while tc nic rules are offloaded
  net/mlx5e: kTLS, Fix build time constant test in RX
  net/mlx5e: kTLS, Fix build time constant test in TX
  net/mlx5: Lag, decouple FDB selection and shared FDB
  net/mlx5: TC, allow offload from uplink to other PF's VF
====================

Link: https://lore.kernel.org/r/20220706231309.38579-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: ti: am65-cpsw: Fix devlink port register sequence
Siddharth Vadapalli [Wed, 6 Jul 2022 07:02:08 +0000 (12:32 +0530)]
net: ethernet: ti: am65-cpsw: Fix devlink port register sequence

Renaming interfaces using udevd depends on the interface being registered
before its netdev is registered. Otherwise, udevd reads an empty
phys_port_name value, resulting in the interface not being renamed.

Fix this by registering the interface before registering its netdev
by invoking am65_cpsw_nuss_register_devlink() before invoking
register_netdev() for the interface.

Move the function call to devlink_port_type_eth_set(), invoking it after
register_netdev() is invoked, to ensure that netlink notification for the
port state change is generated after the netdev is completely initialized.

Fixes: 58356eb31d60 ("net: ti: am65-cpsw-nuss: Add devlink support")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://lore.kernel.org/r/20220706070208.12207-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: dwc-qos: Disable split header for Tegra194
Jon Hunter [Wed, 6 Jul 2022 08:39:13 +0000 (09:39 +0100)]
net: stmmac: dwc-qos: Disable split header for Tegra194

There is a long-standing issue with the Synopsys DWC Ethernet driver
for Tegra194 where random system crashes have been observed [0]. The
problem occurs when the split header feature is enabled in the stmmac
driver. In the bad case, a larger than expected buffer length is
received and causes the calculation of the total buffer length to
overflow. This results in a very large buffer length that causes the
kernel to crash. Why this larger buffer length is received is not clear,
however, the feedback from the NVIDIA design team is that the split
header feature is not supported for Tegra194. Therefore, disable split
header support for Tegra194 to prevent these random crashes from
occurring.

[0] https://lore.kernel.org/linux-tegra/b0b17697-f23e-8fa5-3757-604a86f3a095@nvidia.com/

Fixes: 67afd6d1cfdf ("net: stmmac: Add Split Header support and enable it in XGMAC cores")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20220706083913.13750-1-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme into block-5.19
Jens Axboe [Thu, 7 Jul 2022 23:38:19 +0000 (17:38 -0600)]
Merge tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme into block-5.19

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.19

 - another bogus identifier quirk (Keith Busch)
 - use struct group in the tracer to avoid a gcc warning (Keith Busch)"

* tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme:
  nvme: use struct group for generic command dwords
  nvme-pci: phison e16 has bogus namespace ids

2 years agoio_uring: explicit sqe padding for ioctl commands
Pavel Begunkov [Thu, 7 Jul 2022 14:00:38 +0000 (15:00 +0100)]
io_uring: explicit sqe padding for ioctl commands

32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6b95a05e970af79000435166185e85b196b2ba2.1657202417.git.asml.silence@gmail.com
[axboe: turn bitwise OR into logical variant]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoi2c: cadence: Unregister the clk notifier in error path
Satish Nagireddy [Tue, 28 Jun 2022 19:12:16 +0000 (12:12 -0700)]
i2c: cadence: Unregister the clk notifier in error path

This patch ensures that the clock notifier is unregistered
when driver probe is returning error.

Fixes: df8eb5691c48 ("i2c: Add driver for Cadence I2C controller")
Signed-off-by: Satish Nagireddy <satish.nagireddy@getcruise.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2 years agoMerge tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Rafael J. Wysocki [Thu, 7 Jul 2022 19:46:05 +0000 (21:46 +0200)]
Merge tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull a devfreq fix for 5.19-rc6 from Chanwoo Choi:

"- Fix exynos-bus NULL pointer dereference by correctly using the local
   generated freq_table to output the debug values instead of using the
   profile freq_table that is not used in the driver."

* tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: exynos-bus: Fix NULL pointer dereference

2 years agoPM / devfreq: exynos-bus: Fix NULL pointer dereference
Christian Marangi [Fri, 1 Jul 2022 13:31:26 +0000 (15:31 +0200)]
PM / devfreq: exynos-bus: Fix NULL pointer dereference

Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: b5d281f6c16d ("PM / devfreq: Rework freq_table to be local to devfreq struct")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2 years agonetfilter: conntrack: fix crash due to confirmed bit load reordering
Florian Westphal [Wed, 6 Jul 2022 14:50:04 +0000 (16:50 +0200)]
netfilter: conntrack: fix crash due to confirmed bit load reordering

Kajetan Puchalski reports crash on ARM, with backtrace of:

__nf_ct_delete_from_lists
nf_ct_delete
early_drop
__nf_conntrack_alloc

Unlike atomic_inc_not_zero, refcount_inc_not_zero is not a full barrier.
conntrack uses SLAB_TYPESAFE_BY_RCU, i.e. it is possible that a 'newly'
allocated object is still in use on another CPU:

CPU1 CPU2
encounter 'ct' during hlist walk
 delete_from_lists
 refcount drops to 0
 kmem_cache_free(ct);
 __nf_conntrack_alloc() // returns same object
refcount_inc_not_zero(ct); /* might fail */

/* If set, ct is public/in the hash table */
test_bit(IPS_CONFIRMED_BIT, &ct->status);

In case CPU1 already set refcount back to 1, refcount_inc_not_zero()
will succeed.

The expected possibilities for a CPU that obtained the object 'ct'
(but no reference so far) are:

1. refcount_inc_not_zero() fails.  CPU2 ignores the object and moves to
   the next entry in the list.  This happens for objects that are about
   to be free'd, that have been free'd, or that have been reallocated
   by __nf_conntrack_alloc(), but where the refcount has not been
   increased back to 1 yet.

2. refcount_inc_not_zero() succeeds. CPU2 checks the CONFIRMED bit
   in ct->status.  If set, the object is public/in the table.

   If not, the object must be skipped; CPU2 calls nf_ct_put() to
   un-do the refcount increment and moves to the next object.

Parallel deletion from the hlists is prevented by a
'test_and_set_bit(IPS_DYING_BIT, &ct->status);' check, i.e. only one
cpu will do the unlink, the other one will only drop its reference count.

Because refcount_inc_not_zero is not a full barrier, CPU2 may try to
delete an object that is not on any list:

1. refcount_inc_not_zero() successful (refcount inited to 1 on other CPU)
2. CONFIRMED test also successful (load was reordered or zeroing
   of ct->status not yet visible)
3. delete_from_lists unlinks entry not on the hlist, because
   IPS_DYING_BIT is 0 (already cleared).

2) is already wrong: CPU2 will handle a partially initited object
that is supposed to be private to CPU1.

Add needed barriers when refcount_inc_not_zero() is successful.

It also inserts a smp_wmb() before the refcount is set to 1 during
allocation.

Because other CPU might still see the object, refcount_set(1)
"resurrects" it, so we need to make sure that other CPUs will also observe
the right content.  In particular, the CONFIRMED bit test must only pass
once the object is fully initialised and either in the hash or about to be
inserted (with locks held to delay possible unlink from early_drop or
gc worker).

I did not change flow_offload_alloc(), as far as I can see it should call
refcount_inc(), not refcount_inc_not_zero(): the ct object is attached to
the skb so its refcount should be >= 1 in all cases.

v2: prefer smp_acquire__after_ctrl_dep to smp_rmb (Will Deacon).
v3: keep smp_acquire__after_ctrl_dep close to refcount_inc_not_zero call
    add comment in nf_conntrack_netlink, no control dependency there
    due to locks.

Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/all/Yr7WTfd6AVTQkLjI@e126311.manchester.arm.com/
Reported-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Diagnosed-by: Will Deacon <will@kernel.org>
Fixes: 719774377622 ("netfilter: conntrack: convert to refcount_t api")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Will Deacon <will@kernel.org>
2 years agobpf: Make sure mac_header was set before using it
Eric Dumazet [Thu, 7 Jul 2022 12:39:00 +0000 (12:39 +0000)]
bpf: Make sure mac_header was set before using it

Classic BPF has a way to load bytes starting from the mac header.

Some skbs do not have a mac header, and skb_mac_header()
in this case is returning a pointer that 65535 bytes after
skb->head.

Existing range check in bpf_internal_load_pointer_neg_helper()
was properly kicking and no illegal access was happening.

New sanity check in skb_mac_header() is firing, so we need
to avoid it.

WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline]
WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
Modules linked in:
CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb9484be #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline]
RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41
RSP: 0018:ffffc9000309f668 EFLAGS: 00010216
RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000
RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003
RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004
R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000
FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
____bpf_skb_load_helper_32 net/core/filter.c:276 [inline]
bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264

Fixes: f9aefd6b2aa3 ("net: warn if mac header was not set")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com
2 years agoMerge tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 7 Jul 2022 17:41:27 +0000 (10:41 -0700)]
Merge tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "A fix for tinyconfig build error, a fix for section mismatch warning,
  and two cleanups of obsolete code"

* tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix section mismatch warning
  LoongArch: Fix build errors for tinyconfig
  LoongArch: Remove obsolete mentions of vcsr
  LoongArch: Drop these obsolete selects in Kconfig

2 years agoMerge tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 7 Jul 2022 17:08:20 +0000 (10:08 -0700)]
Merge tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf, netfilter, can, and bluetooth.

  Current release - regressions:

   - bluetooth: fix deadlock on hci_power_on_sync

  Previous releases - regressions:

   - sched: act_police: allow 'continue' action offload

   - eth: usbnet: fix memory leak in error case

   - eth: ibmvnic: properly dispose of all skbs during a failover

  Previous releases - always broken:

   - bpf:
       - fix insufficient bounds propagation from
         adjust_scalar_min_max_vals
       - clear page contiguity bit when unmapping pool

   - netfilter: nft_set_pipapo: release elements in clone from
     abort path

   - mptcp: netlink: issue MP_PRIO signals from userspace PMs

   - can:
       - rcar_canfd: fix data transmission failed on R-Car V3U
       - gs_usb: gs_usb_open/close(): fix memory leak

  Misc:

   - add Wenjia as SMC maintainer"

* tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  wireguard: Kconfig: select CRYPTO_CHACHA_S390
  crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
  wireguard: selftests: use microvm on x86
  wireguard: selftests: always call kernel makefile
  wireguard: selftests: use virt machine on m68k
  wireguard: selftests: set fake real time in init
  r8169: fix accessing unset transport header
  net: rose: fix UAF bug caused by rose_t0timer_expiry
  usbnet: fix memory leak in error case
  Revert "tls: rx: move counting TlsDecryptErrors for sync"
  mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
  mptcp: fix local endpoint accounting
  selftests: mptcp: userspace PM support for MP_PRIO signals
  mptcp: netlink: issue MP_PRIO signals from userspace PMs
  mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
  mptcp: Avoid acquiring PM lock for subflow priority changes
  mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
  net/mlx5e: Fix matchall police parameters validation
  net/sched: act_police: allow 'continue' action offload
  net: lan966x: hardcode the number of external ports
  ...

2 years agoMerge tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 7 Jul 2022 17:02:38 +0000 (10:02 -0700)]
Merge tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

 - Tag Intel pin control as supported in MAINTAINERS

 - Fix a NULL pointer exception in the Aspeed driver

 - Correct some NAND functions in the Sunxi A83T driver

 - Use the right offset for some Sunxi pins

 - Fix a zero base offset in the Freescale (NXP) i.MX93

 - Fix the IRQ support in the STM32 driver

* tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: stm32: fix optional IRQ support to gpios
  pinctrl: imx: Add the zero base flag for imx93
  pinctrl: sunxi: sunxi_pconf_set: use correct offset
  pinctrl: sunxi: a83t: Fix NAND function name for some pins
  pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
  MAINTAINERS: Update Intel pin control to Supported

2 years agosignal handling: don't use BUG_ON() for debugging
Linus Torvalds [Wed, 6 Jul 2022 19:20:59 +0000 (12:20 -0700)]
signal handling: don't use BUG_ON() for debugging

These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.

It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".

Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running.  Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoASoC: Intel: Skylake: Correct the handling of fmt_config flexible array
Peter Ujfalusi [Thu, 30 Jun 2022 06:56:38 +0000 (09:56 +0300)]
ASoC: Intel: Skylake: Correct the handling of fmt_config flexible array

The struct nhlt_format's fmt_config is a flexible array, it must not be
used as normal array.
When moving to the next nhlt_fmt_cfg we need to take into account the data
behind the ->config.caps (indicated by ->config.size).

The logic of the code also changed: it is no longer saves the _last_
fmt_cfg for all found rates.

Fixes: bc2bd45b1f7f3 ("ASoC: Intel: Skylake: Parse nhlt and register clock device")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20220630065638.11183-3-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: Intel: Skylake: Correct the ssp rate discovery in skl_get_ssp_clks()
Peter Ujfalusi [Thu, 30 Jun 2022 06:56:37 +0000 (09:56 +0300)]
ASoC: Intel: Skylake: Correct the ssp rate discovery in skl_get_ssp_clks()

The present flag is only set once when one rate has been found to be saved.
This will effectively going to ignore any rate discovered at later time and
based on the code, this is not the intention.

Fixes: bc2bd45b1f7f3 ("ASoC: Intel: Skylake: Parse nhlt and register clock device")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20220630065638.11183-2-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: rt5640: Fix the wrong state of JD1 and JD2
Oder Chiou [Tue, 5 Jul 2022 10:11:33 +0000 (18:11 +0800)]
ASoC: rt5640: Fix the wrong state of JD1 and JD2

The patch fixes the wrong state of JD1 and JD2 while the bst1 or bst2 is
power on in the HDA JD using.

Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
Reported-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/20220705101134.16792-1-oder_chiou@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: Intel: sof_rt5682: fix out-of-bounds array access
Brent Lu [Fri, 1 Jul 2022 14:15:17 +0000 (22:15 +0800)]
ASoC: Intel: sof_rt5682: fix out-of-bounds array access

Starting from ADL platform we have four HDMI PCM devices which exceeds
the size of sof_hdmi array. Since each sof_hdmi_pcm structure
represents one HDMI PCM device, we remove the sof_hdmi array and add a
new member hdmi_jack to the sof_hdmi_pcm structure to fix the
out-of-bounds problem.

Signed-off-by: Brent Lu <brent.lu@intel.com>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20220701141517.264070-1-brent.lu@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: qdsp6: fix potential memory leak in q6apm_get_audioreach_graph()
Jianglei Nie [Wed, 29 Jun 2022 18:25:20 +0000 (02:25 +0800)]
ASoC: qdsp6: fix potential memory leak in q6apm_get_audioreach_graph()

q6apm_get_audioreach_graph() allocates a memory chunk for graph->graph
with audioreach_alloc_graph_pkt(). When idr_alloc() fails, graph->graph
is not released, which will lead to a memory leak.

We can release the graph->graph with kfree() when idr_alloc() fails to
fix the memory leak.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220629182520.2164409-1-niejianglei2021@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: tas2764: Fix amp gain register offset & default
Hector Martin [Thu, 30 Jun 2022 07:51:35 +0000 (09:51 +0200)]
ASoC: tas2764: Fix amp gain register offset & default

The register default is 0x28 per the datasheet, and the amp gain field
is supposed to be shifted left by one. With the wrong default, the ALSA
controls lie about the power-up state. With the wrong shift, we get only
half the gain we expect.

Signed-off-by: Hector Martin <marcan@marcan.st>
Fixes: 827ed8a0fa50 ("ASoC: tas2764: Add the driver for the TAS2764")
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Link: https://lore.kernel.org/r/20220630075135.2221-4-povik+lin@cutebit.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: tas2764: Correct playback volume range
Hector Martin [Thu, 30 Jun 2022 07:51:34 +0000 (09:51 +0200)]
ASoC: tas2764: Correct playback volume range

DVC value 0xc8 is -100dB and 0xc9 is mute; this needs to map to
-100.5dB as far as the dB scale is concerned. Fix that and enable
the mute flag, so alsamixer correctly shows the control as
<0 dB .. -100 dB, mute>.

Signed-off-by: Hector Martin <marcan@marcan.st>
Fixes: 827ed8a0fa50 ("ASoC: tas2764: Add the driver for the TAS2764")
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Link: https://lore.kernel.org/r/20220630075135.2221-3-povik+lin@cutebit.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: tas2764: Fix and extend FSYNC polarity handling
Martin Povišer [Thu, 30 Jun 2022 07:51:33 +0000 (09:51 +0200)]
ASoC: tas2764: Fix and extend FSYNC polarity handling

Fix setting of FSYNC polarity in case of LEFT_J and DSP_A/B formats.
Do NOT set the SCFG field as was previously done, because that is not
correct and is also in conflict with the "ASI1 Source" control which
sets the same SCFG field!

Also add support for explicit polarity inversion.

Fixes: 827ed8a0fa50 ("ASoC: tas2764: Add the driver for the TAS2764")
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Link: https://lore.kernel.org/r/20220630075135.2221-2-povik+lin@cutebit.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: tas2764: Add post reset delays
Martin Povišer [Thu, 30 Jun 2022 07:51:32 +0000 (09:51 +0200)]
ASoC: tas2764: Add post reset delays

Make sure there is at least 1 ms delay from reset to first command as
is specified in the datasheet. This is a fix similar to commit
307f31452078 ("ASoC: tas2770: Insert post reset delay").

Fixes: 827ed8a0fa50 ("ASoC: tas2764: Add the driver for the TAS2764")
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Link: https://lore.kernel.org/r/20220630075135.2221-1-povik+lin@cutebit.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: dt-bindings: Fix description for msm8916
Bryan O'Donoghue [Wed, 29 Jun 2022 11:40:12 +0000 (12:40 +0100)]
ASoC: dt-bindings: Fix description for msm8916

For the existing msm8916 bindings the minimum reg/reg-names is 1 not 2.
Similarly the minimum interrupt/interrupt-names is 1 not 2.

Fixes: f3fc4fbfa2d2 ("ASoC: dt-bindings: Add SC7280 lpass cpu bindings")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220629114012.3282945-1-bryan.odonoghue@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: doc: Capitalize RESET line name
Marek Vasut [Tue, 28 Jun 2022 16:58:40 +0000 (18:58 +0200)]
ASoC: doc: Capitalize RESET line name

Make sure all AC97 interface lines are spelled in capitals,
to avoid confusing readers about where the 5th line is.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Mark Brown <broonie@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220628165840.152235-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: arizona: Update arizona_aif_cfg_changed to use RX_BCLK_RATE
Charles Keepax [Tue, 28 Jun 2022 15:34:09 +0000 (16:34 +0100)]
ASoC: arizona: Update arizona_aif_cfg_changed to use RX_BCLK_RATE

Currently the function arizona_aif_cfg_changed uses the TX_BCLK_RATE,
however this register is not used on wm8998. This was not noticed as
previously snd_soc_component_read did not print an error message.
However, now the log gets filled with error messages, further more the
test for if the LRCLK changed will return spurious results.

Update the code to use the RX_BCLK_RATE register, the LRCLK parameters
are written to both registers and the RX_BCLK_RATE register is used
across all Arizona devices.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220628153409.3266932-4-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: cs47l92: Fix event generation for OUT1 demux
Charles Keepax [Tue, 28 Jun 2022 15:34:08 +0000 (16:34 +0100)]
ASoC: cs47l92: Fix event generation for OUT1 demux

cs47l92_put_demux returns the value of snd_soc_dapm_mux_update_power,
which returns a 1 if a path was found for the kcontrol. This is
obviously different to the expected return a 1 if the control
was updated value. This results in spurious notifications to
user-space. Update the handling to only return a 1 when the value is
changed.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220628153409.3266932-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: wm8998: Fix event generation for input mux
Charles Keepax [Tue, 28 Jun 2022 15:34:07 +0000 (16:34 +0100)]
ASoC: wm8998: Fix event generation for input mux

wm8998_inmux_put returns the value of snd_soc_dapm_mux_update_power,
which returns a 1 if a path was found for the kcontrol. This is
obviously different to the expected return a 1 if the control
was updated value. This results in spurious notifications to
user-space. Update the handling to only return a 1 when the value is
changed.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220628153409.3266932-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: wm5102: Fix event generation for output compensation
Charles Keepax [Tue, 28 Jun 2022 15:34:06 +0000 (16:34 +0100)]
ASoC: wm5102: Fix event generation for output compensation

The output compensation controls always returns zero regardless of if
the control value was updated. This results in missing notifications
to user-space of the control change. Update the handling to return 1
when the value is changed.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220628153409.3266932-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: wcd9335: Use int array instead of bitmask for TX mixers
Yassine Oudjana [Wed, 22 Jun 2022 06:17:45 +0000 (10:17 +0400)]
ASoC: wcd9335: Use int array instead of bitmask for TX mixers

Currently slim_tx_mixer_get reports all TX mixers as enabled when
at least one is, due to it reading the entire tx_port_value bitmask
without testing the specific bit corresponding to a TX port.
Furthermore, using the same bitmask for all capture DAIs makes
setting one mixer affect them all. To prevent this, and since
the SLIM TX muxes effectively only connect to one of the mixers
at a time, turn tx_port_value into an int array storing the DAI
index each of the ports is connected to.

Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com>
Link: https://lore.kernel.org/r/20220622061745.35399-1-y.oudjana@protonmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: tlv320adcx140: Fix tx_mask check
Sascha Hauer [Fri, 24 Jun 2022 10:57:16 +0000 (12:57 +0200)]
ASoC: tlv320adcx140: Fix tx_mask check

The tx_mask check doesn't reflect what the driver and the chip support.

The check currently checks for exactly two slots being enabled. The
tlv320adcx140 supports anything between one and eight channels, so relax
the check accordingly.

The tlv320adcx140 supports arbitrary tx_mask settings, but the driver
currently only supports adjacent slots beginning with the first slot,
so extend the check to check that the first slot is being used and that
there are no holes in the tx_mask.

Leave a comment to make it's the driver that limits the tx_mask
settings, not the chip itself.

While at it remove the set-but-unused struct adcx140p_priv::tdm_delay
field.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20220624105716.2579539-1-s.hauer@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: max98396: Fix register access for PCM format settings
Daniel Mack [Fri, 24 Jun 2022 10:47:10 +0000 (12:47 +0200)]
ASoC: max98396: Fix register access for PCM format settings

max98396_dai_set_fmt() modifes register 2041 and touches bits in the mask
0x3a. Make sure to use the right mask for that operation.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Link: https://lore.kernel.org/r/20220624104712.1934484-7-daniel@zonque.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: ti: omap-mcbsp: duplicate sysfs error
David Owens [Mon, 20 Jun 2022 18:37:43 +0000 (13:37 -0500)]
ASoC: ti: omap-mcbsp: duplicate sysfs error

Convert to managed versions of sysfs and clk allocation to simplify
unbinding and error handling in probe.  Managed sysfs node
creation specifically addresses the following error seen the second time
probe is attempted after sdma_pcm_platform_register() previously requsted
probe deferral:

sysfs: cannot create duplicate filename '/devices/platform/68000000.ocp/49022000.mcbsp/max_tx_thres'

Signed-off-by: David Owens <dowens@precisionplanting.com>
Link: https://lore.kernel.org/r/20220620183744.3176557-1-dowens@precisionplanting.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: audio_graph_card2: Fix port numbers in example
Sascha Hauer [Fri, 24 Jun 2022 09:26:01 +0000 (11:26 +0200)]
ASoC: audio_graph_card2: Fix port numbers in example

The example in audio-graph-card2.c has multiple nodes with the same name
in it. Change the port numbers to get different names.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20220624092601.2445224-1-s.hauer@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoASoC: sgtl5000: Fix noise on shutdown/remove
Francesco Dolcini [Fri, 24 Jun 2022 10:13:01 +0000 (12:13 +0200)]
ASoC: sgtl5000: Fix noise on shutdown/remove

Put the SGTL5000 in a silent/safe state on shutdown/remove, this is
required since the SGTL5000 produces a constant noise on its output
after it is configured and its clock is removed. Without this change
this is happening every time the module is unbound/removed or from
reboot till the clock is enabled again.

The issue was experienced on both a Toradex Colibri/Apalis iMX6, but can
be easily reproduced everywhere just playing something on the codec and
after that removing/unbinding the driver.

Fixes: 9b34e6cc3bc2 ("ASoC: Add Freescale SGTL5000 codec support")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Fabio Estevam <festevam@denx.de>
Link: https://lore.kernel.org/r/20220624101301.441314-1-francesco.dolcini@toradex.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 years agoima: Fix a potential integer overflow in ima_appraise_measurement
Huaxin Lu [Tue, 5 Jul 2022 05:14:17 +0000 (13:14 +0800)]
ima: Fix a potential integer overflow in ima_appraise_measurement

When the ima-modsig is enabled, the rc passed to evm_verifyxattr() may be
negative, which may cause the integer overflow problem.

Fixes: 39b07096364a ("ima: Implement support for module-style appended signatures")
Signed-off-by: Huaxin Lu <luhuaxin1@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2 years agox86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
Peter Zijlstra [Wed, 6 Jul 2022 13:33:30 +0000 (15:33 +0200)]
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry

Commit

  ee774dac0da1 ("x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()")

moved PUSH_AND_CLEAR_REGS out of error_entry, into its own function, in
part to avoid calling error_entry() for XenPV.

However, commit

  7c81c0c9210c ("x86/entry: Avoid very early RET")

had to change that because the 'ret' was too early and moved it into
idtentry, bloating the text size, since idtentry is expanded for every
exception vector.

However, with the advent of xen_error_entry() in commit

  d147553b64bad ("x86/xen: Add UNTRAIN_RET")

it became possible to remove PUSH_AND_CLEAR_REGS from idtentry, back
into *error_entry().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agox86/ibt, objtool: Don't discard text references from tracepoint section
Peter Zijlstra [Tue, 28 Jun 2022 10:57:42 +0000 (12:57 +0200)]
x86/ibt, objtool: Don't discard text references from tracepoint section

On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:

> # ./ftracetest
> === Ftrace unit tests ===
> [1] Basic trace file check      [PASS]
> [2] Basic test for tracers      [PASS]
> [3] Basic trace clock test      [PASS]
> [4] Basic event tracing check   [PASS]
> [5] Change the ringbuffer size  [PASS]
> [6] Snapshot and tracing setting        [PASS]
> [7] trace_pipe and trace_marker [PASS]
> [8] Test ftrace direct functions against tracers        [UNRESOLVED]
> [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
> [10] Generic dynamic event - add/remove eprobe events   [FAIL]
> [11] Generic dynamic event - add/remove kprobe events
>
> It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> "
> [ 9332.752836] mmiotrace: enabled CPU7.
> [ 9332.788612] mmiotrace: disabled.
> [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0

It turns out that while syscall_regfunc() does have an ENDBR when
generated, it gets sealed by objtool's .ibt_endbr_seal list.

Since the only text references to this function:

  $ git grep syscall_regfunc
  include/linux/tracepoint.h:extern int syscall_regfunc(void);
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  kernel/tracepoint.c:int syscall_regfunc(void)

appear in the __tracepoint section which is excluded by objtool.

Fixes: 3c6f9f77e618 ("objtool: Rework ibt and extricate from stack validation")
Reported-by: Pengfei Xu <pengfei.xu@intel.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yrrepdaow4F5kqG0@hirez.programming.kicks-ass.net
2 years agox86/bugs: Add Cannon lake to RETBleed affected CPU list
Pawan Gupta [Wed, 6 Jul 2022 22:01:15 +0000 (15:01 -0700)]
x86/bugs: Add Cannon lake to RETBleed affected CPU list

Cannon lake is also affected by RETBleed, add it to the list.

Fixes: 6ad0ad2bf8a6 ("x86/bugs: Report Intel retbleed vulnerability")
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agogpiolib: cdev: fix null pointer dereference in linereq_free()
Kent Gibson [Wed, 6 Jul 2022 08:45:07 +0000 (16:45 +0800)]
gpiolib: cdev: fix null pointer dereference in linereq_free()

Fix a kernel NULL pointer dereference reported by gpio kselftests.

linereq_free() can be called as part of the cleanup of a failed request,
at which time the desc for a line may not have been determined, so it
is unsafe to dereference without a check.

Add a check prior to dereferencing the line desc.

Fixes: 2068339a6c35 ("gpiolib: cdev: Add hardware timestamp clock type")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agoLoongArch: Fix section mismatch warning
Tiezhu Yang [Mon, 27 Jun 2022 06:57:35 +0000 (14:57 +0800)]
LoongArch: Fix section mismatch warning

init_numa_memory() is annotated __init and not used by any module,
thus don't export it.

Remove not needed EXPORT_SYMBOL for init_numa_memory() to fix the
following section mismatch warning:

  MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference
from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()
The symbol init_numa_memory is exported and annotated __init
Fix this by removing the __init annotation of init_numa_memory or drop the export.

This is build on Linux 5.19-rc4.

Fixes: d4b6f1562a3c ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix build errors for tinyconfig
Huacai Chen [Wed, 6 Jul 2022 03:03:09 +0000 (11:03 +0800)]
LoongArch: Fix build errors for tinyconfig

Building loongarch:tinyconfig fails with the following error.

./arch/loongarch/include/asm/page.h: In function 'pfn_valid':
./arch/loongarch/include/asm/page.h:42:32: error: 'PHYS_OFFSET' undeclared

Add the missing include file and fix succeeding vdso errors.

Fixes: 09cfefb7fa70 ("LoongArch: Add memory management")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Remove obsolete mentions of vcsr
Qi Hu [Wed, 6 Jul 2022 11:29:37 +0000 (19:29 +0800)]
LoongArch: Remove obsolete mentions of vcsr

The `vcsr` only exists in the old hardware design, it isn't used in any
shipped hardware from Loongson-3A5000 on. Both scalar FP and LSX/LASX
instructions use the `fcsr` as their control and status registers now.
For example, the RM control bit in fcsr0 is shared by FP, LSX and LASX
instructions.

Particularly, fcsr16 to fcsr31 are reserved for LSX/LASX now, access to
these registers has no visible effect if LSX/LASX is enabled, and will
cause SXD/ASXD exceptions if LSX/LASX is not enabled.

So, mentions of vcsr are obsolete in the first place (it was just used
for debugging), let's remove them.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Drop these obsolete selects in Kconfig
Lukas Bulwahn [Tue, 5 Jul 2022 07:34:05 +0000 (09:34 +0200)]
LoongArch: Drop these obsolete selects in Kconfig

Commit fa96b57c1490 ("LoongArch: Add build infrastructure") adds the new
file arch/loongarch/Kconfig.

As the work on LoongArch was probably quite some time under development,
various config symbols have changed and disappeared from the time of
initial writing of the Kconfig file and its inclusion in the repository.

The following four commits:

  commit c126a53c2760 ("arch: remove GENERIC_FIND_FIRST_BIT entirely")
  commit 140c8180eb7c ("arch: remove HAVE_COPY_THREAD_TLS")
  commit aca52c398389 ("mm: remove CONFIG_HAVE_MEMBLOCK")
  commit 3f08a302f533 ("mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option")

remove the mentioned config symbol, and enable the intended setup by
default without configuration.

Drop these obsolete selects in loongarch's Kconfig.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agofbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()

Use the fbcon_info_from_console() wrapper which was added to kernel
v5.19 with commit 409d6c95f9c6 ("fbcon: Introduce wrapper for console->fb_info lookup").

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2 years agofbmem: Check virtual screen sizes in fb_set_var()
Helge Deller [Wed, 29 Jun 2022 13:53:55 +0000 (15:53 +0200)]
fbmem: Check virtual screen sizes in fb_set_var()

Verify that the fbdev or drm driver correctly adjusted the virtual
screen sizes. On failure report the failing driver and reject the screen
size change.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v5.4+
2 years agodrm/ssd130x: Fix pre-charge period setting
Ezequiel Garcia [Wed, 6 Jul 2022 18:41:33 +0000 (15:41 -0300)]
drm/ssd130x: Fix pre-charge period setting

Fix small typo which causes the mask for the 'precharge1' setting
to be used with the 'precharge2' value.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220706184133.210888-1-ezequiel@vanguardiasur.com.ar
2 years agofbcon: Prevent that screen size is smaller than font size
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Prevent that screen size is smaller than font size

We need to prevent that users configure a screen size which is smaller than the
currently selected font size. Otherwise rendering chars on the screen will
access memory outside the graphics memory region.

This patch adds a new function fbcon_modechange_possible() which
implements this check and which later may be extended with other checks
if necessary.  The new function is called from the FBIOPUT_VSCREENINFO
ioctl handler in fbmem.c, which will return -EINVAL if userspace asked
for a too small screen size.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v5.4+
2 years agofbcon: Disallow setting font bigger than screen size
Helge Deller [Sat, 25 Jun 2022 10:56:49 +0000 (12:56 +0200)]
fbcon: Disallow setting font bigger than screen size

Prevent that users set a font size which is bigger than the physical screen.
It's unlikely this may happen (because screens are usually much larger than the
fonts and each font char is limited to 32x32 pixels), but it may happen on
smaller screens/LCD displays.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v4.14+
2 years agodma-buf: Fix one use-after-free of fence
xinhui pan [Thu, 7 Jul 2022 08:02:41 +0000 (16:02 +0800)]
dma-buf: Fix one use-after-free of fence

Need get the new fence when we replace the old one.

Fixes: 047a1b877ed48 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707080241.20060-1-xinhui.pan@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2 years agodrm/i915: Fix vm use-after-free in vma destruction
Thomas Hellström [Mon, 20 Jun 2022 12:36:59 +0000 (14:36 +0200)]
drm/i915: Fix vm use-after-free in vma destruction

In vma destruction, the following race may occur:

Thread 1:        Thread 2:
i915_vma_destroy();

  ...
  list_del_init(vma->vm_link);
  ...
  mutex_unlock(vma->vm->mutex);
  __i915_vm_release();
release_references();

And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.

However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().

v2: Fix a typo in the commit message (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agodrm/i915/guc: ADL-N should use the same GuC FW as ADL-S
Daniele Ceraolo Spurio [Tue, 21 Jun 2022 23:30:05 +0000 (16:30 -0700)]
drm/i915/guc: ADL-N should use the same GuC FW as ADL-S

The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.

Fixes: 7e28d0b26759 ("drm/i915/adl-n: Enable ADL-N platform")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 971e4a9781742aaad1587e25fd5582b2dd595ef8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agodrm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
Hangyu Hua [Fri, 24 Jun 2022 13:04:06 +0000 (06:04 -0700)]
drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()

If drm_connector_init fails, intel_connector_free will be called to take
care of proper free. So it is necessary to drop the refcount of port
before intel_connector_free.

Fixes: 091a4f91942a ("drm/i915: Handle drm-layer errors in intel_dp_add_mst_connector")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624130406.17996-1-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit cea9ed611e85d36a05db52b6457bf584b7d969e2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agoMerge branch 'wireguard-patches-for-5-19-rc6'
Jakub Kicinski [Thu, 7 Jul 2022 03:04:09 +0000 (20:04 -0700)]
Merge branch 'wireguard-patches-for-5-19-rc6'

Jason A. Donenfeld says:

====================
wireguard patches for 5.19-rc6

1) A few small fixups to the selftests, per usual. Of particular note is
   a fix for a test flake that occurred on especially fast systems that
   boot in less than a second.

2) An addition during this cycle of some s390 crypto interacted with the
   way wireguard selects dependencies, resulting in linker errors
   reported by the kernel test robot. So Vladis sent in a patch for
   that, which also required a small preparatory fix moving some Kconfig
   symbols around.
====================

Link: https://lore.kernel.org/r/20220707003157.526645-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agowireguard: Kconfig: select CRYPTO_CHACHA_S390
Vladis Dronov [Thu, 7 Jul 2022 00:31:57 +0000 (02:31 +0200)]
wireguard: Kconfig: select CRYPTO_CHACHA_S390

Select the new implementation of CHACHA20 for S390 when available.
It is faster than the generic software implementation, but also prevents
some linker errors in certain situations.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-kernel/202207030630.6SZVkrWf-lkp@intel.com/
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocrypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
Jason A. Donenfeld [Thu, 7 Jul 2022 00:31:56 +0000 (02:31 +0200)]
crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations

Various accelerated software implementation Kconfig values for S390 were
mistakenly placed into drivers/crypto/Kconfig, even though they're
mainly just SIMD code and live in arch/s390/crypto/ like usual. This
gives them the very unusual dependency on CRYPTO_HW, which leads to
problems elsewhere.

This patch fixes the issue by moving the Kconfig values for non-hardware
drivers into the usual place in crypto/Kconfig.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agowireguard: selftests: use microvm on x86
Jason A. Donenfeld [Thu, 7 Jul 2022 00:31:55 +0000 (02:31 +0200)]
wireguard: selftests: use microvm on x86

This makes for faster tests, faster compile time, and allows us to ditch
ACPI finally.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agowireguard: selftests: always call kernel makefile
Jason A. Donenfeld [Thu, 7 Jul 2022 00:31:54 +0000 (02:31 +0200)]
wireguard: selftests: always call kernel makefile

These selftests are used for much more extensive changes than just the
wireguard source files. So always call the kernel's build file, which
will do something or nothing after checking the whole tree, per usual.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agowireguard: selftests: use virt machine on m68k
Jason A. Donenfeld [Thu, 7 Jul 2022 00:31:53 +0000 (02:31 +0200)]
wireguard: selftests: use virt machine on m68k

This should be a bit more stable hopefully.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agowireguard: selftests: set fake real time in init
Jason A. Donenfeld [Thu, 7 Jul 2022 00:31:52 +0000 (02:31 +0200)]
wireguard: selftests: set fake real time in init

Not all platforms have an RTC, and rather than trying to force one into
each, it's much easier to just set a fixed time. This is necessary
because WireGuard's latest handshakes parameter is returned in wallclock
time, and if the system time isn't set, and the system is really fast,
then this returns 0, which trips the test.

Turning this on requires setting CONFIG_COMPAT_32BIT_TIME=y, as musl
doesn't support settimeofday without it.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agor8169: fix accessing unset transport header
Heiner Kallweit [Tue, 5 Jul 2022 19:15:22 +0000 (21:15 +0200)]
r8169: fix accessing unset transport header

66e4c8d95008 ("net: warn if transport header was not set") added
a check that triggers a warning in r8169, see [0].

The commit referenced in the Fixes tag refers to the change from
which the patch applies cleanly, there's nothing wrong with this
commit. It seems the actual issue (not bug, because the warning
is harmless here) was introduced with bdfa4ed68187
("r8169: use Giant Send").

[0] https://bugzilla.kernel.org/show_bug.cgi?id=216157

Fixes: 8d520b4de3ed ("r8169: work around RTL8125 UDP hw bug")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Tested-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/1b2c2b29-3dc0-f7b6-5694-97ec526d51a0@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: rose: fix UAF bug caused by rose_t0timer_expiry
Duoming Zhou [Tue, 5 Jul 2022 12:56:10 +0000 (20:56 +0800)]
net: rose: fix UAF bug caused by rose_t0timer_expiry

There are UAF bugs caused by rose_t0timer_expiry(). The
root cause is that del_timer() could not stop the timer
handler that is running and there is no synchronization.
One of the race conditions is shown below:

    (thread 1)             |        (thread 2)
                           | rose_device_event
                           |   rose_rt_device_down
                           |     rose_remove_neigh
rose_t0timer_expiry        |       rose_stop_t0timer(rose_neigh)
  ...                      |         del_timer(&neigh->t0timer)
                           |         kfree(rose_neigh) //[1]FREE
  neigh->dce_mode //[2]USE |

The rose_neigh is deallocated in position [1] and use in
position [2].

The crash trace triggered by POC is like below:

BUG: KASAN: use-after-free in expire_timers+0x144/0x320
Write of size 8 at addr ffff888009b19658 by task swapper/0/0
...
Call Trace:
 <IRQ>
 dump_stack_lvl+0xbf/0xee
 print_address_description+0x7b/0x440
 print_report+0x101/0x230
 ? expire_timers+0x144/0x320
 kasan_report+0xed/0x120
 ? expire_timers+0x144/0x320
 expire_timers+0x144/0x320
 __run_timers+0x3ff/0x4d0
 run_timer_softirq+0x41/0x80
 __do_softirq+0x233/0x544
 ...

This patch changes rose_stop_ftimer() and rose_stop_t0timer()
in rose_remove_neigh() to del_timer_sync() in order that the
timer handler could be finished before the resources such as
rose_neigh and so on are deallocated. As a result, the UAF
bugs could be mitigated.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220705125610.77971-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoima: fix violation measurement list record
Mimi Zohar [Thu, 30 Jun 2022 15:23:38 +0000 (11:23 -0400)]
ima: fix violation measurement list record

Although the violation digest in the IMA measurement list is always
zeroes, the size of the digest should be based on the hash algorithm.
Until recently the hash algorithm was hard coded to sha1.  Fix the
violation digest size included in the IMA measurement list.

This is just a cosmetic change which should not affect attestation.

Reported-by: Stefan Berger <stefanb@linux.ibm.com>
Fixes: 09091c44cb73 ("ima: use IMA default hash algorithm for integrity violations")
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2 years agodrm/amdgpu/display: disable prefer_shadow for generic fb helpers
Alex Deucher [Tue, 21 Jun 2022 14:10:37 +0000 (10:10 -0400)]
drm/amdgpu/display: disable prefer_shadow for generic fb helpers

Seems to break hibernation.  Disable for now until we can root
cause it.

Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216119
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 years agodrm/amdgpu: keep fbdev buffers pinned during suspend
Alex Deucher [Tue, 21 Jun 2022 14:04:55 +0000 (10:04 -0400)]
drm/amdgpu: keep fbdev buffers pinned during suspend

Was dropped when we converted to the generic helpers.

Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 years agousbnet: fix memory leak in error case
Oliver Neukum [Tue, 5 Jul 2022 12:53:51 +0000 (14:53 +0200)]
usbnet: fix memory leak in error case

usbnet_write_cmd_async() mixed up which buffers
need to be freed in which error case.

v2: add Fixes tag
v3: fix uninitialized buf pointer

Fixes: 877bd862f32b8 ("usbnet: introduce usbnet 3 command helpers")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20220705125351.17309-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This page took 0.14143 seconds and 4 git commands to generate.