Git Repo - linux.git/log

ice: Ignore error message when setting same promiscuous mode

Commit 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
introduced new checks when setting/clearing promiscuous mode. But if the
requested promiscuous mode setting already exists, an -EEXIST error
message would be printed. This is incorrect because promiscuous mode is
either on/off and shouldn't print an error when the requested
configuration is already set.

This can happen when removing a bridge with two bonded interfaces and
promiscuous most isn't fully cleared from VLAN VSI in hardware.

Fix this by ignoring cases where requested promiscuous mode exists.

Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
Signed-off-by: Benjamin Mikailenko <[email protected]>
Signed-off-by: Grzegorz Siwik <[email protected]>
Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: Fix clearing of promisc mode with bridge over bond

When at least two interfaces are bonded and a bridge is enabled on the
bond, an error can occur when the bridge is removed and re-added. The
reason for the error is because promiscuous mode was not fully cleared from
the VLAN VSI in the hardware. With this change, promiscuous mode is
properly removed when the bridge disconnects from bonding.

[ 1033.676359] bond1: link status definitely down for interface enp95s0f0, disabling it
[ 1033.676366] bond1: making interface enp175s0f0 the new active one
[ 1033.676369] device enp95s0f0 left promiscuous mode
[ 1033.676522] device enp175s0f0 entered promiscuous mode
[ 1033.676901] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6
[ 1041.795662] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6
[ 1041.944826] bond1: link status definitely down for interface enp175s0f0, disabling it
[ 1041.944874] device enp175s0f0 left promiscuous mode
[ 1041.944918] bond1: now running without any active interface!

Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations")
Co-developed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Grzegorz Siwik <[email protected]>
Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
Tested-by: Jaroslav Pulchart <[email protected]>
Tested-by: Igor Raits <[email protected]>
Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: Ignore EEXIST when setting promisc mode

Ignore EEXIST error when setting promiscuous mode.
This fix is needed because the driver could set promiscuous mode
when it still has not cleared properly.
Promiscuous mode could be set only once, so setting it second
time will be rejected.

Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode")
Signed-off-by: Grzegorz Siwik <[email protected]>
Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
Tested-by: Jaroslav Pulchart <[email protected]>
Tested-by: Igor Raits <[email protected]>
Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: Fix double VLAN error when entering promisc mode

Avoid enabling or disabling VLAN 0 when trying to set promiscuous
VLAN mode if double VLAN mode is enabled. This fix is needed
because the driver tries to add the VLAN 0 filter twice (once for
inner and once for outer) when double VLAN mode is enabled. The
filter program is rejected by the firmware when double VLAN is
enabled, because the promiscuous filter only needs to be set once.

This issue was missed in the initial implementation of double VLAN
mode.

Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode")
Signed-off-by: Grzegorz Siwik <[email protected]>
Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/
Tested-by: Jaroslav Pulchart <[email protected]>
Tested-by: Igor Raits <[email protected]>
Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ALSA: hda/realtek: Add quirk for Clevo NS50PU, NS70PU

Fixes headset microphone detection on Clevo NS50PU and NS70PU.

Signed-off-by: Christoffer Sandberg <[email protected]>
Signed-off-by: Werner Sembach <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
"Most notably this drops the commits that trip up google cloud (turns
  out, any legacy device).

  Plus a kerneldoc patch"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: kerneldocs fixes and enhancements
  virtio: Revert "virtio: find_vqs() add arg sizes"
  virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"
  virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"
  virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"
  virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()"
  virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"

btrfs: tree-checker: check for overlapping extent items

We're seeing a weird problem in production where we have overlapping
extent items in the extent tree. It's unclear where these are coming
from, and in debugging we realized there's no check in the tree checker
for this sort of problem. Add a check to the tree-checker to make sure
that the extents do not overlap each other.

Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: fix warning during log replay when bumping inode link count

During log replay, at add_link(), we may increment the link count of
another inode that has a reference that conflicts with a new reference
for the inode currently being processed.

During log replay, at add_link(), we may drop (unlink) a reference from
some inode in the subvolume tree if that reference conflicts with a new
reference found in the log for the inode we are currently processing.

After the unlink, If the link count has decreased from 1 to 0, then we
increment the link count to prevent the inode from being deleted if it's
evicted by an iput() call, because we may have references to add to that
inode later on (and we will fixup its link count later during log replay).

However incrementing the link count from 0 to 1 triggers a warning:

  $ cat fs/inode.c
  (...)
  void inc_nlink(struct inode *inode)
  {
        if (unlikely(inode->i_nlink == 0)) {
                 WARN_ON(!(inode->i_state & I_LINKABLE));
                 atomic_long_dec(&inode->i_sb->s_remove_count);
        }
  (...)

The I_LINKABLE flag is only set when creating an O_TMPFILE file, so it's
never set during log replay.

Most of the time, the warning isn't triggered even if we dropped the last
reference of the conflicting inode, and this is because:

1) The conflicting inode was previously marked for fixup, through a call
   to link_to_fixup_dir(), which increments the inode's link count;

2) And the last iput() on the inode has not triggered eviction of the
   inode, nor was eviction triggered after the iput(). So at add_link(),
   even if we unlink the last reference of the inode, its link count ends
   up being 1 and not 0.

So this means that if eviction is triggered after link_to_fixup_dir() is
called, at add_link() we will read the inode back from the subvolume tree
and have it with a correct link count, matching the number of references
it has on the subvolume tree. So if when we are at add_link() the inode
has exactly one reference only, its link count is 1, and after the unlink
its link count becomes 0.

So fix this by using set_nlink() instead of inc_nlink(), as the former
accepts a transition from 0 to 1 and it's what we use in other similar
contexts (like at link_to_fixup_dir().

Also make add_inode_ref() use set_nlink() instead of inc_nlink() to
bump the link count from 0 to 1.

The warning is actually harmless, but it may scare users. Josef also ran
into it recently.

CC: [email protected] # 5.1+
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: fix lost error handling when looking up extended ref on log replay

During log replay, when processing inode references, if we get an error
when looking up for an extended reference at __add_inode_ref(), we ignore
it and proceed, returning success (0) if no other error happens after the
lookup. This is obviously wrong because in case an extended reference
exists and it encodes some name not in the log, we need to unlink it,
otherwise the filesystem state will not match the state it had after the
last fsync.

So just make __add_inode_ref() return an error it gets from the extended
reference lookup.

Fixes: f186373fef005c ("btrfs: extended inode refs")
CC: [email protected] # 4.9+
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: fix lockdep splat with reloc root extent buffers

We have been hitting the following lockdep splat with btrfs/187 recently

  WARNING: possible circular locking dependency detected
  5.19.0-rc8+ #775 Not tainted
  ------------------------------------------------------
  btrfs/752500 is trying to acquire lock:
  ffff97e1875a97b8 (btrfs-treloc-02#2){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  but task is already holding lock:
  ffff97e1875a9278 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-tree-01/1){+.+.}-{3:3}:
down_write_nested+0x41/0x80
__btrfs_tree_lock+0x24/0x110
btrfs_init_new_buffer+0x7d/0x2c0
btrfs_alloc_tree_block+0x120/0x3b0
__btrfs_cow_block+0x136/0x600
btrfs_cow_block+0x10b/0x230
btrfs_search_slot+0x53b/0xb70
btrfs_lookup_inode+0x2a/0xa0
__btrfs_update_delayed_inode+0x5f/0x280
btrfs_async_run_delayed_root+0x24c/0x290
btrfs_work_helper+0xf2/0x3e0
process_one_work+0x271/0x590
worker_thread+0x52/0x3b0
kthread+0xf0/0x120
ret_from_fork+0x1f/0x30

  -> #1 (btrfs-tree-01){++++}-{3:3}:
down_write_nested+0x41/0x80
__btrfs_tree_lock+0x24/0x110
btrfs_search_slot+0x3c3/0xb70
do_relocation+0x10c/0x6b0
relocate_tree_blocks+0x317/0x6d0
relocate_block_group+0x1f1/0x560
btrfs_relocate_block_group+0x23e/0x400
btrfs_relocate_chunk+0x4c/0x140
btrfs_balance+0x755/0xe40
btrfs_ioctl+0x1ea2/0x2c90
__x64_sys_ioctl+0x88/0xc0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd

  -> #0 (btrfs-treloc-02#2){+.+.}-{3:3}:
__lock_acquire+0x1122/0x1e10
lock_acquire+0xc2/0x2d0
down_write_nested+0x41/0x80
__btrfs_tree_lock+0x24/0x110
btrfs_lock_root_node+0x31/0x50
btrfs_search_slot+0x1cb/0xb70
replace_path+0x541/0x9f0
merge_reloc_root+0x1d6/0x610
merge_reloc_roots+0xe2/0x260
relocate_block_group+0x2c8/0x560
btrfs_relocate_block_group+0x23e/0x400
btrfs_relocate_chunk+0x4c/0x140
btrfs_balance+0x755/0xe40
btrfs_ioctl+0x1ea2/0x2c90
__x64_sys_ioctl+0x88/0xc0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd

  other info that might help us debug this:

  Chain exists of:
    btrfs-treloc-02#2 --> btrfs-tree-01 --> btrfs-tree-01/1

   Possible unsafe locking scenario:

CPU0                    CPU1
----                    ----
    lock(btrfs-tree-01/1);
lock(btrfs-tree-01);
lock(btrfs-tree-01/1);
    lock(btrfs-treloc-02#2);

   *** DEADLOCK ***

  7 locks held by btrfs/752500:
   #0: ffff97e292fdf460 (sb_writers#12){.+.+}-{0:0}, at: btrfs_ioctl+0x208/0x2c90
   #1: ffff97e284c02050 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: btrfs_balance+0x55f/0xe40
   #2: ffff97e284c00878 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x236/0x400
   #3: ffff97e292fdf650 (sb_internal#2){.+.+}-{0:0}, at: merge_reloc_root+0xef/0x610
   #4: ffff97e284c02378 (btrfs_trans_num_writers){++++}-{0:0}, at: join_transaction+0x1a8/0x5a0
   #5: ffff97e284c023a0 (btrfs_trans_num_extwriters){++++}-{0:0}, at: join_transaction+0x1a8/0x5a0
   #6: ffff97e1875a9278 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x24/0x110

  stack backtrace:
  CPU: 1 PID: 752500 Comm: btrfs Not tainted 5.19.0-rc8+ #775
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:

   dump_stack_lvl+0x56/0x73
   check_noncircular+0xd6/0x100
   ? lock_is_held_type+0xe2/0x140
   __lock_acquire+0x1122/0x1e10
   lock_acquire+0xc2/0x2d0
   ? __btrfs_tree_lock+0x24/0x110
   down_write_nested+0x41/0x80
   ? __btrfs_tree_lock+0x24/0x110
   __btrfs_tree_lock+0x24/0x110
   btrfs_lock_root_node+0x31/0x50
   btrfs_search_slot+0x1cb/0xb70
   ? lock_release+0x137/0x2d0
   ? _raw_spin_unlock+0x29/0x50
   ? release_extent_buffer+0x128/0x180
   replace_path+0x541/0x9f0
   merge_reloc_root+0x1d6/0x610
   merge_reloc_roots+0xe2/0x260
   relocate_block_group+0x2c8/0x560
   btrfs_relocate_block_group+0x23e/0x400
   btrfs_relocate_chunk+0x4c/0x140
   btrfs_balance+0x755/0xe40
   btrfs_ioctl+0x1ea2/0x2c90
   ? lock_is_held_type+0xe2/0x140
   ? lock_is_held_type+0xe2/0x140
   ? __x64_sys_ioctl+0x88/0xc0
   __x64_sys_ioctl+0x88/0xc0
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

This isn't necessarily new, it's just tricky to hit in practice.  There
are two competing things going on here.  With relocation we create a
snapshot of every fs tree with a reloc tree.  Any extent buffers that
get initialized here are initialized with the reloc root lockdep key.
However since it is a snapshot, any blocks that are currently in cache
that originally belonged to the fs tree will have the normal tree
lockdep key set.  This creates the lock dependency of

  reloc tree -> normal tree

for the extent buffer locking during the first phase of the relocation
as we walk down the reloc root to relocate blocks.

However this is problematic because the final phase of the relocation is
merging the reloc root into the original fs root.  This involves
searching down to any keys that exist in the original fs root and then
swapping the relocated block and the original fs root block.  We have to
search down to the fs root first, and then go search the reloc root for
the block we need to replace.  This creates the dependency of

  normal tree -> reloc tree

which is why lockdep complains.

Additionally even if we were to fix this particular mismatch with a
different nesting for the merge case, we're still slotting in a block
that has a owner of the reloc root objectid into a normal tree, so that
block will have its lockdep key set to the tree reloc root, and create a
lockdep splat later on when we wander into that block from the fs root.

Unfortunately the only solution here is to make sure we do not set the
lockdep key to the reloc tree lockdep key normally, and then reset any
blocks we wander into from the reloc root when we're doing the merged.

This solves the problem of having mixed tree reloc keys intermixed with
normal tree keys, and then allows us to make sure in the merge case we
maintain the lock order of

  normal tree -> reloc tree

We handle this by setting a bit on the reloc root when we do the search
for the block we want to relocate, and any block we search into or COW
at that point gets set to the reloc tree key.  This works correctly
because we only ever COW down to the parent node, so we aren't resetting
the key for the block we're linking into the fs root.

With this patch we no longer have the lockdep splat in btrfs/187.

Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: move lockdep class helpers to locking.c

These definitions exist in disk-io.c, which is not related to the
locking. Move this over to locking.h/c where it makes more sense.

Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: unset reloc control if transaction commit fails in prepare_to_relocate()

In btrfs_relocate_block_group(), the rc is allocated.  Then
btrfs_relocate_block_group() calls

relocate_block_group()
  prepare_to_relocate()
    set_reloc_control()

that assigns rc to the variable fs_info->reloc_ctl. When
prepare_to_relocate() returns, it calls

btrfs_commit_transaction()
  btrfs_start_dirty_block_groups()
    btrfs_alloc_path()
      kmem_cache_zalloc()

which may fail for example (or other errors could happen). When the
failure occurs, btrfs_relocate_block_group() detects the error and frees
rc and doesn't set fs_info->reloc_ctl to NULL. After that, in
btrfs_init_reloc_root(), rc is retrieved from fs_info->reloc_ctl and
then used, which may cause a use-after-free bug.

This possible bug can be triggered by calling btrfs_ioctl_balance()
before calling btrfs_ioctl_defrag().

To fix this possible bug, in prepare_to_relocate(), check if
btrfs_commit_transaction() fails. If the failure occurs,
unset_reloc_control() is called to set fs_info->reloc_ctl to NULL.

The error log in our fault-injection testing is shown as follows:

  [   58.751070] BUG: KASAN: use-after-free in btrfs_init_reloc_root+0x7ca/0x920 [btrfs]
  ...
  [   58.753577] Call Trace:
  ...
  [   58.755800]  kasan_report+0x45/0x60
  [   58.756066]  btrfs_init_reloc_root+0x7ca/0x920 [btrfs]
  [   58.757304]  record_root_in_trans+0x792/0xa10 [btrfs]
  [   58.757748]  btrfs_record_root_in_trans+0x463/0x4f0 [btrfs]
  [   58.758231]  start_transaction+0x896/0x2950 [btrfs]
  [   58.758661]  btrfs_defrag_root+0x250/0xc00 [btrfs]
  [   58.759083]  btrfs_ioctl_defrag+0x467/0xa00 [btrfs]
  [   58.759513]  btrfs_ioctl+0x3c95/0x114e0 [btrfs]
  ...
  [   58.768510] Allocated by task 23683:
  [   58.768777]  ____kasan_kmalloc+0xb5/0xf0
  [   58.769069]  __kmalloc+0x227/0x3d0
  [   58.769325]  alloc_reloc_control+0x10a/0x3d0 [btrfs]
  [   58.769755]  btrfs_relocate_block_group+0x7aa/0x1e20 [btrfs]
  [   58.770228]  btrfs_relocate_chunk+0xf1/0x760 [btrfs]
  [   58.770655]  __btrfs_balance+0x1326/0x1f10 [btrfs]
  [   58.771071]  btrfs_balance+0x3150/0x3d30 [btrfs]
  [   58.771472]  btrfs_ioctl_balance+0xd84/0x1410 [btrfs]
  [   58.771902]  btrfs_ioctl+0x4caa/0x114e0 [btrfs]
  ...
  [   58.773337] Freed by task 23683:
  ...
  [   58.774815]  kfree+0xda/0x2b0
  [   58.775038]  free_reloc_control+0x1d6/0x220 [btrfs]
  [   58.775465]  btrfs_relocate_block_group+0x115c/0x1e20 [btrfs]
  [   58.775944]  btrfs_relocate_chunk+0xf1/0x760 [btrfs]
  [   58.776369]  __btrfs_balance+0x1326/0x1f10 [btrfs]
  [   58.776784]  btrfs_balance+0x3150/0x3d30 [btrfs]
  [   58.777185]  btrfs_ioctl_balance+0xd84/0x1410 [btrfs]
  [   58.777621]  btrfs_ioctl+0x4caa/0x114e0 [btrfs]
  ...

Reported-by: TOTE Robot <[email protected]>
CC: [email protected] # 5.15+
Reviewed-by: Sweet Tea Dorminy <[email protected]>
Reviewed-by: Nikolay Borisov <[email protected]>
Signed-off-by: Zixuan Fu <[email protected]>
Signed-off-by: David Sterba <[email protected]>

ALSA: info: Fix llseek return value when using callback

When using callback there was a flow of

ret = -EINVAL
if (callback) {
offset = callback();
goto out;
}
...
offset = some other value in case of no callback;
ret = offset;
out:
return ret;

which causes the snd_info_entry_llseek() to return -EINVAL when there is
callback handler. Fix this by setting "ret" directly to callback return
value before jumping to "out".

Fixes: 73029e0ff18d ("ALSA: info - Implement common llseek for binary mode")
Signed-off-by: Amadeusz Sławiński <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

testing: selftests: nft_flowtable.sh: rework test to detect offload failure

This test fails on current kernel releases because the flotwable path
now calls dst_check from packet path and will then remove the offload.

Test script has two purposes:
1. check that file (random content) can be sent to other netns (and vv)
2. check that the flow is offloaded (rather than handled by classic
forwarding path).

Since dst_check is in place, 2) fails because the nftables ruleset in
router namespace 1 intentionally blocks traffic under the assumption
that packets are not passed via classic path at all.

Rework this: Instead of blocking traffic, create two named counters, one
for original and one for reverse direction.

The first three test cases are handled by classic forwarding path
(path mtu discovery is disabled and packets exceed MTU).

But all other tests enable PMTUD, so the originator and responder are
expected to lower packet size and flowtable is expected to do the packet
forwarding.

For those tests, check that the packet counters (which are only
incremented for packets that are passed up to classic forward path)
are significantly lower than the file size transferred.

I've tested that the counter-checks fail as expected when the 'flow add'
statement is removed from the ruleset.

Signed-off-by: Florian Westphal <[email protected]>

ALSA: hda/cs8409: Support new Dolphin Variants

Add 4 new Dolphin Systems, same configuration as older systems.

Signed-off-by: Stefan Binding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

KVM: arm64: Reject 32bit user PSTATE on asymmetric systems

KVM does not support AArch32 EL0 on asymmetric systems. To that end,
prevent userspace from configuring a vCPU in such a state through
setting PSTATE.

It is already ABI that KVM rejects such a write on a system where
AArch32 EL0 is unsupported. Though the kernel's definition of a 32bit
system changed in commit 2122a833316f ("arm64: Allow mismatched
32-bit EL0 support"), KVM's did not.

Fixes: 2122a833316f ("arm64: Allow mismatched 32-bit EL0 support")
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems

KVM does not support AArch32 on asymmetric systems. To that end, enforce
AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system.

Fixes: 2122a833316f ("arm64: Allow mismatched 32-bit EL0 support")
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

====================
Intel Wired LAN Driver Updates 2022-08-16)

This series contains updates to i40e driver only.

Przemyslaw fixes issue with checksum offload on VXLAN tunnels.

Alan disables VSI for Tx timeout when all recovery methods have failed.
====================

Signed-off-by: David S. Miller <[email protected]>

tls: rx: react to strparser initialization errors

Even though the normal strparser's init function has a return
value we got away with ignoring errors until now, as it only
validates the parameters and we were passing correct parameters.

tls_strp can fail to init on memory allocation errors, which
syzbot duly induced and reported.

Reported-by: [email protected]
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

platform/x86: serial-multi-instantiate: Add CLSA0101 Laptop

The device CLSA0101 has two instances of CS35L41
connected by I2C.

Signed-off-by: Lucas Tanure <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

testing: selftests: nft_flowtable.sh: use random netns names

"ns1" is a too generic name, use a random suffix to avoid
errors when such a netns exists. Also allows to run multiple
instances of the script in parallel.

Signed-off-by: Florian Westphal <[email protected]>

netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y

NF_CONNTRACK_PROCFS was marked obsolete in commit 54b07dca68557b09
("netfilter: provide config option to disable ancient procfs parts") in
v3.3.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>

net: sched: fix misuse of qcpu->backlog in gnet_stats_add_queue_cpu

In the gnet_stats_add_queue_cpu function, the qstats->qlen statistics
are incorrectly set to qcpu->backlog.

Fixes: 448e163f8b9b ("gen_stats: Add gnet_stats_add_queue()")
Signed-off-by: Zhengchao Shao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

riscv: Ensure isa-ext static keys are writable

riscv_isa_ext_keys[] is an array of static keys used in the unified
ISA extension framework. The keys added to this array may be used
anywhere, including in modules. Ensure the keys remain writable by
placing them in the data section.

The need to change riscv_isa_ext_keys[]'s section was found when the
kvm module started failing to load. Commit 8eb060e10185 ("arch/riscv:
add Zihintpause support") adds a static branch check for a newly
added isa-ext key to cpu_relax(), which kvm uses.

Fixes: c360cbec3511 ("riscv: introduce unified static key mechanism for ISA extensions")
Signed-off-by: Andrew Jones <[email protected]>
Cc: [email protected]
Reported-by: Ron Economos <[email protected]>
Reported-by: Anup Patel <[email protected]>
Reported-by: Conor Dooley <[email protected]>
Tested-by: Atish Patra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

Revert "drm/amd/amdgpu: add pipe1 hardware support"

This reverts commit 4c7631800e6bf0eced08dd7b4f793fcd972f597d.

Triggered GFX hangs with GNOME Wayland on Navi 21.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2117
Signed-off-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix use-after-free on amdgpu_bo_list mutex

If amdgpu_cs_vm_handling returns r != 0, then it will unlock the
bo_list_mutex inside the function amdgpu_cs_vm_handling and again on
amdgpu_cs_parser_fini. This problem results in the following
use-after-free problem:

[ 220.280990] ------------[ cut here ]------------
[ 220.281000] refcount_t: underflow; use-after-free.
[ 220.281019] WARNING: CPU: 1 PID: 3746 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[ 220.281029] ------------[ cut here ]------------
[ 220.281415] CPU: 1 PID: 3746 Comm: chrome:cs0 Tainted: G W L ------- --- 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1
[ 220.281421] Hardware name: System manufacturer System Product Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
[ 220.281426] RIP: 0010:refcount_warn_saturate+0xba/0x110
[ 220.281431] Code: 01 01 e8 79 4a 6f 00 0f 0b e9 42 47 a5 00 80 3d de
7e be 01 00 75 85 48 c7 c7 f8 98 8e 98 c6 05 ce 7e be 01 01 e8 56 4a
6f 00 <0f> 0b e9 1f 47 a5 00 80 3d b9 7e be 01 00 0f 85 5e ff ff ff 48
c7
[ 220.281437] RSP: 0018:ffffb4b0d18d7a80 EFLAGS: 00010282
[ 220.281443] RAX: 0000000000000026 RBX: 0000000000000003 RCX: 0000000000000000
[ 220.281448] RDX: 0000000000000001 RSI: ffffffff988d06dc RDI: 00000000ffffffff
[ 220.281452] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffb4b0d18d7930
[ 220.281457] R10: 0000000000000003 R11: ffffa0672e2fffe8 R12: ffffa058ca360400
[ 220.281461] R13: ffffa05846c50a18 R14: 00000000fffffe00 R15: 0000000000000003
[ 220.281465] FS: 00007f82683e06c0(0000) GS:ffffa066e2e00000(0000) knlGS:0000000000000000
[ 220.281470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 220.281475] CR2: 00003590005cc000 CR3: 00000001fca46000 CR4: 0000000000350ee0
[ 220.281480] Call Trace:
[ 220.281485] <TASK>
[ 220.281490] amdgpu_cs_ioctl+0x4e2/0x2070 [amdgpu]
[ 220.281806] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 220.282028] drm_ioctl_kernel+0xa4/0x150
[ 220.282043] drm_ioctl+0x21f/0x420
[ 220.282053] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 220.282275] ? lock_release+0x14f/0x460
[ 220.282282] ? _raw_spin_unlock_irqrestore+0x30/0x60
[ 220.282290] ? _raw_spin_unlock_irqrestore+0x30/0x60
[ 220.282297] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282305] ? _raw_spin_unlock_irqrestore+0x40/0x60
[ 220.282317] amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[ 220.282534] __x64_sys_ioctl+0x90/0xd0
[ 220.282545] do_syscall_64+0x5b/0x80
[ 220.282551] ? futex_wake+0x6c/0x150
[ 220.282568] ? lock_is_held_type+0xe8/0x140
[ 220.282580] ? do_syscall_64+0x67/0x80
[ 220.282585] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282592] ? do_syscall_64+0x67/0x80
[ 220.282597] ? do_syscall_64+0x67/0x80
[ 220.282602] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282609] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 220.282616] RIP: 0033:0x7f8282a4f8bf
[ 220.282639] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10
00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00
0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00
00
[ 220.282644] RSP: 002b:00007f82683df410 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 220.282651] RAX: ffffffffffffffda RBX: 00007f82683df588 RCX: 00007f8282a4f8bf
[ 220.282655] RDX: 00007f82683df4d0 RSI: 00000000c0186444 RDI: 0000000000000018
[ 220.282659] RBP: 00007f82683df4d0 R08: 00007f82683df5e0 R09: 00007f82683df4b0
[ 220.282663] R10: 00001d04000a0600 R11: 0000000000000246 R12: 00000000c0186444
[ 220.282667] R13: 0000000000000018 R14: 00007f82683df588 R15: 0000000000000003
[ 220.282689] </TASK>
[ 220.282693] irq event stamp: 6232311
[ 220.282697] hardirqs last enabled at (6232319): [<ffffffff9718cd7e>] __up_console_sem+0x5e/0x70
[ 220.282704] hardirqs last disabled at (6232326): [<ffffffff9718cd63>] __up_console_sem+0x43/0x70
[ 220.282709] softirqs last enabled at (6232072): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170
[ 220.282716] softirqs last disabled at (6232061): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170
[ 220.282722] ---[ end trace 0000000000000000 ]---

Therefore, remove the mutex_unlock from the amdgpu_cs_vm_handling
function, so that amdgpu_cs_submit and amdgpu_cs_parser_fini can handle
the unlock.

Fixes: 90af0ca047f3 ("drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2")
Reported-by: Mikhail Gavrilov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Melissa Wen <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix interrupt handling on ih_soft ring

There are no backing hardware registers for ih_soft ring.
As a result, don't try to access hardware registers for read
and write pointers when processing interrupts on the IH soft
ring.

Signed-off-by: Mukul Joshi <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Add secure display TA load for Renoir

Add secure display TA load for Renoir

Signed-off-by: Shane Xiao <[email protected]>
Reviewed-by: Aaron Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Include scaling factor for SubVP command

[Description]
For SubVP scaling cases, we must include the scaling
info as part of the cmd. This is required when converting
OTG line to HUBP line for the MALL_START_LINE programming.

Reviewed-by: Jun Lei <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/vcn: Return void from the stop_dbg_mode

There is no point in returning an int here. It only returns 0 which
the caller never uses. Therefore return void and remove the unnecessary
assignment.

Addresses-Coverity: 1504988 ("Unused value")
Fixes: 8da1170a16e4 ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Ruijing Dong <[email protected]>
Suggested-by: Ruijing Dong <[email protected]>
Suggested-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Khalid Masum <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: remove useless condition in amdgpu_job_stop_all_jobs_on_sched()

Local variable 'rq' is initialized by an address
of field of drm_sched_job, so it does not make
sense to compare 'rq' with NULL.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Reviewed-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Andrey Strachuk <[email protected]>
Fixes: 7c6e68c777f1 ("drm/amdgpu: Avoid HW GPU reset for RAS.")
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Add decode_iv_ts helper for ih_v6 block

Was missing. Add it.

Signed-off-by: Harish Kasiviswanathan <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: add chip revision to DCN32

[Why & How]
Add GC_11_0_3_A0 as a chip revision to the DCN32 family

Reviewed-by: Rodrigo Siqueira <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Samson Tam <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: avoid doing vm_init multiple time

[why]
this is to ensure that driver will not reprogram hvm_prefetch_req again if
it is done.

Reviewed-by: Martin Leung <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Charlene Liu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Use pitch when calculating size to cache in MALL

[Description]
Use pitch when calculating size to cache in MALL

Reviewed-by: Samson Tam <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Don't set DSC for phantom pipes

[Description]
Don't set DSC bit for phantom pipes, not
required since phantom pipe don't have
any actual output

Reviewed-by: Jun Lei <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Update clock table policy for DCN314

[Why & How]
Depending on how the clock table is constructed from PMFW we can run
into issues where we don't think we have enough bandwidth available
due to FCLK too low - eg. when the FCLK table contains invalid entries
or a single entry.

We should always pick up the maximum clocks for each state as a final
state in this case to prevent validation from failing if the table is
malformed.

We should also contain sensible defaults in the case where values
are invalid.

Redfine the clock table structures by adding a 314 prefix to make
debugging these issues easier by avoiding symbol name clashes.

Overall this policy more closely aligns to how we did things for 315,
but because of how the voltage rail is setup we should favor keeping
DCFCLK low rather than DISPCLK or DPPCLK - so use the max for those
in every entry.

Reviewed-by: Daniel Miess <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Modify header inclusion pattern

[Why]
Recent backport from opensource broke the Nightly tool build
that tests DC and DML for bugs and regressions. This was
because the backport had a header inclusion that was not
consistent with the AMD style of including headers was allowed
to be merged back in DML code that caused tool compilation
failures.

[How]
Modify the way in which the header file in included so that it
is consistent with AMD style of including headers. This then
automatically fixes the tool compilation process and also
helps maintain the code quality and consistency.

Reviewed-by: Alvin Lee <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Chaitanya Dhere <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix plug/unplug external monitor will hang while playback MPO video

[Why]
Pipes for MPO primary and overlay will be power down and power up during
plug/unplug external monitor while MPO video playback.
But the pipes were the same after plug/unplug and should not need to be
power down and power up or it will make page flip interrupt disabled and
cause hang issue.

[How]
Add pipe split change condition that not only check the top pipe pointer
but also check the index of top pipe if both top pipes are available.

Reviewed-by: Sun peng Li <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Tom Chung <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Add debug parameter to retain default clock table

[Why]
Need a way to retain default clock table to aid
the investigation into why 8k@30 display not
lighting up on dcn314

[How]
Use flag to prevent execution of bw_params helper
function and function for updating bw_bounding_box

Reviewed-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Daniel Miess <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Increase tlb flush timeout for sriov

[Why]
During multi-vf executing benchmark (Luxmark) observed kiq error timeout.
It happenes because all of VFs do the tlb invalidation at the same time.
Although each VF has the invalidate register set, from hardware side
the invalidate requests are queue to execute.

[How]
In case of 12 VF increase timeout on 12*100ms

Signed-off-by: Dusica Milinkovic <[email protected]>
Acked-by: Shaoyun Liu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: do not compare integers of different widths

[Why & How]
Increase width of some variables to avoid comparing integers of
different widths.

Reviewed-by: Alvin Lee <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Josip Pavic <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Add reserved dc_log_type.

Reviewed-by: Anthony Koo <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Ian Chen <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix pixel clock programming

[Why]
Some pixel clock values could cause HDMI TMDS SSCPs to be misaligned
between different HDMI lanes when using YCbCr420 10-bit pixel format.

BIOS functions for transmitter/encoder control take pixel clock in kHz
increments, whereas the function for setting the pixel clock is in 100Hz
increments. Setting pixel clock to a value that is not on a kHz boundary
will cause the issue.

[How]
Round pixel clock down to nearest kHz in 10/12-bpc cases.

Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Ilya Bakoulin <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: 3.2.198

This version brings along following fixes:

-Fix edp panel missing event
-Set ARGB16161616 pixel format to 26
-Fix dcn32 interger issue
-Clear optc underflow bit after ODM clock off
-Fix issue with stereo3D
-Fix DML2 lightup issue
-Correct DTBCLK for dcn314
-Revert for a regression
-Fix clocks and bugs in DML2
-Enable SubVP by defalut on DCN32 & DCN321
-Corret boundary condition for engin ID on DCN303
-Fix FRL encoder override registry key
-Fix VPG for dcn314 HPO
-Fix Linux compile-time warning
-Add new prefetch modes in DML for DCN32

Acked-by: Brian Chang <[email protected]>
Signed-off-by: Aric Cyr <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: reverted limiting vscsdp_for_colorimetry and ARGB16161616 pixel format addition

[WHY]
Limiting vscsdp_for_colorimetry for YCbCr420/BT2020 resulted in red/green
point failures in HDR10 DTN tests. The re-implementation of ARGB16161616
was to fix this however it did not actually fix this issue but a side effect of the
issue.

[HOW]
Change ARGB16161616 pixel format to 26.

Reviewed-by: Martin Leung <[email protected]>
Acked-by: Brian Chang <[email protected]>
Signed-off-by: Ethan Wellenreiter <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: Enable GFXOFF feature for SMU IP v13.0.4

The driver needs to set EnableGfxImu message parameter to tell the PMFW
to set the flag that enables the GFXOFF feature.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: enable IH Clock Gating for OSS IP v6.0.1

Enable AMD_CG_SUPPORT_IH_CG support.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: potential crash in kfd_create_indirect_link_prop()

This code has two bugs.  If kfd_topology_device_by_proximity_domain()
failed on the first iteration through the loop then "cpu_link" is
uninitialized and should not be dereferenced.

The second bug is that we cannot dereference a list iterator when it
points to the list head.  In other words, if we exit the
list_for_each_entry() loop exits without hitting a break then "cpu_link"
is not a valid pointer and should not be dereferenced.

Fix both of these problems by setting "cpu_link" to NULL when it is invalid
and non-NULL when it is valid.  That makes it easier to test for
valid vs invalid.

Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: reserve 2 queues for sdma 6.0.1 in bitmap

There is only one engine in sdma 6.0.1, the total number of
reserved queues should be 2, reflect this number in bitmap as well.

Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Tim Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: enable ATHUB IP v3.0.1 Clock Gating

Enable AMD_CG_SUPPORT_ATHUB_MGCG and AMD_CG_SUPPORT_ATHUB_LS support.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: enable HDP IP v5.2.1 Clock Gating

Enable AMD_CG_SUPPORT_HDP_MGCG and AMD_CG_SUPPORT_HDP_LS support.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: enable MMHUB IP v3.0.1 Clock Gating

Enable AMD_CG_SUPPORT_MC_MGCG and AMD_CG_SUPPORT_MC_LS support.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add ATHUB IP v3.0.1 Clock Gating support

Add ATHUB IP v3.0.1 in athub_v3_0_set_clockgating.

The regATHUB_MISC_CNTL has different offset for ATHUB IP v3.0.1,
so need to add IP version checking to use the right REG offset.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add HDP IP v5.2.1 Clock Gating support

Add set/get_clockgating for HDP IP v5.2.1.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add MMHUB IP v3.0.1 Clock Gating support

Add set/get_clockgating for MMHUB IP v3.0.1.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: update the smu driver interface version for SMU IP v13.0.4

The pmfw has changed the driver interface version, so keep same with the
fw.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: Fix mm reference in SVM eviction worker

Use the mm reference from the fence. This allows removing the
svm_bo->svms pointer, which was problematic because we cannot assume
that the struct kfd_process containing the svms is still allocated
without holding a refcount on the process.

Use mmget_not_zero to ensure the mm is still valid, and drop the svm_bo
reference if it isn't.

Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: add mode1 support on smu_v13_0_7

add mode1 support since it's missing on smu_v13_0_7

Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: add ih cg and hdp sd on smu_v13_0_7

add ih cg and hdp sd on smu_v13_0_7

Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: add missing ->fini_xxxx interfaces for some SMU13 asics

Without these, potential memory leak may be induced.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: add missing ->fini_microcode interface for Sienna Cichlid

To avoid any potential memory leak.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: disable 3DCGCG/CGLS temporarily due to stability issue

Some stability issues were reported with these features.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file

Commit 36d4b36b6959 ("lib/nodemask: inline next_node_in() and
node_random()") refactored some code by moving node_random() from
lib/nodemask.c to include/linux/nodemask.h, thus requiring nodemask.h to
include random.h, which conditionally defines add_latent_entropy()
depending on whether the macro LATENT_ENTROPY_PLUGIN is defined.

This broke the build on powerpc, where nodemask.h is indirectly included
in arch/powerpc/kernel/prom_init.c, part of the early boot machinery that
is excluded from the latent entropy plugin using
DISABLE_LATENT_ENTROPY_PLUGIN. It turns out that while we add a gcc flag
to disable the actual plugin, we don't undefine LATENT_ENTROPY_PLUGIN.

This leads to the following:

    CC      arch/powerpc/kernel/prom_init.o
  In file included from ./include/linux/nodemask.h:97,
                   from ./include/linux/mmzone.h:17,
                   from ./include/linux/gfp.h:7,
                   from ./include/linux/xarray.h:15,
                   from ./include/linux/radix-tree.h:21,
                   from ./include/linux/idr.h:15,
                   from ./include/linux/kernfs.h:12,
                   from ./include/linux/sysfs.h:16,
                   from ./include/linux/kobject.h:20,
                   from ./include/linux/pci.h:35,
                   from arch/powerpc/kernel/prom_init.c:24:
  ./include/linux/random.h: In function 'add_latent_entropy':
  ./include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first use in this function); did you mean 'add_latent_entropy'?
     25 |         add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy));
        |                                              ^~~~~~~~~~~~~~
        |                                              add_latent_entropy
  ./include/linux/random.h:25:46: note: each undeclared identifier is reported only once for each function it appears in
  make[2]: *** [scripts/Makefile.build:249: arch/powerpc/kernel/prom_init.o] Fehler 1
  make[1]: *** [scripts/Makefile.build:465: arch/powerpc/kernel] Fehler 2
  make: *** [Makefile:1855: arch/powerpc] Error 2

Change the DISABLE_LATENT_ENTROPY_PLUGIN flags to undefine
LATENT_ENTROPY_PLUGIN for files where the plugin is disabled.

Cc: Yury Norov <[email protected]>
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216367
Link: https://lore.kernel.org/linuxppc-dev/[email protected]/
Reported-by: Erhard Furtner <[email protected]>
Signed-off-by: Andrew Donnellan <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

LoadPin: Return EFAULT on copy_from_user() failures

The copy_from_user() function returns the number of bytes remaining to
be copied on a failure. Such failures should return -EFAULT to high
levels.

Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Fixes: 3f805f8cc23b ("LoadPin: Enable loading from trusted dm-verity devices")
Cc: Matthias Kaehlcke <[email protected]>
Cc: James Morris <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>

exec: Replace kmap{,_atomic}() with kmap_local_page()

The use of kmap() and kmap_atomic() are being deprecated in favor of
kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since the use of kmap_local_page() in exec.c is safe, it should be
preferred everywhere in exec.c.

As said, since kmap_local_page() can be also called from atomic context,
and since remove_arg_zero() doesn't (and shouldn't ever) rely on an
implicit preempt_disable(), this function can also safely replace
kmap_atomic().

Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
fs/exec.c.

Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.

Cc: Eric W. Biederman <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge tag 'nios2_fixes_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux

Pull NIOS2 fixes from Dinh Nguyen:

- Security fixes from Al Viro

* tag 'nios2_fixes_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: add force_successful_syscall_return()
  nios2: restarts apply only to the first sigframe we build...
  nios2: fix syscall restart checks
  nios2: traced syscall does need to check the syscall number
  nios2: don't leave NULLs in sys_call_table[]
  nios2: page fault et.al. are *not* restartable syscalls...

Merge tag 'spi-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few fixes that came in since my pull request, the Meson fix is a
  little large since it's fixing all possible cases of the problem that
  was observed with the driver and clock API trying to share
  configuration by integrating the device clocking fully with the clock
  API rather than spot fixing the one instance that was observed"

* tag 'spi-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dt-bindings: Drop Pratyush Yadav
  spi: meson-spicc: add local pow2 clock ops to preserve rate between messages
  MAINTAINERS: rectify entry for ARM/HPE GXP ARCHITECTURE
  spi: spi.c: Add missing __percpu annotations in users of spi_statistics

Merge tag 'regulator-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of small fixes that came in since my pull request, nothing
  major here"

* tag 'regulator-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Fix missing error return from regulator_bulk_get()
  regulator: pca9450: Remove restrictions for regulator-name

x86: simplify load_unaligned_zeropad() implementation

The exception for the "unaligned access at the end of the page, next
page not mapped" never happens, but the fixup code ends up causing
trouble for compilers to optimize well.

clang in particular ends up seeing it being in the middle of a loop, and
tries desperately to optimize the exception fixup code that is never
really reached.

The simple solution is to just move all the fixups into the exception
handler itself, which moves it all out of the hot case code, and means
that the compiler never sees it or needs to worry about it.

Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

locking/atomic: Make test_and_*_bit() ordered on failure

These operations are documented as always ordered in
include/asm-generic/bitops/instrumented-atomic.h, and producer-consumer
type use cases where one side needs to ensure a flag is left pending
after some shared data was updated rely on this ordering, even in the
failure case.

This is the case with the workqueue code, which currently suffers from a
reproducible ordering violation on Apple M1 platforms (which are
notoriously out-of-order) that ends up causing the TTY layer to fail to
deliver data to userspace properly under the right conditions. This
change fixes that bug.

Change the documentation to restrict the "no order on failure" story to
the _lock() variant (for which it makes sense), and remove the
early-exit from the generic implementation, which is what causes the
missing barrier semantics in that case. Without this, the remaining
atomic op is fully ordered (including on ARM64 LSE, as of recent
versions of the architecture spec).

Suggested-by: Linus Torvalds <[email protected]>
Cc: [email protected]
Fixes: e986a0d6cb36 ("locking/atomics, asm-generic/bitops/atomic.h: Rewrite using atomic_*() APIs")
Fixes: 61e02392d3c7 ("locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()")
Signed-off-by: Hector Martin <[email protected]>
Acked-by: Will Deacon <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

i40e: Fix to stop tx_timeout recovery if GLOBR fails

When a tx_timeout fires, the PF attempts to recover by incrementally
resetting. First we try a PFR, then CORER and finally a GLOBR. If the
GLOBR fails, then we keep hitting the tx_timeout and incrementing the
recovery level and issuing dmesgs, which is both annoying to the user
and accomplishes nothing.

If the GLOBR fails, then we're pretty much totally hosed, and there's
not much else we can do to recover, so this makes it such that we just
kill the VSI and stop hitting the tx_timeout in such a case.

Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Alan Brady <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

i40e: Fix tunnel checksum offload with fragmented traffic

Fix checksum offload on VXLAN tunnels.
In case, when mpls protocol is not used, set l4 header to transport
header of skb. This fixes case, when user tries to offload checksums
of VXLAN tunneled traffic.

Steps for reproduction (requires link partner with tunnels):
ip l s enp130s0f0 up
ip a f enp130s0f0
ip a a 10.10.110.2/24 dev enp130s0f0
ip l s enp130s0f0 mtu 1600
ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev \
enp130s0f0 dstport 4789
ip l s vxlan12_sut up
ip a a 20.10.110.2/24 dev vxlan12_sut
iperf3 -c 20.10.110.1 #should connect

Without this patch, TX descriptor was using wrong data, due to
l4 header pointing wrong address. NIC would then drop those packets
internally, due to incorrect TX descriptor data, which increased
GLV_TEPC register.

Fixes: b4fb2d33514a ("i40e: Add support for MPLS + TSO")
Signed-off-by: Przemyslaw Patynowski <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Marek Szlosek <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

RDMA: Handle the return code from dma_resv_wait_timeout() properly

ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure.

dma_resv_wait_timeout() uses a different scheme:

* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
* greater than zero on success.

This results in ib_umem_dmabuf_map_pages() being non-functional as a
positive return will be understood to be an error by drivers.

Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence")
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Maor Gottlieb <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

RDMA/erdma: Correct the max_qp and max_cq capacities of the device

QP0 in HW is used for CMDQ, and the rest is for RDMA QPs. So the actual
max_qp capacity reported to core should be max_qp (reported by HW) - 1.
So does max_cq.

Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation")
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Cheng Xu <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

RDMA/erdma: Using the key in FMR WR instead of MR structure

RDMA driver should get the MR key from FMR WR, not the MR structure passed
in.

Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Cheng Xu <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

ALSA: hda/realtek: Add quirk for Lenovo Yoga7 14IAL7

Lenovo Yoga7 14IAL7 requires the same quirk as Lenovo Yoga9 14IAP7 for
fixing the bass speaker problems.

Reported-by: Pascal Gross <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

RDMA/cxgb4: fix accept failure due to increased cpl_t5_pass_accept_rpl size

Commit 'c2ed5611afd7' has increased the cpl_t5_pass_accept_rpl{} structure
size by 8B to avoid roundup. cpl_t5_pass_accept_rpl{} is a HW specific
structure and increasing its size will lead to unwanted adapter errors.
Current commit reverts the cpl_t5_pass_accept_rpl{} back to its original
and allocates zeroed skb buffer there by avoiding the memset for iss field.
Reorder code to minimize chip type checks.

Fixes: c2ed5611afd7 ("iw_cxgb4: Use memset_startat() for cpl_t5_pass_accept_rpl")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Potnuri Bharat Teja <[email protected]>
Signed-off-by: Rahul Lakkireddy <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

RDMA/mlx5: Use the proper number of ports

The cited commit allowed the driver to operate over HCAs that have
4 physical ports. Use the number of ports of the RDMA device in the for
loop instead of using the struct size.

Fixes: 4cd14d44b11d ("net/mlx5: Support devices with more than 2 ports")
Link: https://lore.kernel.org/r/a54a56c2ede16044a29d119209b35189c662ac72.1659944855.git.leonro@nvidia.com
Signed-off-by: Mark Bloch <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

ALSA: hda: cs35l41: Clarify support for CSC3551 without _DSD Properties

For devices which use HID CSC3551, correct ACPI _DSD properties are
required to be able support those systems.
Add error message to clarify this.

Signed-off-by: Stefan Binding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'asoc-fix-v6.0-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.0

A relatively large batch of fixes that came in since my pull request,
none of them too major and mostly device specific apart from a series of
security/robustness improvements from Takashi.

IB/iser: Fix login with authentication

The iSER Initiator uses two types of receive buffers:

- one big login buffer posted by iser_post_recvl();
- several small message buffers posted by iser_post_recvm().

The login buffer is used at the login phase and full feature phase in
the discovery session. It may take a few requests and responses to
complete the login phase. The message buffers are only used in the
normal operational session at the full feature phase.

After the commit referred in the fixes line, the login operation fails
if the authentication is enabled. That happens because the Initiator
posts a small receive buffer after the first response from Target. So,
the next send operation fails because Target's second response does not
fit into the small receive buffer.

This commit adds additional checks to prevent posting small receive
buffers until the full feature phase.

Fixes: 39b169ea0d36 ("IB/iser: Fix RNR errors")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sergey Gorenko <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>

ublk_drv: do not add a re-issued request aborted previously to ioucmd's task_work

In ublk_queue_rq(), Assume current request is a re-issued request aborted
previously in monitor_work because the ubq_daemon(ioucmd's task) is
PF_EXITING. For this request, we cannot call
io_uring_cmd_complete_in_task() anymore because at that moment io_uring
context may be freed in case that no inflight ioucmd exists. Otherwise,
we may cause null-deref in ctx->fallback_work.

Add a check on UBLK_IO_FLAG_ABORTED to prevent the above situation. This
check is safe and makes sense.

Note: monitor_work sets UBLK_IO_FLAG_ABORTED and ends this request
(releasing the tag). Then the request is restarted(allocating the tag)
and we are here. Since releasing/allocating a tag implies smp_mb(),
finding UBLK_IO_FLAG_ABORTED guarantees that here is a re-issued request
aborted previously.

Suggested-by: Ming Lei <[email protected]>
Signed-off-by: ZiyangZhang <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

ublk_drv: update comment for __ublk_fail_req()

Since __ublk_rq_task_work always fails requests immediately during
exiting, __ublk_fail_req() is only called from abort context during
exiting. So lock is unnecessary.

Signed-off-by: ZiyangZhang <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

ublk_drv: check ubq_daemon_is_dying() in __ublk_rq_task_work()

Replace direct check on PF_EXITING in __ublk_rq_task_work() by the
existing wrapper. Also inline ubq_daemon_is_dying().

Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: ZiyangZhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

virtio: kerneldocs fixes and enhancements

Fix variable names in some kerneldocs, naming in others.
Add kerneldocs for struct vring_desc and vring_interrupt.

Signed-off-by: Ricardo Cañuelo <[email protected]>
Message-Id: <20220810094004 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>

virtio: Revert "virtio: find_vqs() add arg sizes"

This reverts commit a10fba0377145fccefea4dc4dd5915b7ed87e546: the
proposed API isn't supported on all transports but no
effort was made to address this.

It might not be hard to fix if we want to: maybe just
rename size to size_hint and make sure legacy
transports ignore the hint.

But it's not sure what the benefit is in any case, so
let's drop it.

Fixes: a10fba037714 ("virtio: find_vqs() add arg sizes")
Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20220816053602 [email protected]>

virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"

This reverts commit 99e8927d8a4da8eb8a8a5904dc13a3156be8e7c0:
proposed API isn't supported on all transports but no
effort was made to address this.

It might not be hard to fix if we want to: maybe just rename size to
size_hint and make sure legacy transports ignore the hint.

But it's not sure what the benefit is in any case, so let's drop it.

Fixes: 99e8927d8a4d ("virtio_vdpa: support the arg sizes of find_vqs()")
Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20220816053602 [email protected]>

virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"

This reverts commit cdb44806fca2d0ad29ca644cbf1505433902ee0c: the legacy
path is wrong and in fact can not support the proposed API since for a
legacy device we never communicate the vq size to the hypervisor.

Reported-by: Andres Freund <[email protected]>
Fixes: cdb44806fca2 ("virtio_pci: support the arg sizes of find_vqs()")
Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20220816053602 [email protected]>

virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"

This reverts commit fbed86abba6e0472d98079790e58060e4332608a.
The API is now unused, let's not carry dead code around.

Fixes: fbed86abba6e ("virtio_mmio: support the arg sizes of find_vqs()")
Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20220816053602 [email protected]>

virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()"

This reverts commit fe3dc04e31aa51f91dc7f741a5f76cc4817eb5b4: the
API is now unused and in fact can't be implemented on top of a legacy
device.

Fixes: fe3dc04e31aa ("virtio: add helper virtio_find_vqs_ctx_size()")
Cc: "Xuan Zhuo" <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20220816053602 [email protected]>

virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"

This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7.

This has been reported to trip up guests on GCP (Google Cloud).
The reason is that virtio_find_vqs_ctx_size is broken on legacy
devices. We can in theory fix virtio_find_vqs_ctx_size but
in fact the patch itself has several other issues:

- It treats unknown speed as < 10G
- It leaves userspace no way to find out the ring size set by hypervisor
- It tests speed when link is down
- It ignores the virtio spec advice:
        Both \field{speed} and \field{duplex} can change, thus the driver
        is expected to re-read these values after receiving a
        configuration change notification.
- It is not clear the performance impact has been tested properly

Revert the patch for now.

Reported-by: Andres Freund <[email protected]>
Link: https://lore.kernel.org/r/20220814212610.GA3690074%40roeck-us.net
Link: https://lore.kernel.org/r/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de
Link: https://lore.kernel.org/r/3df6bb82-1951-455d-a768-e9e1513eb667%40www.fastmail.com
Link: https://lore.kernel.org/r/FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE%40anarazel.de
Fixes: 762faee5a267 ("virtio_net: set the default max ring size by find_vqs()")
Cc: Xuan Zhuo <[email protected]>
Cc: Jason Wang <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Tested-by: Andres Freund <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Message-Id: <20220816053602 [email protected]>

io_uring/notif: raise limit on notification slots

1024 notification slots is rather an arbitrary value, raise it up,
everything is accounted to memcg.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/eb78a0a5f2fa5941f8e845cdae5fb399bf7ba0be.1660566179.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring/net: improve zc addr import error handling

We may account memory to a memcg of a request that didn't even got to
the network layer. It's not a bug as it'll be routinely cleaned up on
flush, but it might be confusing for the userspace.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b8aae61f4c3ddc4da97c1da876bb73871f352d50.1660566179.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring/net: use right helpers for async recycle

We have a helper that checks for whether a request contains anything in
->async_data or not, namely req_has_async_data(). It's better to use it
as it might have some extra considerations.

Fixes: 43e0bbbd0b0e3 ("io_uring: add netmsg cache")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b7414da4e7c3c32c31fc02dfd1355af4ccf4ca5f.1660566179.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-08-12 (iavf)

This series contains updates to iavf driver only.

Przemyslaw frees memory for admin queues in initialization error paths,
prevents freeing of vf_res which is causing null pointer dereference,
and adjusts calls in error path of reset to avoid iavf_close() which
could cause deadlock.

Ivan Vecera avoids deadlock that can occur when driver if part of
failover.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix deadlock in initialization
  iavf: Fix reset error handling
  iavf: Fix NULL pointer dereference in iavf_get_link_ksettings
  iavf: Fix adminq error handling
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: rtnetlink: fix module reference count leak issue in rtnetlink_rcv_msg

When bulk delete command is received in the rtnetlink_rcv_msg function,
if bulk delete is not supported, module_put is not called to release
the reference counting. As a result, module reference count is leaked.

Fixes: a6cec0bcd342 ("net: rtnetlink: add bulk delete support flag")
Signed-off-by: Zhengchao Shao <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: moxa: pass pdev instead of ndev to DMA functions

dma_map_single() calls fail in moxart_mac_setup_desc_ring() and
moxart_mac_start_xmit() which leads to an incessant output of this:

[   16.043925] moxart-ethernet 92000000.mac eth0: DMA mapping error
[   16.050957] moxart-ethernet 92000000.mac eth0: DMA mapping error
[   16.058229] moxart-ethernet 92000000.mac eth0: DMA mapping error

Passing pdev to DMA is a common approach among net drivers.

Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver")
Signed-off-by: Sergei Antonov <[email protected]>
Suggested-by: Andrew Lunn <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ksmbd: don't remove dos attribute xattr on O_TRUNC open

When smb client open file in ksmbd share with O_TRUNC, dos attribute
xattr is removed as well as data in file. This cause the FSCTL_SET_SPARSE
request from the client fails because ksmbd can't update the dos attribute
after setting ATTR_SPARSE_FILE. And this patch fix xfstests generic/469
test also.

Signed-off-by: Namjae Jeon <[email protected]>
Reviewed-by: Hyunchul Lee <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: remove unnecessary generic_fillattr in smb2_open

Remove unnecessary generic_fillattr to fix wrong
AllocationSize of SMB2_CREATE response, And
Move the call of ksmbd_vfs_getattr above the place
where stat is needed because of truncate.

This patch fixes wrong AllocationSize of SMB2_CREATE
response. Because ext4 updates inode->i_blocks only
when disk space is allocated, generic_fillattr does
not set stat.blocks properly for delayed allocation.
But ext4 returns the blocks that include the delayed
allocation blocks when getattr is called.

The issue can be reproduced with commands below:

touch ${FILENAME}
xfs_io -c "pwrite -S 0xAB 0 40k" ${FILENAME}
xfs_io -c "stat" ${FILENAME}

40KB are written, but the count of blocks is 8.

Signed-off-by: Hyunchul Lee <[email protected]>
Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>