Git Repo - linux.git/log

drm/amd/display: fix pow() crashing when given base 0

[Why&How]
pow(a,x) is implemented as exp(x*log(a)). log(0) will crash.
So return 0^x = 0, unless x=0, convention seems to be 0^0 = 1.

Cc: [email protected]
Signed-off-by: Krunoslav Kovac <[email protected]>
Reviewed-by: Anthony Koo <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Reset scrambling on Test Pattern

[Why]
Programming is missing the sequence where for eDP the scrambling is
reset when testing for eye diagram test pattern.

[How]
Include the required register in the definition

Signed-off-by: Chris Park <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: fix dcn3 wide timing dsc validation

Wide timing DSC requires odm. Since spreadsheet is missing this dsc
validation we have to modify DML vba code ourselves.

Signed-off-by: Dmytro Laktyushkin <[email protected]>
Reviewed-by: Wesley Chalmers <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix DFPstate hang due to view port changed

[Why]
Place the cursor in the center of screen between two pipes then
adjusting the viewport but cursour doesn't update cause DFPstate hang.

[How]
If viewport changed, update cursor as well.

Cc: [email protected]
Signed-off-by: Paul Hsieh <[email protected]>
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Assign correct left shift

[Why]
Reading for DP alt registers return incorrect values due to LE_SF
definition missing.

[How]
Define correct LE_SF or DP alt registers.

Signed-off-by: Chris Park <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Call DMUB for eDP power control

[Why]
If DMUB is used, LVTMA VBIOS call can be used to control eDP instead of
tranditional transmitter control. Interface is agreed with VBIOS for
eDP to use this new path to program LVTMA registers.

[How]
Create DAL interface to send DMUB command for LVTMA as currently
implemented in VBIOS.

Signed-off-by: Chris Park <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: fix the wrong sdma instance query for renoir

Renoir only has one sdma instance, it will get failed once query the
sdma1 registers. So use switch-case instead of static register array.

Signed-off-by: Huang Rui <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: parse ta firmware for navy_flounder

Use the same case as sienna_cichlid

Signed-off-by: Bhawanpreet Lakha <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Merge tag 'spi-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A bunch of fixes that came in for SPI during the merge window.

  Some from ST and others for their controller, one from Lukas for a
  race between device addition and controller unregistration and one
  from fix from Geert for the DT bindings which unbreaks validation"

* tag 'spi-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  dt-bindings: lpspi: Add missing boolean type for fsl,spi-only-use-cs1-sel
  spi: stm32: always perform registers configuration prior to transfer
  spi: stm32: fixes suspend/resume management
  spi: stm32: fix stm32_spi_prepare_mbr in case of odd clk_rate
  spi: stm32: fix fifo threshold level in case of short transfer
  spi: stm32h7: fix race condition at end of transfer
  spi: stm32: clear only asserted irq flags on interrupt
  spi: Prevent adding devices below an unregistering controller

io_uring: cleanup io_import_iovec() of pre-mapped request

io_rw_prep_async() goes through a dance of clearing req->io, calling
the iovec import, then re-setting req->io. Provide an internal helper
that does the right thing without needing state tweaked to get there.

This enables further cleanups in io_read, io_write, and
io_resubmit_prep(), but that's left for another time.

Signed-off-by: Jens Axboe <[email protected]>

drm/amdgpu: fix NULL pointer access issue when unloading driver

When unloading driver by "modprobe -r amdgpu", one NULL pointer
dereference bug occurs in ras debugfs releasing. The cause is the
duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
up already by drm_minor_unregister.

BUG: kernel NULL pointer dereference, address: 00000000000000a0
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
RIP: 0010:down_write+0x15/0x40
Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
simple_recursive_removal+0x63/0x370
? debugfs_remove+0x60/0x60
debugfs_remove+0x40/0x60
amdgpu_ras_fini+0x82/0x230 [amdgpu]
? __kernfs_remove.part.17+0x101/0x1f0
? kernfs_name_hash+0x12/0x80
amdgpu_device_fini+0x1c0/0x580 [amdgpu]
amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
amdgpu_pci_remove+0x36/0x60 [amdgpu]
pci_device_remove+0x3b/0xb0
device_release_driver_internal+0xe5/0x1c0
driver_detach+0x46/0x90
bus_remove_driver+0x58/0xd0
pci_unregister_driver+0x29/0x90
amdgpu_exit+0x11/0x25 [amdgpu]
__x64_sys_delete_module+0x13d/0x210
do_syscall_64+0x5f/0x250
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix uninit-value in arcturus_log_thermal_throttling_event()

when function arcturus_get_smu_metrics_data() call failed,
it will cause the variable "throttler_status" isn't initialized before use.

warning:
powerplay/arcturus_ppt.c:2268:24: warning: ‘throttler_status’ may be used uninitialized in this function [-Wmaybe-uninitialized]
2268 | if (throttler_status & logging_label[throttler_idx].feature_mask) {

Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: disable gfxoff for navy_flounder

gfxoff is temporarily disabled for navy_flounder,
since at present the feature has broken some basic
amdgpu test.

Signed-off-by: Jiansong Chen <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

net: gianfar: Add of_node_put() before goto statement

Every iteration of for_each_available_child_of_node() decrements
reference count of the previous node, however when control
is transferred from the middle of the loop, as in the case of
a return or break or goto, there is no decrement thus ultimately
resulting in a memory leak.

Fix a potential memory leak in gianfar.c by inserting of_node_put()
before the goto statement.

Issue found with Coccinelle.

Signed-off-by: Sumera Priyadarsini <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'cxgb4-Fix-ethtool-selftest-flits-calculation'

Ganji Aravind says:

====================
cxgb4: Fix ethtool selftest flits calculation

Patch 1 will fix work request size calculation for loopback selftest.

Patch 2 will fix race between loopback selftest and normal Tx handler.
====================

Signed-off-by: David S. Miller <[email protected]>

cxgb4: Fix race between loopback and normal Tx path

Even after Tx queues are marked stopped, there exists a
small window where the current packet in the normal Tx
path is still being sent out and loopback selftest ends
up corrupting the same Tx ring. So, ensure selftest takes
the Tx lock to synchronize access the Tx ring.

Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test")
Signed-off-by: Ganji Aravind <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: Fix work request size calculation for loopback test

Work request used for sending loopback packet needs to add
the firmware work request only once. So, fix by using
correct structure size.

Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test")
Signed-off-by: Ganji Aravind <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'sfc-more-EF100-fixes'

Edward Cree says:

====================
sfc: more EF100 fixes

Fix up some bugs in the initial EF100 submission, and re-fix
the hash_valid fix which was incomplete.

The reset bugs are currently hard to trigger; they were found
with an in-progress patch adding ethtool support, whereby
ethtool --reset reliably reproduces them.
====================

Signed-off-by: David S. Miller <[email protected]>

sfc: don't free_irq()s if they were never requested

If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
needed and will cause error messages and stack traces.
So instead, only do this if efx_nic_init_interrupt successfully completed,
as indicated by the new efx->irqs_hooked flag.

Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: null out channel->rps_flow_id after freeing it

If an ef100_net_open() fails, ef100_net_stop() may be called without
channel->rps_flow_id having been written; thus it may hold the address
freed by a previous ef100_net_stop()'s call to efx_remove_filters().
This then causes a double-free when efx_remove_filters() is called
again, leading to a panic.
To prevent this, after freeing it, overwrite it with NULL.

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: take correct lock in ef100_reset()

When downing and upping the ef100 filter table, we need to take a write
lock on efx->filter_sem, not just a read lock, because we may kfree()
the table pointers.
Without this, resets cause a WARN_ON from efx_rwsem_assert_write_locked().

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: really check hash is valid before using it

Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.

Fixes: 068885434ccb ("sfc: check hash is valid before using it")
Reported-by: Martin Habets <[email protected]>
Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macvlan: validate setting of multiple remote source MAC addresses

Remote source MAC addresses can be set on a 'source mode' macvlan
interface via the IFLA_MACVLAN_MACADDR_DATA attribute. This commit
tightens the validation of these MAC addresses to match the validation
already performed when setting or adding a single MAC address via the
IFLA_MACVLAN_MACADDR attribute.

iproute2 uses IFLA_MACVLAN_MACADDR_DATA for its 'macvlan macaddr set'
command, and IFLA_MACVLAN_MACADDR for its 'macvlan macaddr add' command,
which demonstrates the inconsistent behaviour that this commit
addresses:

# ip link add link eth0 name macvlan0 type macvlan mode source
# ip link set link dev macvlan0 type macvlan macaddr add 01:00:00:00:00:00
RTNETLINK answers: Cannot assign requested address
# ip link set link dev macvlan0 type macvlan macaddr set 01:00:00:00:00:00
# ip -d link show macvlan0
5: macvlan0@eth0: <BROADCAST,MULTICAST,DYNAMIC,UP,LOWER_UP> mtu 1500 ...
link/ether 2e:ac:fd:2d:69:f8 brd ff:ff:ff:ff:ff:ff promiscuity 0
macvlan mode source remotes (1) 01:00:00:00:00:00 numtxqueues 1 ...

With this change, the 'set' command will (rightly) fail in the same way
as the 'add' command.

Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'fixes-2020-08-18' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull ia64 page table fix from Mike Rapoport:
"Fix regression in IA-64 caused by page table allocation refactoring

  The refactoring and consolidation of <asm/pgalloc.h> caused regression
  on parisc and ia64. The fix for parisc made it into v5.9-rc1 while the
  fix ia64 got delayed a bit and here it is"

* tag 'fixes-2020-08-18' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  arch/ia64: Restore arch-specific pgd_offset_k implementation

mm/memory.c: skip spurious TLB flush for retried page fault

Recently we found regression when running will_it_scale/page_fault3 test
on ARM64.  Over 70% down for the multi processes cases and over 20% down
for the multi threads cases.  It turns out the regression is caused by
commit 89b15332af7c ("mm: drop mmap_sem before calling
balance_dirty_pages() in write fault").

The test mmaps a memory size file then write to the mapping, this would
make all memory dirty and trigger dirty pages throttle, that upstream
commit would release mmap_sem then retry the page fault.  The retried
page fault would see correct PTEs installed then just fall through to
spurious TLB flush.  The regression is caused by the excessive spurious
TLB flush.  It is fine on x86 since x86's spurious TLB flush is no-op.

We could just skip the spurious TLB flush to mitigate the regression.

Suggested-by: Linus Torvalds <[email protected]>
Reported-by: Xu Yu <[email protected]>
Debugged-by: Xu Yu <[email protected]>
Tested-by: Xu Yu <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ext4: change to use fallthrough macro

Change to use fallthrough macro in switch case.

Signed-off-by: Shijie Luo <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: remove unused parameter of ext4_generic_delete_entry function

The ext4_generic_delete_entry function does not use the parameter
handle, so it can be removed.

Signed-off-by: Kyoungho Koo <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Link: https://lore.kernel.org/r/20200810080701.GA14160@koo-Z370-HD3
Signed-off-by: Theodore Ts'o <[email protected]>

mballoc: replace seq_printf with seq_puts

seq_puts is a lot cheaper than seq_printf, so use that to print
literal strings.

Signed-off-by: Xu Wang <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: optimize the implementation of ext4_mb_good_group()

It might be better to adjust the code in two places:
1. Determine whether grp is currupt or not should be placed first.
2. (cr<=2 && free <ac->ac_g_ex.fe_len)should may belong to the crx
   strategy, and it may be more appropriate to put it in the
   subsequent switch statement block. For cr1, cr2, the conditions
   in switch potentially realize the above judgment. For cr0, we
   should add (free <ac->ac_g_ex.fe_len) judgment, and then delete
   (free / fragments) >= ac->ac_g_ex.fe_len), because cr0 returns
   true by default.

Signed-off-by: Chunguang Xu <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: delete invalid comments near ext4_mb_check_limits()

These comments do not seem to be related to ext4_mb_check_limits(),
it may be invalid.

Signed-off-by: Chunguang Xu <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: fix typos in ext4_mb_regular_allocator() comment

Fix typos in ext4_mb_regular_allocator() comment

Signed-off-by: Chunguang Xu <[email protected]>
Reviewed-by: Ritesh Harjani <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

Revert "HID: usbhid: do not sleep when opening device"

This reverts commit d3132792285859253c466354fd8d54d1fe0ba786.

This patch causes a regression with quite a few devices, as probing fails
because of the race where the first IRQ is dropped on the floor (after
hid_device_io_start() happens, but before the 50ms timeout passess), and
report descriptor never gets parsed and populated.

As this is just a boot time micro-optimization, let's revert the
patch for 5.9 now, and fix this properly eventually for next merge
window.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=208935
Reported-by: Johannes Hirte <[email protected]>
Reported-by: Marius Zachmann <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

libbpf: Fix build on ppc64le architecture

On ppc64le we get the following warning:

  In file included from btf_dump.c:16:0:
  btf_dump.c: In function ‘btf_dump_emit_struct_def’:
  ../include/linux/kernel.h:20:17: error: comparison of distinct pointer types lacks a cast [-Werror]
    (void) (&_max1 == &_max2);  \
                   ^
  btf_dump.c:882:11: note: in expansion of macro ‘max’
      m_sz = max(0LL, btf__resolve_size(d->btf, m->type));
             ^~~

Fix by explicitly casting to __s64, which is a return type from
btf__resolve_size().

Fixes: 702eddc77a90 ("libbpf: Handle GCC built-in types for Arm NEON")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

dt-bindings: Use Shawn Guo's preferred e-mail for i.MX bindings

Use Shawn Guo's kernel.org address for the i.MX related bindings
as per the MAINTAINERS entries.

Signed-off-by: Fabio Estevam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>

RDMA/usnic: Fix spelling mistake "transistion" -> "transition"

There is a spelling mistake in a usnic_err error message. Fix it.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

RDMA/hns: Fix spelling mistake "epmty" -> "empty"

There is a spelling mistake in a dev_dbg message. Fix it.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

drm/msm: add shutdown support for display platform_driver

Define shutdown callback for display drm driver,
so as to disable all the CRTCS when shutdown
notification is received by the driver.

This change will turn off the timing engine so
that no display transactions are requested
while mmu translations are getting disabled
during reboot sequence.

Signed-off-by: Krishna Manikandan <[email protected]>
Changes in v2:
- Remove NULL check from msm_pdev_shutdown (Stephen Boyd)
- Change commit text to reflect when this issue
was uncovered (Sai Prakash Ranjan)

Signed-off-by: Rob Clark <[email protected]>

bfq: fix blkio cgroup leakage v4

Changes from v1:
    - update commit description with proper ref-accounting justification

commit db37a34c563b ("block, bfq: get a ref to a group when adding it to a service tree")
introduce leak forbfq_group and blkcg_gq objects because of get/put
imbalance.
In fact whole idea of original commit is wrong because bfq_group entity
can not dissapear under us because it is referenced by child bfq_queue's
entities from here:
-> bfq_init_entity()
    ->bfqg_and_blkg_get(bfqg);
    ->entity->parent = bfqg->my_entity

-> bfq_put_queue(bfqq)
    FINAL_PUT
    ->bfqg_and_blkg_put(bfqq_group(bfqq))
    ->kmem_cache_free(bfq_pool, bfqq);

So parent entity can not disappear while child entity is in tree,
and child entities already has proper protection.
This patch revert commit db37a34c563b ("block, bfq: get a ref to a group when adding it to a service tree")

bfq_group leak trace caused by bad commit:
-> blkg_alloc
   -> bfq_pq_alloc
     -> bfqg_get (+1)
->bfq_activate_bfqq
  ->bfq_activate_requeue_entity
    -> __bfq_activate_entity
       ->bfq_get_entity
         ->bfqg_and_blkg_get (+1)  <==== : Note1
->bfq_del_bfqq_busy
  ->bfq_deactivate_entity+0x53/0xc0 [bfq]
    ->__bfq_deactivate_entity+0x1b8/0x210 [bfq]
      -> bfq_forget_entity(is_in_service = true)
entity->on_st_or_in_serv = false   <=== :Note2
if (is_in_service)
     return;  ==> do not touch reference
-> blkcg_css_offline
-> blkcg_destroy_blkgs
  -> blkg_destroy
   -> bfq_pd_offline
    -> __bfq_deactivate_entity
         if (!entity->on_st_or_in_serv) /* true, because (Note2)
return false;
-> bfq_pd_free
    -> bfqg_put() (-1, byt bfqg->ref == 2) because of (Note2)
So bfq_group and blkcg_gq  will leak forever, see test-case below.

##TESTCASE_BEGIN:
#!/bin/bash

max_iters=${1:-100}
#prep cgroup mounts
mount -t tmpfs cgroup_root /sys/fs/cgroup
mkdir /sys/fs/cgroup/blkio
mount -t cgroup -o blkio none /sys/fs/cgroup/blkio

# Prepare blkdev
grep blkio /proc/cgroups
truncate -s 1M img
losetup /dev/loop0 img
echo bfq > /sys/block/loop0/queue/scheduler

grep blkio /proc/cgroups
for ((i=0;i<max_iters;i++))
do
    mkdir -p /sys/fs/cgroup/blkio/a
    echo 0 > /sys/fs/cgroup/blkio/a/cgroup.procs
    dd if=/dev/loop0 bs=4k count=1 of=/dev/null iflag=direct 2> /dev/null
    echo 0 > /sys/fs/cgroup/blkio/cgroup.procs
    rmdir /sys/fs/cgroup/blkio/a
    grep blkio /proc/cgroups
done
##TESTCASE_END:

Fixes: db37a34c563b ("block, bfq: get a ref to a group when adding it to a service tree")
Tested-by: Oleksandr Natalenko <[email protected]>
Signed-off-by: Dmitry Monakhov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

EDAC/{i7core,sb,pnd2,skx}: Fix error event severity

IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto
the stack as part of machine check delivery is valid or not.

Various drivers copied a code fragment that uses the RIPV bit to
determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED
or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where
RIPV is set as "FATAL").

Reverse the tests so that the error is marked fatal when RIPV is not set.

Reported-by: Gabriele Paoloni <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

mei: hdcp: fix mei_hdcp_verify_mprime() input parameter

wired_cmd_repeater_auth_stream_req_in has a variable
length array at the end. we use struct_size() overflow
macro to determine the size for the allocation and sending
size.
This also fixes bug in case number of streams is > 0 in the original
submission. This bug was not triggered as the number of streams is
always one.

Fixes: c56967d674e3 (mei: hdcp: Replace one-element array with flexible-array member)
Fixes: 0a1af1b5c18d (misc/mei/hdcp: Verify M_prime)
Cc: <[email protected]> # v5.1+: c56967d674e3 (mei: hdcp: Replace one-element array with flexible-array member)
Signed-off-by: Tomas Winkler <[email protected]>
Reviewed-by: Gustavo A. R. Silva <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

tty: serial: imx: add dependence and build for earlycon

Add the earlycon dependence and add earlycon Makefile support
to allow to build the driver.

Fixes: 699cc4dfd140 ("tty: serial: imx: add imx earlycon driver")
Signed-off-by: Fugang Duan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: samsung: Removes the IRQ not found warning

In few older Samsung SoCs like s3c2410, s3c2412
and s3c2440, UART IP is having 2 interrupt lines.
However, in other SoCs like s3c6400, s5pv210,
exynos5433, and exynos4210 UART is having only 1
interrupt line. Due to this, "platform_get_irq(platdev, 1)"
call in the driver gives the following false-positive error:
"IRQ index 1 not found" on newer SoC's.

This patch adds the condition to check for Tx interrupt
only for the those SoC's which have 2 interrupt lines.

Tested-by: Alim Akhtar <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Alim Akhtar <[email protected]>
Signed-off-by: Tamseel Shams <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: 8250: change lock order in serial8250_do_startup()

We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
lockdep reports coming from 8250 driver; this causes a bit of trouble
to people, so let's fix it.

The problem is reverse lock order in two different call paths:

chain #1:

serial8250_do_startup()
  spin_lock_irqsave(&port->lock);
   disable_irq_nosync(port->irq);
    raw_spin_lock_irqsave(&desc->lock)

chain #2:

  __report_bad_irq()
   raw_spin_lock_irqsave(&desc->lock)
    for_each_action_of_desc()
     printk()
      spin_lock_irqsave(&port->lock);

Fix this by changing the order of locks in serial8250_do_startup():
do disable_irq_nosync() first, which grabs desc->lock, and grab
uart->port after that, so that chain #1 and chain #2 have same lock
order.

Full lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
5.4.39 #55 Not tainted
======================================================

swapper/0/0 is trying to acquire lock:
ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57

but task is already holding lock:
ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&irq_desc_lock_class){-.-.}:
        _raw_spin_lock_irqsave+0x61/0x8d
        __irq_get_desc_lock+0x65/0x89
        __disable_irq_nosync+0x3b/0x93
        serial8250_do_startup+0x451/0x75c
        uart_startup+0x1b4/0x2ff
        uart_port_activate+0x73/0xa0
        tty_port_open+0xae/0x10a
        uart_open+0x1b/0x26
        tty_open+0x24d/0x3a0
        chrdev_open+0xd5/0x1cc
        do_dentry_open+0x299/0x3c8
        path_openat+0x434/0x1100
        do_filp_open+0x9b/0x10a
        do_sys_open+0x15f/0x3d7
        kernel_init_freeable+0x157/0x1dd
        kernel_init+0xe/0x105
        ret_from_fork+0x27/0x50

-> #1 (&port_lock_key){-.-.}:
        _raw_spin_lock_irqsave+0x61/0x8d
        serial8250_console_write+0xa7/0x2a0
        console_unlock+0x3b7/0x528
        vprintk_emit+0x111/0x17f
        printk+0x59/0x73
        register_console+0x336/0x3a4
        uart_add_one_port+0x51b/0x5be
        serial8250_register_8250_port+0x454/0x55e
        dw8250_probe+0x4dc/0x5b9
        platform_drv_probe+0x67/0x8b
        really_probe+0x14a/0x422
        driver_probe_device+0x66/0x130
        device_driver_attach+0x42/0x5b
        __driver_attach+0xca/0x139
        bus_for_each_dev+0x97/0xc9
        bus_add_driver+0x12b/0x228
        driver_register+0x64/0xed
        do_one_initcall+0x20c/0x4a6
        do_initcall_level+0xb5/0xc5
        do_basic_setup+0x4c/0x58
        kernel_init_freeable+0x13f/0x1dd
        kernel_init+0xe/0x105
        ret_from_fork+0x27/0x50

-> #0 (console_owner){-...}:
        __lock_acquire+0x118d/0x2714
        lock_acquire+0x203/0x258
        console_lock_spinning_enable+0x51/0x57
        console_unlock+0x25d/0x528
        vprintk_emit+0x111/0x17f
        printk+0x59/0x73
        __report_bad_irq+0xa3/0xba
        note_interrupt+0x19a/0x1d6
        handle_irq_event_percpu+0x57/0x79
        handle_irq_event+0x36/0x55
        handle_fasteoi_irq+0xc2/0x18a
        do_IRQ+0xb3/0x157
        ret_from_intr+0x0/0x1d
        cpuidle_enter_state+0x12f/0x1fd
        cpuidle_enter+0x2e/0x3d
        do_idle+0x1ce/0x2ce
        cpu_startup_entry+0x1d/0x1f
        start_kernel+0x406/0x46a
        secondary_startup_64+0xa4/0xb0

other info that might help us debug this:

Chain exists of:
   console_owner --> &port_lock_key --> &irq_desc_lock_class

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&irq_desc_lock_class);
                                lock(&port_lock_key);
                                lock(&irq_desc_lock_class);
   lock(console_owner);

  *** DEADLOCK ***

2 locks held by swapper/0/0:
  #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
  #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
Hardware name: XXXXXX
Call Trace:
  <IRQ>
  dump_stack+0xbf/0x133
  ? print_circular_bug+0xd6/0xe9
  check_noncircular+0x1b9/0x1c3
  __lock_acquire+0x118d/0x2714
  lock_acquire+0x203/0x258
  ? console_lock_spinning_enable+0x31/0x57
  console_lock_spinning_enable+0x51/0x57
  ? console_lock_spinning_enable+0x31/0x57
  console_unlock+0x25d/0x528
  ? console_trylock+0x18/0x4e
  vprintk_emit+0x111/0x17f
  ? lock_acquire+0x203/0x258
  printk+0x59/0x73
  __report_bad_irq+0xa3/0xba
  note_interrupt+0x19a/0x1d6
  handle_irq_event_percpu+0x57/0x79
  handle_irq_event+0x36/0x55
  handle_fasteoi_irq+0xc2/0x18a
  do_IRQ+0xb3/0x157
  common_interrupt+0xf/0xf
  </IRQ>

Signed-off-by: Sergey Senozhatsky <[email protected]>
Fixes: 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
Reported-by: Guenter Roeck <[email protected]>
Reported-by: Raul Rangel <[email protected]>
BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u
Reviewed-by: Andy Shevchenko <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: stm32: avoid kernel warning on absence of optional IRQ

stm32_init_port() of the stm32-usart may trigger a warning in
platform_get_irq() when the device tree specifies no wakeup interrupt.

The wakeup interrupt is usually a board-specific GPIO and the driver
functions correctly in its absence. The mainline stm32mp151.dtsi does
not specify it, so all mainline device trees trigger an unnecessary
kernel warning. Use of platform_get_irq_optional() avoids this.

Fixes: 2c58e56096dd ("serial: stm32: fix the get_irq error case")
Signed-off-by: Holger Assmann <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: pl011: Fix oops on -EPROBE_DEFER

If probing of a pl011 gets deferred until after free_initmem(), an oops
ensues because pl011_console_match() is called which has been freed.

Fix by removing the __init attribute from the function and those it
calls.

Commit 10879ae5f12e ("serial: pl011: add console matching function")
introduced pl011_console_match() not just for early consoles but
regular preferred consoles, such as those added by acpi_parse_spcr().
Regular consoles may be registered after free_initmem() for various
reasons, one being deferred probing, another being dynamic enablement
of serial ports using a DeviceTree overlay.

Thus, pl011_console_match() must not be declared __init and the
functions it calls mustn't either.

Stack trace for posterity:

Unable to handle kernel paging request at virtual address 80c38b58
Internal error: Oops: 8000000d [#1] PREEMPT SMP ARM
PC is at pl011_console_match+0x0/0xfc
LR is at register_console+0x150/0x468
[<80187004>] (register_console)
[<805a8184>] (uart_add_one_port)
[<805b2b68>] (pl011_register_port)
[<805b3ce4>] (pl011_probe)
[<80569214>] (amba_probe)
[<805ca088>] (really_probe)
[<805ca2ec>] (driver_probe_device)
[<805ca5b0>] (__device_attach_driver)
[<805c8060>] (bus_for_each_drv)
[<805c9dfc>] (__device_attach)
[<805ca630>] (device_initial_probe)
[<805c90a8>] (bus_probe_device)
[<805c95a8>] (deferred_probe_work_func)

Fixes: 10879ae5f12e ("serial: pl011: add console matching function")
Signed-off-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v4.10+
Cc: Aleksey Makarov <[email protected]>
Cc: Peter Hurley <[email protected]>
Cc: Russell King <[email protected]>
Cc: Christopher Covington <[email protected]>
Link: https://lore.kernel.org/r/f827ff09da55b8c57d316a1b008a137677b58921.1597315557.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: pl011: Don't leak amba_ports entry on driver register error

pl011_probe() calls pl011_setup_port() to reserve an amba_ports[] entry,
then calls pl011_register_port() to register the uart driver with the
tty layer.

If registration of the uart driver fails, the amba_ports[] entry is not
released.  If this happens 14 times (value of UART_NR macro), then all
amba_ports[] entries will have been leaked and driver probing is no
longer possible.  (To be fair, that can only happen if the DeviceTree
doesn't contain alias IDs since they cause the same entry to be used for
a given port.)   Fix it.

Fixes: ef2889f7ffee ("serial: pl011: Move uart_register_driver call to device")
Signed-off-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v3.15+
Cc: Tushar Behera <[email protected]>
Link: https://lore.kernel.org/r/138f8c15afb2f184d8102583f8301575566064a6.1597316167.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: 8250_exar: Fix number of ports for Commtech PCIe cards

The following in 8250_exar.c line 589 is used to determine the number
of ports for each Exar board:

nr_ports = board->num_ports ? board->num_ports : pcidev->device & 0x0f;

If the number of ports a card has is not explicitly specified, it defaults
to the rightmost 4 bits of the PCI device ID. This is prone to error since
not all PCI device IDs contain a number which corresponds to the number of
ports that card provides.

This particular case involves COMMTECH_4222PCIE, COMMTECH_4224PCIE and
COMMTECH_4228PCIE cards with device IDs 0x0022, 0x0020 and 0x0021.
Currently the multiport cards receive 2, 0 and 1 port instead of 2, 4 and
8 ports respectively.

To fix this, each Commtech Fastcom PCIe card is given a struct where the
number of ports is explicitly specified. This ensures 'board->num_ports'
is used instead of the default 'pcidev->device & 0x0f'.

Fixes: d0aeaa83f0b0 ("serial: exar: split out the exar code from 8250_pci")
Signed-off-by: Valmer Huhn <[email protected]>
Tested-by: Valmer Huhn <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

tty: serial: qcom_geni_serial: Drop __init from qcom_geni_console_setup

When booting with heavily modularized config, the serial console
may not be able to load until after init when modules that
satisfy needed dependencies have time to load.

Unfortunately, as qcom_geni_console_setup is marked as __init,
the function may have been freed before we get to run it,
causing boot time crashes such as:

[    6.469057] Unable to handle kernel paging request at virtual address ffffffe645d4e6cc
[    6.481623] Mem abort info:
[    6.484466]   ESR = 0x86000007
[    6.487557]   EC = 0x21: IABT (current EL), IL = 32 bits
[    6.492929]   SET = 0, FnV = 0g
[    6.496016]   EA = 0, S1PTW = 0
[    6.499202] swapper pgtable: 4k pages, 39-bit VAs, pgdp=000000008151e000
[    6.501286] ufshcd-qcom 1d84000.ufshc: ufshcd_print_pwr_info:[RX, TX]: gear=[3, 3], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
[    6.505977] [ffffffe645d4e6cc] pgd=000000017df9f003, p4d=000000017df9f003, pud=000000017df9f003, pmd=000000017df9c003, pte=0000000000000000
[    6.505990] Internal error: Oops: 86000007 [#1] PREEMPT SMP
[    6.505995] Modules linked in: zl10353 zl10039 zl10036 zd1301_demod xc5000 xc4000 ves1x93 ves1820 tuner_xc2028 tuner_simple tuner_types tua9001 tua6100 1
[    6.506152]  isl6405
[    6.518104] ufshcd-qcom 1d84000.ufshc: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0
[    6.530549]  horus3a helene fc2580 fc0013 fc0012 fc0011 ec100 e4000 dvb_pll ds3000 drxk drxd drx39xyj dib9000 dib8000 dib7000p dib7000m dib3000mc dibx003
[    6.624271] CPU: 7 PID: 148 Comm: kworker/7:2 Tainted: G        W       5.8.0-mainline-12021-g6defd37ba1cd #3455
[    6.624273] Hardware name: Thundercomm Dragonboard 845c (DT)
[    6.624290] Workqueue: events deferred_probe_work_func
[    6.624296] pstate: 40c00005 (nZcv daif +PAN +UAO BTYPE=--)
[    6.624307] pc : qcom_geni_console_setup+0x0/0x110
[    6.624316] lr : try_enable_new_console+0xa0/0x140
[    6.624318] sp : ffffffc010843a30
[    6.624320] x29: ffffffc010843a30 x28: ffffffe645c3e7d0
[    6.624325] x27: ffffff80f8022180 x26: ffffffc010843b28
[    6.637937] x25: 0000000000000000 x24: ffffffe6462a2000
[    6.637941] x23: ffffffe646398000 x22: 0000000000000000
[    6.637945] x21: 0000000000000000 x20: ffffffe6462a5ce8
[    6.637952] x19: ffffffe646398e38 x18: ffffffffffffffff
[    6.680296] x17: 0000000000000000 x16: ffffffe64492b900
[    6.680300] x15: ffffffe6461e9d08 x14: 69202930203d2064
[    6.680305] x13: 7561625f65736162 x12: 202c363331203d20
[    6.696434] x11: 0000000000000030 x10: 0101010101010101
[    6.696438] x9 : 4d4d20746120304d x8 : 7f7f7f7f7f7f7f7f
[    6.707249] x7 : feff4c524c787373 x6 : 0000000000008080
[    6.707253] x5 : 0000000000000000 x4 : 8080000000000000
[    6.707257] x3 : 0000000000000000 x2 : ffffffe645d4e6cc
[    6.744223] qcom_geni_serial 898000.serial: dev_pm_opp_set_rate: failed to find OPP for freq 102400000 (-34)
[    6.744966] x1 : fffffffefe74e174 x0 : ffffffe6462a5ce8
[    6.753580] qcom_geni_serial 898000.serial: dev_pm_opp_set_rate: failed to find OPP for freq 102400000 (-34)
[    6.761634] Call trace:
[    6.761639]  qcom_geni_console_setup+0x0/0x110
[    6.761645]  register_console+0x29c/0x2f8
[    6.767981] Bluetooth: hci0: Frame reassembly failed (-84)
[    6.775252]  uart_add_one_port+0x438/0x500
[    6.775258]  qcom_geni_serial_probe+0x2c4/0x4a8
[    6.775266]  platform_drv_probe+0x58/0xa8
[    6.855359]  really_probe+0xec/0x398
[    6.855362]  driver_probe_device+0x5c/0xb8
[    6.855367]  __device_attach_driver+0x98/0xb8
[    7.184945]  bus_for_each_drv+0x74/0xd8
[    7.188825]  __device_attach+0xec/0x148
[    7.192705]  device_initial_probe+0x24/0x30
[    7.196937]  bus_probe_device+0x9c/0xa8
[    7.200816]  deferred_probe_work_func+0x7c/0xb8
[    7.205398]  process_one_work+0x20c/0x4b0
[    7.209456]  worker_thread+0x48/0x460
[    7.213157]  kthread+0x14c/0x158
[    7.216432]  ret_from_fork+0x10/0x18
[    7.220049] Code: bad PC value
[    7.223139] ---[ end trace 73f3b21e251d5a70 ]---

Thus this patch removes the __init avoiding crash in such
configs.

Cc: Andy Gross <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Saravana Kannan <[email protected]>
Cc: Todd Kjos <[email protected]>
Cc: Amit Pundir <[email protected]>
Cc: [email protected]
Cc: [email protected]
Suggested-by: Saravana Kannan <[email protected]>
Signed-off-by: John Stultz <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: qcom_geni_serial: Fix recent kdb hang

The commit e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work
even if UART isn't console") worked pretty well and I've been doing a
lot of debugging with it.  However, recently I typed "dmesg" in kdb
and then held the space key down to scroll through the pagination.  My
device hung.  This was repeatable and I found that it was introduced
with the aforementioned commit.

It turns out that there are some strange boundary cases in geni where
in some weird situations it will signal RX_LAST but then will put 0 in
RX_LAST_BYTE.  This means that the entire last FIFO entry is valid.
This weird corner case is handled in qcom_geni_serial_handle_rx()
where you can see that we only honor RX_LAST_BYTE if RX_LAST is set
_and_ RX_LAST_BYTE is non-zero.  If either of these is not true we use
BYTES_PER_FIFO_WORD (4) for the size of the last FIFO word.

Let's fix kgdb.  While at it, also use the proper #define for 4.

Fixes: e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console")
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Akash Asthana <[email protected]>
Link: https://lore.kernel.org/r/20200806221904.1.I4455ff86f0ef5281c2a0cd0a4712db614548a5ca@changeid
Signed-off-by: Greg Kroah-Hartman <[email protected]>

vt_ioctl: change VT_RESIZEX ioctl to check for error return from vc_resize()

vc_resize() can return with an error after failure. Change VT_RESIZEX ioctl
to save struct vc_data values that are modified and restore the original
values in case of error.

Signed-off-by: George Kennedy <[email protected]>
Cc: stable <[email protected]>
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

fbcon: prevent user font height or width change from causing potential out-of-bounds access

Add a check to fbcon_resize() to ensure that a possible change to user font
height or user font width will not allow a font data out-of-bounds access.
NOTE: must use original charcount in calculation as font charcount can
change and cannot be used to determine the font data allocated size.

Signed-off-by: George Kennedy <[email protected]>
Cc: stable <[email protected]>
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

vt: defer kfree() of vc_screenbuf in vc_do_resize()

syzbot is reporting UAF bug in set_origin() from vc_do_resize() [1], for
vc_do_resize() calls kfree(vc->vc_screenbuf) before calling set_origin().

Unfortunately, in set_origin(), vc->vc_sw->con_set_origin() might access
vc->vc_pos when scroll is involved in order to manipulate cursor, but
vc->vc_pos refers already released vc->vc_screenbuf until vc->vc_pos gets
updated based on the result of vc->vc_sw->con_set_origin().

Preserving old buffer and tolerating outdated vc members until set_origin()
completes would be easier than preventing vc->vc_sw->con_set_origin() from
accessing outdated vc members.

[1] https://syzkaller.appspot.com/bug?id=6649da2081e2ebdc65c0642c214b27fe91099db3

Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/1596034621-4714-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <[email protected]>

kconfig: qconf: remove unused colNr

This is not used at all.

Signed-off-by: Masahiro Yamada <[email protected]>

kconfig: qconf: fix the popup menu in the ConfigInfoView window

I do not know when ConfigInfoView::createStandardContextMenu() is
called.

Because QTextEdit::createStandardContextMenu() is not virtual,
ConfigInfoView::createStandardContextMenu() cannot override it.
Even if right-click the ConfigInfoView window, the "Show Debug Info"
menu does not show up.

Build up the menu in the constructor, and invoke it from the
contextMenuEvent().

Signed-off-by: Masahiro Yamada <[email protected]>

kconfig: qconf: fix signal connection to invalid slots

If you right-click in the ConfigList window, you will see the following
messages in the console:

QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:888
QObject::connect:  (sender name:   'config')
QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:897
QObject::connect:  (sender name:   'config')
QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:906
QObject::connect:  (sender name:   'config')

Right, there is no such slot in QAction. I think this is a typo of
setChecked.

Due to this bug, when you toggled the menu "Option->Show Name/Range/Data"
the state of the context menu was not previously updated. Fix this.

Fixes: d5d973c3f8a9 ("Port xconfig to Qt5 - Put back some of the old implementation(part 2)")
Signed-off-by: Masahiro Yamada <[email protected]>

genksyms: keywords: Use __restrict not _restrict

Use the proper form of the RESTRICT keyword.

Quote the comments properly too.

Signed-off-by: Joe Perches <[email protected]>
Acked-by: Nick Desaulniers <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove redundant patterns in filter/filter-out

The '%' in filter/filter-out matches to any number of any characters,
including empty string.

So, '%config' matches to 'config', and '%install' to 'install'.

Drop the redundant patterns.

Signed-off-by: Masahiro Yamada <[email protected]>

extract-cert: add static to local data

Fix the following warning from sparse:

scripts/extract-cert.c:74:5: warning: symbol 'kbuild_verbose' was not declared. Should it be static?

Signed-off-by: Masahiro Yamada <[email protected]>

speakup: only build serialio when ISA is enabled

Drivers using serialio were already made available in Kconfig only under
the ISA condition.

Signed-off-by: Samuel Thibault <[email protected]>
Link: https://lore.kernel.org/r/20200804160659.7y76sdseow43lfms@function
Signed-off-by: Greg Kroah-Hartman <[email protected]>

speakup: Fix wait_for_xmitr for ttyio case

This was missed while introducing the tty-based serial access.

The only remaining use of wait_for_xmitr with tty-based access is in
spk_synth_is_alive_restart to check whether the synth can be restarted.
With tty-based this is up to the tty layer to cope with the buffering
etc. so we can just say yes.

Signed-off-by: Samuel Thibault <[email protected]>
Link: https://lore.kernel.org/r/20200804160637.x3iycau5izywbgzl@function
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: Fix device driver race

When a new device with a specialised device driver is plugged in, the
new driver will be modprobe()'d but the driver core will attach the
"generic" driver to the device.

After that, nothing will trigger a reprobe when the modprobe()'d device
driver has finished initialising, as the device has the "generic"
driver attached to it.

Trigger a reprobe ourselves when new specialised drivers get registered.

Fixes: 88b7381a939d ("USB: Select better matching USB drivers when available")
Signed-off-by: Bastien Nocera <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: Also match device drivers using the ->match vfunc

We only ever used the ID table matching before, but we should also support
open-coded match functions.

Fixes: 88b7381a939de ("USB: Select better matching USB drivers when available")
Signed-off-by: Bastien Nocera <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: host: xhci-tegra: fix tegra_xusb_get_phy()

tegra_xusb_get_phy() should take input argument "name".

Signed-off-by: JC Kuo <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: host: xhci-tegra: otg usb2/usb3 port init

tegra_xusb_init_usb_phy() should initialize "otg_usb2_port" and
"otg_usb3_port" with -EINVAL because "0" is a valid value
represents usb2 port 0 or usb3 port 0.

Signed-off-by: JC Kuo <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: hcd: Fix use after free in usb_hcd_pci_remove()

On the removal stage we put a reference to the controller structure and
if it's not used anymore it gets freed, but later we try to dereference
a pointer to a member of that structure.

Copy necessary field to a temporary variable to avoid use after free.

Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
Reported-by: John Garry <[email protected]>
Link: https://lore.kernel.org/linux-usb/[email protected]/
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: ucsi: Hold con->lock for the entire duration of ucsi_register_port()

Commit 081da1325d35 ("usb: typec: ucsi: displayport: Fix a potential race
during registration") made the ucsi code hold con->lock in
ucsi_register_displayport(). But we really don't want any interactions
with the connector to run before the port-registration process is fully
complete.

This commit moves the taking of con->lock from ucsi_register_displayport()
into ucsi_register_port() to achieve this.

Cc: [email protected]
Fixes: 081da1325d35 ("usb: typec: ucsi: displayport: Fix a potential race during registration")
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Heikki Krogerus <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: ucsi: Rework ppm_lock handling

The ppm_lock really only needs to be hold during 2 functions:
ucsi_reset_ppm() and ucsi_run_command().

Push the taking of the lock down into these 2 functions, renaming
ucsi_run_command() to ucsi_send_command() which was an existing
wrapper already taking the lock for its callers.

This simplifies things for the callers and removes the difference
between ucsi_send_command() and ucsi_run_command() which has led
to various locking bugs in the past.

Cc: [email protected]
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Heikki Krogerus <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: ucsi: Fix 2 unlocked ucsi_run_command calls

Fix 2 unlocked ucsi_run_command calls:

1. ucsi_handle_connector_change() contains one ucsi_send_command() call,
which takes the ppm_lock for it; and one ucsi_run_command() call which
relies on the caller have taking the ppm_lock.
ucsi_handle_connector_change() does not take the lock, so the
second (ucsi_run_command) calls should also be ucsi_send_command().

2. ucsi_get_pdos() gets called from ucsi_handle_connector_change() which
does not hold the ppm_lock, so it also must use ucsi_send_command().

This commit also adds a WARN_ON(!mutex_is_locked(&ucsi->ppm_lock)); to
ucsi_run_command() to avoid similar problems getting re-introduced in
the future.

Cc: [email protected]
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Heikki Krogerus <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: ucsi: Fix AB BA lock inversion

Lockdep reports an AB BA lock inversion between ucsi_init() and
ucsi_handle_connector_change():

AB order:

1. ucsi_init takes ucsi->ppm_lock (it runs with that locked for the
   duration of the function)
2. usci_init eventually end up calling ucsi_register_displayport,
   which takes ucsi_connector->lock

BA order:

1. ucsi_handle_connector_change work is started, takes ucsi_connector->lock
2. ucsi_handle_connector_change calls ucsi_send_command which takes
   ucsi->ppm_lock

The ppm_lock really only needs to be hold during 2 functions:
ucsi_reset_ppm() and ucsi_run_command().

This commit fixes the AB BA lock inversion by making ucsi_init drop the
ucsi->ppm_lock before it starts registering ports; and replacing any
ucsi_run_command() calls after this point with ucsi_send_command()
(which is a wrapper around run_command taking the lock while handling
the command).

Some of the replacing of ucsi_run_command with ucsi_send_command
in the helpers used during port registration also fixes a number of
code paths after registration which call ucsi_run_command() without
holding the ppm_lock:
1. ucsi_altmode_update_active() call in ucsi/displayport.c
2. ucsi_register_altmodes() call from ucsi_handle_connector_change()
   (through ucsi_partner_change())

Cc: [email protected]
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Heikki Krogerus <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usbip: Implement a match function to fix usbip

Commit 88b7381a939d ("USB: Select better matching USB drivers when
available") introduced the use of a "match" function to select a
non-generic/better driver for a particular USB device. This
unfortunately breaks the operation of usbip in general, as reported in
the kernel bugzilla with bug 208267 (linked below).

Upon inspecting the aforementioned commit, one can observe that the
original code in the usb_device_match function used to return 1
unconditionally, but the aforementioned commit makes the usb_device_match
function use identifier tables and "match" virtual functions, if either of
them are available.

Hence, this commit implements a match function for usbip that
unconditionally returns true to ensure that usbip is functional again.

This change has been verified to restore usbip functionality, with a
v5.7.y kernel on an up-to-date version of Qubes OS 4.0, which uses
usbip to redirect USB devices between VMs.

Thanks to Jonathan Dieter for the effort in bisecting this issue down
to the aforementioned commit.

Fixes: 88b7381a939d ("USB: Select better matching USB drivers when available")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208267
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1856443
Link: https://github.com/QubesOS/qubes-issues/issues/5905
Signed-off-by: M. Vefa Bicakci <[email protected]>
Cc: <[email protected]> # 5.7
Cc: Valentina Manea <[email protected]>
Cc: Alan Stern <[email protected]>
Reviewed-by: Bastien Nocera <[email protected]>
Reviewed-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: renesas-xhci: remove version check

Some devices in wild are reporting bunch of firmware versions, so remove
the check for versions in driver

Reported by: Anastasios Vacharakis <[email protected]>
Reported by: Glen Journeay <[email protected]>
Fixes: 2478be82de44 ("usb: renesas-xhci: Add ROM loader for uPD720201")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208911
Signed-off-by: Vinod Koul <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: lvtest: return proper error code in probe

lvs_rh_probe() can return some nonnegative value from usb_control_msg()
when it is less than "USB_DT_HUB_NONVAR_SIZE + 2" that is considered as
a failure. Make lvs_rh_probe() return -EINVAL in this case.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Evgeny Novikov <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: cdc-acm: rework notification_buffer resizing

Clang static analysis reports this error

cdc-acm.c:409:3: warning: Use of memory after it is freed
        acm_process_notification(acm, (unsigned char *)dr);

There are three problems, the first one is that dr is not reset

The variable dr is set with

if (acm->nb_index)
dr = (struct usb_cdc_notification *)acm->notification_buffer;

But if the notification_buffer is too small it is resized with

if (acm->nb_size) {
kfree(acm->notification_buffer);
acm->nb_size = 0;
}
alloc_size = roundup_pow_of_two(expected_size);
/*
* kmalloc ensures a valid notification_buffer after a
* use of kfree in case the previous allocation was too
* small. Final freeing is done on disconnect.
*/
acm->notification_buffer =
kmalloc(alloc_size, GFP_ATOMIC);

dr should point to the new acm->notification_buffer.

The second problem is any data in the notification_buffer is lost
when the pointer is freed.  In the normal case, the current data
is accumulated in the notification_buffer here.

memcpy(&acm->notification_buffer[acm->nb_index],
       urb->transfer_buffer, copy_size);

When a resize happens, anything before
notification_buffer[acm->nb_index] is garbage.

The third problem is the acm->nb_index is not reset on a
resizing buffer error.

So switch resizing to using krealloc and reassign dr and
reset nb_index.

Fixes: ea2583529cd1 ("cdc-acm: reassemble fragmented notifications")
Signed-off-by: Tom Rix <[email protected]>
Cc: stable <[email protected]>
Acked-by: Oliver Neukum <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: quirks: Add no-lpm quirk for another Raydium touchscreen

There's another Raydium touchscreen needs the no-lpm quirk:
[    1.339149] usb 1-9: New USB device found, idVendor=2386, idProduct=350e, bcdDevice= 0.00
[    1.339150] usb 1-9: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    1.339151] usb 1-9: Product: Raydium Touch System
[    1.339152] usb 1-9: Manufacturer: Raydium Corporation
...
[    6.450497] usb 1-9: can't set config #1, error -110

BugLink: https://bugs.launchpad.net/bugs/1889446
Signed-off-by: Kai-Heng Feng <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: yurex: Fix bad gfp argument

The syzbot fuzzer identified a bug in the yurex driver: It passes
GFP_KERNEL as a memory-allocation flag to usb_submit_urb() at a time
when its state is TASK_INTERRUPTIBLE, not TASK_RUNNING:

do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000370c7c68>] prepare_to_wait+0xb1/0x2a0 kernel/sched/wait.c:247
WARNING: CPU: 1 PID: 340 at kernel/sched/core.c:7253 __might_sleep+0x135/0x190
kernel/sched/core.c:7253
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 340 Comm: syz-executor677 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
panic+0x2aa/0x6e1 kernel/panic.c:231
__warn.cold+0x20/0x50 kernel/panic.c:600
report_bug+0x1bd/0x210 lib/bug.c:198
handle_bug+0x41/0x80 arch/x86/kernel/traps.c:234
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
RIP: 0010:__might_sleep+0x135/0x190 kernel/sched/core.c:7253
Code: 65 48 8b 1c 25 40 ef 01 00 48 8d 7b 10 48 89 fe 48 c1 ee 03 80 3c 06 00 75
2b 48 8b 73 10 48 c7 c7 e0 9e 06 86 e8 ed 12 f6 ff <0f> 0b e9 46 ff ff ff e8 1f
b2 4b 00 e9 29 ff ff ff e8 15 b2 4b 00
RSP: 0018:ffff8881cdb77a28 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8881c6458000 RCX: 0000000000000000
RDX: ffff8881c6458000 RSI: ffffffff8129ec93 RDI: ffffed1039b6ef37
RBP: ffffffff86fdade2 R08: 0000000000000001 R09: ffff8881db32f54f
R10: 0000000000000000 R11: 0000000030343354 R12: 00000000000001f2
R13: 0000000000000000 R14: 0000000000000068 R15: ffffffff83c1b1aa
slab_pre_alloc_hook.constprop.0+0xea/0x200 mm/slab.h:498
slab_alloc_node mm/slub.c:2816 [inline]
slab_alloc mm/slub.c:2900 [inline]
kmem_cache_alloc_trace+0x46/0x220 mm/slub.c:2917
kmalloc include/linux/slab.h:554 [inline]
dummy_urb_enqueue+0x7a/0x880 drivers/usb/gadget/udc/dummy_hcd.c:1251
usb_hcd_submit_urb+0x2b2/0x22d0 drivers/usb/core/hcd.c:1547
usb_submit_urb+0xb4e/0x13e0 drivers/usb/core/urb.c:570
yurex_write+0x3ea/0x820 drivers/usb/misc/yurex.c:495

This patch changes the call to use GFP_ATOMIC instead of GFP_KERNEL.

Reported-and-tested-by: [email protected]
Signed-off-by: Alan Stern <[email protected]>
CC: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death

For a power9 KVM guest with XIVE enabled, running a test loop
where we hotplug 384 vcpus and then unplug them, the following traces
can be seen (generally within a few loops) either from the unplugged
vcpu:

  cpu 65 (hwid 65) Ready to die...
  Querying DEAD? cpu 66 (66) shows 2
  list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:56!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
  CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
  NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
  REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
  MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000842  XER: 20040000
  ...
  NIP __list_del_entry_valid+0xac/0x100
  LR  __list_del_entry_valid+0xa8/0x100
  Call Trace:
    __list_del_entry_valid+0xa8/0x100 (unreliable)
    free_pcppages_bulk+0x1f8/0x940
    free_unref_page+0xd0/0x100
    xive_spapr_cleanup_queue+0x148/0x1b0
    xive_teardown_cpu+0x1bc/0x240
    pseries_mach_cpu_die+0x78/0x2f0
    cpu_die+0x48/0x70
    arch_cpu_idle_dead+0x20/0x40
    do_idle+0x2f4/0x4c0
    cpu_startup_entry+0x38/0x40
    start_secondary+0x7bc/0x8f0
    start_secondary_prolog+0x10/0x14

or on the worker thread handling the unplug:

  pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2
  BUG: Bad page state in process kworker/u768:3  pfn:95de1
  cpu 314 (hwid 314) Ready to die...
  page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
  flags: 0x5ffffc00000000()
  raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
  page dumped because: nonzero mapcount
  Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
  CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
  Workqueue: pseries hotplug workque pseries_hp_work_fn
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    bad_page+0x12c/0x1b0
    free_pcppages_bulk+0x5bc/0x940
    page_alloc_cpu_dead+0x118/0x120
    cpuhp_invoke_callback.constprop.5+0xb8/0x760
    _cpu_down+0x188/0x340
    cpu_down+0x5c/0xa0
    cpu_subsys_offline+0x24/0x40
    device_offline+0xf0/0x130
    dlpar_offline_cpu+0x1c4/0x2a0
    dlpar_cpu_remove+0xb8/0x190
    dlpar_cpu_remove_by_index+0x12c/0x150
    dlpar_cpu+0x94/0x800
    pseries_hp_work_fn+0x128/0x1e0
    process_one_work+0x304/0x5d0
    worker_thread+0xcc/0x7a0
    kthread+0x1ac/0x1c0
    ret_from_kernel_thread+0x5c/0x80

The latter trace is due to the following sequence:

  page_alloc_cpu_dead
    drain_pages
      drain_pages_zone
        free_pcppages_bulk

where drain_pages() in this case is called under the assumption that
the unplugged cpu is no longer executing. To ensure that is the case,
and early call is made to __cpu_die()->pseries_cpu_die(), which runs a
loop that waits for the cpu to reach a halted state by polling its
status via query-cpu-stopped-state RTAS calls. It only polls for 25
iterations before giving up, however, and in the trace above this
results in the following being printed only .1 seconds after the
hotplug worker thread begins processing the unplug request:

  pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2

At that point the worker thread assumes the unplugged CPU is in some
unknown/dead state and procedes with the cleanup, causing the race
with the XIVE cleanup code executed by the unplugged CPU.

Fix this by waiting indefinitely, but also making an effort to avoid
spurious lockup messages by allowing for rescheduling after polling
the CPU status and printing a warning if we wait for longer than 120s.

Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Tested-by: Greg Kurz <[email protected]>
Reviewed-by: Thiago Jung Bauermann <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
[mpe: Trim oopses in change log slightly for readability]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined

When MODULES_VADDR is defined, is_module_segment() shall check the
address against it instead of checking agains VMALLOC_START.

Fixes: 6ca055322da8 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu

powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32

On BOOK3S_32, when we have modules and strict kernel RWX, modules
are not in vmalloc space but in a dedicated segment that is
below PAGE_OFFSET.

So KASAN_SHADOW_START must take it into account.

MODULES_VADDR can't be used because it is not defined yet
in kasan.h

Fixes: 6ca055322da8 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu

Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe"

FCP T10-PI and NVMe features are independent of each other. This patch
allows both features to co-exist.

This reverts commit 5da05a26b8305a625bc9d537671b981795b46dab.

Link: https://lore.kernel.org/r/[email protected]
Fixes: 5da05a26b830 ("scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe")
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command"

FCoE adapter initialization failed for ISP8021 with the following patch
applied. In addition, reproduction of the issue the patch originally tried
to address has been unsuccessful.

This reverts commit 3cb182b3fa8b7a61f05c671525494697cba39c6a.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Fix null pointer access during disconnect from subsystem

NVMEAsync command is being submitted to QLA while the same NVMe controller
is in the middle of reset. The reset path has deleted the association and
freed aen_op->fcp_req.private. Add a check for this private pointer before
issuing the command.

...
6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e
    [exception RIP: qla_nvme_post_cmd+394]
    RIP: ffffffffc0d012ba  RSP: ffffb656ca11fd98  RFLAGS: 00010206
    RAX: ffff8fb039eda228  RBX: ffff8fb039eda200  RCX: 00000000000da161
    RDX: ffffffffc0d4d0f0  RSI: ffffffffc0d26c9b  RDI: ffff8fb039eda220
    RBP: 0000000000000013   R8: ffff8fb47ff6aa80   R9: 0000000000000002
    R10: 0000000000000000  R11: ffffb656ca11fdc8  R12: ffff8fb27d04a3b0
    R13: ffff8fc46dd98a58  R14: 0000000000000000  R15: ffff8fc4540f0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc]
8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc]
9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core]
10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437
11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef
12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402
13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255

--
PID: 37824  TASK: ffff8fb033063d80  CPU: 20  COMMAND: "kworker/u97:451"
0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3
1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8
2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed
3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf
4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5
5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core]
6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc]
7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437
8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50
9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402
10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Check if FW supports MQ before enabling

OS boot during Boot from SAN was stuck at dracut emergency shell after
enabling NVMe driver parameter. For non-MQ support the driver was enabling
MQ. Add a check to confirm if FW supports MQ.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba

qla_nvme_register_hba() puts out a warning when there are not enough queue
pairs available for FC-NVME. Just fail the NVME registration rather than a
WARNING + call Trace.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Arun Easi <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set anytime

ql2xextended_error_logging can now be set to 1 to get the default mask
value, as opposed to at module load time only.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Arun Easi <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Reduce noisy debug message

Update debug level and message for ELS IOCB done.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Fix login timeout

Multipath errors were seen during failback due to login timeout.  The
remote device sent LOGO, the local host tore down the session and did
relogin. The RSCN arrived indicates remote device is going through failover
after which the relogin is in a 20s timeout phase.  At this point the
driver is stuck in the relogin process.  Add a fix to delete the session as
part of abort/flush the login.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Indicate correct supported speeds for Mezz card

Correct the supported speeds for 16G Mezz card.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Flush I/O on zone disable

Perform implicit logout to flush I/O on zone disable.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: qla2xxx: Flush all sessions on zone disable

On Zone Disable, certain switches would ignore all commands. This causes
timeout for both switch scan command and abort of that command. On
detection of this condition, all sessions will be shutdown.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

block: Fix page_is_mergeable() for compound pages

If we pass in an offset which is larger than PAGE_SIZE, then
page_is_mergeable() thinks it's not mergeable with the previous bio_vec,
leading to a large number of bio_vecs being used. Use a slightly more
obvious test that the two pages are compatible with each other.

Fixes: 52d52d1c98a9 ("block: only allow contiguous page structs in a bio_vec")
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values

Improves readability of qla_mbx.c.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Reviewed-by: Roman Bolshakov <[email protected]>
Signed-off-by: Enzo Matsumiya <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: scsi_debug: Fix scp is NULL errors

John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were
mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The
problem was tracked down to a missing critical section on a "short circuit"
path. Namely, the time to process the current command so far has already
exceeded the requested command duration (i.e. the number of nanoseconds in
the ndelay parameter).

The random=1 parameter setting was pivotal in finding this error. The
failure scenario involved first taking that "short circuit" path (due to a
very short command duration) and then taking the more likely
hrtimer_start() path (due to a longer command duration). With random=1 each
command's duration is taken from the uniformly distributed [0..ndelay)
interval. The fio utility also helped by reliably generating the error
scenario at about once per minute on a RPi 4 (64 bit OS).

Link: https://lore.kernel.org/r/[email protected]
Reported-by: John Garry <[email protected]>
Reviewed-by: Lee Duncan <[email protected]>
Signed-off-by: Douglas Gilbert <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: zfcp: Fix use-after-free in request timeout handlers

Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
timer_setup()"), we intentionally only passed zfcp_adapter as context
argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
adapter recovery, it was unnecessary to sync against races between timeout
and (late) completion.  Likewise, we only passed zfcp_erp_action as context
argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
it was unnecessary to sync against races between timeout and (late)
completion.

Meanwhile the timeout handlers get timer_list as context argument and do a
timer-specific container-of to zfcp_fsf_req which can have been freed.

Fix it by making sure that any request timeout handlers, that might just
have started before del_timer(), are completed by using del_timer_sync()
instead. This ensures the request free happens afterwards.

Space time diagram of potential use-after-free:

Basic idea is to have 2 or more pending requests whose timeouts run out at
almost the same time.

req 1 timeout     ERP thread        req 2 timeout
----------------  ----------------  ---------------------------------------
zfcp_fsf_request_timeout_handler
fsf_req = from_timer(fsf_req, t, timer)
adapter = fsf_req->adapter
zfcp_qdio_siosl(adapter)
zfcp_erp_adapter_reopen(adapter,...)
                  zfcp_erp_strategy
                  ...
                  zfcp_fsf_req_dismiss_all
                  list_for_each_entry_safe
                    zfcp_fsf_req_complete 1
                    del_timer 1
                    zfcp_fsf_req_free 1
                    zfcp_fsf_req_complete 2
                                    zfcp_fsf_request_timeout_handler
                    del_timer 2
                                    fsf_req = from_timer(fsf_req, t, timer)
                    zfcp_fsf_req_free 2
                                    adapter = fsf_req->adapter
                                              ^^^^^^^ already freed

Link: https://lore.kernel.org/r/[email protected]
Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
Cc: <[email protected]> #4.15+
Suggested-by: Julian Wiedmann <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Signed-off-by: Steffen Maier <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs: No need to send Abort Task if the task in DB was cleared

If the bit corresponding to a task in the Doorbell register has been
cleared, no need to poll the status of the task on the device side and to
send an Abort Task TM. Instead, let it directly goto cleanup.

In addition, to keep original debug output, move the goto below the debug
print.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Stanley Chu <[email protected]>
Reviewed-by: Can Guo <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs: Clean up completed request without interrupt notification

If somehow no interrupt notification is raised for a completed request and
its doorbell bit is cleared by host, UFS driver needs to cleanup its
outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally
in the following scenario:

After ufshcd_abort() returns, this request will be requeued by SCSI layer
with its outstanding bit set. Any future completed request will trigger
ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At
this time the "abnormal outstanding bit" will be detected and the "requeued
request" will be chosen to execute request post-processing flow. This is
wrong because this request is still "alive".

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Can Guo <[email protected]>
Acked-by: Avri Altman <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Bean Huo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs: Improve interrupt handling for shared interrupts

For shared interrupts, the interrupt status might be zero, so check that
first.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs: Fix interrupt error message for shared interrupts

The interrupt might be shared, in which case it is not an error for the
interrupt handler to be called when the interrupt status is zero, so don't
print the message unless there was enabled interrupt status.

Link: https://lore.kernel.org/r/[email protected]
Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code")
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL

Intel EHL UFS host controller advertises auto-hibernate capability but it
does not work correctly. Add a quirk for that.

[mkp: checkpatch fix]

Link: https://lore.kernel.org/r/[email protected]
Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL")
Acked-by: Stanley Chu <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs-mediatek: Fix incorrect time to wait link status

Fix incorrect calculation of "ms" based waiting time in function
ufs_mtk_setup_clocks().

Link: https://lore.kernel.org/r/[email protected]
Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet")
Reviewed-by: Avri Altman <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufs: Fix possible infinite loop in ufshcd_hold

In ufshcd_suspend(), after clk-gating is suspended and link is set
as Hibern8 state, ufshcd_hold() is still possibly invoked before
ufshcd_suspend() returns. For example, MediaTek's suspend vops may
issue UIC commands which would call ufshcd_hold() during the command
issuing flow.

Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
then ufshcd_hold() may enter infinite loops because there is no
clk-ungating work scheduled or pending. In this case, ufshcd_hold()
shall just bypass, and keep the link as Hibern8 state.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Avri Altman <[email protected]>
Co-developed-by: Andy Teng <[email protected]>
Signed-off-by: Andy Teng <[email protected]>
Signed-off-by: Stanley Chu <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>