Linus Torvalds [Fri, 20 Jan 2023 17:58:44 +0000 (09:58 -0800)]
Merge tag 'net-6.2-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless, bluetooth, bpf and netfilter.
Current release - regressions:
- Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6
addrconf", fix nsna_ping mode of team
- wifi: mt76: fix bugs in Rx queue handling and DMA mapping
- eth: mlx5:
- add missing mutex_unlock in error reporter
- protect global IPsec ASO with a lock
Current release - new code bugs:
- rxrpc: fix wrong error return in rxrpc_connect_call()
Previous releases - regressions:
- bluetooth: hci_sync: fix use of HCI_OP_LE_READ_BUFFER_SIZE_V2
- wifi:
- mac80211: fix crashes on Rx due to incorrect initialization of
rx->link and rx->link_sta
- mac80211: fix bugs in iTXQ conversion - Tx stalls, incorrect
aggregation handling, crashes
- brcmfmac: fix regression for Broadcom PCIe wifi devices
- rndis_wlan: prevent buffer overflow in rndis_query_oid
- netfilter: conntrack: handle tcp challenge acks during connection
reuse
- sched: avoid grafting on htb_destroy_class_offload when destroying
- virtio-net: correctly enable callback during start_xmit, fix stalls
- tcp: avoid the lookup process failing to get sk in ehash table
- ipa: disable ipa interrupt during suspend
- eth: stmmac: enable all safety features by default
Previous releases - always broken:
- bpf:
- fix pointer-leak due to insufficient speculative store bypass
mitigation (Spectre v4)
- skip task with pid=1 in send_signal_common() to avoid a splat
- fix BPF program ID information in BPF_AUDIT_UNLOAD as well as
PERF_BPF_EVENT_PROG_UNLOAD events
- fix potential deadlock in htab_lock_bucket from same bucket
index but different map_locked index
- bluetooth:
- fix a buffer overflow in mgmt_mesh_add()
- hci_qca: fix driver shutdown on closed serdev
- ISO: fix possible circular locking dependency
- CIS: hci_event: fix invalid wait context
- wifi: brcmfmac: fixes for survey dump handling
- mptcp: explicitly specify sock family at subflow creation time
- netfilter: nft_payload: incorrect arithmetics when fetching VLAN
header bits
- tcp: fix rate_app_limited to default to 1
- l2tp: close all race conditions in l2tp_tunnel_register()
- eth: mlx5: fixes for QoS config and eswitch configuration
- eth: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
- eth: stmmac: fix invalid call to mdiobus_get_phy()
Misc:
- ethtool: add netlink attr in rss get reply only if the value is not
empty"
* tag 'net-6.2-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
Revert "Merge branch 'octeontx2-af-CPT'"
tcp: fix rate_app_limited to default to 1
bnxt: Do not read past the end of test names
net: stmmac: enable all safety features by default
octeontx2-af: add mbox to return CPT_AF_FLT_INT info
octeontx2-af: update cpt lf alloc mailbox
octeontx2-af: restore rxc conf after teardown sequence
octeontx2-af: optimize cpt pf identification
octeontx2-af: modify FLR sequence for CPT
octeontx2-af: add mbox for CPT LF reset
octeontx2-af: recover CPT engine when it gets fault
net: dsa: microchip: ksz9477: port map correction in ALU table entry register
selftests/net: toeplitz: fix race on tpacket_v3 block close
net/ulp: use consistent error code when blocking ULP
octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt
tcp: avoid the lookup process failing to get sk in ehash table
Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf"
MAINTAINERS: add networking entries for Willem
net: sched: gred: prevent races when adding offloads to stats
l2tp: prevent lockdep issue in l2tp_tunnel_register()
...
Stefan Assmann [Tue, 10 Jan 2023 08:00:18 +0000 (09:00 +0100)]
iavf: schedule watchdog immediately when changing primary MAC
iavf_replace_primary_mac() utilizes queue_work() to schedule the
watchdog task but that only ensures that the watchdog task is queued
to run. To make sure the watchdog is executed asap use
mod_delayed_work().
Without this patch it may take up to 2s until the watchdog task gets
executed, which may cause long delays when setting the MAC address.
Marcin Szycik [Tue, 3 Jan 2023 16:42:27 +0000 (17:42 +0100)]
iavf: Move netdev_update_features() into watchdog task
Remove netdev_update_features() from iavf_adminq_task(), as it can cause
deadlocks due to needing rtnl_lock. Instead use the
IAVF_FLAG_SETUP_NETDEV_FEATURES flag to indicate that netdev features need
to be updated in the watchdog task. iavf_set_vlan_offload_features()
and iavf_set_queue_vlan_tag_loc() can be called directly from
iavf_virtchnl_completion().
Michal Schmidt [Thu, 15 Dec 2022 22:50:48 +0000 (23:50 +0100)]
iavf: fix temporary deadlock and failure to set MAC address
We are seeing an issue where setting the MAC address on iavf fails with
EAGAIN after the 2.5s timeout expires in iavf_set_mac().
There is the following deadlock scenario:
iavf_set_mac(), holding rtnl_lock, waits on:
iavf_watchdog_task (within iavf_wq) to send a message to the PF,
and
iavf_adminq_task (within iavf_wq) to receive a response from the PF.
In this adapter state (>=__IAVF_DOWN), these tasks do not need to take
rtnl_lock, but iavf_wq is a global single-threaded workqueue, so they
may get stuck waiting for another adapter's iavf_watchdog_task to run
iavf_init_config_adapter(), which does take rtnl_lock.
The deadlock resolves itself by the timeout in iavf_set_mac(),
which results in EAGAIN returned to userspace.
Let's break the deadlock loop by changing iavf_wq into a per-adapter
workqueue, so that one adapter's tasks are not blocked by another's.
Pavel Begunkov [Fri, 20 Jan 2023 16:38:06 +0000 (16:38 +0000)]
io_uring/msg_ring: fix remote queue to disabled ring
IORING_SETUP_R_DISABLED rings don't have the submitter task set, so
it's not always safe to use ->submitter_task. Disallow posting msg_ring
messaged to disabled rings. Also add task NULL check for loosy sync
around testing for IORING_SETUP_R_DISABLED.
Pavel Begunkov [Fri, 20 Jan 2023 16:38:05 +0000 (16:38 +0000)]
io_uring/msg_ring: fix flagging remote execution
There is a couple of problems with queueing a tw in io_msg_ring_data()
for remote execution. First, once we queue it the target ring can
go away and so setting IORING_SQ_TASKRUN there is not safe. Secondly,
the userspace might not expect IORING_SQ_TASKRUN.
Extract a helper and uniformly use TWA_SIGNAL without TWA_SIGNAL_NO_IPI
tricks for now, just as it was done in the original patch.
Yi Liu [Fri, 20 Jan 2023 15:05:28 +0000 (07:05 -0800)]
kvm/vfio: Fix potential deadlock on vfio group_lock
Currently it is possible that the final put of a KVM reference comes from
vfio during its device close operation. This occurs while the vfio group
lock is held; however, if the vfio device is still in the kvm device list,
then the following call chain could result in a deadlock:
VFIO holds group->group_lock/group_rwsem
-> kvm_put_kvm
-> kvm_destroy_vm
-> kvm_destroy_devices
-> kvm_vfio_destroy
-> kvm_vfio_file_set_kvm
-> vfio_file_set_kvm
-> try to hold group->group_lock/group_rwsem
The key function is the kvm_destroy_devices() which triggers destroy cb
of kvm_device_ops. It calls back to vfio and try to hold group_lock. So
if this path doesn't call back to vfio, this dead lock would be fixed.
Actually, there is a way for it. KVM provides another point to free the
kvm-vfio device which is the point when the device file descriptor is
closed. This can be achieved by providing the release cb instead of the
destroy cb. Also rename kvm_vfio_destroy() to be kvm_vfio_release().
/*
* Destroy is responsible for freeing dev.
*
* Destroy may be called before or after destructors are called
* on emulated I/O regions, depending on whether a reference is
* held by a vcpu or other kvm component that gets destroyed
* after the emulated I/O.
*/
void (*destroy)(struct kvm_device *dev);
/*
* Release is an alternative method to free the device. It is
* called when the device file descriptor is closed. Once
* release is called, the destroy method will not be called
* anymore as the device is removed from the device list of
* the VM. kvm->lock is held.
*/
void (*release)(struct kvm_device *dev);
Jens Axboe [Fri, 20 Jan 2023 15:08:29 +0000 (08:08 -0700)]
Merge tag 'nvme-6.2-2023-01-20' of git://git.infradead.org/nvme into block-6.2
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- fix controller shutdown regression in nvme-apple (Janne Grunau)
- fix a polling on timeout regression in nvme-pci (Keith Busch)"
* tag 'nvme-6.2-2023-01-20' of git://git.infradead.org/nvme:
nvme-pci: fix timeout request state check
nvme-apple: only reset the controller when RTKit is running
nvme-apple: reset controller during shutdown
Alexander Stein [Fri, 20 Jan 2023 12:27:14 +0000 (13:27 +0100)]
usb: host: ehci-fsl: Fix module alias
Commit ca07e1c1e4a6 ("drivers:usb:fsl:Make fsl ehci drv an independent
driver module") changed DRV_NAME which was used for MODULE_ALIAS as well.
Starting from this the module alias didn't match the platform device
name created in fsl-mph-dr-of.c
Change DRV_NAME to match the driver name for host mode in fsl-mph-dr-of.
This is needed for module autoloading on ls1021a.
David Morley [Thu, 19 Jan 2023 19:00:28 +0000 (19:00 +0000)]
tcp: fix rate_app_limited to default to 1
The initial default value of 0 for tp->rate_app_limited was incorrect,
since a flow is indeed application-limited until it first sends
data. Fixing the default to be 1 is generally correct but also
specifically will help user-space applications avoid using the initial
tcpi_delivery_rate value of 0 that persists until the connection has
some non-zero bandwidth sample.
Kees Cook [Wed, 18 Jan 2023 20:35:01 +0000 (12:35 -0800)]
bnxt: Do not read past the end of test names
Test names were being concatenated based on a offset beyond the end of
the first name, which tripped the buffer overflow detection logic:
detected buffer overflow in strnlen
[...]
Call Trace:
bnxt_ethtool_init.cold+0x18/0x18
Refactor struct hwrm_selftest_qlist_output to use an actual array,
and adjust the concatenation to use snprintf() rather than a series of
strncat() calls.
Matthew Howell [Thu, 19 Jan 2023 19:40:29 +0000 (14:40 -0500)]
serial: exar: Add support for Sealevel 7xxxC serial cards
Add support for Sealevel 7xxxC serial cards.
This patch:
* Adds IDs to recognize 7xxxC cards from Sealevel Systems.
* Updates exar_pci_probe() to set nr_ports to last two bytes of primary
dev ID for these cards.
Vishnu Dasa [Wed, 30 Nov 2022 07:05:11 +0000 (23:05 -0800)]
VMCI: Use threaded irqs instead of tasklets
The vmci_dispatch_dgs() tasklet function calls vmci_read_data()
which uses wait_event() resulting in invalid sleep in an atomic
context (and therefore potentially in a deadlock).
Use threaded irqs to fix this issue and completely remove usage
of tasklets.
Elliot Berman [Thu, 12 Jan 2023 18:23:12 +0000 (10:23 -0800)]
misc: fastrpc: Pass bitfield into qcom_scm_assign_mem
The srcvm parameter of qcom_scm_assign_mem is a pointer to a bitfield of
VMIDs. The bitfield is updated with which VMIDs have permissions
after the qcom_scm_assign_mem call. This makes it simpler for clients to
make qcom_scm_assign_mem calls later, they always pass in same srcvm
bitfield and do not need to closely track whether memory was originally
shared.
When restoring permissions to HLOS, fastrpc is incorrectly using the
first VMID directly -- neither the BIT nor the other possible VMIDs the
memory was already assigned to. We already have a field intended for
this purpose: "perms" in the struct fastrpc_channel_ctx, but it was
never used. Start using the perms field.
Cc: Abel Vesa <[email protected]> Cc: Vamsi Krishna Gattupalli <[email protected]> Cc: Srinivas Kandagatla <[email protected]> Fixes: e90d91190619 ("misc: fastrpc: Add support to secure memory map") Fixes: 0871561055e6 ("misc: fastrpc: Add support for audiopd") Fixes: 532ad70c6d44 ("misc: fastrpc: Add mmap request assigning for static PD pool") Tested-by: Srinivas Kandagatla <[email protected]> Signed-off-by: Elliot Berman <[email protected]>
drivers/misc/fastrpc.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
We can get EFI variables without fetching the attribute, so we must
allow for that in gsmi.
commit 859748255b43 ("efi: pstore: Omit efivars caching EFI varstore
access layer") added a new get_variable call with attr=NULL, which
triggers panic in gsmi.
Ola Jeppsson [Thu, 24 Nov 2022 17:49:41 +0000 (17:49 +0000)]
misc: fastrpc: Fix use-after-free race condition for maps
It is possible that in between calling fastrpc_map_get() until
map->fl->lock is taken in fastrpc_free_map(), another thread can call
fastrpc_map_lookup() and get a reference to a map that is about to be
deleted.
Rewrite fastrpc_map_get() to only increase the reference count of a map
if it's non-zero. Propagate this to callers so they can know if a map is
about to be deleted.
Fixes this warning:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 5 PID: 10100 at lib/refcount.c:25 refcount_warn_saturate
...
Call trace:
refcount_warn_saturate
[fastrpc_map_get inlined]
[fastrpc_map_lookup inlined]
fastrpc_map_create
fastrpc_internal_invoke
fastrpc_device_ioctl
__arm64_sys_ioctl
invoke_syscall
Abel Vesa [Thu, 24 Nov 2022 17:49:40 +0000 (17:49 +0000)]
misc: fastrpc: Don't remove map on creater_process and device_release
Do not remove the map from the list on error path in
fastrpc_init_create_process, instead call fastrpc_map_put, to avoid
use-after-free. Do not remove it on fastrpc_device_release either,
call fastrpc_map_put instead.
The fastrpc_free_map is the only proper place to remove the map.
This is called only after the reference count is 0.
Abel Vesa [Thu, 24 Nov 2022 17:49:39 +0000 (17:49 +0000)]
misc: fastrpc: Fix use-after-free and race in fastrpc_map_find
Currently, there is a race window between the point when the mutex is
unlocked in fastrpc_map_lookup and the reference count increasing
(fastrpc_map_get) in fastrpc_map_find, which can also lead to
use-after-free.
So lets merge fastrpc_map_find into fastrpc_map_lookup which allows us
to both protect the maps list by also taking the &fl->lock spinlock and
the reference count, since the spinlock will be released only after.
Add take_ref argument to make this suitable for all callers.
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Andrew Halaney [Wed, 18 Jan 2023 16:56:38 +0000 (10:56 -0600)]
net: stmmac: enable all safety features by default
In the original implementation of dwmac5
commit 8bf993a5877e ("net: stmmac: Add support for DWMAC5 and implement Safety Features")
all safety features were enabled by default.
Later it seems some implementations didn't have support for all the
features, so in
commit 5ac712dcdfef ("net: stmmac: enable platform specific safety features")
the safety_feat_cfg structure was added to the callback and defined for
some platforms to selectively enable these safety features.
The problem is that only certain platforms were given that software
support. If the automotive safety package bit is set in the hardware
features register the safety feature callback is called for the platform,
and for platforms that didn't get a safety_feat_cfg defined this results
in the following NULL pointer dereference:
Go back to the original behavior, if the automotive safety package
is found to be supported in hardware enable all the features unless
safety_feat_cfg is passed in saying this particular platform only
supports a subset of the features.
Fixes: 5ac712dcdfef ("net: stmmac: enable platform specific safety features") Reported-by: Ning Cai <[email protected]> Signed-off-by: Andrew Halaney <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Randy Dunlap [Fri, 13 Jan 2023 06:38:05 +0000 (22:38 -0800)]
i2c: rk3x: fix a bunch of kernel-doc warnings
Fix multiple W=1 kernel-doc warnings in i2c-rk3x.c:
drivers/i2c/busses/i2c-rk3x.c:83: warning: missing initial short description on line:
* struct i2c_spec_values:
drivers/i2c/busses/i2c-rk3x.c:139: warning: missing initial short description on line:
* struct rk3x_i2c_calced_timings:
drivers/i2c/busses/i2c-rk3x.c:162: warning: missing initial short description on line:
* struct rk3x_i2c_soc_data:
drivers/i2c/busses/i2c-rk3x.c:242: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Generate a START condition, which triggers a REG_INT_START interrupt.
drivers/i2c/busses/i2c-rk3x.c:261: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Generate a STOP condition, which triggers a REG_INT_STOP interrupt.
drivers/i2c/busses/i2c-rk3x.c:304: warning: expecting prototype for Setup a read according to i2c(). Prototype was for rk3x_i2c_prepare_read() instead
drivers/i2c/busses/i2c-rk3x.c:335: warning: expecting prototype for Fill the transmit buffer with data from i2c(). Prototype was for rk3x_i2c_fill_transmit_buf() instead
drivers/i2c/busses/i2c-rk3x.c:535: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Get timing values of I2C specification
drivers/i2c/busses/i2c-rk3x.c:552: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Calculate divider values for desired SCL frequency
drivers/i2c/busses/i2c-rk3x.c:713: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Calculate timing values for desired SCL frequency
drivers/i2c/busses/i2c-rk3x.c:963: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Setup I2C registers for an I2C operation specified by msgs, num.
David S. Miller [Fri, 20 Jan 2023 09:00:08 +0000 (09:00 +0000)]
Merge branch 'octeontx2-af-CPT'
Srujana Challa says:
====================
octeontx2-af: Miscellaneous changes for CPT
This patchset consists of miscellaneous changes for CPT.
- Adds a new mailbox to reset the requested CPT LF.
- Modify FLR sequence as per HW team suggested.
- Adds support to recover CPT engines when they gets fault.
- Updates CPT inbound inline IPsec configuration mailbox,
as per new generation of the OcteonTX2 chips.
- Adds a new mailbox to return CPT FLT Interrupt info.
---
v2:
- Addressed a review comment.
v1:
- Dropped patch "octeontx2-af: Fix interrupt name strings completely"
to submit to net.
---
====================
Srujana Challa [Wed, 18 Jan 2023 12:03:54 +0000 (17:33 +0530)]
octeontx2-af: add mbox to return CPT_AF_FLT_INT info
CPT HW would trigger the CPT AF FLT interrupt when CPT engines
hits some uncorrectable errors and AF is the one which receives
the interrupt and recovers the engines.
This patch adds a mailbox for CPT VFs to request for CPT faulted
and recovered engines info.
Srujana Challa [Wed, 18 Jan 2023 12:03:53 +0000 (17:33 +0530)]
octeontx2-af: update cpt lf alloc mailbox
The CN10K CPT coprocessor contains a context processor
to accelerate updates to the IPsec security association
contexts. The context processor contains a context cache.
This patch updates CPT LF ALLOC mailbox to config ctx_ilen
requested by VFs. CPT_LF_ALLOC:ctx_ilen is the size of
initial context fetch.
octeontx2-af: restore rxc conf after teardown sequence
CN10K CPT coprocessor includes a component named RXC which
is responsible for reassembly of inner IP packets. RXC has
the feature to evict oldest entries based on age/threshold.
The age/threshold is being set to minimum values to evict
all entries at the time of teardown.
This patch adds code to restore timeout and threshold config
after teardown sequence is complete as it is global config.
Srujana Challa [Wed, 18 Jan 2023 12:03:50 +0000 (17:33 +0530)]
octeontx2-af: modify FLR sequence for CPT
On OcteonTX2 platform CPT instruction enqueue is only
possible via LMTST operations.
The existing FLR sequence mentioned in HRM requires
a dummy LMTST to CPT but LMTST can't be submitted from
AF driver. So, HW team provided a new sequence to avoid
dummy LMTST. This patch adds code for the same.
Srujana Challa [Wed, 18 Jan 2023 12:03:49 +0000 (17:33 +0530)]
octeontx2-af: add mbox for CPT LF reset
On OcteonTX2 SoC, the admin function (AF) is the only one with all
priviliges to configure HW and alloc resources, PFs and it's VFs
have to request AF via mailbox for all their needs.
This patch adds a new mailbox for CPT VFs to request for CPT LF
reset.
Fabrizio Castro [Tue, 17 Jan 2023 17:50:17 +0000 (17:50 +0000)]
dt-bindings: i2c: renesas,rzv2m: Fix SoC specific string
The preferred form for Renesas' compatible strings is:
"<vendor>,<family>-<module>"
Somehow the compatible string for the r9a09g011 I2C IP was upstreamed
as renesas,i2c-r9a09g011 instead of renesas,r9a09g011-i2c, which
is really confusing, especially considering the generic fallback
is renesas,rzv2m-i2c.
The first user of renesas,i2c-r9a09g011 in the kernel is not yet in
a kernel release, it will be in v6.1, therefore it can still be
fixed in v6.1.
Even if we don't fix it before v6.2, I don't think there is any
harm in making such a change.
s/renesas,i2c-r9a09g011/renesas,r9a09g011-i2c/g for consistency.
Dave Airlie [Fri, 20 Jan 2023 01:17:11 +0000 (11:17 +1000)]
Merge tag 'drm-misc-fixes-2023-01-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A fix for vc4 to address a memory leak when allocating a buffer, a
Kconfig fix for panfrost and two fixes for i915 and fb-helper to
address some bugs with vga-switcheroo.
Masahiro Yamada [Fri, 6 Jan 2023 16:12:13 +0000 (01:12 +0900)]
riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAIT
Since commit 80b6093b55e3 ("kbuild: add -Wundef to KBUILD_CPPFLAGS
for W=1 builds"), building with W=1 detects misuse of #if.
$ make W=1 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- arch/riscv/kernel/
[snip]
AS arch/riscv/kernel/head.o
arch/riscv/kernel/head.S:329:5: warning: "CONFIG_RISCV_BOOT_SPINWAIT" is not defined, evaluates to 0 [-Wundef]
329 | #if CONFIG_RISCV_BOOT_SPINWAIT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
CONFIG_RISCV_BOOT_SPINWAIT is a bool option. #ifdef should be used.
Conor Dooley [Fri, 6 Jan 2023 12:53:45 +0000 (12:53 +0000)]
MAINTAINERS: add an IRC entry for RISC-V
I remember being told "Just ping me on IRC" about patches, but googling
at the time was not helpful. #riscv on libera is not linux specific,
but a bunch of contributors etc do hang out there.
Add a link to the maintainers entry to help others find it in the future!
Heiko Stuebner [Thu, 5 Jan 2023 19:26:10 +0000 (20:26 +0100)]
RISC-V: fix compile error from deduplicated __ALTERNATIVE_CFG_2
On the non-assembler-side wrapping alternative-macros inside other macros
to prevent duplication of code works, as the end result will just be a
string that gets fed to the asm instruction.
In real assembler code, wrapping .macro blocks inside other .macro blocks
brings more restrictions on usage it seems and the optimization done by
commit 2ba8c7dc71c0 ("riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2")
results in a compile error like:
../arch/riscv/lib/strcmp.S: Assembler messages:
../arch/riscv/lib/strcmp.S:15: Error: too many positional arguments
../arch/riscv/lib/strcmp.S:15: Error: backward ref to unknown label "886:"
../arch/riscv/lib/strcmp.S:15: Error: backward ref to unknown label "887:"
../arch/riscv/lib/strcmp.S:15: Error: backward ref to unknown label "886:"
../arch/riscv/lib/strcmp.S:15: Error: backward ref to unknown label "887:"
../arch/riscv/lib/strcmp.S:15: Error: backward ref to unknown label "886:"
../arch/riscv/lib/strcmp.S:15: Error: attempt to move .org backwards
Wrapping the variables containing assembler code in quotes solves this issue,
compilation and the code in question still works and objdump also shows sane
decompiled results of the affected code.
Linus Torvalds [Thu, 19 Jan 2023 23:22:28 +0000 (15:22 -0800)]
Merge tag 'perf-tools-fixes-for-v6.2-3-2023-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Prevent reading into undefined memory in the expression lexer,
accounting for a trailer backslash followed by the null byte.
- Fix file mode when copying files to the build id cache, the problem
happens when the cache directory is in a different file system than
the file being cached, otherwise the mode was preserved as only a
hard link would be done to save space.
- Fix a related build-id 'perf test' entry that checked that permission
when caching PE (Portable Executable) files, used when profiling
Windows executables under wine.
- Sync the tools/ copies of kvm headers, build_bug.h, socket.h and
arm64's cputype.h with the kernel sources.
* tag 'perf-tools-fixes-for-v6.2-3-2023-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf test build-id: Fix test check for PE file
perf buildid-cache: Fix the file mode with copyfile() while adding file to build-id cache
perf expr: Prevent normalize() from reading into undefined memory in the expression lexer
tools headers: Syncronize linux/build_bug.h with the kernel sources
perf beauty: Update copy of linux/socket.h with the kernel sources
tools headers arm64: Sync arm64's cputype.h with the kernel sources
tools kvm headers arm64: Update KVM header from the kernel sources
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
Juergen Gross [Tue, 17 Jan 2023 15:57:23 +0000 (16:57 +0100)]
acpi: Fix suspend with Xen PV
Commit f1e525009493 ("x86/boot: Skip realmode init code when running as
Xen PV guest") missed one code path accessing real_mode_header, leading
to dereferencing NULL when suspending the system under Xen:
Fix that by adding an optional acpi callback allowing to skip setting
the wakeup address, as in the Xen PV case this will be handled by the
hypervisor anyway.
Dave Airlie [Thu, 19 Jan 2023 21:49:00 +0000 (07:49 +1000)]
Merge tag 'drm-msm-fixes-2023-01-16' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
msm-fixes for v6.3-rc5
Two GPU fixes which were meant to be part of the previous pull request,
but I'd forgotten to fetch from gitlab after the MR was merged so that
git tag was applied to the wrong commit.
Linus Torvalds [Thu, 19 Jan 2023 20:32:07 +0000 (12:32 -0800)]
Merge tag 'printk-for-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:
- Prevent a potential deadlock when configuring kgdb console
- Fix a kernel doc warning
* tag 'printk-for-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
kernel/printk/printk.c: Fix W=1 kernel-doc warning
tty: serial: kgdboc: fix mutex locking order for configure_kgdboc()
Linus Torvalds [Thu, 19 Jan 2023 20:28:53 +0000 (12:28 -0800)]
Merge tag 's390-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 build fix from Heiko Carstens:
- Workaround invalid gcc-11 out of bounds read warning caused by s390's
S390_lowcore definition. This happens only with gcc 11.1.0 and
11.2.0.
The code which causes this warning will be gone with the next merge
window. Therefore just replace the memcpy() with a for loop to get
rid of the warning.
* tag 's390-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: workaround invalid gcc-11 out of bounds read warning
Viresh Kumar [Wed, 18 Jan 2023 08:38:24 +0000 (14:08 +0530)]
thermal: core: call put_device() only after device_register() fails
put_device() shouldn't be called before a prior call to
device_register(). __thermal_cooling_device_register() doesn't follow
that properly and needs fixing. Also
thermal_cooling_device_destroy_sysfs() is getting called unnecessarily
on few error paths.
Fix all this by placing the calls at the right place.
Based on initial work done by Caleb Connolly.
Fixes: 4748f9687caa ("thermal: core: fix some possible name leaks in error paths") Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") Reported-by: Caleb Connolly <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Tested-by: Frank Rowand <[email protected]> Reviewed-by: Yang Yingliang <[email protected]> Tested-by: Caleb Connolly <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
Linus Torvalds [Thu, 19 Jan 2023 17:54:08 +0000 (09:54 -0800)]
Merge tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
- A single patch to fix sync write operations to detect and handle
errors due to external zone corruptions resulting in writes at
invalid location, from me.
* tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Detect append writes at invalid locations
Jens Axboe [Thu, 19 Jan 2023 16:04:40 +0000 (09:04 -0700)]
io_uring/msg_ring: fix missing lock on overflow for IOPOLL
If the target ring is configured with IOPOLL, then we always need to hold
the target ring uring_lock before posting CQEs. We could just grab it
unconditionally, but since we don't expect many target rings to be of this
type, make grabbing the uring_lock conditional on the ring type.
net: dsa: microchip: ksz9477: port map correction in ALU table entry register
ALU table entry 2 register in KSZ9477 have bit positions reserved for
forwarding port map. This field is referred in ksz9477_fdb_del() for
clearing forward port map and alu table.
But current fdb_del refer ALU table entry 3 register for accessing forward
port map. Update ksz9477_fdb_del() to get forward port map from correct
alu table entry register.
With this bug, issue can be observed while deleting static MAC entries.
Delete any specific MAC entry using "bridge fdb del" command. This should
clear all the specified MAC entries. But it is observed that entries with
self static alone are retained.
Tested on LAN9370 EVB since ksz9477_fdb_del() is used common across
LAN937x and KSZ series.
Willem de Bruijn [Wed, 18 Jan 2023 15:18:47 +0000 (10:18 -0500)]
selftests/net: toeplitz: fix race on tpacket_v3 block close
Avoid race between process wakeup and tpacket_v3 block timeout.
The test waits for cfg_timeout_msec for packets to arrive. Packets
arrive in tpacket_v3 rings, which pass packets ("frames") to the
process in batches ("blocks"). The sk waits for req3.tp_retire_blk_tov
msec to release a block.
Set the block timeout lower than the process waiting time, else
the process may find that no block has been released by the time it
scans the socket list. Convert to a ring of more than one, smaller,
blocks with shorter timeouts. Blocks must be page aligned, so >= 64KB.
Paolo Abeni [Wed, 18 Jan 2023 12:24:12 +0000 (13:24 +0100)]
net/ulp: use consistent error code when blocking ULP
The referenced commit changed the error code returned by the kernel
when preventing a non-established socket from attaching the ktls
ULP. Before to such a commit, the user-space got ENOTCONN instead
of EINVAL.
The existing self-tests depend on such error code, and the change
caused a failure:
RUN global.non_established ...
tls.c:1673:non_established:Expected errno (22) == ENOTCONN (107)
non_established: Test failed at step #3
FAIL global.non_established
In the unlikely event existing applications do the same, address
the issue by restoring the prior error code in the above scenario.
Note that the only other ULP performing similar checks at init
time - smc_ulp_ops - also fails with ENOTCONN when trying to attach
the ULP to a non-established socket.
x86/sev: Add SEV-SNP guest feature negotiation support
The hypervisor can enable various new features (SEV_FEATURES[1:63]) and start a
SNP guest. Some of these features need guest side implementation. If any of
these features are enabled without it, the behavior of the SNP guest will be
undefined. It may fail booting in a non-obvious way making it difficult to
debug.
Instead of allowing the guest to continue and have it fail randomly later,
detect this early and fail gracefully.
The SEV_STATUS MSR indicates features which the hypervisor has enabled. While
booting, SNP guests should ascertain that all the enabled features have guest
side implementation. In case a feature is not implemented in the guest, the
guest terminates booting with GHCB protocol Non-Automatic Exit(NAE) termination
request event, see "SEV-ES Guest-Hypervisor Communication Block Standardization"
document (currently at https://developer.amd.com/wp-content/resources/56421.pdf),
section "Termination Request".
Populate SW_EXITINFO2 with mask of unsupported features that the hypervisor can
easily report to the user.
More details in the AMD64 APM Vol 2, Section "SEV_STATUS MSR".
[ bp:
- Massage.
- Move snp_check_features() call to C code.
Note: the CC:stable@ aspect here is to be able to protect older, stable
kernels when running on newer hypervisors. Or not "running" but fail
reliably and in a well-defined manner instead of randomly. ]
Chen Zhongjin [Fri, 25 Nov 2022 06:35:41 +0000 (14:35 +0800)]
driver core: Fix test_async_probe_init saves device in wrong array
In test_async_probe_init, second set of asynchronous devices are saved
in sync_dev[sync_id], which should be async_dev[async_id].
This makes these devices not unregistered when exit.
> modprobe test_async_driver_probe && \
> modprobe -r test_async_driver_probe && \
> modprobe test_async_driver_probe
...
> sysfs: cannot create duplicate filename '/devices/platform/test_async_driver.4'
> kobject_add_internal failed for test_async_driver.4 with -EEXIST,
don't try to register things with the same name in the same directory.
Yang Yingliang [Mon, 5 Dec 2022 10:15:58 +0000 (18:15 +0800)]
w1: fix WARNING after calling w1_process()
I got the following WARNING message while removing driver(ds2482):
------------[ cut here ]------------
do not call blocking ops when !TASK_RUNNING; state=1 set at [<000000002d50bfb6>] w1_process+0x9e/0x1d0 [wire]
WARNING: CPU: 0 PID: 262 at kernel/sched/core.c:9817 __might_sleep+0x98/0xa0
CPU: 0 PID: 262 Comm: w1_bus_master1 Tainted: G N 6.1.0-rc3+ #307
RIP: 0010:__might_sleep+0x98/0xa0
Call Trace:
exit_signals+0x6c/0x550
do_exit+0x2b4/0x17e0
kthread_exit+0x52/0x60
kthread+0x16d/0x1e0
ret_from_fork+0x1f/0x30
The state of task is set to TASK_INTERRUPTIBLE in loop in w1_process(),
set it to TASK_RUNNING when it breaks out of the loop to avoid the
warning.
Yang Yingliang [Mon, 5 Dec 2022 08:04:34 +0000 (16:04 +0800)]
w1: fix deadloop in __w1_remove_master_device()
I got a deadloop report while doing device(ds2482) add/remove test:
[ 162.241881] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
[ 163.272251] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
[ 164.296157] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
...
__w1_remove_master_device() can't return, because the dev->refcnt is not zero.
w1_add_master_device() |
w1_alloc_dev() |
atomic_set(&dev->refcnt, 2) |
kthread_run() |
|__w1_remove_master_device()
| kthread_stop()
// KTHREAD_SHOULD_STOP is set, |
// threadfn(w1_process) won't be |
// called. |
kthread() |
| // refcnt will never be 0, it's deadloop.
| while (atomic_read(&dev->refcnt)) {...}
After calling w1_add_master_device(), w1_process() is not really
invoked, before w1_process() starting, if kthread_stop() is called
in __w1_remove_master_device(), w1_process() will never be called,
the refcnt can not be decreased, then it causes deadloop in remove
function because of non-zero refcnt.
We need to make sure w1_process() is really started, so move the
set refcnt into w1_process() to fix this problem.
Ian Abbott [Tue, 3 Jan 2023 14:37:54 +0000 (14:37 +0000)]
comedi: adv_pci1760: Fix PWM instruction handling
(Actually, this is fixing the "Read the Current Status" command sent to
the device's outgoing mailbox, but it is only currently used for the PWM
instructions.)
The PCI-1760 is operated mostly by sending commands to a set of Outgoing
Mailbox registers, waiting for the command to complete, and reading the
result from the Incoming Mailbox registers. One of these commands is
the "Read the Current Status" command. The number of this command is
0x07 (see the User's Manual for the PCI-1760 at
<https://advdownload.advantech.com/productfile/Downloadfile2/1-11P6653/PCI-1760.pdf>.
The `PCI1760_CMD_GET_STATUS` macro defined in the driver should expand
to this command number 0x07, but unfortunately it currently expands to
0x03. (Command number 0x03 is not defined in the User's Manual.)
Correct the definition of the `PCI1760_CMD_GET_STATUS` macro to fix it.
This is used by all the PWM subdevice related instructions handled by
`pci1760_pwm_insn_config()` which are probably all broken. The effect
of sending the undefined command number 0x03 is not known.
Open-code the initializers with the version that was already used,
and use the pm_sleep_ptr() method to deal with unused ones,
in place of the __maybe_unused annotation.
Tobias Schramm [Mon, 9 Jan 2023 07:29:40 +0000 (08:29 +0100)]
serial: atmel: fix incorrect baudrate setup
Commit ba47f97a18f2 ("serial: core: remove baud_rates when serial console
setup") changed uart_set_options to select the correct baudrate
configuration based on the absolute error between requested baudrate and
available standard baudrate settings.
Prior to that commit the baudrate was selected based on which predefined
standard baudrate did not exceed the requested baudrate.
This change of selection logic was never reflected in the atmel serial
driver. Thus the comment left in the atmel serial driver is no longer
accurate.
Additionally the manual rounding up described in that comment and applied
via (quot - 1) requests an incorrect baudrate. Since uart_set_options uses
tty_termios_encode_baud_rate to determine the appropriate baudrate flags
this can cause baudrate selection to fail entirely because
tty_termios_encode_baud_rate will only select a baudrate if relative error
between requested and selected baudrate does not exceed +/-2%.
Fix that by requesting actual, exact baudrate used by the serial.
Gaosheng Cui [Fri, 2 Dec 2022 06:06:33 +0000 (14:06 +0800)]
tty: fix possible null-ptr-defer in spk_ttyio_release
Run the following tests on the qemu platform:
syzkaller:~# modprobe speakup_audptr
input: Speakup as /devices/virtual/input/input4
initialized device: /dev/synth, node (MAJOR 10, MINOR 125)
speakup 3.1.6: initialized
synth name on entry is: (null)
synth probe
spk_ttyio_initialise_ldisc failed because tty_kopen_exclusive returned
failed (errno -16), then remove the module, we will get a null-ptr-defer
problem, as follow:
Paolo Abeni [Thu, 19 Jan 2023 14:39:37 +0000 (15:39 +0100)]
Merge tag 'mlx5-fixes-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net: mlx5: eliminate anonymous module_init & module_exit
net/mlx5: E-switch, Fix switchdev mode after devlink reload
net/mlx5e: Protect global IPsec ASO
net/mlx5e: Remove optimization which prevented update of ESN state
net/mlx5e: Set decap action based on attr for sample
net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT
net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT
net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk
net/mlx5e: Avoid false lock dependency warning on tc_ht even more
net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work()
====================
Marek Vasut [Thu, 12 Jan 2023 18:04:17 +0000 (19:04 +0100)]
serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
Requesting an interrupt with IRQF_ONESHOT will run the primary handler
in the hard-IRQ context even in the force-threaded mode. The
force-threaded mode is used by PREEMPT_RT in order to avoid acquiring
sleeping locks (spinlock_t) in hard-IRQ context. This combination
makes it impossible and leads to "sleeping while atomic" warnings.
Use one interrupt handler for both handlers (primary and secondary)
and drop the IRQF_ONESHOT flag which is not needed.
Lino Sanfilippo [Sun, 8 Jan 2023 18:17:35 +0000 (19:17 +0100)]
serial: amba-pl011: fix high priority character transmission in rs486 mode
In RS485 mode the transmission of a high priority character fails since it
is written to the data register before the transmitter is enabled. Fix this
in pl011_tx_chars() by enabling RS485 transmission before writing the
character.
Ilpo Järvinen [Tue, 3 Jan 2023 09:34:35 +0000 (11:34 +0200)]
serial: pch_uart: Pass correct sg to dma_unmap_sg()
A local variable sg is used to store scatterlist pointer in
pch_dma_tx_complete(). The for loop doing Tx byte accounting before
dma_unmap_sg() alters sg in its increment statement. Therefore, the
pointer passed into dma_unmap_sg() won't match to the one given to
dma_map_sg().
To fix the problem, use priv->sg_tx_p directly in dma_unmap_sg()
instead of the local variable.
tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO buffer
Driver's probe allocates memory for RX FIFO (port->rx_fifo) based on
default RX FIFO depth, e.g. 16. Later during serial startup the
qcom_geni_serial_port_setup() updates the RX FIFO depth
(port->rx_fifo_depth) to match real device capabilities, e.g. to 32.
The RX UART handle code will read "port->rx_fifo_depth" number of words
into "port->rx_fifo" buffer, thus exceeding the bounds. This can be
observed in certain configurations with Qualcomm Bluetooth HCI UART
device and KASAN:
Bluetooth: hci0: QCA Product ID :0x00000010
Bluetooth: hci0: QCA SOC Version :0x400a0200
Bluetooth: hci0: QCA ROM Version :0x00000200
Bluetooth: hci0: QCA Patch Version:0x00000d2b
Bluetooth: hci0: QCA controller version 0x02000200
Bluetooth: hci0: QCA Downloading qca/htbtfw20.tlv
bluetooth hci0: Direct firmware load for qca/htbtfw20.tlv failed with error -2
Bluetooth: hci0: QCA Failed to request file: qca/htbtfw20.tlv (-2)
Bluetooth: hci0: QCA Failed to download patch (-2)
==================================================================
BUG: KASAN: slab-out-of-bounds in handle_rx_uart+0xa8/0x18c
Write of size 4 at addr ffff279347d578c0 by task swapper/0/0
Yang Yingliang [Wed, 23 Nov 2022 02:25:42 +0000 (10:25 +0800)]
device property: fix of node refcount leak in fwnode_graph_get_next_endpoint()
The 'parent' returned by fwnode_graph_get_port_parent()
with refcount incremented when 'prev' is not NULL, it
needs be put when finish using it.
Because the parent is const, introduce a new variable to
store the returned fwnode, then put it before returning
from fwnode_graph_get_next_endpoint().
Eric Pilmore [Thu, 19 Jan 2023 03:39:08 +0000 (19:39 -0800)]
ptdma: pt_core_execute_cmd() should use spinlock
The interrupt handler (pt_core_irq_handler()) of the ptdma
driver can be called from interrupt context. The code flow
in this function can lead down to pt_core_execute_cmd() which
will attempt to grab a mutex, which is not appropriate in
interrupt context and ultimately leads to a kernel panic.
The fix here changes this mutex to a spinlock, which has
been verified to resolve the issue.
Arnd Bergmann [Wed, 18 Jan 2023 09:01:41 +0000 (10:01 +0100)]
usb: dwc3: fix extcon dependency
The dwc3 core support now links against the extcon subsystem,
so it cannot be built-in when extcon is a loadable module:
arm-linux-gnueabi-ld: drivers/usb/dwc3/core.o: in function `dwc3_get_extcon':
core.c:(.text+0x572): undefined reference to `extcon_get_edev_by_phandle'
arm-linux-gnueabi-ld: core.c:(.text+0x596): undefined reference to `extcon_get_extcon_dev'
arm-linux-gnueabi-ld: core.c:(.text+0x5ea): undefined reference to `extcon_find_edev_by_node'
There was already a Kconfig dependency in the dual-role support,
but this is now needed for the entire dwc3 driver.
It is still possible to build dwc3 without extcon, but this
prevents it from being set to built-in when extcon is a loadable
module.
Nirmoy Das [Tue, 17 Jan 2023 17:52:36 +0000 (18:52 +0100)]
drm/i915: Fix a memory leak with reused mmap_offset
drm_vma_node_allow() and drm_vma_node_revoke() should be called in
balanced pairs. We call drm_vma_node_allow() once per-file everytime a
user calls mmap_offset, but only call drm_vma_node_revoke once per-file
on each mmap_offset. As the mmap_offset is reused by the client, the
per-file vm_count may remain non-zero and the rbtree leaked.
Call drm_vma_node_allow_once() instead to prevent that memory leak.
Currently there is no easy way for a drm driver to safely check and allow
drm_vma_offset_node for a drm file just once. Allow drm drivers to call
non-refcounted version of drm_vma_node_allow() so that a driver doesn't
need to keep track of each drm_vma_node_allow() to call subsequent
drm_vma_node_revoke() to prevent memory leak.
Kevin Hao [Wed, 18 Jan 2023 07:13:00 +0000 (15:13 +0800)]
octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt
The commit 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura
free") uses the get/put_cpu() to protect the usage of percpu pointer
in ->aura_freeptr() callback, but it also unnecessarily disable the
preemption for the blockable memory allocation. The commit 87b93b678e95
("octeontx2-pf: Avoid use of GFP_KERNEL in atomic context") tried to
fix these sleep inside atomic warnings. But it only fix the one for
the non-rt kernel. For the rt kernel, we still get the similar warnings
like below.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
3 locks held by swapper/0/1:
#0: ffff800009fc5fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30
#1: ffff000100c276c0 (&mbox->lock){+.+.}-{3:3}, at: otx2_init_hw_resources+0x8c/0x3a4
#2: ffffffbfef6537e0 (&cpu_rcache->lock){+.+.}-{2:2}, at: alloc_iova_fast+0x1ac/0x2ac
Preemption disabled at:
[<ffff800008b1908c>] otx2_rq_aura_pool_init+0x14c/0x284
CPU: 20 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc3-rt1-yocto-preempt-rt #1
Hardware name: Marvell OcteonTX CN96XX board (DT)
Call trace:
dump_backtrace.part.0+0xe8/0xf4
show_stack+0x20/0x30
dump_stack_lvl+0x9c/0xd8
dump_stack+0x18/0x34
__might_resched+0x188/0x224
rt_spin_lock+0x64/0x110
alloc_iova_fast+0x1ac/0x2ac
iommu_dma_alloc_iova+0xd4/0x110
__iommu_dma_map+0x80/0x144
iommu_dma_map_page+0xe8/0x260
dma_map_page_attrs+0xb4/0xc0
__otx2_alloc_rbuf+0x90/0x150
otx2_rq_aura_pool_init+0x1c8/0x284
otx2_init_hw_resources+0xe4/0x3a4
otx2_open+0xf0/0x610
__dev_open+0x104/0x224
__dev_change_flags+0x1e4/0x274
dev_change_flags+0x2c/0x7c
ic_open_devs+0x124/0x2f8
ip_auto_config+0x180/0x42c
do_one_initcall+0x90/0x4dc
do_basic_setup+0x10c/0x14c
kernel_init_freeable+0x10c/0x13c
kernel_init+0x2c/0x140
ret_from_fork+0x10/0x20
Of course, we can shuffle the get/put_cpu() to only wrap the invocation
of ->aura_freeptr() as what commit 87b93b678e95 does. But there are only
two ->aura_freeptr() callbacks, otx2_aura_freeptr() and
cn10k_aura_freeptr(). There is no usage of perpcu variable in the
otx2_aura_freeptr() at all, so the get/put_cpu() seems redundant to it.
We can move the get/put_cpu() into the corresponding callback which
really has the percpu variable usage and avoid the sprinkling of
get/put_cpu() in several places.
Jason Xing [Wed, 18 Jan 2023 01:59:41 +0000 (09:59 +0800)]
tcp: avoid the lookup process failing to get sk in ehash table
While one cpu is working on looking up the right socket from ehash
table, another cpu is done deleting the request socket and is about
to add (or is adding) the big socket from the table. It means that
we could miss both of them, even though it has little chance.
Let me draw a call trace map of the server side.
CPU 0 CPU 1
----- -----
tcp_v4_rcv() syn_recv_sock()
inet_ehash_insert()
-> sk_nulls_del_node_init_rcu(osk)
__inet_lookup_established()
-> __sk_nulls_add_node_rcu(sk, list)
Notice that the CPU 0 is receiving the data after the final ack
during 3-way shakehands and CPU 1 is still handling the final ack.
Why could this be a real problem?
This case is happening only when the final ack and the first data
receiving by different CPUs. Then the server receiving data with
ACK flag tries to search one proper established socket from ehash
table, but apparently it fails as my map shows above. After that,
the server fetches a listener socket and then sends a RST because
it finds a ACK flag in the skb (data), which obeys RST definition
in RFC 793.
Besides, Eric pointed out there's one more race condition where it
handles tw socket hashdance. Only by adding to the tail of the list
before deleting the old one can we avoid the race if the reader has
already begun the bucket traversal and it would possibly miss the head.
Many thanks to Eric for great help from beginning to end.
Keith Busch [Wed, 18 Jan 2023 16:44:16 +0000 (08:44 -0800)]
nvme-pci: fix timeout request state check
Polling the completion can progress the request state to IDLE, either
inline with the completion, or through softirq. Either way, the state
may not be COMPLETED, so don't check for that. We only care if the state
isn't IN_FLIGHT.
This is fixing an issue where the driver aborts an IO that we just
completed. Seeing the "aborting" message instead of "polled" is very
misleading as to where the timeout problem resides.
Fixes: bf392a5dc02a9b ("nvme-pci: Remove tag from process cq") Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
Janne Grunau [Tue, 17 Jan 2023 18:25:01 +0000 (19:25 +0100)]
nvme-apple: only reset the controller when RTKit is running
NVMe controller register access hangs indefinitely when the co-processor
is not running. A missed reset is preferable over a hanging thread since
it could be recoverable.
Janne Grunau [Tue, 17 Jan 2023 18:25:00 +0000 (19:25 +0100)]
nvme-apple: reset controller during shutdown
This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller
shutdown in apple_nvme_disable").
The commit broke suspend/resume since apple_nvme_reset_work() tries to
disable the controller on resume. This does not work for the apple NVMe
controller since register access only works while the co-processor
firmware is running.
Disabling the NVMe controller in the shutdown path is also required
for shutting the co-processor down. The original code was appropriate
for this hardware. Add a comment to prevent a similar breaking changes
in the future.
Currently IFF_NO_ADDRCONF is used to prevent all ipv6 addrconf for the
slave ports of team, bonding and failover devices and it means no ipv6
packets can be sent out through these slave ports. However, for team
device, "nsna_ping" link_watch requires ipv6 addrconf. Otherwise, the
link will be marked failure. This patch removes the IFF_NO_ADDRCONF
flag set for team port, and we will fix the original issue in another
patch, as Jakub suggested.
Jakub Kicinski [Tue, 17 Jan 2023 19:01:41 +0000 (11:01 -0800)]
MAINTAINERS: add networking entries for Willem
We often have to ping Willem asking for reviews of patches
because he doesn't get included in the CC list. Add MAINTAINERS
entries for some of the areas he covers so that ./scripts/ will
know to add him.
Jakub Kicinski [Fri, 13 Jan 2023 04:41:37 +0000 (20:41 -0800)]
net: sched: gred: prevent races when adding offloads to stats
Naresh reports seeing a warning that gred is calling
u64_stats_update_begin() with preemption enabled.
Arnd points out it's coming from _bstats_update().
We should be holding the qdisc lock when writing
to stats, they are also updated from the datapath.
Hamza Mahfooz [Tue, 17 Jan 2023 20:12:49 +0000 (15:12 -0500)]
drm/amd/display: fix issues with driver unload
Currently, we run into a number of WARN()s when attempting to unload the
amdgpu driver (e.g. using "modprobe -r amdgpu"). These all stem from
calling drm_encoder_cleanup() too early. So, to fix this we can stop
calling drm_encoder_cleanup() from amdgpu_dm_fini() and instead have it
be called from amdgpu_dm_encoder_destroy(). Also, we don't need to free
in amdgpu_dm_encoder_destroy() since mst_encoders[] isn't explicitly
allocated by the slab allocator.
Fixes: f74367e492ba ("drm/amdgpu/display: create fake mst encoders ahead of time (v4)") Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
The YCC conversion matrix for RGB -> COLOR_SPACE_YCBCR2020_TYPE is
missing the values for the fourth column of the matrix.
The fourth column of the matrix is essentially just a value that is
added given that the color is 3 components in size.
These values are needed to bias the chroma from the [-1, 1] -> [0, 1]
range.
This fixes color being very green when using Gamescope HDR on HDMI
output which prefers YCC 4:4:4.
Joshua Ashton [Tue, 10 Jan 2023 20:12:21 +0000 (20:12 +0000)]
drm/amd/display: Calculate output_color_space after pixel encoding adjustment
Code in get_output_color_space depends on knowing the pixel encoding to
determine whether to pick between eg. COLOR_SPACE_SRGB or
COLOR_SPACE_YCBCR709 for transparent RGB -> YCbCr 4:4:4 in the driver.
v2: Fixed patch being accidentally based on a personal feature branch, oops!