Frank Li [Fri, 7 Jul 2023 23:00:14 +0000 (19:00 -0400)]
usb: gadget: call usb_gadget_check_config() to verify UDC capability
The legacy gadget driver omitted calling usb_gadget_check_config()
to ensure that the USB device controller (UDC) has adequate resources,
including sufficient endpoint numbers and types, to support the given
configuration.
Previously, usb_add_config() was solely invoked by the legacy gadget
driver. Adds the necessary usb_gadget_check_config() after the bind()
operation to fix the issue.
Kyle Tso [Fri, 23 Jun 2023 15:10:35 +0000 (23:10 +0800)]
usb: typec: Iterate pds array when showing the pd list
The pointers of each usb_power_delivery handles are stored in "pds"
array returned from the pd_get ops but not in the adjacent memory
calculated from "pd". Get the handles from "pds" array directly instead
of deriving them from "pd".
Kyle Tso [Fri, 23 Jun 2023 15:10:34 +0000 (23:10 +0800)]
usb: typec: Set port->pd before adding device for typec_port
When calling device_add in the registration of typec_port, it will do
the NULL check on usb_power_delivery handle in typec_port for the
visibility of the device attributes. It is always NULL because port->pd
is set in typec_port_set_usb_power_delivery which is later than the
device_add call.
Set port->pd before device_add and only link the device after that.
The reverted commit was based on static analysis and a misunderstanding
of how PTR_ERR() and NULLs are supposed to work. When a function
returns both pointer errors and NULL then normally the NULL means
"continue operating without a feature because it was deliberately
turned off". The NULL should not be treated as a failure. If a driver
cannot work when that feature is disabled then the KConfig should
enforce that the function cannot return NULL. We should not need to
test for it.
In this driver, the bug means that probe cannot succeed when CONFIG_PM
is disabled.
The reverted commit was based on static analysis and a misunderstanding
of how PTR_ERR() and NULLs are supposed to work. When a function
returns both pointer errors and NULL then normally the NULL means
"continue operating without a feature because it was deliberately
turned off". The NULL should not be treated as a failure. If a driver
cannot work when that feature is disabled then the KConfig should
enforce that the function cannot return NULL. We should not need to
test for it.
In this code, the patch means that certain tegra_xusb_probe() will
fail if the firmware supports power-domains but CONFIG_PM is disabled.
USB: gadget: Fix the memory leak in raw_gadget driver
Currently, increasing raw_dev->count happens before invoke the
raw_queue_event(), if the raw_queue_event() return error, invoke
raw_release() will not trigger the dev_free() to be called.
[ 268.905865][ T5067] raw-gadget.0 gadget.0: failed to queue event
[ 268.912053][ T5067] udc dummy_udc.0: failed to start USB Raw Gadget: -12
[ 268.918885][ T5067] raw-gadget.0: probe of gadget.0 failed with error -12
[ 268.925956][ T5067] UDC core: USB Raw Gadget: couldn't find an available UDC or it's busy
[ 268.934657][ T5067] misc raw-gadget: fail, usb_gadget_register_driver returned -16
usb: gadget: core: remove unbalanced mutex_unlock in usb_gadget_activate
Commit 286d9975a838 ("usb: gadget: udc: core: Prevent soft_connect_store() race")
introduced one extra mutex_unlock of connect_lock in the usb_gadget_active function.
AutoRetry has been found to sometimes cause controller freezes when
communicating with buggy USB devices.
This controller feature allows the controller in host mode to send
non-terminating/burst retry ACKs instead of terminating retry ACKs
to devices when a transaction error (CRC error or overflow) occurs.
Unfortunately, if the USB device continues to respond with a CRC error,
the controller will not complete endpoint-related commands while it
keeps trying to auto-retry. [3] The xHCI driver will notice this once
it tries to abort the transfer using a Stop Endpoint command and
does not receive a completion in time. [1]
This situation is reported to dmesg:
[sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
xhci-hcd: xHCI host not responding to stop endpoint command
xhci-hcd: xHCI host controller not responding, assume dead
xhci-hcd: HC died; cleaning up
Some users observed this problem on an Odroid HC2 with the JMS578
USB3-to-SATA bridge. The issue can be triggered by starting
a read-heavy workload on an attached SSD. After a while, the host
controller would die and the SSD would disappear from the system. [1]
Further analysis by Synopsys determined that controller revisions
other than the one in Odroid HC2 are also affected by this.
The recommended solution was to disable AutoRetry altogether.
This change does not have a noticeable performance impact. [2]
Revert the enablement commit. This will keep the AutoRetry bit in
the default state configured during SoC design [2].
[ 0.754373] pci 0000:0b:00.0: xHCI HW did not halt within 32000 usec status = 0x1000
[ 0.754419] pci 0000:0b:00.0: quirk_usb_early_handoff+0x0/0x7a0 took 31459 usecs
[ 2.228048] xhci_hcd 0000:0b:00.0: xHCI Host Controller
[ 2.228053] xhci_hcd 0000:0b:00.0: new USB bus registered, assigned bus number 7
[ 2.260073] xhci_hcd 0000:0b:00.0: Host halt failed, -110
[ 2.260079] xhci_hcd 0000:0b:00.0: can't setup: -110
[ 2.260551] xhci_hcd 0000:0b:00.0: USB bus 7 deregistered
[ 2.260624] xhci_hcd 0000:0b:00.0: init 0000:0b:00.0 fail, -110
[ 2.260639] xhci_hcd: probe of 0000:0b:00.0 failed with error -110
The hardware in question is an external PCIe card. It looks to me like the quirk
needs to be narrowed down. But this needs information about the hardware showing
the issue this quirk is to fix. So for now a clean revert.
To fix the condition a user has to unplug and plug the device again.
With USB_QUIRK_RESET_RESUME applied ("usbcore.quirks=1235:8211:b")
for the Scarlett audio device the issue still reproduces.
Applying USB_QUIRK_DISCONNECT_SUSPEND ("usbcore.quirks=1235:8211:m")
fixed the issue and the Scarlett audio device didn't drop off the USB
bus for ~5000 suspend/resume cycles where originally issue reproduced in
~100 or less suspend/resume cycles.
Jisheng Zhang [Tue, 27 Jun 2023 16:20:18 +0000 (00:20 +0800)]
usb: dwc3: don't reset device side if dwc3 was configured as host-only
Commit c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on
system_suspend in host mode") replaces check for HOST only dr_mode with
current_dr_role. But during booting, the current_dr_role isn't
initialized, thus the device side reset is always issued even if dwc3
was configured as host-only. What's more, on some platforms with host
only dwc3, aways issuing device side reset by accessing device register
block can cause kernel panic.
Neil Armstrong [Mon, 26 Jun 2023 16:52:00 +0000 (18:52 +0200)]
usb: typec: ucsi: move typec_set_mode(TYPEC_STATE_SAFE) to ucsi_unregister_partner()
It's better to set TYPEC_STATE_SAFE mode from ucsi_unregister_partner()
instead of ucsi_partner_change(), ucsi_unregister_partner() is always
when the partner disconnects.
Guiting Shen [Mon, 26 Jun 2023 15:27:13 +0000 (23:27 +0800)]
usb: ohci-at91: Fix the unhandle interrupt when resume
The ohci_hcd_at91_drv_suspend() sets ohci->rh_state to OHCI_RH_HALTED when
suspend which will let the ohci_irq() skip the interrupt after resume. And
nobody to handle this interrupt.
According to the comment in ohci_hcd_at91_drv_suspend(), it need to reset
when resume from suspend(MEM) to fix by setting "hibernated" argument of
ohci_resume().
platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
Only the HW rfkill state is toggled on laptops with quirks->ec_read_only
(so far only MSI Wind U90/U100). There are, however, a few issues with
the implementation:
1. The initial HW state is always unblocked, regardless of the actual
state on boot, because msi_init_rfkill only sets the SW state,
regardless of ec_read_only.
2. The initial SW state corresponds to the actual state on boot, but it
can't be changed afterwards, because set_device_state returns
-EOPNOTSUPP. It confuses the userspace, making Wi-Fi and/or Bluetooth
unusable if it was blocked on boot, and breaking the airplane mode if
the rfkill was unblocked on boot.
Address the above issues by properly initializing the HW state on
ec_read_only laptops and by allowing the userspace to toggle the SW
state. Don't set the SW state ourselves and let the userspace fully
control it. Toggling the SW state is a no-op, however, it allows the
userspace to properly toggle the airplane mode. The actual SW radio
disablement is handled by the corresponding rtl818x_pci and btusb
drivers that have their own rfkills.
Tested on MSI Wind U100 Plus, BIOS ver 1.0G, EC ver 130.
platform/x86/intel/hid: Add HP Dragonfly G2 to VGBS DMI quirks
HP Elite Dragonfly G2 (a convertible laptop/tablet) has a reliable VGBS
method. If VGBS is not called on boot, the firmware sends an initial
0xcd event shortly after calling the BTNL method, but only if the device
is booted in the laptop mode. However, if the device is booted in the
tablet mode and VGBS is not called, there is no initial 0xcc event, and
the input device for SW_TABLET_MODE is not registered up until the user
turns the device into the laptop mode.
Call VGBS on boot on this device to get the initial state of
SW_TABLET_MODE in a reliable way.
On a HP Elite Dragonfly G2 the 0xcc and 0xcd events for SW_TABLET_MODE
are only send after the BTNL ACPI method has been called.
Likely more devices need this, so make the BTNL ACPI method unconditional
instead of only doing it on devices with a 5 button array.
Note this also makes the intel_button_array_enable() call in probe()
unconditional, that function does its own priv->array check. This makes
the intel_button_array_enable() call in probe() consistent with the calls
done on suspend/resume which also rely on the priv->array check inside
the function.
Shyam Sundar S K [Fri, 14 Jul 2023 14:44:35 +0000 (20:14 +0530)]
platform/x86/amd/pmf: Notify OS power slider update
APMF fn8 can notify EC about the OS slider position change. Add this
capability to the PMF driver so that it can call the APMF fn8 based on
the changes in the Platform profile events.
ALSA: usb-audio: Add quirk for Microsoft Modern Wireless Headset
Microsoft Modern Wireless Headset (appearing on the host as "Microsoft
USB Link") has a playback and a capture mixer volume/switch, but they
are fairly broken. The descriptor reports wrong dB ranges for
playback, and the capture volume/switch don't influence on the actual
recording at all. Moreover, there seem instabilities in the
connection, and at best, we should disable the runtime PM.
So this ended up with a quirk entry for:
- Correct the playback dB range;
I picked up some reasonable values but it's a guess work
- Disable the capture mixer;
it's completely useless and confuses PA/PW
- Suppress get-sample-rate, apply the delay for message handling,
and suppress the auto-suspend
The behavior of the wheel control on the headset is somehow flaky,
too, but it's an issue of HID.
ASoC: atmel: Fix the 8K sample parameter in I2SC master
The 8K sample parameter of 12.288Mhz main system bus clock doesn't work
because the I2SC_MR.IMCKDIV must not be 0 according to the sama5d2
series datasheet(I2SC Mode Register of Register Summary).
So use the 6.144Mhz instead of 12.288Mhz to support 8K sample.
Shuming Fan [Fri, 21 Jul 2023 09:07:11 +0000 (17:07 +0800)]
ASoC: rt711-sdca: fix for JD event handling in ClockStop Mode0
When the system suspends, peripheral SDCA interrupts are disabled.
When system level resume is invoked, the peripheral SDCA interrupts
should be enabled to handle JD events.
Enable SDCA interrupts in resume sequence when ClockStop Mode0 is applied.
Shuming Fan [Fri, 21 Jul 2023 09:06:54 +0000 (17:06 +0800)]
ASoC: rt711: fix for JD event handling in ClockStop Mode0
When the system suspends, peripheral Imp-defined interrupt is disabled.
When system level resume is invoked, the peripheral Imp-defined interrupts
should be enabled to handle JD events.
Shuming Fan [Fri, 21 Jul 2023 09:07:32 +0000 (17:07 +0800)]
ASoC: rt722-sdca: fix for JD event handling in ClockStop Mode0
When the system suspends, peripheral SDCA interrupts are disabled.
When system level resume is invoked, the peripheral SDCA interrupts
should be enabled to handle JD events.
Enable SDCA interrupts in resume sequence when ClockStop Mode0 is applied.
Shuming Fan [Fri, 21 Jul 2023 09:07:21 +0000 (17:07 +0800)]
ASoC: rt712-sdca: fix for JD event handling in ClockStop Mode0
When the system suspends, peripheral SDCA interrupts are disabled.
When system level resume is invoked, the peripheral SDCA interrupts
should be enabled to handle JD events.
Enable SDCA interrupts in resume sequence when ClockStop Mode0 is applied.
Shuming Fan [Fri, 21 Jul 2023 09:06:43 +0000 (17:06 +0800)]
ASoC: rt5682-sdw: fix for JD event handling in ClockStop Mode0
When the system suspends, peripheral Imp-defined interrupt is disabled.
When system level resume is invoked, the peripheral Imp-defined interrupts
should be enabled to handle JD events.
net: stmmac: Apply redundant write work around on 4.xx too
commit a3a57bf07de23fe1ff779e0fdf710aa581c3ff73 ("net: stmmac: work
around sporadic tx issue on link-up") worked around a problem with TX
sometimes not working after a link-up by avoiding a redundant write to
MAC_CTRL_REG (aka GMAC_CONFIG), since the IP appeared to have problems
with handling multiple writes to that register in some cases.
That commit however only added the work around to dwmac_lib.c (apart
from the common code in stmmac_main.c), but my systems with version
4.21a of the IP exhibit the same problem, so add the work around to
dwmac4_lib.c too.
As of today, hash extraction support is enabled for all the silicons.
Because of which we are facing initialization issues when the silicon
does not support hash extraction. During creation of the hardware
parsing table for IPv6 address, we need to consider if hash extraction
is enabled then extract only 32 bit, otherwise 128 bit needs to be
extracted. This patch fixes the issue and configures the hardware parser
based on the availability of the feature.
Dpt objects that are created from internal get evicted when there is
memory pressure and do not get restored when pinned during scanout. The
pinned page table entries look corrupted and programming the display
engine with the incorrect pte's result in DE throwing pipe faults.
Create DPT objects from shmem and mark the object as dirty when pinning so
that the object is restored when shrinker evicts an unpinned buffer object.
v2: Unconditionally mark the dpt objects dirty during pinning(Chris).
====================
Fix up dev flags when add P2P down link
When adding p2p interfaces to bond/team. The POINTOPOINT, NOARP flags are
not inherit to up devices. Which will trigger IPv6 DAD. Since there is
no ethernet MAC address for P2P devices. This will cause unexpected DAD
failures.
====================
Hangbin Liu [Fri, 21 Jul 2023 04:03:56 +0000 (12:03 +0800)]
team: reset team's flags when down link is P2P device
When adding a point to point downlink to team device, we neglected to reset
the team's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to team device. Fix this by
remove multicast/broadcast flags and add p2p and noarp flags.
After removing the none ethernet interface and adding an ethernet interface
to team, we need to reset team interface flags. Unlike bonding interface,
team do not need restore IFF_MASTER, IFF_SLAVE flags.
Reported-by: Liang Li <[email protected]> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
Hangbin Liu [Fri, 21 Jul 2023 04:03:55 +0000 (12:03 +0800)]
bonding: reset bond's flags when down link is P2P device
When adding a point to point downlink to the bond, we neglected to reset
the bond's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to the bonding.
To address this issue, let's reset the bond's flags for P2P interfaces.
Before fix:
7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188::
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/gre6 2006:70:10::1 brd 2006:70:10::2
inet6 fe80::200:ff:fe00:0/64 scope link
valid_lft forever preferred_lft forever
After fix:
7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9::
8: bond0: <POINTOPOINT,NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2
inet6 fe80::1/64 scope link
valid_lft forever preferred_lft forever
Reported-by: Liang Li <[email protected]> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
Steve French [Tue, 25 Jul 2023 06:05:23 +0000 (01:05 -0500)]
smb3: do not set NTLMSSP_VERSION flag for negotiate not auth request
The NTLMSSP_NEGOTIATE_VERSION flag only needs to be sent during
the NTLMSSP NEGOTIATE (not the AUTH) request, so filter it out for
NTLMSSP AUTH requests. See MS-NLMP 2.2.1.3
This fixes a problem found by the gssntlmssp server.
We need to specify charset, like "iocharset=utf-8", in mount options for
Chinese path if the nls_default don't support it, such as iso8859-1, the
default value for CONFIG_NLS_DEFAULT.
But now in reconnection the nls_default is used, instead of the one we
specified and used in mount, and this can lead to mount failure.
Jakub Kicinski [Tue, 25 Jul 2023 00:12:06 +0000 (17:12 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-07-21 (i40e, iavf)
This series contains updates to i40e and iavf drivers.
Wang Ming corrects an error check on i40e.
Jake unlocks crit_lock on allocation failure to prevent deadlock and
stops re-enabling of interrupts when it's not intended for iavf.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: check for removal state before IAVF_FLAG_PF_COMMS_FAILED
iavf: fix potential deadlock on allocation failure
i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
====================
Jakub Kicinski [Tue, 25 Jul 2023 00:10:10 +0000 (17:10 -0700)]
Merge tag 'linux-can-fixes-for-6.5-20230724' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2023-07-24
The first patch is by me and adds a missing set of CAN state to
CAN_STATE_STOPPED on close in the gs_usb driver.
The last patch is by Eric Dumazet and fixes a lockdep issue in the CAN
raw protocol.
* tag 'linux-can-fixes-for-6.5-20230724' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: raw: fix lockdep issue in raw_release()
can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
====================
Fix ethtool FDIR logic to not use memory after its release.
In the ice_ethtool_fdir.c file there are 2 spots where code can
refer to pointers which may be missing.
In the ice_cfg_fdir_xtrct_seq() function seg may be freed but
even then may be still used by memcpy(&tun_seg[1], seg, sizeof(*seg)).
In the ice_add_fdir_ethtool() function struct ice_fdir_fltr *input
may first fail to be added via ice_fdir_update_list_entry() but then
may be deleted by ice_fdir_update_list_entry.
Terminate in both cases when the returned value of the previous
operation is other than 0, free memory and don't use it anymore.
Stewart Smith [Fri, 21 Jul 2023 22:24:10 +0000 (15:24 -0700)]
tcp: Reduce chance of collisions in inet6_hashfn().
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash
table with a hash over the source & destination addresses and ports.
However, the IPv6 hash is insufficient and can lead to a high rate of
collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the
fast jenkins hash, meaning it is possible for an external entity to
ensure the hash collides, thus falling back to a linear search in the
bucket, which is slow.
We take the approach of hash the full length of IPv6 address in
__ipv6_addr_jhash() so that all users can benefit from a more secure
version.
While this may look like it adds overhead, the reality of modern CPUs
means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing
code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9
on Nehalem (base of ~173).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash")
netfilter switched from a jenkins hash to a siphash, but even the faster
hsiphash is a more significant overhead (~20-30%) in some preliminary
testing. So, in this patch, we keep to the more conservative approach to
ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the
connection buckets. In both testing and real-world scenarios, we have
not found any measurable performance impact.
net: fec: avoid tx queue timeout when XDP is enabled
According to the implementation of XDP of FEC driver, the XDP path
shares the transmit queues with the kernel network stack, so it is
possible to lead to a tx timeout event when XDP uses the tx queue
pretty much exclusively. And this event will cause the reset of the
FEC hardware.
To avoid timeout in this case, we use the txq_trans_cond_update()
interface to update txq->trans_start to jiffies so that watchdog
won't generate a transmit timeout warning.
ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
currently on 6.4 net/main:
# ip link add dummy1 type dummy
# echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr
# ip link set dummy1 up
# ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic
valid_lft 604800sec preferred_lft 86172sec
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1
(can wait a few seconds if you want to, the above delete isn't [directly] the problem)
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic
valid_lft 604797sec preferred_lft 86169sec
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
This patch prevents this new 'global temporary dynamic' address from being
created by the deletion of the related (same subnet prefix) 'mngtmpaddr'
(which is triggered by there already being no temporary addresses).
Mark Brown [Sat, 22 Jul 2023 23:27:22 +0000 (00:27 +0100)]
ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
The WM8904_ADC_TEST_0 register is modified as part of updating the OSR
controls but does not have a cache default, leading to errors when we try
to modify these controls in cache only mode with no prior read:
wm8904 3-001a: ASoC: error at snd_soc_component_update_bits on wm8904.3-001a for register: [0x000000c6] -16
Add a read of the register to probe() to fill the cache and avoid both the
error messages and the misconfiguration of the chip which will result.
This series includes 2 patches related to (but not fixing) the following
I2C failure which occurs sometimes during system suspend or resume and
indicates a problem with a spurious DA7219 interrupt:
This log shows that DA7219 AAD interrupt handler da7219_aad_irq_thread()
is unexpectedly running when DA7219 is suspended and should not generate
interrupts. As a result, the IRQ handler is trying to read AAD IRQ event
status over I2C and is hitting the I2C driver "Transfer while suspended"
failure.
Patch #1 adds synchronize_irq() when suspending DA7219, to prevent the
IRQ handler from running after suspending if there is a pending IRQ
generated before suspending. With this patch the above failure is still
reproducible, so this patch does not fix any real observed issue so far,
but at least is useful for confirming that the above issue is not caused
by a pending IRQ but rather looks like a DA7219 hardware issue with an
unexpectedly generated IRQ.
Patch #2 does not fix the above issue either, but it prevents its
potentially harmful side effects. With the existing code, if the issue
occurs and the IRQ handler fails to read the AAD IRQ events status over
I2C, it does not check that and tries to use the garbage uninitialized
value of the events status, potentially reporting bogus events. This
patch fixes that by adding missing error checking.
In fact I'm sending these patches not only to submit them for review but
also to ask Renesas folks for any hints on a possible cause of the
described DA7219 issue (AAD interrupts spuriously firing after jack
detection is already disabled) or how to debug it further.
Stefan Haberland [Fri, 21 Jul 2023 19:36:47 +0000 (21:36 +0200)]
s390/dasd: print copy pair message only for the correct error
The DASD driver has certain types of requests that might be rejected by
the storage server or z/VM because they are not supported. Since the
missing support of the command is not a real issue there is no user
visible kernel error message for this.
For copy pair setups there is a specific error that IO is not allowed on
secondary devices. This error case is explicitly handled and an error
message is printed.
The code checking for the error did use a bitwise 'and' that is used to
check for specific bits. But in this case the whole sense byte has to
match.
This leads to the problem that the copy pair related error message is
erroneously printed for other error cases that are usually not reported.
This might heavily confuse users and lead to follow on actions that might
disrupt application processing.
Fix by checking the sense byte for the exact value and not single bits.
Stefan Haberland [Fri, 21 Jul 2023 19:36:46 +0000 (21:36 +0200)]
s390/dasd: fix hanging device after request requeue
The DASD device driver has a function to requeue requests to the
blocklayer.
This function is used in various cases when basic settings for the device
have to be changed like High Performance Ficon related parameters or copy
pair settings.
The functions iterates over the device->ccw_queue and also removes the
requests from the block->ccw_queue.
In case the device is started on an alias device instead of the base
device it might be removed from the block->ccw_queue without having it
canceled properly before. This might lead to a hanging device since the
request is no longer on a queue and can not be handled properly.
Fix by iterating over the block->ccw_queue instead of the
device->ccw_queue. This will take care of all blocklayer related requests
and handle them on all associated DASD devices.
Stefan Haberland [Fri, 21 Jul 2023 19:36:45 +0000 (21:36 +0200)]
s390/dasd: use correct number of retries for ERP requests
If a DASD request fails an error recovery procedure (ERP) request might
be built as a copy of the original request to do error recovery.
The ERP request gets a number of retries assigned.
This number is always 256 no matter what other value might have been set
for the original request. This is not what is expected when a user
specifies a certain amount of retries for the device via sysfs.
Correctly use the number of retries of the original request for ERP
requests.
Stefan Haberland [Fri, 21 Jul 2023 19:36:44 +0000 (21:36 +0200)]
s390/dasd: fix hanging device after quiesce/resume
Quiesce and resume are functions that tell the DASD driver to stop/resume
issuing I/Os to a specific DASD.
On resume dasd_schedule_block_bh() is called to kick handling of IO
requests again. This does unfortunately not cover internal requests which
are used for path verification for example.
This could lead to a hanging device when a path event or anything else
that triggers internal requests occurs on a quiesced device.
Fix by also calling dasd_schedule_device_bh() which triggers handling of
internal requests on resume.
io_uring: gate iowait schedule on having pending requests
A previous commit made all cqring waits marked as iowait, as a way to
improve performance for short schedules with pending IO. However, for
use cases that have a special reaper thread that does nothing but
wait on events on the ring, this causes a cosmetic issue where we
know have one core marked as being "busy" with 100% iowait.
While this isn't a grave issue, it is confusing to users. Rather than
always mark us as being in iowait, gate setting of current->in_iowait
to 1 by whether or not the waiting task has pending requests.
The pidfd_getfd() system call allows a caller with ptrace_may_access()
abilities on another process to steal a file descriptor from this
process. This system call is used by debuggers, container runtimes,
system call supervisors, networking proxies etc. So while it is a
special interest system call it is used in common tools.
That ability ends up breaking our long-time optimization in fdget_pos(),
which "knew" that if we had exclusive access to the file descriptor
nobody else could access it, and we didn't need the lock for the file
position.
That check for file_count(file) was always fairly subtle - it depended
on __fdget() not incrementing the file count for single-threaded
processes and thus included that as part of the rule - but it did mean
that we didn't need to take the lock in all those traditional unix
process contexts.
So it's sad to see this go, and I'd love to have some way to re-instate
the optimization. At the same time, the lock obviously isn't ever
contended in the case we optimized, so all we were optimizing away is
the atomics and the cacheline dirtying. Let's see if anybody even
notices that the optimization is gone.
Filipe Manana [Fri, 21 Jul 2023 09:49:20 +0000 (10:49 +0100)]
btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
At btrfs_wait_for_commit() we wait for a transaction to finish and then
always return 0 (success) without checking if it was aborted, in which
case the transaction didn't happen due to some critical error. Fix this
by checking if the transaction was aborted.
Filipe Manana [Fri, 30 Jun 2023 15:03:44 +0000 (16:03 +0100)]
btrfs: remove BUG_ON()'s in add_new_free_space()
At add_new_free_space() we have these BUG_ON()'s that are there to deal
with any failure to add free space to the in memory free space cache.
Such failures are mostly -ENOMEM that should be very rare. However there's
no need to have these BUG_ON()'s, we can just return any error to the
caller and all callers and their upper call chain are already dealing with
errors.
So just make add_new_free_space() return any errors, while removing the
BUG_ON()'s, and returning the total amount of added free space to an
optional u64 pointer argument.
Merge tag 'x86_bugs_zenbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Zen 2 errata fix from Borislav Petkov:
"Fix an issue on AMD Zen2 processors called Zenbleed.
The bug manifests itself as a data corruption issue when executing
VZEROUPPER under certain microarchitectural conditions"
* tag 'x86_bugs_zenbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/amd: Add a Zenbleed fix
x86/cpu/amd: Move the errata checking functionality up
hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
Because of hex value 0x46 used instead of decimal 46, the temp6
(PECI1) temperature is always declared visible and then displayed
even if disabled in the chip
Ben Hutchings [Fri, 16 Jun 2023 15:36:10 +0000 (17:36 +0200)]
m68k: Fix invalid .section syntax
gas supports several different forms for .section for ELF targets,
including:
.section NAME [, "FLAGS"[, @TYPE[,FLAG_SPECIFIC_ARGUMENTS]]]
and:
.section "NAME"[, #FLAGS...]
In several places we use a mix of these two forms:
.section NAME, #FLAGS...
A current development snapshot of binutils (2.40.50.20230611) treats
this mixed syntax as an error.
Change to consistently use:
.section NAME, "FLAGS"
as is used elsewhere in the kernel.
phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
The size of array 'priv->ports[]' is INNO_PHY_PORT_NUM.
In the for loop, 'i' is used as the index for array 'priv->ports[]'
with a check (i > INNO_PHY_PORT_NUM) which indicates that
INNO_PHY_PORT_NUM is allowed value for 'i' in the same loop.
This > comparison needs to be changed to >=, otherwise it potentially leads
to an out of bounds write on the next iteration through the loop
David S. Miller [Mon, 24 Jul 2023 09:47:09 +0000 (10:47 +0100)]
Merge branch 'vxlan-gro-fixes'
Jiri Benc says:
====================
vxlan: fix GRO with VXLAN-GPE
The first patch generalizes code for the second patch, which is a fix for
broken VXLAN-GPE GRO. Thanks to Paolo for noticing the bug.
====================
In VXLAN-GPE, there may not be an Ethernet header following the VXLAN
header. But in GRO, the vxlan driver calls eth_gro_receive
unconditionally, which means the following header is incorrectly parsed
as Ethernet.
Introduce GPE specific GRO handling.
For better performance, do not check for GPE during GRO but rather
install a different set of functions at setup time.
vxlan: generalize vxlan_parse_gpe_hdr and remove unused args
The vxlan_parse_gpe_hdr function extracts the next protocol value from
the GPE header and marks GPE bits as parsed.
In order to be used in the next patch, split the function into protocol
extraction and bit marking. The bit marking is meaningful only in
vxlan_rcv; move it directly there.
Rename the function to vxlan_parse_gpe_proto to reflect what it now
does. Remove unused arguments skb and vxflags. Move the function earlier
in the file to allow it to be called from more places in the next patch.
VXLAN-GPE does not add an extra inner Ethernet header. Take that into
account when calculating header length.
This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is
cached.
In the collect_md mode (which is the only mode that VXLAN-GPE
supports), there's no magic auto-setting of the tunnel interface MTU.
It can't be, since the destination and thus the underlying interface
may be different for each packet.
So, the administrator is responsible for setting the correct tunnel
interface MTU. Apparently, the administrators are capable enough to
calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36).
They set the tunnel interface MTU to 1464. If you run a TCP stream over
such interface, it's then segmented according to the MTU 1464, i.e.
producing 1514 bytes frames. Which is okay, this still fits the lower
MTU.
However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as
the header size and thus incorrectly calculates the frame size to be
1528. This leads to ICMP too big message being generated (locally),
PMTU of 1450 to be cached and the TCP stream to be resegmented.
The fix is to use the correct actual header size, especially for
skb_tunnel_check_pmtu calculation.
net: hns3: fix wrong bw weight of disabled tc issue
In dwrr mode, the default bandwidth weight of disabled tc is set to 0.
If the bandwidth weight is 0, the mode will change to sp.
Therefore, disabled tc default bandwidth weight need changed to 1,
and 0 is returned when query the bandwidth weight of disabled tc.
In addition, driver need stop configure bandwidth weight if tc is disabled.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Jie Wang <[email protected]> Signed-off-by: Jijie Shao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: hns3: fix wrong tc bandwidth weight data issue
Currently, the weight saved by the driver is used as the query result,
which may be different from the actual weight in the register.
Therefore, the register value read from the firmware is used
as the query result
Fixes: 0e32038dc856 ("net: hns3: refactor dump tc of debugfs") Signed-off-by: Jijie Shao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Hao Lan [Thu, 20 Jul 2023 02:05:08 +0000 (10:05 +0800)]
net: hns3: add tm flush when setting tm
When the tm module is configured with traffic, traffic
may be abnormal. This patch fixes this problem.
Before the tm module is configured, traffic processing
should be stopped. After the tm module is configured,
traffic processing is enabled.
Hao Lan [Thu, 20 Jul 2023 02:05:07 +0000 (10:05 +0800)]
net: hns3: fix the imp capability bit cannot exceed 32 bits issue
Current only the first 32 bits of the capability flag bit are considered.
When the matching capability flag bit is greater than 31 bits,
it will get an error bit.This patch use bitmap to solve this issue.
It can handle each capability bit whitout bit width limit.
Fixes: da77aef9cc58 ("net: hns3: create common cmdq resource allocate/free/query APIs") Signed-off-by: Hao Lan <[email protected]> Signed-off-by: Jijie Shao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Thomas Gleixner [Mon, 24 Jul 2023 08:27:43 +0000 (10:27 +0200)]
Merge tag 'irqchip-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:
- Work around an erratum on GIC700, where a race between a CPU
handling a wake-up interrupt, a change of affinity, and another
CPU going to sleep can result in a lack of wake-up event on the
next interrupt.
- Fix the locking required on a VPE for GICv4
- Enable Rockchip 3588001 erratum workaround for RK3588S
- Fix the irq-bcm6345-l1 assumtions of the boot CPU always be
the first CPU in the system
Johan Hovold [Thu, 13 Jul 2023 14:57:41 +0000 (16:57 +0200)]
serial: qcom-geni: drop bogus runtime pm state update
The runtime PM state should not be changed by drivers that do not
implement runtime PM even if it happens to work around a bug in PM core.
With the wake irq arming now fixed, drop the bogus runtime PM state
update which left the device in active state (and could potentially
prevent a parent device from suspending).
Johan Hovold [Thu, 13 Jul 2023 14:57:40 +0000 (16:57 +0200)]
PM: sleep: wakeirq: drop unused enable helpers
Drop the wake-irq enable and disable helpers which have not been used
since commit bed570307ed7 ("PM / wakeirq: Fix dedicated wakeirq for
drivers not using autosuspend").
Note that these functions are essentially just leftovers from the first
iteration of the wake-irq implementation where device drivers were
supposed to call these functions themselves instead of PM core (as
is also indicated by the bogus kernel doc comments).
Johan Hovold [Thu, 13 Jul 2023 14:57:39 +0000 (16:57 +0200)]
PM: sleep: wakeirq: fix wake irq arming
The decision whether to enable a wake irq during suspend can not be done
based on the runtime PM state directly as a driver may use wake irqs
without implementing runtime PM. Such drivers specifically leave the
state set to the default 'suspended' and the wake irq is thus never
enabled at suspend.
Add a new wake irq flag to track whether a dedicated wake irq has been
enabled at runtime suspend and therefore must not be enabled at system
suspend.
Note that pm_runtime_enabled() can not be used as runtime PM is always
disabled during late suspend.
Ahmad Fatoum [Sat, 8 Jul 2023 11:27:20 +0000 (13:27 +0200)]
thermal: of: fix double-free on unregistration
Since commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal
zone parameters structure"), thermal_zone_device_register() allocates
a copy of the tzp argument and frees it when unregistering, so
thermal_of_zone_register() now ends up leaking its original tzp and
double-freeing the tzp copy. Fix this by locating tzp on stack instead.
Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure") Signed-off-by: Ahmad Fatoum <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Cc: 6.4+ <[email protected]> # 6.4+: 8bcbb18c61d6: thermal: core: constify params in thermal_zone_device_register Signed-off-by: Rafael J. Wysocki <[email protected]>
Ahmad Fatoum [Sat, 8 Jul 2023 11:27:19 +0000 (13:27 +0200)]
thermal: core: constify params in thermal_zone_device_register
Since commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone
parameters structure"), thermal_zone_device_register() allocates a copy
of the tzp argument and callers need not explicitly manage its lifetime.
This means the function no longer cares about the parameter being
mutable, so constify it.
mm,memblock: reset memblock.reserved to system init state to prevent UAF
The memblock_discard function frees the memblock.reserved.regions
array, which is good.
However, if a subsequent memblock_free (or memblock_phys_free) comes
in later, from for example ima_free_kexec_buffer, that will result in
a use after free bug in memblock_isolate_range.
When running a kernel with CONFIG_KASAN enabled, this will cause a
kernel panic very early in boot. Without CONFIG_KASAN, there is
a chance that memblock_isolate_range might scribble on memory
that is now in use by somebody else.
Avoid those issues by making sure that memblock_discard points
memblock.reserved.regions back at the static buffer.
If memblock_free is called after memblock memory is discarded, that will
print a warning in memblock_remove_region.
After syncing the enable status of VCN33_WIFI to VCN33_BT, the driver
will disable VCN33_WIFI. If it fails it will error out with a message.
However the error message incorrectly refers to VCN33_BT.
Fix the error message so that it correctly refers to VCN33_WIFI.
ASoC: da7219: Check for failure reading AAD IRQ events
When handling an AAD interrupt, if IRQ events read failed (for example,
due to i2c "Transfer while suspended" failure, i.e. when attempting to
read it while DA7219 is suspended, which may happen due to a spurious
AAD interrupt), the events array contains garbage uninitialized values.
So instead of trying to interprete those values and doing any actions
based on them (potentially resulting in misbehavior, e.g. reporting
bogus events), refuse to handle the interrupt.
ASoC: da7219: Flush pending AAD IRQ when suspending
da7219_aad_suspend() disables jack detection, which should prevent
generating new interrupts by DA7219 while suspended. However, there is a
theoretical possibility that there is a pending interrupt generated just
before suspending DA7219 and not handled yet, so the IRQ handler may
still run after DA7219 is suspended. To prevent that, wait until the
pending IRQ handling is done.
This patch arose as an attempt to fix the following I2C failure
occurring sometimes during system suspend or resume:
which indicates that the AAD IRQ handler is unexpectedly running when
DA7219 is suspended, and as a result, is trying to read data from DA7219
over I2C and is hitting the I2C driver "Transfer while suspended"
failure.
However, with this patch the above failure is still reproducible. So
this patch does not fix any real observed issue so far, but at least is
useful for confirming that the above issue is not caused by a pending
IRQ but rather looks like a DA7219 hardware issue with an IRQ
unexpectedly generated after jack detection is already disabled.
Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Swapping the ring buffer for snapshotting (for things like irqsoff)
can crash if the ring buffer is being resized. Disable swapping when
this happens. The missed swap will be reported to the tracer
- Report error if the histogram fails to be created due to an error in
adding a histogram variable, in event_hist_trigger_parse()
- Remove unused declaration of tracing_map_set_field_descr()
* tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
ring-buffer: Do not swap cpu_buffer during resize process
tracing: Remove unused extern declaration tracing_map_set_field_descr()
Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix stale help text in gconfig
- Support *.S files in compile_commands.json
- Flatten KBUILD_CFLAGS
- Fix external module builds with Rust so that temporary files are
created in the modules directories instead of the kernel tree
* tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: rust: avoid creating temporary files
kbuild: flatten KBUILD_CFLAGS
gen_compile_commands: add assembly files to compilation database
kconfig: gconfig: correct program name in help text
kconfig: gconfig: drop the Show Debug Info help text