Git Repo - linux.git/log

entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up

RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB). These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).

However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.

Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.

Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

perf/x86/amd/core: Always clear status for idx

The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:

WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270

This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.

Move the bit cleaning together above, together when the "handled"
counter is incremented.

Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sandipan Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

sched/fair: Sanitize vruntime of entity being migrated

Commit 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.

For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.

In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.

Fixes: 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Zhang Qiao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

entry: Fix noinstr warning in __enter_from_user_mode()

__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().

The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().

Just use __ct_state() instead.

Fixes the following warnings:

  vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section

Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org

drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F

Like the Windows Lenovo Yoga Book X91F/L the Android Lenovo Yoga Book
X90F/L has a portrait 1200x1920 screen used in landscape mode,
add a quirk for this.

When the quirk for the X91F/L was initially added it was written to
also apply to the X90F/L but this does not work because the Android
version of the Yoga Book uses completely different DMI strings.
Also adjust the X91F/L quirk to reflect that it only applies to
the X91F/L models.

Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

octeontx2-vf: Add missing free for alloc_percpu

Add the free_percpu for the allocated "vf->hw.lmt_info" in order to avoid
memory leak, same as the "pf->hw.lmt_info" in
`drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c`.

Fixes: 5c0512072f65 ("octeontx2-pf: cn10k: Use runtime allocated LMTLINE region")
Signed-off-by: Jiasheng Jiang <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Acked-by: Geethasowjanya Akula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

io_uring/net: avoid sending -ECONNABORTED on repeated connection requests

Since io_uring does nonblocking connect requests, if we do two repeated
ones without having a listener, the second will get -ECONNABORTED rather
than the expected -ECONNREFUSED. Treat -ECONNABORTED like a normal retry
condition if we're nonblocking, if we haven't already seen it.

Cc: [email protected]
Fixes: 3fb1bd688172 ("io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT")
Link: https://github.com/axboe/liburing/issues/828
Reported-by: Hui, Chunyang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

block/io_uring: pass in issue_flags for uring_cmd task_work handling

io_uring_cmd_done() currently assumes that the uring_lock is held
when invoked, and while it generally is, this is not guaranteed.
Pass in the issue_flags associated with it, so that we have
IO_URING_F_UNLOCKED available to be able to lock the CQ ring
appropriately when completing events.

Cc: [email protected]
Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

Pull fsverity fixes from Eric Biggers:
"Fix two significant performance issues with fsverity"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
fsverity: Remove WQ_UNBOUND from fsverity read workqueue

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt fix from Eric Biggers:
"Fix a bug where when a filesystem was being unmounted, the fscrypt
  keyring was destroyed before inodes have been released by the Landlock
  LSM.

  This bug was found by syzbot"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()
  fscrypt: improve fscrypt_destroy_keyring() documentation
  fscrypt: destroy keyring after security_sb_delete()

zonefs: Fix error message in zonefs_file_dio_append()

Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.

Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>

zonefs: Prevent uninitialized symbol 'size' warning

In zonefs_file_dio_append(), initialize the variable size to 0 to
prevent compilation and static code analizers warning such as:

New smatch warnings:
fs/zonefs/file.c:441 zonefs_file_dio_append() error: uninitialized
symbol 'size'.

The warning is a false positive as size is never actually used
uninitialized.

No functional change.

Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>

gpu: host1x: fix uninitialized variable use

The error handling for platform_get_irq() failing no longer works after
a recent change, clang now points this out with a warning:

  drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized when used here [-Werror,-Wuninitialized]
          if (syncpt_irq < 0)
              ^~~~~~~~~~

Fix this by removing the variable and checking the correct error status.

Fixes: 625d4ffb438c ("gpu: host1x: Rewrite syncpoint interrupt handling")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Jon Hunter <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Mikko Perttunen <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG

The Acer Aspire 3830TG predates Windows 8, so it defaults to using
acpi_video# for backlight control, but this is non functional on
this model.

Add a DMI quirk to use the native backlight interface which does
work properly.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

- Fix /proc/PID/io read_bytes accounting

- Fix setting NLM file_lock start and end during decoding testargs

- Fix timing for setting access cache timestamps

* tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Correct timing for assigning access cache timestamp
  lockd: set file_lock start and end when decoding nlm4 testargs
  NFS: Fix /proc/PID/io read_bytes for buffered reads

thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit

cppcheck reports
drivers/thunderbolt/nhi.c:74:7: style: Local variable 'bit' shadows outer variable [shadowVariable]
  int bit;
      ^
drivers/thunderbolt/nhi.c:66:6: note: Shadowed declaration
int bit = ring_interrupt_index(ring) & 31;
     ^
drivers/thunderbolt/nhi.c:74:7: note: Shadow variable
  int bit;
      ^
For readablity rename the outer to interrupt_bit and the innner
to auto_clear_bit.

Fixes: 468c49f44759 ("thunderbolt: Disable interrupt auto clear for ring")
Cc: [email protected]
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

Revert "drm/i915/hwmon: Enable PL1 power limit"

This reverts commit ee892ea83d99610fa33bea612de058e0955eec3a.

It was accidentally picked up for backporting. Revert.

Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge branch 'ps3_gelic_net-fixes'

Geoff Levand says:

====================
net/ps3_gelic_net: DMA related fixes

v9: Make rx_skb_size local to gelic_descr_prepare_rx.
v8: Add more cpu_to_be32 calls.
v7: Remove all cleanups, sync to spider net.
v6: Reworked and cleaned up patches.
v5: Some additional patch cleanups.
v4: More patch cleanups.
v3: Cleaned up patches as requested.
====================

Signed-off-by: David S. Miller <[email protected]>

net/ps3_gelic_net: Use dma_mapping_error

The current Gelic Etherenet driver was checking the return value of its
dma_map_single call, and not using the dma_mapping_error() routine.

Fixes runtime problems like these:

DMA-API: ps3_gelic_driver sb_05: device driver failed to check map error
WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1027 .check_unmap+0x888/0x8dc

Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
Reviewed-by: Alexander Duyck <[email protected]>
Signed-off-by: Geoff Levand <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ps3_gelic_net: Fix RX sk_buff length

The Gelic Ethernet device needs to have the RX sk_buffs aligned to
GELIC_NET_RXBUF_ALIGN, and also the length of the RX sk_buffs must
be a multiple of GELIC_NET_RXBUF_ALIGN.

The current Gelic Ethernet driver was not allocating sk_buffs large
enough to allow for this alignment.

Also, correct the maximum and minimum MTU sizes, and add a new
preprocessor macro for the maximum frame size, GELIC_NET_MAX_FRAME.

Fixes various randomly occurring runtime network errors.

Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Geoff Levand <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

usb: plusb: remove unused pl_clear_QuickLink_features function

clang with W=1 reports
drivers/net/usb/plusb.c:65:1: error:
unused function 'pl_clear_QuickLink_features' [-Werror,-Wunused-function]
pl_clear_QuickLink_features(struct usbnet *dev, int val)
^
This static function is not used, so remove it.

Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: usb: lan78xx: Limit packet length to skb->len

Packet length retrieved from descriptor may be larger than
the actual socket buffer length. In such case the cloned
skb passed up the network stack will leak kernel memory contents.

Additionally prevent integer underflow when size is less than
ETH_FCS_LEN.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Szymon Heidrich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: qcom/emac: Fix use after free bug in emac_remove due to race condition

In emac_probe, &adpt->work_thread is bound with
emac_work_thread. Then it will be started by timeout
handler emac_tx_timeout or a IRQ handler emac_isr.

If we remove the driver which will call emac_remove
  to make cleanup, there may be a unfinished work.

The possible sequence is as follows:

Fix it by finishing the work before cleanup in the emac_remove
and disable timeout response.

CPU0                  CPU1

                    |emac_work_thread
emac_remove         |
free_netdev         |
kfree(netdev);      |
                    |emac_reinit_locked
                    |emac_mac_down
                    |//use netdev
Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: Zheng Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: report rx_bytes unadjusted for ETH_HLEN

We collect the software statistics counters for RX bytes (reported to
/proc/net/dev and to ethtool -S $dev | grep 'rx_bytes: ") at a time when
skb->len has already been adjusted by the eth_type_trans() ->
skb_pull_inline(skb, ETH_HLEN) call to exclude the L2 header.

This means that when connecting 2 DSA interfaces back to back and
sending 1 packet with length 100, the sending interface will report
tx_bytes as incrementing by 100, and the receiving interface will report
rx_bytes as incrementing by 86.

Since accounting for that in scripts is quirky and is something that
would be DSA-specific behavior (requiring users to know that they are
running on a DSA interface in the first place), the proposal is that we
treat it as a bug and fix it.

This design bug has always existed in DSA, according to my analysis:
commit 91da11f870f0 ("net: Distributed Switch Architecture protocol
support") also updates skb->dev->stats.rx_bytes += skb->len after the
eth_type_trans() call. Technically, prior to Florian's commit
a86d8becc3f0 ("net: dsa: Factor bottom tag receive functions"), each and
every vendor-specific tagging protocol driver open-coded the same bug,
until the buggy code was consolidated into something resembling what can
be seen now. So each and every driver should have its own Fixes: tag,
because of their different histories until the convergence point.
I'm not going to do that, for the sake of simplicity, but just blame the
oldest appearance of buggy code.

There are 2 ways to fix the problem. One is the obvious way, and the
other is how I ended up doing it. Obvious would have been to move
dev_sw_netstats_rx_add() one line above eth_type_trans(), and below
skb_push(skb, ETH_HLEN). But DSA processing is not as simple as that.
We count the bytes after removing everything DSA-related from the
packet, to emulate what the packet's length was, on the wire, when the
user port received it.

When eth_type_trans() executes, dsa_untag_bridge_pvid() has not run yet,
so in case the switch driver requests this behavior - commit
412a1526d067 ("net: dsa: untag the bridge pvid from rx skbs") has the
details - the obvious variant of the fix wouldn't have worked, because
the positioning there would have also counted the not-yet-stripped VLAN
header length, something which is absent from the packet as seen on the
wire (there it may be untagged, whereas software will see it as
PVID-tagged).

Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/i915: Update vblank timestamping stuff on seamless M/N change

When we change the M/N values seamlessly during a fastset we should
also update the vblank timestamping stuff to make sure the vblank
timestamp corrections/guesstimations come out exact.

Note that only crtc_clock and framedur_ns can actually end up
changing here during fastsets. Everything else we touch can
only change during full modesets.

Technically we should try to do this exactly at the start of
vblank, but that would require some kind of double buffering
scheme. Let's skip that for now and just update things right
after the commit has been submitted to the hardware. This
means the information will be properly up to date when the
vblank irq handler goes to work. Only if someone ends up
querying some vblanky stuff in between the commit and start
of vblank may we see a slight discrepancy.

Also this same problem really exists for the DRRS downclocking
stuff. But as that is supposed to be more or less transparent
to the user, and it only drops to low gear after a long delay
(1 sec currently) we probably don't have to worry about it.
Any time something is actively submitting updates DRRS will
remain in high gear and so the timestamping constants will
match the hardware state.

Reviewed-by: Jani Nikula <[email protected]>
Reviewed-by: Mitul Golani <[email protected]>
Fixes: e6f29923c048 ("drm/i915: Allow M/N change during fastset on bdw+")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8cb1f95cca68421b08333175719fdd3615372ca8)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Fix format for perf_limit_reasons

Use hex format so that it is easier to decode.

Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
Signed-off-by: Vinay Belgaumkar <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5e008ba67cb80084e99b40ccd46f9029ae421632)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/gt: perform uc late init after probe error injection

Probe pseudo errors should be injected only in places where real errors
can be encountered, otherwise unwinding code can be broken.
Placing intel_uc_init_late before i915_inject_probe_error violated
this rule, resulting in following bug:
__intel_gt_disable:655 GEM_BUG_ON(intel_gt_pm_is_awake(gt))

Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit c4252a11131c7f27a158294241466e2a4e7ff94e)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/active: Fix missing debug object activation

debug_active_activate() expected ref->count to be zero
which is not true anymore as __i915_active_activate() calls
debug_active_activate() after incrementing the count.

v2: No need to check for "ref->count == 1" as __i915_active_activate()
already make sure of that(Janusz).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6733
Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active callback")
Cc: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: [email protected]
Cc: Janusz Krzysztofik <[email protected]>
Cc: <[email protected]> # v5.10+
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Janusz Krzysztofik <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit bfad380c542438a9b642f8190b7fd37bc77e2723)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/guc: Fix missing ecodes

Error captures are tagged with an 'ecode'. This is a pseduo-unique magic
number that is meant to distinguish similar seeming bugs with
different underlying signatures. It is a combination of two ring state
registers. Unfortunately, the register state being used is only valid
in execlist mode. In GuC mode, the register state exists in a separate
list of arbitrary register address/value pairs rather than the named
entry structure. So, search through that list to find the two exciting
registers and copy them over to the structure's named members.

v2: if else if instead of if if (Alan)

Signed-off-by: John Harrison <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Fixes: a6f0f9cf330a ("drm/i915/guc: Plumb GuC-capture into gpu_coredump")
Cc: Alan Previn <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Aravind Iddamsetty <[email protected]>
Cc: Michael Cheng <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Daniele Ceraolo Spurio <[email protected]>
Cc: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 9724ecdbb9ddd6da3260e4a442574b90fc75188a)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/mtl: Disable MC6 for MTL A step

The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508

Fixes: 8f70f1ec587d ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 038a24835ab68f341eaa7a0e3bcc6ce0f9b22e17)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Preserve crtc_state->inherited during state clearing

intel_crtc_prepare_cleared_state() is unintentionally losing
the "inherited" flag. This will happen if intel_initial_commit()
is forced to go through the full modeset calculations for
whatever reason.

Afterwards the first real commit from userspace will not get
forced to the full modeset path, and thus eg. audio state may
not get recomputed properly. So if the monitor was already
enabled during boot audio will not work until userspace itself
does an explicit full modeset.

Cc: [email protected]
Tested-by: Lee Shawn C <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Uma Shankar <[email protected]>
(cherry picked from commit 2553bacaf953b48c59357f5a622282bc0c45adae)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/fbdev: lock the fbdev obj before vma pin

lock the fbdev obj before calling into
i915_vma_pin_iomap(). This helps to solve below :

<7>[   93.563308] i915 0000:00:02.0: [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
<4>[   93.581844] ------------[ cut here ]------------
<4>[   93.581855] WARNING: CPU: 12 PID: 625 at drivers/gpu/drm/i915/gem/i915_gem_pages.c:424 i915_gem_object_pin_map+0x152/0x1c0 [i915]

Fixes: f0b6b01b3efe ("drm/i915: Add ww context to intel_dpt_pin, v2.")
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 561b31acfd65502a2cda2067513240fc57ccdbdc)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/mtl: Fix Wa_16015201720 implementation

The commit 2357f2b271ad ("drm/i915/mtl: Initial display workarounds")
extended the workaround Wa_16015201720 to MTL. However the registers
that the original WA implemented moved for MTL.

Implement the workaround with the correct register.

v3: Skip clock gating for pipe C, D DMC's and fix the title

Fixes: 2357f2b271ad ("drm/i915/mtl: Initial display workarounds")
Cc: Matt Atwood <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0188be507b973e36f637ba010a369057c8cb7282)
Signed-off-by: Jani Nikula <[email protected]>

thunderbolt: Disable interrupt auto clear for rings

When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts. If two interrupts have
come in before one can be serviced then this will cause lost interrupts.

On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.

Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.

Fixes: 7a1808f82a37 ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Tested-by: Anson Tsao <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

thunderbolt: Use const qualifier for `ring_interrupt_index`

`ring_interrupt_index` doesn't change the data for `ring` so mark it as
const. This is needed by the following patch that disables interrupt
auto clear for rings.

Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

Linux 6.3-rc3

Merge tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix setting affinity of hwlat threads in containers

   Using sched_set_affinity() has unwanted side effects when being
   called within a container. Use set_cpus_allowed_ptr() instead

- Fix per cpu thread management of the hwlat tracer:
    - Do not start per_cpu threads if one is already running for the CPU
    - When starting per_cpu threads, do not clear the kthread variable
      as it may already be set to running per cpu threads

- Fix return value for test_gen_kprobe_cmd()

   On error the return value was overwritten by being set to the result
   of the call from kprobe_event_delete(), which would likely succeed,
   and thus have the function return success

- Fix splice() reads from the trace file that was broken by commit
   36e2c7421f02 ("fs: don't allow splice read/write without explicit
   ops")

- Remove obsolete and confusing comment in ring_buffer.c

   The original design of the ring buffer used struct page flags for
   tricks to optimize, which was shortly removed due to them being
   tricks. But a comment for those tricks remained

- Set local functions and variables to static

* tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
  ring-buffer: remove obsolete comment for free_buffer_page()
  tracing: Make splice_read available again
  ftrace: Set direct_ops storage-class-specifier to static
  trace/hwlat: Do not start per-cpu thread if it is already running
  trace/hwlat: Do not wipe the contents of per-cpu thread data
  tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static
  tracing: Fix wrong return in kprobe_event_gen_test.c

tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr

There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.

Test case:

  # cd /sys/kernel/tracing
  # echo 0 > tracing_on
  # echo round-robin > hwlat_detector/mode
  # echo hwlat > current_tracer
  # unshare --fork --pid bash -c 'echo 1 > tracing_on'
  # dmesg -c

Actual behavior:

[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Fixes: 0330f7aa8ee63 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

ring-buffer: remove obsolete comment for free_buffer_page()

The comment refers to mm/slob.c which is being removed. It comes from
commit ed56829cb319 ("ring_buffer: reset buffer page when freeing") and
according to Steven the borrowed code was a page mapcount and mapping
reset, which was later removed by commit e4c2ce82ca27 ("ring_buffer:
allocate buffer page pointer"). Thus the comment is not accurate anyway,
remove it.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Reported-by: Mike Rapoport <[email protected]>
Suggested-by: Steven Rostedt (Google) <[email protected]>
Fixes: e4c2ce82ca27 ("ring_buffer: allocate buffer page pointer")
Signed-off-by: Vlastimil Babka <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Make splice_read available again

Since the commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.

This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

Merge tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.3-rc3 to resolve
  some reported issues.

  They include:

   - 8250 driver Kconfig issue pointed out by you that showed up in -rc1

   - qcom-geni serial driver fixes

   - various 8250 driver fixes for reported problems

   - fsl_lpuart driver fixes

   - serdev fix for regression in -rc1

   - vt.c bugfix

  All have been in linux-next for over a week with no reported problems"

* tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: vt: protect KD_FONT_OP_GET_TALL from unbound access
  serial: qcom-geni: drop bogus uart_write_wakeup()
  serial: qcom-geni: fix mapping of empty DMA buffer
  serial: qcom-geni: fix DMA mapping leak on shutdown
  serial: qcom-geni: fix console shutdown hang
  serdev: Set fwnode for serdev devices
  tty: serial: fsl_lpuart: fix race on RX DMA shutdown
  serial: 8250_pci1xxxx: Disable SERIAL_8250_PCI1XXXX config by default
  serial: 8250_fsl: fix handle_irq locking
  serial: 8250_em: Fix UART port type
  serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
  tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
  Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"

Merge tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are a few small char/misc/other driver subsystem patches to
  resolve reported problems for 6.3-rc3.

  Included in here are:

   - Interconnect driver fixes for reported problems

   - Memory driver fixes for reported problems

   - nvmem core fix

   - firmware driver fix for reported problem

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits)
  memory: tegra30-emc: fix interconnect registration race
  memory: tegra20-emc: fix interconnect registration race
  memory: tegra124-emc: fix interconnect registration race
  memory: tegra: fix interconnect registration race
  interconnect: exynos: drop redundant link destroy
  interconnect: exynos: fix registration race
  interconnect: exynos: fix node leak in probe PM QoS error path
  interconnect: qcom: msm8974: fix registration race
  interconnect: qcom: rpmh: fix registration race
  interconnect: qcom: rpmh: fix probe child-node error handling
  interconnect: qcom: rpm: fix registration race
  nvmem: core: return -ENOENT if nvmem cell is not found
  firmware: xilinx: don't make a sleepable memory allocation from an atomic context
  interconnect: qcom: rpm: fix probe child-node error handling
  interconnect: qcom: osm-l3: fix registration race
  interconnect: imx: fix registration race
  interconnect: fix provider registration API
  interconnect: fix icc_provider_del() error handling
  interconnect: fix mem leak when freeing nodes
  interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
  ...

pcpcntr: remove percpu_counter_sum_all()

percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.

Remove it.

This effectively reverts the changes made in f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

fork: remove use of percpu_counter_sum_all

This effectively reverts the change made in commit f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") as the
race condition percpu_counter_sum_all() was invented to avoid is
now handled directly in percpu_counter_sum() and nobody needs to
care about summing racing with cpu unplug anymore.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

pcpcntrs: fix dying cpu summation race

In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all
interface") a race condition between a cpu dying and
percpu_counter_sum() iterating online CPUs was identified. The
solution was to iterate all possible CPUs for summation via
percpu_counter_sum_all().

We recently had a percpu_counter_sum() call in XFS trip over this
same race condition and it fired a debug assert because the
filesystem was unmounting and the counter *should* be zero just
before we destroy it. That was reported here:

https://lore.kernel.org/linux-kernel/20230314090649 [email protected]/

likely as a result of running generic/648 which exercises
filesystems in the presence of CPU online/offline events.

The solution to use percpu_counter_sum_all() is an awful one. We
use percpu counters and percpu_counter_sum() for accurate and
reliable threshold detection for space management, so a summation
race condition during these operations can result in overcommit of
available space and that may result in filesystem shutdowns.

As percpu_counter_sum_all() iterates all possible CPUs rather than
just those online or even those present, the mask can include CPUs
that aren't even installed in the machine, or in the case of
machines that can hot-plug CPU capable nodes, even have physical
sockets present in the machine.

Fundamentally, this race condition is caused by the CPU being
offlined being removed from the cpu_online_mask before the notifier
that cleans up per-cpu state is run. Hence percpu_counter_sum() will
not sum the count for a cpu currently being taken offline,
regardless of whether the notifier has run or not. This is
the root cause of the bug.

The percpu counter notifier iterates all the registered counters,
locks the counter and moves the percpu count to the global sum.
This is serialised against other operations that move the percpu
counter to the global sum as well as percpu_counter_sum() operations
that sum the percpu counts while holding the counter lock.

Hence the notifier is safe to run concurrently with sum operations,
and the only thing we actually need to care about is that
percpu_counter_sum() iterates dying CPUs. That's trivial to do,
and when there are no CPUs dying, it has no addition overhead except
for a cpumask_or() operation.

This change makes percpu_counter_sum() always do the right thing in
the presence of CPU hot unplug events and makes
percpu_counter_sum_all() unnecessary. This, in turn, means that
filesystems like XFS, ext4, and btrfs don't have to work out when
they should use percpu_counter_sum() vs percpu_counter_sum_all() in
their space accounting algorithms

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

cpumask: introduce for_each_cpu_or

Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

Merge tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS fix from Borislav Petkov:

- Flush out logged errors immediately after MCA banks configuration
   changes over sysfs have been done instead of waiting until something
   else triggers the workqueue later - another error or the polling
   interval cycle is reached

* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Make sure logged MCEs are processed after sysfs update

xfs: test dir/attr hash when loading module

Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in
ext4 against generic/454.  The cause of this test failure was the
unfortunate combination of setting an xattr name containing UTF8 encoded
emoji, an xattr hash function that accepted a char pointer with no
explicit signedness, signed type extension of those chars to an int, and
the 6.2 build tools maintainers deciding to mandate -funsigned-char
across the board.  As a result, the ondisk extended attribute structure
written out by 6.1 and 6.2 were not the same.

This discrepancy, in fact, had been noticeable if a filesystem with such
an xattr were moved between any two architectures that don't employ the
same signedness of a raw "char" declaration.  The only reason anyone
noticed is that x86 gcc defaults to signed, and no such -funsigned-char
update was made to e2fsprogs, so e2fsck immediately started reporting
data corruption.

After a day and a half of discussing how to handle this use case (xattrs
with bit 7 set anywhere in the name) without breaking existing users,
Linus merged his own patch and didn't tell the maintainer.  None of the
ext4 developers realized this until AUTOSEL announced that the commit
had been backported to stable.

In the end, this problem could have been detected much earlier if there
had been any useful tests of hash function(s) in use inside ext4 to make
sure that they always produce the same outputs given the same inputs.

The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's
vulnerable to this problem.  However, let's avoid all this drama by
adding our own self test to check that the da hash produces the same
outputs for a static pile of inputs on various platforms.  This enables
us to fix any breakage that may result in a controlled fashion.  The
buffer and test data are identical to the patches submitted to xfsprogs.

Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/
Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

xfs: add tracepoints for each of the externally visible allocators

There are now five separate space allocator interfaces exposed to the
rest of XFS for five different strategies to find space. Add
tracepoints for each of them so that I can tell from a trace dump
exactly which ones got called and what happened underneath them. Add a
sixth so it's more obvious if an allocation actually happened.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags

Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
want us to perform a non-blocking scan of the AGs for free space.  There
are no ordering constraints for non-blocking AGF lock acquisition, so
the scan can freely start over at AG 0 even when minimum_agno > 0.

This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
pointer patchset applied and the realtime volume enabled.  I observed
the following sequence as part of an xfs_dir_createname call:

0. Fragment the free space, then allocate nearly all the free space in
   all AGs except AG 0.

1. Create a directory in AG 2 and let it grow for a while.

2. Try to allocate 2 blocks to expand the dirent part of a directory.
   The space will be allocated out of AG 0, but the allocation will not
   be contiguous.  This (I think) activates the LOWMODE allocator.

3. The bmapi call decides to convert from extents to bmbt format and
   tries to allocate 1 block.  This allocation request calls
   xfs_alloc_vextent_start_ag with the inode number, which starts the
   scan at AG 2.  We ignore AG 0 (with all its free space) and instead
   scrape AG 2 and 3 for more space.  We find one block, but this now
   kicks t_highest_agno to 3.

4. The createname call decides it needs to split the dabtree.  It tries
   to allocate even more space with xfs_alloc_vextent_start_ag, but now
   we're constrained to AG 3, and we don't find the space.  The
   createname returns ENOSPC and the filesystem shuts down.

This change fixes the problem by making the trylock scan wrap around to
AG 0 if it doesn't like the AGs that it finds.  Since the current
transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
we take space from the AG that has plenty.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

Merge tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Check whether sibling events have been deactivated before adding them
   to groups

- Update the proper event time tracking variable depending on the event
   type

- Fix a memory overwrite issue due to using the wrong function argument
   when outputting perf events

* tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix check before add_event_to_groups() in perf_group_detach()
  perf: fix perf_event_context->time
  perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output

Merge tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
"There's a little bit more 'movement' in there for my taste but it
  needs to happen and should make the code better after it.

   - Check cmdline_find_option()'s return value before further
     processing

   - Clear temporary storage in the resctrl code to prevent access to an
     unexistent MSR

   - Add a simple throttling mechanism to protect the hypervisor from
     potentially malicious SEV guests issuing requests in rapid
     succession.

     In order to not jeopardize the sanity of everyone involved in
     maintaining this code, the request issuing side has received a
     cleanup, split in more or less trivial, small and digestible
     pieces. Otherwise, the code was threatening to become an
     unmaintainable mess.

     Therefore, that cleanup is marked indirectly also for stable so
     that there's no differences between the upstream code and the
     stable variant when it comes down to backporting more there"

* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix use of uninitialized buffer in sme_enable()
  x86/resctrl: Clear staged_config[] before and after it is used
  virt/coco/sev-guest: Add throttling awareness
  virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
  virt/coco/sev-guest: Do some code style cleanups
  virt/coco/sev-guest: Carve out the request issuing logic into a helper
  virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
  virt/coco/sev-guest: Simplify extended guest request handling
  virt/coco/sev-guest: Check SEV_SNP attribute at probe time

Merge tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fix from Ted Ts'o:
"Fix a double unlock bug on an error path in ext4, found by smatch and
syzkaller"

* tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix possible double unlock when moving a directory

ftrace: Set direct_ops storage-class-specifier to static

smatch reports this warning
kernel/trace/ftrace.c:2594:19: warning:
symbol 'direct_ops' was not declared. Should it be static?

The variable direct_ops is only used in ftrace.c, so it should be static

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

trace/hwlat: Do not start per-cpu thread if it is already running

The hwlatd tracer will end up starting multiple per-cpu threads with
the following script:

    #!/bin/sh
    cd /sys/kernel/debug/tracing
    echo 0 > tracing_on
    echo hwlat > current_tracer
    echo per-cpu > hwlat_detector/mode
    echo 100000 > hwlat_detector/width
    echo 200000 > hwlat_detector/window
    echo 1 > tracing_on

To fix the issue, check if the hwlatd thread for the cpu is already
running, before starting a new one. Along with the previous patch, this
avoids running multiple instances of the same CPU thread on the system.

Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

trace/hwlat: Do not wipe the contents of per-cpu thread data

Do not wipe the contents of the per-cpu kthread data when starting the
tracer, as this will completely forget about already running instances
and can later start new additional per-cpu threads.

Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static

smatch reports several similar warnings
kernel/trace/trace_osnoise.c:220:1: warning:
  symbol '__pcpu_scope_per_cpu_osnoise_var' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:243:1: warning:
  symbol '__pcpu_scope_per_cpu_timerlat_var' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:335:14: warning:
  symbol 'interface_lock' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:2242:5: warning:
  symbol 'timerlat_min_period' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:2243:5: warning:
  symbol 'timerlat_max_period' was not declared. Should it be static?

These variables are only used in trace_osnoise.c, so it should be static

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Fix wrong return in kprobe_event_gen_test.c

Overwriting the error code with the deletion result may cause the
function to return 0 despite encountering an error. Commit b111545d26c0
("tracing: Remove the useless value assignment in
test_create_synth_event()") solves a similar issue by
returning the original error code, so this patch does the same.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Anton Gusev <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

mlxsw: core_thermal: Fix fan speed in maximum cooling state

The cooling levels array is supposed to prevent the system fans from
being configured below a 20% duty cycle as otherwise some of them get
stuck at 0 RPM.

Due to an off-by-one error, the last element in the array was not
initialized, causing it to be set to zero, which in turn lead to fans
being configured with a 0% duty cycle in maximum cooling state.

Since commit 332fdf951df8 ("mlxsw: thermal: Fix out-of-bounds memory
accesses") the contents of the array are static. Therefore, instead of
fixing the initialization of the array, simply remove it and adjust
thermal_cooling_device_ops::set_cur_state() so that the configured duty
cycle is never set below 20%.

Before:

# cat /sys/class/thermal/thermal_zone0/cdev0/type
mlxsw_fan
# echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
# cat /sys/class/hwmon/hwmon0/name
mlxsw
# cat /sys/class/hwmon/hwmon0/pwm1
0

After:

# cat /sys/class/thermal/thermal_zone0/cdev0/type
mlxsw_fan
# echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
# cat /sys/class/hwmon/hwmon0/name
mlxsw
# cat /sys/class/hwmon/hwmon0/pwm1
255

This bug was uncovered when the thermal subsystem repeatedly tried to
configure the cooling devices to their maximum state due to another
issue [1]. This resulted in the fans being stuck at 0 RPM, which
eventually lead to the system undergoing thermal shutdown.

[1] https://lore.kernel.org/netdev/ZA3CFNhU4AbtsP4G@shredder/

Fixes: a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Vadim Pasternak <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sfp: fix state loss when updating state_hw_mask

Andrew reports that the SFF modules on one of the ZII platforms do not
indicate link up due to the SFP code believing that LOS indicating that
there is no signal being received from the remote end, but in fact the
LOS signal is showing that there is signal.

What makes SFF modules different from SFPs is they typically have an
inverted LOS, which uncovered this issue. When we read the hardware
state, we mask it with state_hw_mask so we ignore anything we're not
interested in. However, we don't re-read when state_hw_mask changes,
leading to sfp->state being stale.

Arrange for a software poll of the module state after we have parsed
the EEPROM in sfp_sm_mod_probe() and updated state_*_mask. This will
generate any necessary events for signal changes for the state
machine as well as updating sfp->state.

Reported-by: Andrew Lunn <[email protected]>
Tested-by: Andrew Lunn <[email protected]>
Fixes: 8475c4b70b04 ("net: sfp: re-implement soft state polling setup")
Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Fix for mismatched host/device DMA address width

Currently DMA address width is either read from a RO device register
or force set from the platform data. This breaks DMA when the host DMA
address width is <=32it but the device is >32bit.

Right now the driver may decide to use a 2nd DMA descriptor for
another buffer (happens in case of TSO xmit) assuming that 32bit
addressing is used due to platform configuration but the device will
still use both descriptor addresses as one address.

This can be observed with the Intel EHL platform driver that sets
32bit for addr64 but the MAC reports 40bit. The TX queue gets stuck in
case of TCP with iptables NAT configuration on TSO packets.

The logic should be like this: Whatever we do on the host side (memory
allocation GFP flags) should happen with the host DMA width, whenever
we decide how to set addresses on the device registers we must use the
device DMA address width.

This patch renames the platform address width field from addr64 (term
used in device datasheet) to host_addr and uses this value exclusively
for host side operations while all chip operations consider the device
DMA width as read from the device register.

Fixes: 7cfc4486e7ea ("stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing")
Signed-off-by: Jochen Henneberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mdiobus-module-owner'

Florian Fainelli says:

====================
ACPI/DT mdiobus module owner fixes

This patch series fixes wrong mdiobus module ownership for MDIO buses
registered from DT or ACPI.

Thanks Maxime for providing the first patch and making me see that ACPI
also had the same issue.

Changes in v2:

- fixed missing kdoc in the first patch
====================

Signed-off-by: David S. Miller <[email protected]>

net: mdio: fix owner field for mdio buses registered using ACPI

Bus ownership is wrong when using acpi_mdiobus_register() to register an
mdio bus. That function is not inline, so when it calls
mdiobus_register() the wrong THIS_MODULE value is captured.

CC: Maxime Bizon <[email protected]>
Fixes: 803ca24d2f92 ("net: mdio: Add ACPI support code for mdio")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mdio: fix owner field for mdio buses registered using device-tree

Bus ownership is wrong when using of_mdiobus_register() to register an mdio
bus. That function is not inline, so when it calls mdiobus_register() the wrong
THIS_MODULE value is captured.

Signed-off-by: Maxime Bizon <[email protected]>
Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs")
[florian: fix kdoc, added Fixes tag]
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phy: Ensure state transitions are processed from phy_stop()

In the phy_disconnect() -> phy_stop() path, we will be forcibly setting
the PHY state machine to PHY_HALTED. This invalidates the old_state !=
phydev->state condition in phy_state_machine() such that we will neither
display the state change for debugging, nor will we invoke the
link_change_notify() callback.

Factor the code by introducing phy_process_state_change(), and ensure
that we process the state change from phy_stop() as well.

Fixes: 5c5f626bcace ("net: phy: improve handling link_change_notify callback")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-03-16 (igb, igbvf, igc)

This series contains updates to igb, igbvf, and igc drivers.

Lin Ma removes rtnl_lock() when disabling SRIOV on remove which was
causing deadlock on igb.

Akihiko Odaki delays enabling of SRIOV on igb to prevent early messages
that could get ignored and clears MAC address when PF returns nack on
reset; indicating no MAC address was assigned for igbvf.

Gaosheng Cui frees IRQs in error path for igbvf.

Akashi Takahiro fixes logic on checking TAPRIO gate support for igc.
====================

Signed-off-by: David S. Miller <[email protected]>

xirc2ps_cs: Fix use after free bug in xirc2ps_detach

In xirc2ps_probe, the local->tx_timeout_task was bounded
with xirc2ps_tx_timeout_task. When timeout occurs,
it will call xirc_tx_timeout->schedule_work to start the
work.

When we call xirc2ps_detach to remove the driver, there
may be a sequence as follows:

Stop responding to timeout tasks and complete scheduled
tasks before cleanup in xirc2ps_detach, which will fix
the problem.

CPU0                  CPU1

                    |xirc2ps_tx_timeout_task
xirc2ps_detach      |
  free_netdev       |
    kfree(dev);     |
                    |
                    | do_reset
                    |   //use dev

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zheng Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed/qed_sriov: guard against NULL derefs from qed_iov_get_vf_info

We have to make sure that the info returned by the helper is valid
before using it.

Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.

Fixes: f990c82c385b ("qed*: Add support for ndo_set_vf_trust")
Fixes: 733def6a04bf ("qed*: IOV link control")
Signed-off-by: Daniil Tatianin <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()

It is a bug for fscrypt_put_master_key_activeref() to see a NULL
keyring. But it used to be possible due to the bug, now fixed, where
fscrypt_destroy_keyring() was called before security_sb_delete(). To be
consistent with how fscrypt_destroy_keyring() uses WARN_ON for the same
issue, WARN and leak the fscrypt_master_key if the keyring is NULL
instead of dereferencing the NULL pointer.

This is a robustness improvement, not a fix.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>

fscrypt: improve fscrypt_destroy_keyring() documentation

Document that fscrypt_destroy_keyring() must be called after all
potentially-encrypted inodes have been evicted.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>

Merge tag 'fbdev-for-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:
"The majority of lines changed is due to a code style cleanup in the
  pnmtologo helper program.

  Arnd removed the omap1 osk driver and the SIS fb driver is now
  orphaned.

  Other than that it's the usual bunch of small fixes and cleanups, e.g.
  prevent possible divide-by-zero in various fb drivers if the pixclock
  is zero and various conversions to devm_platform*() and of_property*()
  functions:

   - Drop omap1 osk driver

   - Various potential divide by zero pixclock fixes

   - Add pixelclock and fb_check_var() to stifb

   - Code style cleanups and indenting fixes"

* tag 'fbdev-for-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: Use of_property_present() for testing DT property presence
  fbdev: au1200fb: Fix potential divide by zero
  fbdev: lxfb: Fix potential divide by zero
  fbdev: intelfb: Fix potential divide by zero
  fbdev: nvidia: Fix potential divide by zero
  fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
  fbdev: omapfb: remove omap1 osk driver
  fbdev: xilinxfb: Use devm_platform_get_and_ioremap_resource()
  fbdev: wm8505fb: Use devm_platform_ioremap_resource()
  fbdev: pxa3xx-gcu: Use devm_platform_get_and_ioremap_resource()
  fbdev: Use of_property_read_bool() for boolean properties
  fbdev: clps711x-fb: Use devm_platform_get_and_ioremap_resource()
  fbdev: tgafb: Fix potential divide by zero
  MAINTAINERS: orphan SIS FRAMEBUFFER DRIVER
  fbdev: omapfb: cleanup inconsistent indentation
  drivers: video: logo: add SPDX comment, remove GPL notice in pnmtologo.c
  drivers: video: logo: fix code style issues in pnmtologo.c

Merge tag 'kbuild-fixes-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- Exclude kallsyms_seqs_of_names from kallsyms to fix build error

- Fix 'make kernelrelease' for external module builds

- Get the Debian source package compilable again

- Fix the wrong uname when Debian packages are built with the
   KDEB_PKGVERSION option

- Fix superfluous CROSS_COMPILE when building Debian packages

- Fix RPM package build error when KCONFIG_CONFIG is set

- Use 'git archive' for creating source tarballs

- Remove the scripts/list-gitignored tool

* tag 'kbuild-fixes-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: use git-archive for source package creation
  kbuild: rpm-pkg: move source components to rpmbuild/SOURCES
  kbuild: deb-pkg: use dh_listpackages to know enabled packages
  kbuild: deb-pkg: split image and debug objects staging out into functions
  kbuild: deb-pkg: set CROSS_COMPILE only when undefined
  kbuild: deb-pkg: do not take KERNELRELEASE from the source version
  kbuild: deb-pkg: make debian source package working again
  Makefile: Make kernelrelease target work with M=
  kconfig: Update config changed flag before calling callback
  kallsyms: add kallsyms_seqs_of_names to list of special symbols

Merge tag 'hwmon-for-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- ltc2992, adm1266: Set missing can_sleep flag

- tmp512/tmp513: Drop of_match_ptr for ID table to fix build with
   !CONFIG_OF

- ucd90320: Fix back-to-back access problem

- ina3221: Fix bad error return from probe function

- xgene: Fix use-after-free bug in remove function

- adt7475: Fix hysteresis register bit masks, and fix association of
   'smoothing' attributes

* tag 'hwmon-for-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (ltc2992) Set `can_sleep` flag for GPIO chip
  hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
  hwmon: tmp512: drop of_match_ptr for ID table
  hwmon: (ucd90320) Add minimum delay between bus accesses
  hwmon: (ina3221) return prober error code
  hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
  hwmon: (adt7475) Fix masking of hysteresis registers
  hwmon: (adt7475) Display smoothing attributes in correct order

Merge tag 'ata-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ata fixes from Damien Le Moal:

- Two fixes from Ondrej for the pata_parport driver to address an issue
   with error handling during drive connection and to fix memory leaks
   in case of errors during initialization and when disconnecting a
   device.

* tag 'ata-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: pata_parport: fix memory leaks
  ata: pata_parport: fix parport release without claim

media: m5mols: fix off-by-one loop termination error

The __find_restype() function loops over the m5mols_default_ffmt[]
array, and the termination condition ends up being wrong: instead of
stopping when the iterator becomes the size of the array it traverses,
it stops after it has already overshot the array.

Now, in practice this doesn't likely matter, because the code will
always find the entry it looks for, and will thus return early and never
hit that last extra iteration.

But it turns out that clang will unroll the loop fully, because it has
only two iterations (well, three due to the off-by-one bug), and then
clang will end up just giving up in the middle of the loop unrolling
when it notices that the code walks past the end of the array.

And that made 'objtool' very unhappy indeed, because the generated code
just falls off the edge of the universe, and ends up falling through to
the next function, causing this warning:

drivers/media/i2c/m5mols/m5mols.o: warning: objtool: m5mols_set_fmt() falls through to next function m5mols_get_frame_desc()

Fix the loop ending condition.

Reported-by: Jens Axboe <[email protected]>
Analyzed-by: Miguel Ojeda <[email protected]>
Analyzed-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/linux-block/CAHk-=wgTSdKYbmB1JYM5vmHMcD9J9UZr0mn7BOYM_LudrP+Xvw@mail.gmail.com/
Fixes: bc125106f8af ("[media] Add support for M-5MOLS 8 Mega Pixel camera ISP")
Cc: HeungJun, Kim <[email protected]>
Cc: Sylwester Nawrocki <[email protected]>
Cc: Kyungmin Park <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

block: ublk_drv: mark device as LIVE before adding disk

IO can be started before add_disk() returns, such as reading parititon table,
then the monitor work should work for making forward progress.

So mark device as LIVE before adding disk, meantime change to
DEAD if add_disk() fails.

Fixed: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: Ziyang Zhang <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L

Another Lenovo convertable which reports a landscape resolution of
1920x1200 with a pitch of (1920 * 4) bytes, while the actual framebuffer
has a resolution of 1200x1920 with a pitch of (1200 * 4) bytes.

Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>

efi: sysfb_efi: Fix DMI quirks not working for simpledrm

Commit 8633ef82f101 ("drivers/firmware: consolidate EFI framebuffer setup
for all arches") moved the sysfb_apply_efi_quirks() call in sysfb_init()
from before the [sysfb_]parse_mode() call to after it.
But sysfb_apply_efi_quirks() modifies the global screen_info struct which
[sysfb_]parse_mode() parses, so doing it later is too late.

This has broken all DMI based quirks for correcting wrong firmware efifb
settings when simpledrm is used.

To fix this move the sysfb_apply_efi_quirks() call back to its old place
and split the new setup of the efifb_fwnode (which requires
the platform_device) into its own function and call that at
the place of the moved sysfb_apply_efi_quirks(pd) calls.

Fixes: 8633ef82f101 ("drivers/firmware: consolidate EFI framebuffer setup for all arches")
Cc: [email protected]
Cc: Javier Martinez Canillas <[email protected]>
Cc: Thomas Zimmermann <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>

efi/libstub: smbios: Drop unused 'recsize' parameter

We no longer use the recsize argument for locating the string table in
an SMBIOS record, so we can drop it from the internal API.

Signed-off-by: Ard Biesheuvel <[email protected]>

arm64: efi: Use SMBIOS processor version to key off Ampere quirk

Instead of using the SMBIOS type 1 record 'family' field, which is often
modified by OEMs, use the type 4 'processor ID' and 'processor version'
fields, which are set to a small set of probe-able values on all known
Ampere EFI systems in the field.

Fixes: 550b33cfd4452968 ("arm64: efi: Force the use of ...")
Tested-by: Andrea Righi <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>

efi/libstub: smbios: Use length member instead of record struct size

The type 1 SMBIOS record happens to always be the same size, but there
are other record types which have been augmented over time, and so we
should really use the length field in the header to decide where the
string table starts.

Fixes: 550b33cfd4452968 ("arm64: efi: Force the use of ...")
Signed-off-by: Ard Biesheuvel <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-03-16 (iavf)

This series contains updates to iavf driver only.

Alex fixes incorrect check against Rx hash feature and corrects payload
value for IPv6 UDP packet.

Ahmed removes bookkeeping of VLAN 0 filter as it always exists and can
cause a false max filter error message.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: do not track VLAN 0 filters
  iavf: fix non-tunneled IPv6 UDP packet type and hashing
  iavf: fix inverted Rx hash condition leading to disabled hash
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ethernet: ti: am65-cpts: reset pps genf adj settings on enable

The CPTS PPS GENf adjustment settings are invalid after it has been
disabled for a while, so reset them.

Fixes: eb9233ce6751 ("net: ethernet: ti: am65-cpts: adjust pps following ptp changes")
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Siddharth Vadapalli <[email protected]>
Reviewed-by: Roger Quadros <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: usb: smsc95xx: Limit packet length to skb->len

Packet length retrieved from descriptor may be larger than
the actual socket buffer length. In such case the cloned
skb passed up the network stack will leak kernel memory contents.

Fixes: 2f7ca802bdae ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
Signed-off-by: Szymon Heidrich <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: b53: mmap: fix device tree support

CPU port should also be enabled in order to get a working switch.

Fixes: a5538a777b73 ("net: dsa: b53: mmap: Add device tree support")
Signed-off-by: Álvaro Fernández Rojas <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ext4: fix possible double unlock when moving a directory

Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Dan Carpenter <[email protected]>
Reported-by: [email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

nfsd: don't replace page in rq_pages if it's a continuation of last page

The splice read calls nfsd_splice_actor to put the pages containing file
data into the svc_rqst->rq_pages array. It's possible however to get a
splice result that only has a partial page at the end, if (e.g.) the
filesystem hands back a short read that doesn't cover the whole page.

nfsd_splice_actor will plop the partial page into its rq_pages array and
return. Then later, when nfsd_splice_actor is called again, the
remainder of the page may end up being filled out. At this point,
nfsd_splice_actor will put the page into the array _again_ corrupting
the reply. If this is done enough times, rq_next_page will overrun the
array and corrupt the trailing fields -- the rq_respages and
rq_next_page pointers themselves.

If we've already added the page to the array in the last pass, don't add
it to the array a second time when dealing with a splice continuation.
This was originally handled properly in nfsd_splice_actor, but commit
91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check
for it.

Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()")
Cc: Al Viro <[email protected]>
Reported-by: Dario Lesca <[email protected]>
Tested-by: David Critch <[email protected]>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>

Merge tag 'net-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, wifi and ipsec.

  A little more changes than usual, but it's pretty normal for us that
  the rc3/rc4 PRs are oversized as people start testing in earnest.

  Possibly an extra boost from people deploying the 6.1 LTS but that's
  more of an unscientific hunch.

  Current release - regressions:

   - phy: mscc: fix deadlock in phy_ethtool_{get,set}_wol()

   - virtio: vsock: don't use skbuff state to account credit

   - virtio: vsock: don't drop skbuff on copy failure

   - virtio_net: fix page_to_skb() miscalculating the memory size

  Current release - new code bugs:

   - eth: correct xdp_features after device reconfig

   - wifi: nl80211: fix the puncturing bitmap policy

   - net/mlx5e: flower:
      - fix raw counter initialization
      - fix missing error code
      - fix cloned flow attribute

   - ipa:
      - fix some register validity checks
      - fix a surprising number of bad offsets
      - kill FILT_ROUT_CACHE_CFG IPA register

  Previous releases - regressions:

   - tcp: fix bind() conflict check for dual-stack wildcard address

   - veth: fix use after free in XDP_REDIRECT when skb headroom is small

   - ipv4: fix incorrect table ID in IOCTL path

   - ipvlan: make skb->skb_iif track skb->dev for l3s mode

   - mptcp:
      - fix possible deadlock in subflow_error_report
      - fix UaFs when destroying unaccepted and listening sockets

   - dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290

  Previous releases - always broken:

   - tcp: tcp_make_synack() can be called from process context, don't
     assume preemption is disabled when updating stats

   - netfilter: correct length for loading protocol registers

   - virtio_net: add checking sq is full inside xdp xmit

   - bonding: restore IFF_MASTER/SLAVE flags on bond enslave Ethertype
     change

   - phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bit number

   - eth: i40e: fix crash during reboot when adapter is in recovery mode

   - eth: ice: avoid deadlock on rtnl lock when auxiliary device
     plug/unplug meets bonding

   - dsa: mt7530:
      - remove now incorrect comment regarding port 5
      - set PLL frequency and trgmii only when trgmii is used

   - eth: mtk_eth_soc: reset PCS state when changing interface types

  Misc:

   - ynl: another license adjustment

   - move the TCA_EXT_WARN_MSG attribute for tc action"

* tag 'net-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
  selftests: bonding: add tests for ether type changes
  bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
  bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change
  net: renesas: rswitch: Fix GWTSDIE register handling
  net: renesas: rswitch: Fix the output value of quote from rswitch_rx()
  ethernet: sun: add check for the mdesc_grab()
  net: ipa: fix some register validity checks
  net: ipa: kill FILT_ROUT_CACHE_CFG IPA register
  net: ipa: add two missing declarations
  net: ipa: reg: include <linux/bug.h>
  net: xdp: don't call notifiers during driver init
  net/sched: act_api: add specific EXT_WARN_MSG for tc action
  Revert "net/sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy"
  net: dsa: microchip: fix RGMII delay configuration on KSZ8765/KSZ8794/KSZ8795
  ynl: make the tooling check the license
  ynl: broaden the license even more
  tools: ynl: make definitions optional again
  hsr: ratelimit only when errors are printed
  qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
  selftests: net: devlink_port_split.py: skip test if no suitable device available
  ...

cifs: check only tcon status on tcon related functions

We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.

Signed-off-by: Shyam Prasad N <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
"A bit bigger than usual, as the NVMe pull request missed last weeks
  submission. In detail:

   - NVMe pull request via Christoph:
        - Avoid potential UAF in nvmet_req_complete (Damien Le Moal)
        - More quirks (Elmer Miroslav Mosher Golovin, Philipp Geulen)
        - Fix a memory leak in the nvme-pci probe teardown path
          (Irvin Cote)
        - Repair the MAINTAINERS entry (Lukas Bulwahn)
        - Fix handling single range discard request (Ming Lei)
        - Show more opcode names in trace events (Minwoo Im)
        - Fix nvme-tcp timeout reporting (Sagi Grimberg)

   - MD pull request via Song:
        - Two fixes for old issues (Neil)
        - Resource leak in device stopping (Xiao)

   - Bio based device stats fix (Yu)

   - Kill unused CONFIG_BLOCK_COMPAT (Lukas)

   - sunvdc missing mdesc_grab() failure check (Liang)

   - Fix for reversal of request ordering upon issue for certain cases
     (Jan)

   - null_blk timeout fixes (Damien)

   - Loop use-after-free fix (Bart)

   - blk-mq SRCU fix for BLK_MQ_F_BLOCKING devices (Chris)"

* tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linux:
  block: remove obsolete config BLOCK_COMPAT
  md: select BLOCK_LEGACY_AUTOLOAD
  block: count 'ios' and 'sectors' when io is done for bio-based device
  block: sunvdc: add check for mdesc_grab() returning NULL
  nvmet: avoid potential UAF in nvmet_req_complete()
  nvme-trace: show more opcode names
  nvme-tcp: add nvme-tcp pdu size build protection
  nvme-tcp: fix opcode reporting in the timeout handler
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
  nvme-pci: fixing memory leak in probe teardown path
  nvme: fix handling single range discard request
  MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS
  block: null_blk: cleanup null_queue_rq()
  block: null_blk: Fix handling of fake timeout request
  blk-mq: fix "bad unlock balance detected" on q->srcu in __blk_mq_run_dispatch_ops
  loop: Fix use-after-free issues
  block: do not reverse request order when flushing plug list
  md: avoid signed overflow in slot_store()
  md: Free resources in __md_stop

Merge tag 'io_uring-6.3-2023-03-16' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

- When PF_NO_SETAFFINITY was removed for io-wq threads, we kind of
   forgot about the SQPOLL thread. Remove it there as well, there's even
   less of a reason to set it there (Michal)

- Fixup a confusing 'ret' setting (Li)

- When MSG_RING is used to send a direct descriptor to another ring,
   it's possible to have it allocate it on the target ring rather than
   provide a specific index for it. If this is done, return the chosen
   value in the CQE, like we would've done locally (Pavel)

- Fix a regression in this series on huge page bvec collapsing (Pavel)

* tag 'io_uring-6.3-2023-03-16' of git://git.kernel.dk/linux:
  io_uring/rsrc: fix folio accounting
  io_uring/msg_ring: let target know allocated index
  io_uring: rsrc: Optimize return value variable 'ret'
  io_uring/sqpoll: Do not set PF_NO_SETAFFINITY on sqpoll threads

Merge tag 'pm-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix an error code path issue in a cpuidle driver and make the
  sleepgraph utility more robust against unexpected input.

  Specifics:

   - Fix the psci_pd_init_topology() failure path in the PSCI cpuidle
     driver (Shawn Guo)

   - Modify the sleepgraph utility so it does not crash on binary data
     in device names (Todd Brandt)"

* tag 'pm-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  pm-graph: sleepgraph: Avoid crashing on binary data in device names
  cpuidle: psci: Iterate backwards over list in psci_pd_remove()

Merge tag 'acpi-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"These add some new quirks, fix PPTT handling, fix an ACPI utility and
  correct a mistake in the ACPI documentation.

  Specifics:

   - Fix ACPI PPTT handling to avoid sleep in the atomic context when it
     is not present (Sudeep Holla)

   - Add 'backlight=native' DMI quirk for Dell Vostro 15 3535 to the
     ACPI video driver (Chia-Lin Kao)

   - Add ACPI quirks for I2C device enumeration on Lenovo Yoga Book X90
     and Acer Iconia One 7 B1-750 (Hans de Goede)

   - Fix handling of invalid command line option values in the ACPI
     pfrut utility (Chen Yu)

   - Fix references to I2C device data type in the ACPI documentation
     for device enumeration (Andy Shevchenko)"

* tag 'acpi-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
  ACPI: PPTT: Fix to avoid sleep in the atomic context when PPTT is absent
  ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90
  ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750
  ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper
  ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535
  ACPI: docs: enumeration: Correct reference to the I²C device data type

Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat fweaks and fixes from Len Brown:
"Leprechaun sized fixes and tweaks touching only turbostat.

  'Keeping happy users happy since 2010'"

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2023.03.17
  tools/power turbostat: fix decoding of HWP_STATUS
  tools/power turbostat: Introduce support for EMR
  tools/power turbostat: remove stray newlines from warn/warnx strings
  tools/power turbostat: Fix /dev/cpu_dma_latency warnings
  tools/power turbostat: Provide better debug messages for failed capabilities accesses
  tools/power turbostat: update dump of SECONDARY_TURBO_RATIO_LIMIT

Merge tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- cleanup for xen time handling

- enable the VGA console in a Xen PVH dom0

- cleanup in the xenfs driver

* tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: remove unnecessary (void*) conversions
  x86/PVH: obtain VGA console info in Dom0
  x86/xen/time: cleanup xen_tsc_safe_clocksource
  xen: update arch/x86/include/asm/xen/cpuid.h

Merge tag 'riscv-for-linus-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

- fixes to the ASID allocator to avoid leaking stale mappings between
   tasks

- fix the vmalloc fault handler to tolerate huge pages

* tag 'riscv-for-linus-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: mm: Support huge page in vmalloc_fault()
  riscv: asid: Fixup stale TLB entry cause application crash
  Revert "riscv: mm: notify remote harts about mmu cache updates"

Merge tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- Update defconfigs

- Fix early boot code by adding missing intersection check to prevent
   potential overwriting of the ipl report

- Fix a use-after-free issue in s390-specific code related to PCI
   resources being retained after hot-unplugging individual functions,
   by removing the resources from the PCI bus's resource list and using
   the zpci_bar_struct's resource pointer directly

* tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: update defconfigs
  PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
  s390/ipl: add missing intersection check to ipl_report handling

Merge tag 'powerpc-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

- Fix false detection of read faults, introduced by execute-only
   support

- Fix a build failure when GENERIC_ALLOCATOR is not selected

Thanks to Russell Currey, Randy Dunlap, Michal Suchánek, Nathan Lynch,
and Benjamin Gray.

* tag 'powerpc-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Fix false detection of read faults
  powerpc/pseries: RTAS work area requires GENERIC_ALLOCATOR

Merge tag 'mmc-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

- dw_mmc-starfive: Fix initialization of the prev_err variable

- sdhci_am654: Lower power-on failed message severity

* tag 'mmc-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc-starfive: Fix initialization of prev_err
mmc: sdhci_am654: lower power-on failed message severity

Merge tag 'sound-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Nothing surprising, a collection of small device-specific fixes.

  The majority of changes are for ASoC Intel stuff, while a few other
  ASoC and HD-audio fixes are found"

* tag 'sound-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
  ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set()
  ALSA: asihpi: check pao in control_message()
  ASoC: hdmi-codec: only startup/shutdown on supported streams
  ASoC: da7219: Initialize jack_det_mutex
  ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
  ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
  ALSA: hda/realtek: fix speaker, mute/micmute LEDs not work on a HP platform
  ALSA: hda: intel-dsp-config: add MTL PCI id
  ASoC: SOF: IPC4: update gain ipc msg definition to align with fw
  ASoC: SOF: sof-audio: don't squelch errors in WIDGET_SETUP phase
  ASoC: SOF: Intel: hda-ctrl: re-add sleep after entering and exiting reset
  ASoC: SOF: Intel: hda-dsp: harden D0i3 programming sequence
  ASoC: SOF: ipc4-topology: set dmic dai index from copier
  ASoC: SOF: sof-audio: Fix broken early bclk feature for SSP
  ASoC: SOF: Intel: pci-tng: revert invalid bar size setting
  ASoC: SOF: topology: Fix error handling in sof_widget_ready()
  ASoC: Intel: soc-acpi: fix copy-paste issue in topology names
  ASoC: SOF: ipc4-topology: Fix incorrect sample rate print unit
  ASoC: SOF: ipc3: Check for upper size limit for the received message
  ASOC: SOF: Intel: pci-tgl: Fix device description
  ...