David Howells [Tue, 9 Jan 2024 17:51:08 +0000 (17:51 +0000)]
afs: Don't use certain unnecessary folio_*() functions
Filesystems should use folio->index and folio->mapping, instead of
folio_index(folio), folio_mapping() and folio_file_mapping() since
they know that it's in the pagecache.
David Howells [Tue, 9 Jan 2024 17:17:36 +0000 (17:17 +0000)]
netfs: Don't use certain unnecessary folio_*() functions
Filesystems should use folio->index and folio->mapping, instead of
folio_index(folio), folio_mapping() and folio_file_mapping() since
they know that it's in the pagecache.
fbcon: Fix incorrect printed function name in fbcon_prepare_logo()
If the boot logo does not fit, a message is printed, including a wrong
function name prefix. Instead of correcting the function name (or using
__func__), just use "fbcon", like is done in several other messages.
While at it, modernize the call by switching to pr_info().
Linus Torvalds [Mon, 22 Jan 2024 21:29:42 +0000 (13:29 -0800)]
Merge tag 'for-6.8-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- zoned mode fixes:
- fix slowdown when writing large file sequentially by looking up
block groups with enough space faster
- locking fixes when activating a zone
- new mount API fixes:
- preserve mount options for a ro/rw mount of the same subvolume
- scrub fixes:
- fix use-after-free in case the chunk length is not aligned to
64K, this does not happen normally but has been reported on
images converted from ext4
- similar alignment check was missing with raid-stripe-tree
- subvolume deletion fixes:
- prevent calling ioctl on already deleted subvolume
- properly track flag tracking a deleted subvolume
- in subpage mode, fix decompression of an inline extent (zlib, lzo,
zstd)
- fix crash when starting writeback on a folio, after integration with
recent MM changes this needs to be started conditionally
- reject unknown flags in defrag ioctl
- error handling, API fixes, minor warning fixes
* tag 'for-6.8-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: scrub: limit RST scrub to chunk boundary
btrfs: scrub: avoid use-after-free when chunk length is not 64K aligned
btrfs: don't unconditionally call folio_start_writeback in subpage
btrfs: use the original mount's mount options for the legacy reconfigure
btrfs: don't warn if discard range is not aligned to sector
btrfs: tree-checker: fix inline ref size in error messages
btrfs: zstd: fix and simplify the inline extent decompression
btrfs: lzo: fix and simplify the inline extent decompression
btrfs: zlib: fix and simplify the inline extent decompression
btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args
btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted
btrfs: don't abort filesystem when attempting to snapshot deleted subvolume
btrfs: zoned: fix lock ordering in btrfs_zone_activate()
btrfs: fix unbalanced unlock of mapping_tree_lock
btrfs: ref-verify: free ref cache before clearing mount opt
btrfs: fix kvcalloc() arguments order in btrfs_ioctl_send()
btrfs: zoned: optimize hint byte for zoned allocator
btrfs: zoned: factor out prepare_allocation_zoned()
Bernd Edlinger [Mon, 22 Jan 2024 18:34:21 +0000 (19:34 +0100)]
exec: Fix error handling in begin_new_exec()
If get_unused_fd_flags() fails, the error handling is incomplete because
bprm->cred is already set to NULL, and therefore free_bprm will not
unlock the cred_guard_mutex. Note there are two error conditions which
end up here, one before and one after bprm->cred is cleared.
Consolidate the calls to allow_write_access()/fput() into a single
place, since we repeat this code pattern. Add comments around the
callers for the details on it.
Linus Torvalds [Mon, 22 Jan 2024 17:47:24 +0000 (09:47 -0800)]
Merge tag 'Wstringop-overflow-for-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull stringop-overflow warning update from Gustavo A. R. Silva:
"Enable -Wstringop-overflow globally.
I waited for the release of -rc1 to run a final build-test on top of
it before sending this pull request. Fortunatelly, after building 358
kernels overnight (basically all supported archs with a wide variety
of configs), no more warnings have surfaced! :)
Thus, we are in a good position to enable this compiler option for all
versions of GCC that support it, with the exception of GCC-11, which
appears to have some issues with this option [1]"
Link: https://lore.kernel.org/lkml/[email protected]/
* tag 'Wstringop-overflow-for-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
init: Kconfig: Disable -Wstringop-overflow for GCC-11
Makefile: Enable -Wstringop-overflow globally
Linus Torvalds [Mon, 22 Jan 2024 17:40:05 +0000 (09:40 -0800)]
Merge tag 'xsa448-6.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen netback fix from Juergen Gross:
"Transmit requests in Xen's virtual network protocol can consist of
multiple parts. While not really useful, except for the initial part
any of them may be of zero length, i.e. carry no data at all.
Besides a certain initial portion of the to be transferred data, these
parts are directly translated into what Linux calls SKB fragments.
Such converted request parts can, when for a particular SKB they are
all of length zero, lead to a de-reference of NULL in core networking
code"
* tag 'xsa448-6.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-netback: don't produce zero-size SKB frags
net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv
Syzcaller UBSAN crash occurs in rds_cmsg_recv(),
which reads inc->i_rx_lat_trace[j + 1] with index 4 (3 + 1),
but with array size of 4 (RDS_RX_MAX_TRACES).
Here 'j' is assigned from rs->rs_rx_trace[i] and in-turn from
trace.rx_trace_pos[i] in rds_recv_track_latency(),
with both arrays sized 3 (RDS_MSG_RX_DGRAM_TRACE_MAX). So fix the
off-by-one bounds check in rds_recv_track_latency() to prevent
a potential crash in rds_cmsg_recv().
Found by syzcaller:
=================================================================
UBSAN: array-index-out-of-bounds in net/rds/recv.c:585:39
index 4 is out of range for type 'u64 [4]'
CPU: 1 PID: 8058 Comm: syz-executor228 Not tainted 6.6.0-gd2f51b3516da #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:217 [inline]
__ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:348
rds_cmsg_recv+0x60d/0x700 net/rds/recv.c:585
rds_recvmsg+0x3fb/0x1610 net/rds/recv.c:716
sock_recvmsg_nosec net/socket.c:1044 [inline]
sock_recvmsg+0xe2/0x160 net/socket.c:1066
__sys_recvfrom+0x1b6/0x2f0 net/socket.c:2246
__do_sys_recvfrom net/socket.c:2264 [inline]
__se_sys_recvfrom net/socket.c:2260 [inline]
__x64_sys_recvfrom+0xe0/0x1b0 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
==================================================================
Fixes: 3289025aedc0 ("RDS: add receive message trace used by application") Reported-by: Chenyuan Yang <[email protected]> Closes: https://lore.kernel.org/linux-rdma/CALGdzuoVdq-wtQ4Az9iottBqC5cv9ZhcE5q8N7LfYFvkRsOVcw@mail.gmail.com/ Signed-off-by: Sharath Srinivasan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Horatiu Vultur [Fri, 19 Jan 2024 10:47:50 +0000 (11:47 +0100)]
net: micrel: Fix PTP frame parsing for lan8814
The HW has the capability to check each frame if it is a PTP frame,
which domain it is, which ptp frame type it is, different ip address in
the frame. And if one of these checks fail then the frame is not
timestamp. Most of these checks were disabled except checking the field
minorVersionPTP inside the PTP header. Meaning that once a partner sends
a frame compliant to 8021AS which has minorVersionPTP set to 1, then the
frame was not timestamp because the HW expected by default a value of 0
in minorVersionPTP. This is exactly the same issue as on lan8841.
Fix this issue by removing this check so the userspace can decide on this.
Fix issues when performing unordered unbind/bind of a kernel modules
which are using a dpll device with DPLL_PIN_TYPE_MUX pins.
Currently only serialized bind/unbind of such use case works, fix
the issues and allow for unserialized kernel module bind order.
dpll: fix register pin with unregistered parent pin
In case of multiple kernel module instances using the same dpll device:
if only one registers dpll device, then only that one can register
directly connected pins with a dpll device. When unregistered parent is
responsible for determining if the muxed pin can be registered with it
or not, the drivers need to be loaded in serialized order to work
correctly - first the driver instance which registers the direct pins
needs to be loaded, then the other instances could register muxed type
pins.
Allow registration of a pin with a parent even if the parent was not
yet registered, thus allow ability for unserialized driver instance
load order.
Do not WARN_ON notification for unregistered pin, which can be invoked
for described case, instead just return error.
Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Reviewed-by: Jan Glaza <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Arkadiusz Kubalewski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
If parent pin was unregistered but child pin was not, the userspace
would see the "zombie" pins - the ones that were registered with
a parent pin (dpll_pin_on_pin_register(..)).
Technically those are not available - as there is no dpll device in the
system. Do not dump those pins and prevent userspace from any
interaction with them. Provide a unified function to determine if the
pin is available and use it before acting/responding for user requests.
When a kernel module is unbound but the pin resources were not entirely
freed (other kernel module instance of the same PCI device have had kept
the reference to that pin), and kernel module is again bound, the pin
properties would not be updated (the properties are only assigned when
memory for the pin is allocated), prop pointer still points to the
kernel module memory of the kernel module which was deallocated on the
unbind.
If the pin dump is invoked in this state, the result is a kernel crash.
Prevent the crash by storing persistent pin properties in dpll subsystem,
copy the content from the kernel module when pin is allocated, instead of
using memory of the kernel module.
Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Reviewed-by: Jan Glaza <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Arkadiusz Kubalewski <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
If pin type is not expected, or pin properities failed to allocate
memory, the unwind error path shall not destroy pin's xarrays, which
were not yet initialized.
Add new goto label and use it to fix broken error path.
David S. Miller [Mon, 22 Jan 2024 10:58:05 +0000 (10:58 +0000)]
Merge branch 'tun-fixes'
Yunjian Wang says:
====================
fixes for tun
There are few places on the receive path where packet receives and
packet drops were not accounted for. This patchset fixes that issue.
====================
Yunjian Wang [Fri, 19 Jan 2024 10:22:56 +0000 (18:22 +0800)]
tun: add missing rx stats accounting in tun_xdp_act
The TUN can be used as vhost-net backend, and it is necessary to
count the packets transmitted from TUN to vhost-net/virtio-net.
However, there are some places in the receive path that were not
taken into account when using XDP. It would be beneficial to also
include new accounting for successfully received bytes using
dev_sw_netstats_rx_add.
Yunjian Wang [Fri, 19 Jan 2024 10:22:35 +0000 (18:22 +0800)]
tun: fix missing dropped counter in tun_xdp_act
The commit 8ae1aff0b331 ("tuntap: split out XDP logic") includes
dropped counter for XDP_DROP, XDP_ABORTED, and invalid XDP actions.
Unfortunately, that commit missed the dropped counter when error
occurs during XDP_TX and XDP_REDIRECT actions. This patch fixes
this issue.
Jakub Kicinski [Fri, 19 Jan 2024 00:58:59 +0000 (16:58 -0800)]
net: fix removing a namespace with conflicting altnames
Mark reports a BUG() when a net namespace is removed.
kernel BUG at net/core/dev.c:11520!
Physical interfaces moved outside of init_net get "refunded"
to init_net when that namespace disappears. The main interface
name may get overwritten in the process if it would have
conflicted. We need to also discard all conflicting altnames.
Recent fixes addressed ensuring that altnames get moved
with the main interface, which surfaced this problem.
init: Kconfig: Disable -Wstringop-overflow for GCC-11
-Wstringop-overflow is buggy in GCC-11. Therefore, we should disable
this option specifically for that compiler version. To achieve this,
we introduce a new configuration option: GCC11_NO_STRINGOP_OVERFLOW.
The compiler option related to string operation overflow is now managed
under configuration CC_STRINGOP_OVERFLOW. This option is enabled by
default for all other versions of GCC that support it.
It seems that we have finished addressing all the remaining
issues regarding -Wstringop-overflow. So, we are now in good
shape to enable this compiler option globally.
Ilya Dryomov [Wed, 17 Jan 2024 17:59:44 +0000 (18:59 +0100)]
rbd: don't move requests to the running list on errors
The running list is supposed to contain requests that are pinning the
exclusive lock, i.e. those that must be flushed before exclusive lock
is released. When wake_lock_waiters() is called to handle an error,
requests on the acquiring list are failed with that error and no
flushing takes place. Briefly moving them to the running list is not
only pointless but also harmful: if exclusive lock gets acquired
before all of their state machines are scheduled and go through
rbd_lock_del_request(), we trigger
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"Some fixes, Some refactoring, some minor features:
- Assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on
multithreaded workloads: fstests (in addition to ktest tests) are
now checking slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
bcachefs: Improve inode_to_text()
bcachefs: logged_ops_format.h
bcachefs: reflink_format.h
bcachefs; extents_format.h
bcachefs: ec_format.h
bcachefs: subvolume_format.h
bcachefs: snapshot_format.h
bcachefs: alloc_background_format.h
bcachefs: xattr_format.h
bcachefs: dirent_format.h
bcachefs: inode_format.h
bcachefs; quota_format.h
bcachefs: sb-counters_format.h
bcachefs: counters.c -> sb-counters.c
bcachefs: comment bch_subvolume
bcachefs: bch_snapshot::btime
bcachefs: add missing __GFP_NOWARN
bcachefs: opts->compression can now also be applied in the background
bcachefs: Prep work for variable size btree node buffers
bcachefs: grab s_umount only if snapshotting
...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
clocksource/drivers/ep93xx: Fix error handling during probe
clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
clocksource/timer-riscv: Add riscv_clock_shutdown callback
dt-bindings: timer: Add StarFive JH8100 clint
dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.
And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.
Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.
In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.
Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.
Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()
drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
bcachefs: Better journal tracepoints
Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
bcachefs: Avoid flushing the journal in the discard path
When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.
But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.
This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.
Michal Schmidt [Thu, 18 Jan 2024 20:50:40 +0000 (21:50 +0100)]
idpf: distinguish vports by the dev_port attribute
idpf registers multiple netdevs (virtual ports) for one PCI function,
but it does not provide a way for userspace to distinguish them with
sysfs attributes. Per Documentation/ABI/testing/sysfs-class-net, it is
a bug not to set dev_port for independent ports on the same PCI bus,
device and function.
Without dev_port set, systemd-udevd's default naming policy attempts
to assign the same name ("ens2f0") to all four idpf netdevs on my test
system and obviously fails, leaving three of them with the initial
eth<N> name.
With this patch, systemd-udevd is able to assign unique names to the
netdevs (e.g. "ens2f0", "ens2f0d1", "ens2f0d2", "ens2f0d3").
The Intel-provided out-of-tree idpf driver already sets dev_port. In
this patch I chose to do it in the same place in the idpf_cfg_netdev
function.
Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Michal Schmidt <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Eric Dumazet [Thu, 18 Jan 2024 20:17:49 +0000 (20:17 +0000)]
udp: fix busy polling
Generic sk_busy_loop_end() only looks at sk->sk_receive_queue
for presence of packets.
Problem is that for UDP sockets after blamed commit, some packets
could be present in another queue: udp_sk(sk)->reader_queue
In some cases, a busy poller could spin until timeout expiration,
even if some packets are available in udp_sk(sk)->reader_queue.
v3: - make sk_busy_loop_end() nicer (Willem)
v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats.
- add a sk_is_inet() check in sk_is_udp() (Willem feedback)
- add a sk_is_inet() check in sk_is_tcp().
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
bcachefs: Reduce would_deadlock restarts
We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.
This tweaks the reflink path to take extents btree locks in the right
order.
Kent Overstreet [Thu, 11 Jan 2024 04:08:30 +0000 (23:08 -0500)]
bcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT
Previously, we added logging in the write path to ensure that any
unexpected errors getting reported to userspace have a log message; but
BCH_WRITE_ALLOC_NOWAIT is a special case, it's used for promotes where
errors are expected and not reported out to userspace - so we need to
silence those.
Fullway Wang [Thu, 18 Jan 2024 06:24:43 +0000 (14:24 +0800)]
fbdev: sis: Error out if pixclock equals zero
The userspace program could pass any values to the driver through
ioctl() interface. If the driver doesn't check the value of pixclock,
it may cause divide-by-zero error.
In sisfb_check_var(), var->pixclock is used as a divisor to caculate
drate before it is checked against zero. Fix this by checking it
at the beginning.
This is similar to CVE-2022-3061 in i740fb which was fixed by
commit 15cf0b8.
Fullway Wang [Thu, 18 Jan 2024 03:49:40 +0000 (11:49 +0800)]
fbdev: savage: Error out if pixclock equals zero
The userspace program could pass any values to the driver through
ioctl() interface. If the driver doesn't check the value of pixclock,
it may cause divide-by-zero error.
Although pixclock is checked in savagefb_decode_var(), but it is not
checked properly in savagefb_probe(). Fix this by checking whether
pixclock is zero in the function savagefb_check_var() before
info->var.pixclock is used as the divisor.
This is similar to CVE-2022-3061 in i740fb which was fixed by
commit 15cf0b8.
- retry (reconnect) improvement including new retrans mount parm, and
handling of two additional return codes that need to be retried on
- two minor cleanup patches and another to remove duplicate query
info code
- two documentation cleanup, and one reviewer email correction"
* tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update iface_last_update on each query-and-update
cifs: handle servers that still advertise multichannel after disabling
cifs: new mount option called retrans
cifs: reschedule periodic query for server interfaces
smb: client: don't clobber ->i_rdev from cached reparse points
smb: client: get rid of smb311_posix_query_path_info()
smb: client: parse owner/group when creating reparse points
smb: client: fix parsing of SMB3.1.1 POSIX create context
cifs: update known bugs mentioned in kernel docs for cifs
cifs: new nt status codes from MS-SMB2
cifs: pick channel for tcon and tdis
cifs: open_cached_dir should not rely on primary channel
smb3: minor documentation updates
Update MAINTAINERS email address
cifs: minor comment cleanup
smb3: show beginning time for per share stats
cifs: remove redundant variable tcon_exist
Linus Torvalds [Sat, 20 Jan 2024 23:03:25 +0000 (15:03 -0800)]
Merge tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New support:
- Loongson LS2X APB DMA controller
- sf-pdma: mpfs-pdma support
- Qualcomm X1E80100 GPI dma controller support
Updates:
- Xilinx XDMA updates to support interleaved DMA transfers
- TI PSIL threads for AM62P and J722S and cfg register regions
description
- axi-dmac Improving the cyclic DMA transfers
- Tegra Support dma-channel-mask property
- Remaining platform remove callback returning void conversions
Driver fixes for:
- Xilinx xdma driver operator precedence and initialization fix
- Excess kernel-doc warning fix in imx-sdma xilinx xdma drivers
- format-overflow warning fix for rz-dmac, sh usb dmac drivers
- 'output may be truncated' fix for shdma, fsl-qdma and dw-edma
drivers"
* tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits)
dmaengine: dw-edma: increase size of 'name' in debugfs code
dmaengine: fsl-qdma: increase size of 'irq_name'
dmaengine: shdma: increase size of 'dev_id'
dmaengine: xilinx: xdma: Fix kernel-doc warnings
dmaengine: usb-dmac: Avoid format-overflow warning
dmaengine: sh: rz-dmac: Avoid format-overflow warning
dmaengine: imx-sdma: fix Excess kernel-doc warnings
dmaengine: xilinx: xdma: Fix initialization location of desc in xdma_channel_isr()
dmaengine: xilinx: xdma: Fix operator precedence in xdma_prep_interleaved_dma()
dmaengine: xilinx: xdma: statify xdma_prep_interleaved_dma
dmaengine: xilinx: xdma: Workaround truncation compilation error
dmaengine: pl330: issue_pending waits until WFP state
dmaengine: xilinx: xdma: Implement interleaved DMA transfers
dmaengine: xilinx: xdma: Prepare the introduction of interleaved DMA transfers
dmaengine: xilinx: xdma: Add transfer error reporting
dmaengine: xilinx: xdma: Add error checking in xdma_channel_isr()
dmaengine: xilinx: xdma: Rework xdma_terminate_all()
dmaengine: xilinx: xdma: Ease dma_pool alignment requirements
dmaengine: xilinx: xdma: Add necessary macro definitions
dmaengine: xilinx: xdma: Get rid of unused code
...
Linus Torvalds [Sat, 20 Jan 2024 22:20:34 +0000 (14:20 -0800)]
Merge tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux
Pull coccinelle updates from Julia Lawall:
"Updates to the device_attr_show semantic patch to reflect the new
guidelines of the Linux kernel documentation.
The problem was identified by Li Zhijian <[email protected]>, who
proposed an initial fix"
* tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
coccinelle: device_attr_show: simplify patch case
coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst
Aurelien Jarno [Sat, 13 Jan 2024 18:33:31 +0000 (19:33 +0100)]
media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)
This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver. This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
- the preprocessed size from 121 MiB to 4.5 MiB;
- the build CPU time from 46.8 s to 1.6 s;
- the build memory from 2786 MiB to 98MiB.
In fine, this allows this relatively simple C file to be built on a
32-bit system.
Linus Torvalds [Tue, 9 Jan 2024 00:43:04 +0000 (16:43 -0800)]
execve: open the executable file before doing anything else
No point in allocating a new mm, counting arguments and environment
variables etc if we're just going to return ENOENT.
This patch does expose the fact that 'do_filp_open()' that execve() uses
is still unnecessarily expensive in the failure case, because it
allocates the 'struct file *' early, even if the path lookup (which is
heavily optimized) fails.
So that remains an unnecessary cost in the "no such executable" case,
but it's a separate issue. Regardless, I do not want to do _both_ a
filename_lookup() and a later do_filp_open() like the origin patch by
Josh Triplett did in [1].
Linus Torvalds [Sat, 20 Jan 2024 19:06:04 +0000 (11:06 -0800)]
Merge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Support for tuning for systems with fast misaligned accesses.
- Support for SBI-based suspend.
- Support for the new SBI debug console extension.
- The T-Head CMOs now use PA-based flushes.
- Support for enabling the V extension in kernel code.
- Optimized IP checksum routines.
- Various ftrace improvements.
- Support for archrandom, which depends on the Zkr extension.
- The build is no longer broken under NET=n, KUNIT=y for ports that
don't define their own ipv6 checksum.
* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
lib: checksum: Fix build with CONFIG_NET=n
riscv: lib: Check if output in asm goto supported
riscv: Fix build error on rv32 + XIP
riscv: optimize ELF relocation function in riscv
RISC-V: Implement archrandom when Zkr is available
riscv: Optimize hweight API with Zbb extension
riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
riscv: ftrace: Make function graph use ftrace directly
riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
riscv: Restrict DWARF5 when building with LLVM to known working versions
riscv: Hoist linker relaxation disabling logic into Kconfig
kunit: Add tests for csum_ipv6_magic and ip_fast_csum
riscv: Add checksum library
riscv: Add checksum header
riscv: Add static key for misaligned accesses
asm-generic: Improve csum_fold
RISC-V: selftests: cbo: Ensure asm operands match constraints
...
Linus Torvalds [Sat, 20 Jan 2024 17:42:32 +0000 (09:42 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Final round of fixes that came in too late to send in the first
request.
It's nine bug fixes and one version update (because of a bug fix) and
one set of PCI ID additions. There's one bug fix in the core which is
really a one liner (except that an additional sdev pointer was added
for convenience) and the rest are in drivers"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: core: Add TMF to tmr_list handling
scsi: core: Kick the requeue list after inserting when flushing
scsi: fnic: unlock on error path in fnic_queuecommand()
scsi: fcoe: Fix unsigned comparison with zero in store_ctlr_mode()
scsi: mpi3mr: Fix mpi3mr_fw.c kernel-doc warnings
scsi: smartpqi: Bump driver version to 2.1.26-030
scsi: smartpqi: Fix logical volume rescan race condition
scsi: smartpqi: Add new controller PCI IDs
scsi: ufs: qcom: Remove unnecessary goto statement from ufs_qcom_config_esi()
scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
scsi: ufs: core: Simplify power management during async scan
Linus Torvalds [Sat, 20 Jan 2024 17:24:06 +0000 (09:24 -0800)]
Merge tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"Since the large patch series to convert arch/sh to device tree support
has not been finalized yet due to various maintainers still asking for
changes to the series, this ended up being rather small consisting of
just two fixes.
The first patch by Geert Uytterhoeven addresses a build failure in the
EcoVec platform code. And the second patch by Masahiro Yamada removes
an unnecessary $(foreach ...) found in a Makefile of the vsyscall
code.
- Rename missed backlight field from fbdev to dev
- Remove unnecessary $(foreach ...)"
* tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: vsyscall: Remove unnecessary $(foreach ...)
sh: ecovec24: Rename missed backlight field from fbdev to dev
Linus Torvalds [Sat, 20 Jan 2024 17:14:04 +0000 (09:14 -0800)]
Merge tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fix from Helge Deller:
"There were various reports from people without any graphics output on
the screen and it turns out one commit triggers the problem.
- Revert 'firmware/sysfb: Clear screen_info state after consuming it'"
* tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
Revert "firmware/sysfb: Clear screen_info state after consuming it"
llc_conn_handler() initialises local variables {saddr,daddr}.mac
based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes
them to __llc_lookup().
However, the initialisation is done only when skb->protocol is
htons(ETH_P_802_2), otherwise, __llc_lookup_established() and
__llc_lookup_listener() will read garbage.
The missing initialisation existed prior to commit 211ed865108e
("net: delete all instances of special processing for token ring").
It removed the part to kick out the token ring stuff but forgot to
close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv().
Let's remove llc_tr_packet_type and complete the deprecation.
Eric Dumazet [Thu, 18 Jan 2024 18:36:25 +0000 (18:36 +0000)]
llc: make llc_ui_sendmsg() more robust against bonding changes
syzbot was able to trick llc_ui_sendmsg(), allocating an skb with no
headroom, but subsequently trying to push 14 bytes of Ethernet header [1]
Like some others, llc_ui_sendmsg() releases the socket lock before
calling sock_alloc_send_skb().
Then it acquires it again, but does not redo all the sanity checks
that were performed.
This fix:
- Uses LL_RESERVED_SPACE() to reserve space.
- Check all conditions again after socket lock is held again.
- Do not account Ethernet header for mtu limitation.
Lin Ma [Thu, 18 Jan 2024 13:03:06 +0000 (21:03 +0800)]
vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING
In the vlan_changelink function, a loop is used to parse the nested
attributes IFLA_VLAN_EGRESS_QOS and IFLA_VLAN_INGRESS_QOS in order to
obtain the struct ifla_vlan_qos_mapping. These two nested attributes are
checked in the vlan_validate_qos_map function, which calls
nla_validate_nested_deprecated with the vlan_map_policy.
However, this deprecated validator applies a LIBERAL strictness, allowing
the presence of an attribute with the type IFLA_VLAN_QOS_UNSPEC.
Consequently, the loop in vlan_changelink may parse an attribute of type
IFLA_VLAN_QOS_UNSPEC and believe it carries a payload of
struct ifla_vlan_qos_mapping, which is not necessarily true.
To address this issue and ensure compatibility, this patch introduces two
type checks that skip attributes whose type is not IFLA_VLAN_QOS_MAPPING.
Jakub Kicinski [Sat, 20 Jan 2024 05:15:16 +0000 (21:15 -0800)]
Merge branch 'bnxt_en-bug-fixes'
Michael Chan says:
====================
bnxt_en: Bug fixes
This series contains 5 miscellaneous fixes. The fixes include adding
delay for FLR, buffer memory leak, RSS table size calculation,
ethtool self test kernel warning, and mqprio crash.
====================
Michael Chan [Wed, 17 Jan 2024 23:45:15 +0000 (15:45 -0800)]
bnxt_en: Fix possible crash after creating sw mqprio TCs
The driver relies on netdev_get_num_tc() to get the number of HW
offloaded mqprio TCs to allocate and free TX rings. This won't
work and can potentially crash the system if software mqprio or
taprio TCs have been setup. netdev_get_num_tc() will return the
number of software TCs and it may cause the driver to allocate or
free more TX rings that it should. Fix it by adding a bp->num_tc
field to store the number of HW offload mqprio TCs for the device.
Use bp->num_tc instead of netdev_get_num_tc().
Michael Chan [Wed, 17 Jan 2024 23:45:14 +0000 (15:45 -0800)]
bnxt_en: Prevent kernel warning when running offline self test
We call bnxt_half_open_nic() to setup the chip partially to run
loopback tests. The rings and buffers are initialized normally
so that we can transmit and receive packets in loopback mode.
That means page pool buffers are allocated for the aggregation ring
just like the normal case. NAPI is not needed because we are just
polling for the loopback packets.
When we're done with the loopback tests, we call bnxt_half_close_nic()
to clean up. When freeing the page pools, we hit a WARN_ON()
in page_pool_unlink_napi() because the NAPI state linked to the
page pool is uninitialized.
The simplest way to avoid this warning is just to initialize the
NAPIs during half open and delete the NAPIs during half close.
Trying to skip the page pool initialization or skip linking of
NAPI during half open will be more complicated.
Michael Chan [Wed, 17 Jan 2024 23:45:13 +0000 (15:45 -0800)]
bnxt_en: Fix RSS table entries calculation for P5_PLUS chips
The existing formula used in the driver to calculate the number of RSS
table entries is to round up the number of RX rings to the next integer
multiples of 64 (e.g. 64, 128, 192, ..). This is incorrect. The valid
values supported by the chip are 64, 128, 256, 512 only (power of 2
starting from 64). When the number of RX rings is greater than 128, the
entry size will likely be wrong. Firmware will round down the invalid
value (e.g. 192 rounded down to 128) provided by the driver, causing some
RSS rings to not receive any packets.
We already have an existing function bnxt_calc_nr_ring_pages() to
do this calculation. Use it in bnxt_get_nr_rss_ctxs() to calculate the
number of RSS contexts correctly for P5_PLUS chips.