Pavel Begunkov [Tue, 5 Nov 2024 02:12:33 +0000 (02:12 +0000)]
io_uring: avoid normal tw intermediate fallback
When a DEFER_TASKRUN io_uring is terminating it requeues deferred task
work items as normal tw, which can further fallback to kthread
execution. Avoid this extra step and always push them to the fallback
kthread.
Olivier Langlois [Sun, 13 Oct 2024 18:29:24 +0000 (14:29 -0400)]
io_uring/napi: add static napi tracking strategy
Add the static napi tracking strategy. That allows the user to manually
manage the napi ids list for busy polling, and eliminate the overhead of
dynamically updating the list from the fast path.
Olivier Langlois [Sun, 13 Oct 2024 18:29:12 +0000 (14:29 -0400)]
io_uring/napi: clean up __io_napi_do_busy_loop
__io_napi_do_busy_loop now requires to have loop_end in its parameters.
This makes the code cleaner and also has the benefit of removing a
branch since the only caller not passing NULL for loop_end_arg is also
setting the value conditionally.
Olivier Langlois [Sun, 13 Oct 2024 18:28:50 +0000 (14:28 -0400)]
io_uring/napi: improve __io_napi_add
1. move the sock->sk pointer validity test outside the function to
avoid the function call overhead and to make the function more
more reusable
2. change its name to __io_napi_add_id to be more precise about it is
doing
3. return an error code to report errors
Pavel Begunkov [Mon, 4 Nov 2024 12:02:47 +0000 (12:02 +0000)]
io_uring: prevent speculating sq_array indexing
The SQ index array consists of user provided indexes, which io_uring
then uses to index the SQ, and so it's susceptible to speculation. For
all other queues io_uring tracks heads and tails in kernel, and they
shouldn't need any special care.
Jens Axboe [Sun, 3 Nov 2024 17:23:38 +0000 (10:23 -0700)]
io_uring: move struct io_kiocb from task_struct to io_uring_task
Rather than store the task_struct itself in struct io_kiocb, store
the io_uring specific task_struct. The life times are the same in terms
of io_uring, and this avoids doing some dereferences through the
task_struct. For the hot path of putting local task references, we can
deref req->tctx instead, which we'll need anyway in that function
regardless of whether it's local or remote references.
This is mostly straight forward, except the original task PF_EXITING
check needs a bit of tweaking. task_work is _always_ run from the
originating task, except in the fallback case, where it's run from a
kernel thread. Replace the potentially racy (in case of fallback work)
checks for req->task->flags with current->flags. It's either the still
the original task, in which case PF_EXITING will be sane, or it has
PF_KTHREAD set, in which case it's fallback work. Both cases should
prevent moving forward with the given request.
Jens Axboe [Sun, 3 Nov 2024 17:22:43 +0000 (10:22 -0700)]
io_uring: move cancelations to be io_uring_task based
Right now the task_struct pointer is used as the key to match a task,
but in preparation for some io_kiocb changes, move it to using struct
io_uring_task instead. No functional changes intended in this patch.
Jens Axboe [Sun, 3 Nov 2024 15:46:07 +0000 (08:46 -0700)]
io_uring/rsrc: split io_kiocb node type assignments
Currently the io_rsrc_node assignment in io_kiocb is an array of two
pointers, as two nodes may be assigned to a request - one file node,
and one buffer node. However, the buffer node can co-exist with the
provided buffers, as currently it's not supported to use both provided
and registered buffers at the same time.
This crucially brings struct io_kiocb down to 4 cache lines again, as
before it spilled into the 5th cacheline.
Jens Axboe [Sun, 3 Nov 2024 15:17:28 +0000 (08:17 -0700)]
io_uring/rsrc: encode node type and ctx together
Rather than keep the type field separate rom ctx, use the fact that we
can encode up to 4 types of nodes in the LSB of the ctx pointer. Doesn't
reclaim any space right now on 64-bit archs, but it leaves a full int
for future use.
hexue [Fri, 1 Nov 2024 09:19:57 +0000 (17:19 +0800)]
io_uring: add support for hybrid IOPOLL
A new hybrid poll is implemented on the io_uring layer. Once an IO is
issued, it will not poll immediately, but rather block first and re-run
before IO complete, then poll to reap IO. While this poll method could
be a suboptimal solution when running on a single thread, it offers
performance lower than regular polling but higher than IRQ, and CPU
utilization is also lower than polling.
To use hybrid polling, the ring must be setup with both the
IORING_SETUP_IOPOLL and IORING_SETUP_HYBRID)IOPOLL flags set. Hybrid
polling has the same restrictions as IOPOLL, in that commands must
explicitly support it.
Jens Axboe [Tue, 29 Oct 2024 13:50:56 +0000 (07:50 -0600)]
io_uring/rsrc: allow cloning with node replacements
Currently cloning a buffer table will fail if the destination already has
a table. But it should be possible to use it to replace existing elements.
Add a IORING_REGISTER_DST_REPLACE cloning flag, which if set, will allow
the destination to already having a buffer table. If that is the case,
then entries designated by offset + nr buffers will be replaced if they
already exist.
Note that it's allowed to use IORING_REGISTER_DST_REPLACE and not have
an existing table, in which case it'll work just like not having the
flag set and an empty table - it'll just assign the newly created table
for that case.
Jens Axboe [Tue, 29 Oct 2024 00:43:13 +0000 (18:43 -0600)]
io_uring/rsrc: allow cloning at an offset
Right now buffer cloning is an all-or-nothing kind of thing - either the
whole table is cloned from a source to a destination ring, or nothing at
all.
However, it's not always desired to clone the whole thing. Allow for
the application to specify a source and destination offset, and a
number of buffers to clone. If the destination offset is non-zero, then
allocate sparse nodes upfront.
Jens Axboe [Wed, 30 Oct 2024 15:51:58 +0000 (09:51 -0600)]
io_uring/rsrc: get rid of the empty node and dummy_ubuf
The empty node was used as a placeholder for a sparse entry, but it
didn't really solve any issues. The caller still has to check for
whether it's the empty node or not, it may as well just check for a NULL
return instead.
The dummy_ubuf was used for a sparse buffer entry, but NULL will serve
the same purpose there of ensuring an -EFAULT on attempted import.
Just use NULL for a sparse node, regardless of whether or not it's a
file or buffer resource.
Jens Axboe [Tue, 29 Oct 2024 15:02:38 +0000 (09:02 -0600)]
io_uring/rsrc: add io_reset_rsrc_node() helper
Puts and reset an existing node in a slot, if one exists. Returns true
if a node was there, false if not. This helps cleanup some of the code
that does a lookup just to clear an existing node.
Jens Axboe [Sat, 26 Oct 2024 20:50:13 +0000 (14:50 -0600)]
io_uring/rsrc: unify file and buffer resource tables
For files, there's nr_user_files/file_table/file_data, and buffers have
nr_user_bufs/user_bufs/buf_data. There's no reason why file_table and
file_data can't be the same thing, and ditto for the buffer side. That
gets rid of more io_ring_ctx state that's in two spots rather than just
being in one spot, as it should be. Put all the registered file data in
one locations, and ditto on the buffer front.
This also avoids having both io_rsrc_data->nodes being an allocated
array, and ->user_bufs[] or ->file_table.nodes. There's no reason to
have this information duplicated. Keep it in one spot, io_rsrc_data,
along with how many resources are available.
Jens Axboe [Sat, 26 Oct 2024 16:41:51 +0000 (10:41 -0600)]
io_uring/rsrc: add an empty io_rsrc_node for sparse buffer entries
Rather than allocate an io_rsrc_node for an empty/sparse buffer entry,
add a const entry that can be used for that. This just needs checking
for writing the tag, and the put check needs to check for that sparse
node rather than NULL for validity.
This avoids allocating rsrc nodes for sparse buffer entries.
Jens Axboe [Sat, 26 Oct 2024 01:27:39 +0000 (19:27 -0600)]
io_uring/rsrc: get rid of per-ring io_rsrc_node list
Work in progress, but get rid of the per-ring serialization of resource
nodes, like registered buffers and files. Main issue here is that one
node can otherwise hold up a bunch of other nodes from getting freed,
which is especially a problem for file resource nodes and networked
workloads where some descriptors may not see activity in a long time.
As an example, instantiate an io_uring ring fd and create a sparse
registered file table. Even 2 will do. Then create a socket and register
it as fixed file 0, F0. The number of open files in the app is now 5,
with 0/1/2 being the usual stdin/out/err, 3 being the ring fd, and 4
being the socket. Register this socket (eg "the listener") in slot 0 of
the registered file table. Now add an operation on the socket that uses
slot 0. Finally, loop N times, where each loop creates a new socket,
registers said socket as a file, then unregisters the socket, and
finally closes the socket. This is roughly similar to what a basic
accept loop would look like.
At the end of this loop, it's not unreasonable to expect that there
would still be 5 open files. Each socket created and registered in the
loop is also unregistered and closed. But since the listener socket
registered first still has references to its resource node due to still
being active, each subsequent socket unregistration is stuck behind it
for reclaim. Hence 5 + N files are still open at that point, where N is
awaiting the final put held up by the listener socket.
Rewrite the io_rsrc_node handling to NOT rely on serialization. Struct
io_kiocb now gets explicit resource nodes assigned, with each holding a
reference to the parent node. A parent node is either of type FILE or
BUFFER, which are the two types of nodes that exist. A request can have
two nodes assigned, if it's using both registered files and buffers.
Since request issue and task_work completion is both under the ring
private lock, no atomics are needed to handle these references. It's a
simple unlocked inc/dec. As before, the registered buffer or file table
each hold a reference as well to the registered nodes. Final put of the
node will remove the node and free the underlying resource, eg unmap the
buffer or put the file.
Outside of removing the stall in resource reclaim described above, it
has the following advantages:
1) It's a lot simpler than the previous scheme, and easier to follow.
No need to specific quiesce handling anymore.
2) There are no resource node allocations in the fast path, all of that
happens at resource registration time.
3) The structs related to resource handling can all get simplified
quite a bit, like io_rsrc_node and io_rsrc_data. io_rsrc_put can
go away completely.
4) Handling of resource tags is much simpler, and doesn't require
persistent storage as it can simply get assigned up front at
registration time. Just copy them in one-by-one at registration time
and assign to the resource node.
The only real downside is that a request is now explicitly limited to
pinning 2 resources, one file and one buffer, where before just
assigning a resource node to a request would pin all of them. The upside
is that it's easier to follow now, as an individual resource is
explicitly referenced and assigned to the request.
With this in place, the above mentioned example will be using exactly 5
files at the end of the loop, not N.
Jens Axboe [Mon, 28 Oct 2024 14:41:24 +0000 (08:41 -0600)]
io_uring/rsrc: kill io_charge_rsrc_node()
It's only used from __io_req_set_rsrc_node(), and it takes both the ctx
and node itself, while never using the ctx. Just open-code the basic
refs++ in __io_req_set_rsrc_node() instead.
Jens Axboe [Mon, 28 Oct 2024 14:03:04 +0000 (08:03 -0600)]
io_uring/splice: open code 2nd direct file assignment
In preparation for not pinning the whole registered file table, open
code the second potential direct file assignment. This will be handled
by appropriate helpers in the future, for now just do it manually.
Jens Axboe [Tue, 15 Oct 2024 18:19:33 +0000 (12:19 -0600)]
io_uring: specify freeptr usage for SLAB_TYPESAFE_BY_RCU io_kiocb cache
Doesn't matter right now as there's still some bytes left for it, but
let's prepare for the io_kiocb potentially growing and add a specific
freeptr offset for it.
Jens Axboe [Tue, 22 Oct 2024 19:47:00 +0000 (13:47 -0600)]
io_uring: add support for fixed wait regions
Generally applications have 1 or a few waits of waiting, yet they pass
in a struct io_uring_getevents_arg every time. This needs to get copied
and, in turn, the timeout value needs to get copied.
Rather than do this for every invocation, allow the application to
register a fixed set of wait regions that can simply be indexed when
asking the kernel to wait on events.
At ring setup time, the application can register a number of these wait
regions and initialize region/index 0 upfront:
/* set timeout and mark as set, sigmask/sigmask_sz as needed */
reg->ts.tv_sec = 0;
reg->ts.tv_nsec = 100000;
reg->flags = IORING_REG_WAIT_TS;
where nr_regions >= 1 && nr_regions <= PAGE_SIZE / sizeof(*reg). The
above initializes index 0, but 63 other regions can be initialized,
if needed. Now, instead of doing:
to wait for events for each submit_and_wait, or just wait, operation, it
can just reference the above region at offset 0 and do:
io_uring_submit_and_wait_reg(ring, &cqe, nr, 0);
to achieve the same goal of waiting 100usec without needing to copy
both struct io_uring_getevents_arg (24b) and struct __kernel_timeout
(16b) for each invocation. Struct io_uring_reg_wait looks as follows:
embedding the timeout itself in the region, rather than passing it as
a pointer as well. Note that the signal mask is still passed as a
pointer, both for compatability reasons, but also because there doesn't
seem to be a lot of high frequency waits scenarios that involve setting
and resetting the signal mask for each wait.
The application is free to modify any region before a wait call, or it
can use keep multiple regions with different settings to avoid needing to
modify the same one for wait calls. Up to a page size of regions is mapped
by default, allowing PAGE_SIZE / 64 available regions for use.
The registered region must fit within a page. On a 4kb page size system,
that allows for 64 wait regions if a full page is used, as the size of
struct io_uring_reg_wait is 64b. The region registered must be aligned
to io_uring_reg_wait in size. It's valid to register less than 64
entries.
In network performance testing with zero-copy, this reduced the time
spent waiting on the TX side from 3.12% to 0.3% and the RX side from 4.4%
to 0.3%.
Wait regions are fixed for the lifetime of the ring - once registered,
they are persistent until the ring is torn down. The regions support
minimum wait timeout as well as the regular waits.
Jens Axboe [Tue, 22 Oct 2024 19:41:42 +0000 (13:41 -0600)]
io_uring: change io_get_ext_arg() to use uaccess begin + end
In scenarios where a high frequency of wait events are seen, the copy
of the struct io_uring_getevents_arg is quite noticeable in the
profiles in terms of time spent. It can be seen as up to 3.5-4.5%.
Rewrite the copy-in logic, saving about 0.5% of the time.
Jens Axboe [Mon, 28 Oct 2024 19:18:27 +0000 (13:18 -0600)]
io_uring/sqpoll: wait on sqd->wait for thread parking
io_sqd_handle_event() just does a mutex unlock/lock dance when it's
supposed to park, somewhat relying on full ordering with the thread
trying to park it which does a similar unlock/lock dance on sqd->lock.
However, with adaptive spinning on mutexes, this can waste an awful
lot of time. Normally this isn't very noticeable, as parking and
unparking the thread isn't a common (or fast path) occurence. However,
in testing ring resizing, it's testing exactly that, as each resize
will require the SQPOLL to safely park and unpark.
Have io_sq_thread_park() explicitly wait on sqd->park_pending being
zero before attempting to grab the sqd->lock again.
In a resize test, this brings the runtime of SQPOLL down from about
60 seconds to a few seconds, just like the !SQPOLL tests. And saves
a ton of spinning time on the mutex, on both sides.
Once a ring has been created, the size of the CQ and SQ rings are fixed.
Usually this isn't a problem on the SQ ring side, as it merely controls
the available number of requests that can be submitted in a single
system call, and there's rarely a need to change that.
For the CQ ring, it's a different story. For most efficient use of
io_uring, it's important that the CQ ring never overflows. This means
that applications must size it for the worst case scenario, which can
be wasteful.
Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize
the existing rings. It takes a struct io_uring_params argument, the same
one which is used to setup the ring initially, and resizes rings
according to the sizes given.
Certain properties are always inherited from the original ring setup,
like SQE128/CQE32 and other setup options. The implementation only
allows flag associated with how the CQ ring is sized and clamped.
Existing unconsumed SQE and CQE entries are copied as part of the
process. If either the SQ or CQ resized destination ring cannot hold the
entries already present in the source rings, then the operation is failed
with -EOVERFLOW. Any register op holds ->uring_lock, which prevents new
submissions, and the internal mapping holds the completion lock as well
across moving CQ ring state.
To prevent races between mmap and ring resizing, add a mutex that's
solely used to serialize ring resize and mmap. mmap_sem can't be used
here, as as fork'ed process may be doing mmaps on the ring as well.
The ctx->resize_lock is held across mmap operations, and the resize
will grab it before swapping out the already mapped new data.
Jens Axboe [Thu, 24 Oct 2024 16:52:02 +0000 (10:52 -0600)]
io_uring/memmap: explicitly return -EFAULT for mmap on NULL rings
The later mapping will actually check this too, but in terms of code
clarify, explicitly check for whether or not the rings and sqes are
valid during validation. That makes it explicit that if they are
non-NULL, they are valid and can get mapped.
Jens Axboe [Mon, 21 Oct 2024 19:32:19 +0000 (13:32 -0600)]
io_uring: abstract out a bit of the ring filling logic
Abstract out a io_uring_fill_params() helper, which fills out the
necessary bits of struct io_uring_params. Add it to io_uring.h as well,
in preparation for having another internal user of it.
Jens Axboe [Mon, 21 Oct 2024 19:29:39 +0000 (13:29 -0600)]
io_uring: move max entry definition and ring sizing into header
In preparation for needing this somewhere else, move the definitions
for the maximum CQ and SQ ring size into io_uring.h. Make the
rings_size() helper available as well, and have it take just the setup
flags argument rather than the fill ring pointer. That's all that is
needed.
Pavel Begunkov [Tue, 22 Oct 2024 14:43:15 +0000 (15:43 +0100)]
io_uring/net: clean up io_msg_copy_hdr
Put sr->umsg into a local variable, so it doesn't repeat "sr->umsg->"
for every field. It looks nicer, and likely without the patch it
compiles into a bunch of umsg memory reads.
Pavel Begunkov [Tue, 22 Oct 2024 14:43:14 +0000 (15:43 +0100)]
io_uring/net: don't alias send user pointer reads
We keep user pointers in an union, which could be a user buffer or a
user pointer to msghdr. What is confusing is that it potenitally reads
and assigns sqe->addr as one type but then uses it as another via the
union. Even more, it's not even consistent across copy and zerocopy
versions.
Make send and sendmsg setup helpers read sqe->addr and treat it as the
right type from the beginning. The end goal would be to get rid of
the use of struct io_sr_msg::umsg for send requests as we only need it
at the prep side.
Pavel Begunkov [Tue, 22 Oct 2024 14:43:13 +0000 (15:43 +0100)]
io_uring/net: don't store send address ptr
For non "msg" requests we copy the address at the prep stage and there
is no need to store the address user pointer long term. Pass the SQE
into io_send_setup(), let it parse it, and remove struct io_sr_msg addr
addr_len fields. It saves some space and also less confusing.
Jens Axboe [Wed, 16 Oct 2024 13:39:31 +0000 (07:39 -0600)]
io_uring/net: move send zc fixed buffer import to issue path
Let's keep it close with the actual import, there's no reason to do this
on the prep side. With that, we can drop one of the branches checking
for whether or not IORING_RECVSEND_FIXED_BUF is set.
Jens Axboe [Wed, 16 Oct 2024 21:48:38 +0000 (15:48 -0600)]
io_uring/uring_cmd: get rid of using req->imu
It's pretty pointless to use io_kiocb as intermediate storage for this,
so split the validity check and the actual usage. The resource node is
assigned upfront at prep time, to prevent it from going away. The actual
import is never called with the ctx->uring_lock held, so grab it for
the import.
Pavel Begunkov [Fri, 18 Oct 2024 16:14:00 +0000 (17:14 +0100)]
io_uring: clean up cqe trace points
We have too many helpers posting CQEs, instead of tracing completion
events before filling in a CQE and thus having to pass all the data,
set the CQE first, pass it to the tracing helper and let it extract
everything it needs.
io_uring maintains two hash lists of inflight requests:
1) ctx->cancel_table_locked. This is used when the caller has the
ctx->uring_lock held already. This is only an issue side parameter,
as removal or task_work will always have it held.
2) ctx->cancel_table. This is used when the issuer does NOT have the
ctx->uring_lock held, and relies on the table spinlocks for access.
However, it's pretty trivial to simply grab the lock in the one spot
where we care about it, for insertion. With that, we can kill the
unlocked table (and get rid of the _locked postfix for the other one).
io_uring/msg_ring: add support for sending a sync message
Normally MSG_RING requires both a source and a destination ring. But
some users don't always have a ring avilable to send a message from, yet
they still need to notify a target ring.
Add support for using io_uring_register(2) without having a source ring,
using a file descriptor of -1 for that. Internally those are called
blind registration opcodes. Implement IORING_REGISTER_SEND_MSG_RING as a
blind opcode, which simply takes an sqe that the application can put on
the stack and use the normal liburing helpers to initialize it. Then the
app can call:
and get the same behavior in terms of the target, where a CQE is posted
with the details given in the sqe.
For now this takes a single sqe pointer argument, and hence arg must
be set to that, and nr_args must be 1. Could easily be extended to take
an array of sqes, but for now let's keep it simple.
io_uring/eventfd: move ctx->evfd_last_cq_tail into io_ev_fd
Everything else about the io_uring eventfd support is nicely kept
private to that code, except the cached_cq_tail tracking. With
everything else in place, move io_eventfd_flush_signal() to using
the ev_fd grab+release helpers, which then enables the direct use of
io_ev_fd for this tracking too.
io_uring/eventfd: move trigger check into a helper
It's a bit hard to read what guards the triggering, move it into a
helper and add a comment explaining it too. This additionally moves
the ev_fd == NULL check in there as well.
Linus Torvalds [Sun, 20 Oct 2024 21:08:17 +0000 (14:08 -0700)]
Merge tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Pull bluetooth fixes from Luiz Augusto Von Dentz:
- ISO: Fix multiple init when debugfs is disabled
- Call iso_exit() on module unload
- Remove debugfs directory on module init failure
- btusb: Fix not being able to reconnect after suspend
- btusb: Fix regression with fake CSR controllers 0a12:0001
- bnep: fix wild-memory-access in proto_unregister
Note: normally the bluetooth fixes go through the networking tree, but
this missed the weekly merge, and two of the commits fix regressions
that have caused a fair amount of noise and have now hit stable too:
So I'm pulling it directly just to expedite things and not miss yet
another -rc release. This is not meant to become a new pattern.
* tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Bluetooth: bnep: fix wild-memory-access in proto_unregister
Bluetooth: btusb: Fix not being able to reconnect after suspend
Bluetooth: Remove debugfs directory on module init failure
Bluetooth: Call iso_exit() on module unload
Bluetooth: ISO: Fix multiple init when debugfs is disabled
Linus Torvalds [Sun, 20 Oct 2024 20:55:46 +0000 (13:55 -0700)]
Merge tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Mostly error path fixes, but one pretty serious interrupt problem in
the Ocelot driver as well:
- Fix two error paths and a missing semicolon in the Intel driver
- Add a missing ACPI ID for the Intel Panther Lake
- Check return value of devm_kasprintf() in the Apple and STM32
drivers
- Add a missing mutex_destroy() in the aw9523 driver
- Fix a double free in cv1800_pctrl_dt_node_to_map() in the Sophgo
driver
- Fix a double free in ma35_pinctrl_dt_node_to_map_func() in the
Nuvoton driver
- Fix a bug in the Ocelot interrupt handler making the system hang"
* tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: ocelot: fix system hang on level based interrupts
pinctrl: nuvoton: fix a double free in ma35_pinctrl_dt_node_to_map_func()
pinctrl: sophgo: fix double free in cv1800_pctrl_dt_node_to_map()
pinctrl: intel: platform: Add Panther Lake to the list of supported
pinctrl: aw9523: add missing mutex_destroy
pinctrl: stm32: check devm_kasprintf() returned value
pinctrl: apple: check devm_kasprintf() returned value
pinctrl: intel: platform: use semicolon instead of comma in ncommunities assignment
pinctrl: intel: platform: fix error path in device_for_each_child_node()
Linus Torvalds [Sun, 20 Oct 2024 20:10:44 +0000 (13:10 -0700)]
Merge tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull misc driver fixes from Greg KH:
"Here are a number of small char/misc/iio driver fixes for 6.12-rc4:
- loads of small iio driver fixes for reported problems
- parport driver out-of-bounds fix
- Kconfig description and MAINTAINERS file updates
All of these, except for the Kconfig and MAINTAINERS file updates have
been in linux-next all week. Those other two are just documentation
changes and will have no runtime issues and were merged on Friday"
* tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (39 commits)
misc: rtsx: list supported models in Kconfig help
MAINTAINERS: Remove some entries due to various compliance requirements.
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device
parport: Proper fix for array out-of-bounds access
iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig
iio: frequency: {admv4420,adrf6780}: format Kconfig entries
iio: adc: ad4695: Add missing Kconfig select
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config()
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig
iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig
iio: resolver: ad2s1210 add missing select REGMAP in Kconfig
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
...
Linus Torvalds [Sun, 20 Oct 2024 19:57:53 +0000 (12:57 -0700)]
Merge tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes and new device ids for 6.12-rc4:
- xhci driver fixes for a number of reported issues
- new usb-serial driver ids
- dwc3 driver fixes for reported problems.
- usb gadget driver fixes for reported problems
- typec driver fixes
- MAINTAINER file updates
All of these have been in linux-next this week with no reported issues"
* tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Telit FN920C04 MBIM compositions
USB: serial: option: add support for Quectel EG916Q-GL
xhci: dbc: honor usb transfer size boundaries.
usb: xhci: Fix handling errors mid TD followed by other errors
xhci: Mitigate failed set dequeue pointer commands
xhci: Fix incorrect stream context type macro
USB: gadget: dummy-hcd: Fix "task hung" problem
usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store
usb: dwc3: core: Fix system suspend on TI AM62 platforms
xhci: tegra: fix checked USB2 port number
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF
usb: typec: altmode should keep reference to parent
MAINTAINERS: usb: raw-gadget: add bug tracker link
MAINTAINERS: Add an entry for the LJCA drivers
Linus Torvalds [Sun, 20 Oct 2024 19:04:32 +0000 (12:04 -0700)]
Merge tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Explicitly disable the TSC deadline timer when going idle to address
some CPU errata in that area
- Do not apply the Zenbleed fix on anything else except AMD Zen2 on the
late microcode loading path
- Clear CPU buffers later in the NMI exit path on 32-bit to avoid
register clearing while they still contain sensitive data, for the
RDFS mitigation
- Do not clobber EFLAGS.ZF with VERW on the opportunistic SYSRET exit
path on 32-bit
- Fix parsing issues of memory bandwidth specification in sysfs for
resctrl's memory bandwidth allocation feature
- Other small cleanups and improvements
* tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Always explicitly disarm TSC-deadline timer
x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load
x86/bugs: Use code segment selector for VERW operand
x86/entry_32: Clear CPU buffers after register restore in NMI return
x86/entry_32: Do not clobber user EFLAGS.ZF
x86/resctrl: Annotate get_mem_config() functions as __init
x86/resctrl: Avoid overflow in MB settings in bw_validate()
x86/amd_nb: Add new PCI ID for AMD family 1Ah model 20h
Linus Torvalds [Sun, 20 Oct 2024 18:30:56 +0000 (11:30 -0700)]
Merge tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduling fixes from Borislav Petkov:
- Add PREEMPT_RT maintainers
- Fix another aspect of delayed dequeued tasks wrt determining their
state, i.e., whether they're runnable or blocked
- Handle delayed dequeued tasks and their migration wrt PSI properly
- Fix the situation where a delayed dequeue task gets enqueued into a
new class, which should not happen
- Fix a case where memory allocation would happen while the runqueue
lock is held, which is a no-no
- Do not over-schedule when tasks with shorter slices preempt the
currently running task
- Make sure delayed to deque entities are properly handled before
unthrottling
- Other smaller cleanups and improvements
* tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add an entry for PREEMPT_RT.
sched/fair: Fix external p->on_rq users
sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug
sched/core: Dequeue PSI signals for blocked tasks that are delayed
sched: Fix delayed_dequeue vs switched_from_fair()
sched/core: Disable page allocation in task_tick_mm_cid()
sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl()
sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running
sched: Fix sched_delayed vs cfs_bandwidth
Linus Torvalds [Sun, 20 Oct 2024 00:04:52 +0000 (17:04 -0700)]
Merge tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux
Pull one more io_uring fix from Jens Axboe:
"Fix for a regression introduced in 6.12-rc2, where a condition check
was negated and hence -EAGAIN would bubble back up up to userspace
rather than trigger a retry condition"
* tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux:
io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()
Linus Torvalds [Sat, 19 Oct 2024 19:52:19 +0000 (12:52 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Fixes all in drivers. The largest is the mpi3mr which corrects a phy
count limit that should only apply to the controller but was being
incorrectly applied to expander phys"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: core: Fix null-ptr-deref in target_alloc_device()
scsi: mpi3mr: Validate SAS port assignments
scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
scsi: ufs: core: Requeue aborted request
scsi: ufs: core: Fix the issue of ICU failure
Linus Torvalds [Sat, 19 Oct 2024 19:42:14 +0000 (12:42 -0700)]
Merge tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace fixes from Steven Rostedt:
"A couple of fixes to function graph infrastructure:
- Fix allocation of idle shadow stack allocation during hotplug
If function graph tracing is started when a CPU is offline, if it
were come online during the trace then the idle task that
represents the CPU will not get a shadow stack allocated for it.
This means all function graph hooks that happen while that idle
task is running (including in interrupt mode) will have all its
events dropped.
Switch over to the CPU hotplug mechanism that will have any newly
brought on line CPU get a callback that can allocate the shadow
stack for its idle task.
- Fix allocation size of the ret_stack_list array
When function graph tracing converted over to allowing more than
one user at a time, it had to convert its shadow stack from an
array of ret_stack structures to an array of unsigned longs. The
shadow stacks are allocated in batches of 32 at a time and assigned
to every running task. The batch is held by the ret_stack_list
array.
But when the conversion happened, instead of allocating an array of
32 pointers, it was allocated as a ret_stack itself (PAGE_SIZE).
This ret_stack_list gets passed to a function that iterates over
what it believes is its size defined by the
FTRACE_RETSTACK_ALLOC_SIZE macro (which is 32).
Luckily (PAGE_SIZE) is greater than 32 * sizeof(long), otherwise
this would have been an array overflow. This still should be fixed
and the ret_stack_list should be allocated to the size it is
expected to be as someday it may end up being bigger than
SHADOW_STACK_SIZE"
* tag 'ftrace-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
fgraph: Allocate ret_stack_list with proper size
fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks
Linus Torvalds [Sat, 19 Oct 2024 18:48:14 +0000 (11:48 -0700)]
Merge tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull ipe fixes from Fan Wu:
"This addresses several issues identified by Luca when attempting to
enable IPE on Debian and systemd:
- address issues with IPE policy update errors and policy update
version check, improving the clarity of error messages for better
understanding by userspace programs.
- enable IPE policies to be signed by secondary and platform
keyrings, facilitating broader use across general Linux
distributions like Debian.
- updates the IPE entry in the MAINTAINERS file to reflect the new
tree URL and my updated email from kernel.org"
* tag 'ipe-pr-20241018' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
MAINTAINERS: update IPE tree url and Fan Wu's email
ipe: fallback to platform keyring also if key in trusted keyring is rejected
ipe: allow secondary and platform keyrings to install/update policies
ipe: also reject policy updates with the same version
ipe: return -ESTALE instead of -EINVAL on update when new policy has a lower version
Linus Torvalds [Sat, 19 Oct 2024 17:18:03 +0000 (10:18 -0700)]
Merge tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a fix for Zinitix driver to not fail probing if the property enabling
touch keys functionality is not defined. Support for touch keys was
added in 6.12 merge window so this issue does not affect users of
released kernels
- a couple new vendor/device IDs in xpad driver to enable support for
more hardware
* tag 'input-for-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: zinitix - don't fail if linux,keycodes prop is absent
Input: xpad - add support for MSI Claw A1M
Input: xpad - add support for 8BitDo Ultimate 2C Wireless Controller
Linus Torvalds [Sat, 19 Oct 2024 15:44:10 +0000 (08:44 -0700)]
Merge tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux
Pull 9p fixes from Dominique Martinet:
"Mashed-up update that I sat on too long:
- fix for multiple slabs created with the same name
- enable multipage folios
- theorical fix to also look for opened fids by inode if none was
found by dentry"
[ Enabling multi-page folios should have been done during the merge
window, but it's a one-liner, and the actual meat of the enablement
is in netfs and already in use for other filesystems... - Linus ]
* tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux:
9p: Avoid creating multiple slab caches with the same name
9p: Enable multipage folios
9p: v9fs_fid_find: also lookup by inode if not found dentry
Linus Torvalds [Sat, 19 Oct 2024 15:32:47 +0000 (08:32 -0700)]
Merge tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux
Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Fix several issues with the 'rustc-option' macro. It includes a
refactor from Masahiro of three '{cc,rust}-*' macros, which is not
a fix but avoids repeating the same commands (which would be
several lines in the case of 'rustc-option').
- Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not
a fix but is needed for the actual fix.
And a trivial grammar fix"
* tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux:
cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS
kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`
kbuild: fix issues with rustc-option
kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros
lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
Jens Axboe [Sat, 19 Oct 2024 15:16:51 +0000 (09:16 -0600)]
io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()
A previous commit improved how !FMODE_NOWAIT is dealt with, but
inadvertently negated a check whilst doing so. This caused -EAGAIN to be
returned from reading files with O_NONBLOCK set. Fix up the check for
REQ_F_SUPPORT_NOWAIT.
Steven Rostedt [Sat, 19 Oct 2024 01:52:12 +0000 (21:52 -0400)]
fgraph: Allocate ret_stack_list with proper size
The ret_stack_list is an array of ret_stack shadow stacks for the function
graph usage. When the first function graph is enabled, all tasks in the
system get a shadow stack. The ret_stack_list is a 32 element array of
pointers to these shadow stacks. It allocates the shadow stack in batches
(32 stacks at a time), assigns them to running tasks, and continues until
all tasks are covered.
When the function graph shadow stack changed from an array of
ftrace_ret_stack structures to an array of longs, the allocation of
ret_stack_list went from allocating an array of 32 elements to just a
block defined by SHADOW_STACK_SIZE. Luckily, that's defined as PAGE_SIZE
and is much more than enough to hold 32 pointers. But it is way overkill
for the amount needed to allocate.
Change the allocation of ret_stack_list back to a kcalloc() of
FTRACE_RETSTACK_ALLOC_SIZE pointers.
Steven Rostedt [Sat, 19 Oct 2024 01:43:00 +0000 (21:43 -0400)]
fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks
The function graph infrastructure allocates a shadow stack for every task
when enabled. This includes the idle tasks. The first time the function
graph is invoked, the shadow stacks are created and never freed until the
task exits. This includes the idle tasks.
Only the idle tasks that were for online CPUs had their shadow stacks
created when function graph tracing started. If function graph tracing is
enabled and a CPU comes online, the idle task representing that CPU will
not have its shadow stack created, and all function graph tracing for that
idle task will be silently dropped.
Instead, use the CPU hotplug mechanism to allocate the idle shadow stacks.
This will include idle tasks for CPUs that come online during tracing.
Linus Torvalds [Fri, 18 Oct 2024 23:27:14 +0000 (16:27 -0700)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Daniel Borkmann:
- Fix BPF verifier to not affect subreg_def marks in its range
propagation (Eduard Zingerman)
- Fix a truncation bug in the BPF verifier's handling of
coerce_reg_to_size_sx (Dimitar Kanaliev)
- Fix the BPF verifier's delta propagation between linked registers
under 32-bit addition (Daniel Borkmann)
- Fix a NULL pointer dereference in BPF devmap due to missing rxq
information (Florian Kauer)
- Fix a memory leak in bpf_core_apply (Jiri Olsa)
- Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
arrays of nested structs (Hou Tao)
- Fix build ID fetching where memory areas backing the file were
created with memfd_secret (Andrii Nakryiko)
- Fix BPF task iterator tid filtering which was incorrectly using pid
instead of tid (Jordan Rome)
- Several fixes for BPF sockmap and BPF sockhash redirection in
combination with vsocks (Michal Luczaj)
- Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)
- Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
of an infinite BPF tailcall (Pu Lehui)
- Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
be resolved (Thomas Weißschuh)
- Fix a bug in kfunc BTF caching for modules where the wrong BTF object
was returned (Toke Høiland-Jørgensen)
- Fix a BPF selftest compilation error in cgroup-related tests with
musl libc (Tony Ambardar)
- Several fixes to BPF link info dumps to fill missing fields (Tyrone
Wu)
- Add BPF selftests for kfuncs from multiple modules, checking that the
correct kfuncs are called (Simon Sundberg)
- Ensure that internal and user-facing bpf_redirect flags don't overlap
(Toke Høiland-Jørgensen)
- Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
Riel)
- Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
under RT (Wander Lairson Costa)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
lib/buildid: Handle memfd_secret() files in build_id_parse()
selftests/bpf: Add test case for delta propagation
bpf: Fix print_reg_state's constant scalar dump
bpf: Fix incorrect delta propagation between linked registers
bpf: Properly test iter/task tid filtering
bpf: Fix iter/task tid filtering
riscv, bpf: Make BPF_CMPXCHG fully ordered
bpf, vsock: Drop static vsock_bpf_prot initialization
vsock: Update msg_count on read_skb()
vsock: Update rx_bytes on read_skb()
bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
selftests/bpf: Add asserts for netfilter link info
bpf: Fix link info netfilter flags to populate defrag flag
selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
bpf: Fix truncation bug in coerce_reg_to_size_sx()
selftests/bpf: Assert link info uprobe_multi count & path_size if unset
bpf: Fix unpopulated path_size when uprobe_multi fields unset
selftests/bpf: Fix cross-compiling urandom_read
selftests/bpf: Add test for kfunc module order
...
Linus Torvalds [Fri, 18 Oct 2024 23:11:17 +0000 (16:11 -0700)]
Merge tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
- fix test makefile to install tests directory without which the test
fails with errors
* tag 'linux_kselftest-fixes-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftest: hid: add the missing tests directory
Nikita Travkin [Fri, 4 Oct 2024 16:17:30 +0000 (21:17 +0500)]
Input: zinitix - don't fail if linux,keycodes prop is absent
When initially adding the touchkey support, a mistake was made in the
property parsing code. The possible negative errno from
device_property_count_u32() was never checked, which was an oversight
left from converting to it from the of_property as part of the review
fixes.
Re-add the correct handling of the absent property, in which case zero
touchkeys should be assumed, which would disable the feature.
- Cleanup of checking for always-set tag_set (SurajSonawane2415)
- Fix for a crash with CPU hotplug notifiers (Ming)
- Don't allow zero-copy ublk on unprivileged device (Ming)
- Use array_index_nospec() for CDROM (Josh)
- Remove dead code in drbd (David)
- Tweaks to elevator loading (Breno)
* tag 'block-6.12-20241018' of git://git.kernel.dk/linux:
cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed()
nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function
nvme: make keep-alive synchronous operation
nvme-loop: flush off pending I/O while shutting down loop controller
nvme-pci: fix race condition between reset and nvme_dev_disable()
ublk: don't allow user copy for unprivileged device
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
nvme-multipath: defer partition scanning
blk-mq: setup queue ->tag_set before initializing hctx
elevator: Remove argument from elevator_find_get
elevator: do not request_module if elevator exists
drbd: Remove unused conn_lowest_minor
nvme: disable CC.CRIME (NVME_CC_CRIME)
nvme: delete unnecessary fallthru comment
nvmet-rdma: use sbitmap to replace rsp free list
block: Fix elevator_get_default() checking for NULL q->tag_set
nvme: tcp: avoid race between queue_lock lock and destroy
nvmet-passthru: clear EUID/NGUID/UUID while using loop target
block: fix blk_rq_map_integrity_sg kernel-doc
Linus Torvalds [Fri, 18 Oct 2024 22:38:37 +0000 (15:38 -0700)]
Merge tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix a regression this merge window where cloning of registered
buffers didn't take into account the dummy_ubuf
- Fix a race with reading how many SQRING entries are available,
causing userspace to need to loop around io_uring_sqring_wait()
rather than being able to rely on SQEs being available when it
returned
- Ensure that the SQPOLL thread is TASK_RUNNING before running
task_work off the cancelation exit path
* tag 'io_uring-6.12-20241018' of git://git.kernel.dk/linux:
io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work
io_uring/rsrc: ignore dummy_ubuf for buffer cloning
io_uring/sqpoll: close race on waiting for sqring entries
ipe: fallback to platform keyring also if key in trusted keyring is rejected
If enabled, we fallback to the platform keyring if the trusted keyring
doesn't have the key used to sign the ipe policy. But if pkcs7_verify()
rejects the key for other reasons, such as usage restrictions, we do not
fallback. Do so, following the same change in dm-verity.
Signed-off-by: Luca Boccassi <[email protected]> Suggested-by: Serge Hallyn <[email protected]>
[FW: fixed some line length issues and a typo in the commit message] Signed-off-by: Fan Wu <[email protected]>
Linus Torvalds [Fri, 18 Oct 2024 18:37:12 +0000 (11:37 -0700)]
Merge tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix possible double free setting xattrs
- Fix slab out of bounds with large ioctl payload
- Remove three unused functions, and an unused variable that could be
confusing
* tag 'v6.12-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Remove unused functions
smb/client: Fix logically dead code
smb: client: fix OOBs when building SMB2_IOCTL request
smb: client: fix possible double free in smb2_set_ea()
Linus Torvalds [Fri, 18 Oct 2024 18:28:39 +0000 (11:28 -0700)]
Merge tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
- Fix integer overflow in xrep_bmap
- Fix stale dealloc punching for COW IO
* tag 'xfs-6.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: punch delalloc extents from the COW fork for COW writes
xfs: set IOMAP_F_SHARED for all COW fork allocations
xfs: share more code in xfs_buffered_write_iomap_begin
xfs: support the COW fork in xfs_bmap_punch_delalloc_range
xfs: IOMAP_ZERO and IOMAP_UNSHARE already hold invalidate_lock
xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof
xfs: factor out a xfs_file_write_zero_eof helper
iomap: move locking out of iomap_write_delalloc_release
iomap: remove iomap_file_buffered_write_punch_delalloc
iomap: factor out a iomap_last_written_block helper
xfs: fix integer overflow in xrep_bmap
Linus Torvalds [Fri, 18 Oct 2024 18:16:01 +0000 (11:16 -0700)]
Merge tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two issues in the amd-pstate cpufreq driver and update the
intel_rapl power capping driver with a new processor ID.
Specifics:
- Enable ACPI CPPC in amd_pstate_register_driver() after disabling it
in amd_pstate_unregister_driver() when switching driver operation
modes (Dhananjay Ugwekar)
- Make amd-pstate use nominal performance as the maximum performance
level when boost is disabled (Mario Limonciello)
- Add ArrowLake-H to the list of processors where PL4 is supported in
the MSR part of the intel_rapl power capping driver (Srinivas
Pandruvada)"
* tag 'pm-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
powercap: intel_rapl_msr: Add PL4 support for ArrowLake-H
cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems
Linus Torvalds [Fri, 18 Oct 2024 18:13:53 +0000 (11:13 -0700)]
Merge tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix auto-detect regression in jc42 driver"
* tag 'hwmon-for-v6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
[PATCH} hwmon: (jc42) Properly detect TSE2004-compliant devices again
Linus Torvalds [Fri, 18 Oct 2024 18:03:21 +0000 (11:03 -0700)]
Merge tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, msm and xe are the two main ones, with a bunch of
scattered fixes including a largish revert in mgag200, then amdgpu,
vmwgfx and scattering of other minor ones.
All seems pretty regular.
msm:
- Display:
- move CRTC resource assignment to atomic_check otherwise to make
consecutive calls to atomic_check() consistent
- fix rounding / sign-extension issues with pclk calculation in
case of DSC
- cleanups to drop incorrect null checks in dpu snapshots
- fix to use kvzalloc in dpu snapshot to avoid allocation issues
in heavily loaded system cases
- Fix to not program merge_3d block if dual LM is not being used
- Fix to not flush merge_3d block if its not enabled otherwise
this leads to false timeouts
- GPU:
- a7xx: add a fence wait before SMMU table update
xe:
- New workaround to Xe2 (Aradhya)
- Fix unbalanced rpm put (Matthew Auld)
- Remove fragile lock optimization (Matthew Brost)
- Fix job release, delegating it to the drm scheduler (Matthew Brost)
- Fix timestamp bit width for Xe2 (Lucas)
- Fix external BO's dma-resv usag (Matthew Brost)
- Fix returning success for timeout in wait_token (Nirmoy)
- Initialize fence to avoid it being detected as signaled (Matthew
Auld)
- Improve cache flush for BMG (Matthew Auld)
- Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka)
host1x:
- Fix boot on Tegra186
- Set DMA parameters
mgag200:
- Revert VBLANK support
panel:
- himax-hx83192: Adjust power and gamma
qaic:
- Sgtable loop fixes
vmwgfx:
- Limit display layout allocatino size
- Handle allocation errors in connector checks
- Clean up KMS code for 2d-only setup
- Report surface-check errors correctly
- Remove NULL test around kvfree()"
* tag 'drm-fixes-2024-10-18' of https://gitlab.freedesktop.org/drm/kernel: (45 commits)
drm/ast: vga: Clear EDID if no display is connected
drm/ast: sil164: Clear EDID if no display is connected
Revert "drm/mgag200: Add vblank support"
drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs
drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater
drm/xe/bmg: improve cache flushing behaviour
drm/xe/xe_sync: initialise ufence.signalled
drm/xe/ufence: ufence can be signaled right after wait_woken
drm/xe: Use bookkeep slots for external BO's in exec IOCTL
drm/xe/query: Increase timestamp width
drm/xe: Don't free job in TDR
drm/xe: Take job list lock in xe_sched_add_pending_job
drm/xe: fix unbalanced rpm put() with declare_wedged()
drm/xe: fix unbalanced rpm put() with fence_fini()
drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg
drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode
drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation
drm/msm/a6xx+: Insert a fence wait before SMMU table update
drm/msm/dpu: don't always program merge_3d block
drm/msm/dpu: Don't always set merge_3d pending flush
...