Ian Rogers [Fri, 12 Aug 2022 23:09:49 +0000 (16:09 -0700)]
perf jevents: Fold strings optimization
If a shorter string ends a longer string then the shorter string may
reuse the longer string at an offset. For example, on x86 the event
arith.cycles_div_busy and cycles_div_busy can be folded, even though
they have difference names the strings are identical after 6
characters. cycles_div_busy can reuse the arith.cycles_div_busy string
at an offset of 6.
In pmu-events.c this looks like the following where the 'also:' lists
folded strings:
/* offset=177541 */ "arith.cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000" /* also: cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000 */
As jevents.py combines multiple strings for an event into a larger
string, the amount of folding is minimal as all parts of the event must
align. Other organizations can benefit more from folding, but lose space
by say recording more offsets.
Ian Rogers [Fri, 12 Aug 2022 23:09:48 +0000 (16:09 -0700)]
perf jevents: Compress the pmu_events_table
The pmu_events array requires 15 pointers per entry which in position
independent code need relocating. Change the array to be an array of
offsets within a big C string. Only the offset of the first variable is
required, subsequent variables are stored in order after the \0
terminator (requiring a byte per variable rather than 4 bytes per
offset).
The file size savings are:
no jevents - the same 19,788,464bytes
x86 jevents - ~16.7% file size saving 23,744,288bytes vs 28,502,632bytes
all jevents - ~19.5% file size saving 24,469,056bytes vs 30,379,920bytes
default build options plus NO_LIBBFD=1.
For example, the x86 build savings come from .rela.dyn and
.data.rel.ro becoming smaller by 3,157,032bytes and 3,042,016bytes
respectively. .rodata increases by 1,432,448bytes, giving an overall
4,766,600bytes saving.
To make metric strings more shareable, the topic is changed from say
'skx metrics' to just 'metrics'.
To try to help with the memory layout the pmu_events are ordered as used
by perf qsort comparator functions.
Ian Rogers [Fri, 12 Aug 2022 23:09:47 +0000 (16:09 -0700)]
perf metrics: Copy entire pmu_event in find metric
The pmu_event passed to the pmu_events_table_for_each_event is invalid
after the loop. Copy the entire struct in metricgroup__find_metric.
Reduce the scope of this function to static.
Ian Rogers [Fri, 12 Aug 2022 23:09:43 +0000 (16:09 -0700)]
perf test: Use full metric resolution
The simple metric resolution doesn't handle recursion properly, switch
to use the full resolution as with the parse-metric tests which also
increases coverage. Don't set the values for the metric backward as
failures to generate a result are ignored.
Ian Rogers [Fri, 12 Aug 2022 23:09:41 +0000 (16:09 -0700)]
perf pmu-events: Avoid passing pmu_events_map
Preparation for hiding pmu_events_map as an implementation detail. While
the map is passed, the table of events is all that is normally wanted.
While modifying the function's types, rename pmu_events_map__find to
pmu_events_table__find to match later encapsulation. Similarly rename
pmu_add_cpu_aliases_map to pmu_add_cpu_aliases_table.
Ian Rogers [Fri, 12 Aug 2022 23:09:39 +0000 (16:09 -0700)]
perf jevents: Sort JSON files entries
Sort the JSON files entries on conversion to C. The sort order tries to
replicated cmp_sevent from pmu.c so that the input there is already
sorted except for sysfs events. Specifically, the sort order is given by
the tuple:
(not j.desc is None, fix_none(j.topic), fix_none(j.name), fix_none(j.pmu), fix_none(j.metric_name))
which is putting events with descriptions and topics before those
without, then sorting by name, then pmu and finally metric_name
Add the topic to JsonEvent on reading to simplify. Remove an unnecessary
lambda in the JSON reading.
Ian Rogers [Fri, 12 Aug 2022 23:09:37 +0000 (16:09 -0700)]
perf jevents: Remove the type/version variables
pmu_events_map has a type variable that is always initialized to "core"
and a version variable that is never read. Remove these from the API as
it is straightforward to add them back when necessary.
Ian Rogers [Fri, 12 Aug 2022 23:09:36 +0000 (16:09 -0700)]
perf jevent: Add an 'all' architecture argument
When 'all' is passed as the architecture generate a mapping table for
all architectures. This simplifies testing. To identify the table for an
architecture add an arch variable to the pmu_events_map.
Leo Yan [Thu, 11 Aug 2022 06:24:50 +0000 (14:24 +0800)]
perf c2c: Use 'peer' as default display for Arm64
Since Arm64 arch doesn't support HITMs flags, this patch changes to use
'peer' as default display if user doesn't specify any type; for other
arches, it still uses 'tot' as default display type if user doesn't
specify it.
This patch changes to call perf_session__new() in an earlier place, so
session environment can be initialized ahead and arch info can be used
for setting display type.
Leo Yan [Thu, 11 Aug 2022 06:24:49 +0000 (14:24 +0800)]
perf c2c: Sort on peer snooping for load operations
This patch adds a new option 'peer' so can sort on the cache hit for
peer snooping.
For displaying with option 'peer', the "Shared Data Cache Line Table"
and "Shared Cache Line Distribution Pareto" both sort with the metrics
"tot_peer".
Leo Yan [Thu, 11 Aug 2022 06:24:48 +0000 (14:24 +0800)]
perf c2c: Refactor display string
The display type is shown by combination the display string array and a
suffix string "HITMs", which is not friendly to extend display for other
sorting type (e.g. extension for peer operations).
This patch moves the suffix string "HITMs" into display string array for
HITM types, so it can allow us to not necessarily to output string
"HITMs" for new incoming display type.
Leo Yan [Thu, 11 Aug 2022 06:24:47 +0000 (14:24 +0800)]
perf c2c: Refactor node header
The node header array contains 3 items, each item is used for one of
the 3 flavors for node accessing info. To extend sorting on other
snooping type and not always stick to HITMs, the second header string
"Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as
"Node{cpus %peer %stores}").
For this reason, this patch changes the node header array to three
flat variables and uses switch-case in function setup_nodes_header(),
thus it is easier for altering the header string.
Leo Yan [Thu, 11 Aug 2022 06:24:46 +0000 (14:24 +0800)]
perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'
Use more general naming for the main sort dimension, this can allow us
not to sort only on HITM snoop type, so it can be extended to support
other costly snooping operations. So rename the dimension to the prefix
'percent_costly_".
Leo Yan [Thu, 11 Aug 2022 06:24:45 +0000 (14:24 +0800)]
perf c2c: Use explicit names for display macros
Perf c2c tool has an assumption that it heavily depends on HITM snoop
type to detect cache false sharing, unfortunately, HITM is not supported
on some architectures.
Essentially, perf c2c tool wants to find some very costly snooping
operations for false cache sharing, this means it's not necessarily
to stick using HITM tags and we can explore other snooping types
(e.g. SNOOPX_PEER).
For this reason, this patch renames HITM related display macros with
suffix '_HITM', so it can be distinct if later add more display types
for on other snooping type.
Leo Yan [Thu, 11 Aug 2022 06:24:43 +0000 (14:24 +0800)]
perf c2c: Add dimensions of peer metrics for cache line view
This patch adds dimensions of peer ops, which will be used for Shared
cache line distribution pareto.
It adds the percentage dimensions for local and remote peer operations,
and the dimensions for accounting operation numbers which is used for
stdio mode.
Leo Yan [Thu, 11 Aug 2022 06:24:42 +0000 (14:24 +0800)]
perf c2c: Add dimensions for peer load operations
This patch adds three dimensions for peer load operations of 'lcl_peer',
'rmt_peer' and 'tot_peer'. These three dimensions will be used in the
shared data cache line table.
Leo Yan [Thu, 11 Aug 2022 06:24:40 +0000 (14:24 +0800)]
perf mem: Add statistics for peer snooping
Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping
from peer cache line, it can come from a peer core, a peer cluster, or
a remote NUMA node.
This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER. Note, we
take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate
with cache level statistics. Therefore, we account the load operations
for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and
peer related metrics when flag PERF_MEM_SNOOPX_PEER is set.
So three new metrics are introduced: 'lcl_peer' is for local cache
access, the metric 'rmt_peer' is for remote access (includes remote DRAM
and any caches in remote node), and the metric 'tot_peer' is accounting
the sum value of 'lcl_peer' and 'rmt_peer'.
Ali Saidi [Thu, 11 Aug 2022 06:24:39 +0000 (14:24 +0800)]
perf arm-spe: Use SPE data source for neoverse cores
When synthesizing data from SPE, augment the type with source information
for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
the same encoding. I can't find encoding information for any other SPE
implementations to unify their choices with Arm's thus that is left for
future work.
This change populates the mem_lvl_num for Neoverse cores as well as the
deprecated mem_lvl namespace.
Ali Saidi [Thu, 11 Aug 2022 06:24:37 +0000 (14:24 +0800)]
perf tools: Sync addition of PERF_MEM_SNOOPX_PEER
Add a flag to the 'perf mem' data struct to signal that a request caused
a cache-to-cache transfer of a line from a peer of the requestor and
wasn't sourced from a lower cache level.
The line being moved from one peer cache to another has latency and
performance implications.
On Arm64 Neoverse systems the data source can indicate a cache-to-cache
transfer but not if the line is dirty or clean, so instead of
overloading HITM define a new flag that indicates this type of transfer.
Committer notes:
This really is not syncing with the kernel since the patch to the kernel
wasn't merged.
But we're going ahead of this as it seems trivial and is just a matter
of the perf kernel maintainers to give their ack or for us to find
another way of expressing this in the perf records synthesized in
userspace from the ARM64 hardware traces.
Adrian Hunter [Thu, 11 Aug 2022 17:04:11 +0000 (20:04 +0300)]
perf tools: Tidy guest option documentation
Move common guest options into include files. Use attribute substitution to
customize an example, using "[verse]" to define the block instead of a
"literal" block which does not permit substitution.
Namhyung Kim [Thu, 11 Aug 2022 18:54:56 +0000 (11:54 -0700)]
perf offcpu: Update offcpu test for child process
Record off-cpu data with perf bench sched messaging workload and count
the number of offcpu-time events. Also update the test script not to
run next tests if failed already and revise the error messages.
$ sudo ./perf test offcpu -v
88: perf record offcpu profiling tests :
--- start ---
test child forked, pid 344780
Checking off-cpu privilege
Basic off-cpu test
Basic off-cpu test [Success]
Child task off-cpu test
Child task off-cpu test [Success]
test child finished with 0
---- end ----
perf record offcpu profiling tests: Ok
Namhyung Kim [Thu, 11 Aug 2022 18:54:55 +0000 (11:54 -0700)]
perf offcpu: Track child processes
When -p option used or a workload is given, it needs to handle child
processes. The perf_event can inherit those task events
automatically. We can add a new BPF program in task_newtask
tracepoint to track child processes.
Namhyung Kim [Thu, 11 Aug 2022 18:54:54 +0000 (11:54 -0700)]
perf offcpu: Parse process id separately
The current target code uses thread id for tracking tasks because
perf_events need to be opened for each task. But we can use tgid in
BPF maps and check it easily.
Namhyung Kim [Thu, 11 Aug 2022 18:54:53 +0000 (11:54 -0700)]
perf offcpu: Check process id for the given workload
Current task filter checks task->pid which is different for each
thread. But we want to profile all the threads in the process. So
let's compare process id (or thread-group id: tgid) instead.
Adrian Hunter [Tue, 9 Aug 2022 08:07:02 +0000 (11:07 +0300)]
perf tools: Do not pass NULL to parse_events()
Many cases do not use the extra error information provided by
parse_events and instead pass NULL as the struct parse_events_error
pointer. Add a wrapper for those cases so that the pointer is never
NULL.
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
Yang Jihong [Mon, 8 Aug 2022 09:24:07 +0000 (17:24 +0800)]
perf kvm: Fix subcommand matching error
Currently the 'diff', 'top', 'buildid-list' and 'stat' perf commands use
strncmp() to match subcommands. As a result, matching does not meet
expectation.
-i, --input <file> Input file name
-o, --output <file> Output file name
-v, --verbose be more verbose (show counter open errors, etc)
--guest Collect guest os data
--guest-code Guest code can be found in hypervisor process
--guestkallsyms <file>
file saving guest os /proc/kallsyms
--guestmodules <file>
file saving guest os /proc/modules
--guestmount <directory>
guest mount directory under which every guest os instance has a subdir
--guestvmlinux <file>
file saving guest os vmlinux
--host Collect host os data
Brian Robbins [Fri, 5 Aug 2022 22:06:45 +0000 (15:06 -0700)]
perf inject jit: Ignore memfd and anonymous mmap events if jitdump present
Some processes store jitted code in memfd mappings to avoid having rwx
mappings. These processes map the code with a writeable mapping and a
read-execute mapping. They write the code using the writeable mapping
and then unmap the writeable mapping. All subsequent execution is
through the read-execute mapping.
perf inject --jit ignores //anon* mappings for each process where a
jitdump is present because it expects to inject mmap events for each
jitted code range, and said jitted code ranges will overlap with the
//anon* mappings.
Ignore /memfd: and [anon:* mappings so that jitted code contained in
/memfd: and [anon:* mappings is treated the same way as jitted code
contained in //anon* mappings.
Thomas Richter [Thu, 4 Aug 2022 07:52:21 +0000 (09:52 +0200)]
perf list: Add PMU pai_crypto event description for IBM z16
Add the event description for the IBM z16 pai_crypto PMU released with
commit 1bf54f32f525 ("s390/pai: Add support for cryptography counters")
The document SA22-7832-13 "z/Architecture Principles of Operation",
published May, 2022, contains the description of the
Processor Activity Instrumentation Facility and the cryptography
counter set., See Pages 5-110 to 5-113.
Patch reworked to fit for the converted jevents processing.
Committer notes:
Couldn't find 1bf54f32f525 ("s390/pai: Add support for cryptography
counters") in torvalds/master, in what tree is that cset?
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
Ian Rogers [Thu, 4 Aug 2022 22:18:01 +0000 (15:18 -0700)]
perf jevents: Simplify generation of C-string
Previous implementation wanted variable order and '(null)' string output
to match the C implementation. The '(null)' string output was a
quirk/bug and so there is no need to carry it forward.
As the building mechanism is now able to retry detection with different
combinations of linking flags, setting
FEATURE_CHECK_LDFLAGS-disassembler-four-args and
FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore,
so remove it.
Committer notes:
Use the same technique to find the set of bfd-related libraries to link as in:
3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features")
Commit 6e8ccb4f624a7 ("tools/bpf: properly account for libbfd variations")
sets the linking flags depending on which flavor of the libbfd feature was
detected.
However, the flavors except libbfd cannot be detected, as they are not in
the feature list.
Complete the list of features to detect by adding libbfd-liberty and
libbfd-liberty-z.
Committer notes:
Adjust conflict with with:
1e1613f64cc8a09d ("tools bpftool: Don't display disassembler-four-args feature test") 600b7b26c07a070d ("tools bpftool: Fix compilation error with new binutils")
tools, build: Retry detection of bfd-related features
While separate features have been defined to determine which linking flags
are required to use libbfd depending on the distribution (libbfd,
libbfd-liberty and libbfd-liberty-z), the same has not been done for other
features requiring linking to libbfd.
For example, disassembler-four-args requires linking to libbfd too, but it
should use the right linking flags. If not all the required ones are
specified, e.g. -liberty, detection will always fail even if the feature is
available.
Instead of creating new features, similarly to libbfd, simply retry
detection with the different set of flags until detection succeeds (or
fails, if the libraries are missing). In this way, feature detection is
transparent for the users of this building mechanism (e.g. perf), and those
users don't have for example to set an appropriate value for the
FEATURE_CHECK_LDFLAGS-disassembler-four-args variable.
The number of retries and features for which the retry mechanism is
implemented is low enough to make the increase in the complexity of
Makefile negligible.
Tested with perf and bpftool on Ubuntu 20.04.4 LTS, Fedora 36 and openSUSE
Tumbleweed.
Committer notes:
Do the retry for disassembler-init-styled as well.
[root@quaco ~]# perf test json
90: perf stat JSON output linter : Ok
[root@quaco ~]# set -o vi
[root@quaco ~]# perf test -v json
90: perf stat JSON output linter :
--- start ---
test child forked, pid 560794
Checking json output: no args [Success]
Checking json output: system wide [Success]
Checking json output: system wide Checking json output: system wide no aggregation [Success]
Checking json output: interval [Success]
Checking json output: event [Success]
Checking json output: per core [Success]
Checking json output: per thread [Success]
Checking json output: per die [Success]
Checking json output: per node [Success]
Checking json output: per socket [Success]
test child finished with 0
---- end ----
perf stat JSON output linter: Ok
[root@quaco ~]#
Claire Jensen [Fri, 5 Aug 2022 20:01:04 +0000 (13:01 -0700)]
perf stat: Add JSON output option
CSV output is tricky to format and column layout changes are susceptible
to breaking parsers. New JSON-formatted output has variable names to
identify fields that are consistent and informative, making the output
parseable.
Linus Torvalds [Tue, 9 Aug 2022 03:15:13 +0000 (20:15 -0700)]
Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull ksmbd updates from Steve French:
- fixes for memory access bugs (out of bounds access, oops, leak)
- multichannel fixes
- session disconnect performance improvement, and session register
improvement
- cleanup
* tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix heap-based overflow in set_ntacl_dacl()
ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
ksmbd: prevent out of bound read for SMB2_WRITE
ksmbd: fix use-after-free bug in smb2_tree_disconect
ksmbd: fix memory leak in smb2_handle_negotiate
ksmbd: fix racy issue while destroying session on multichannel
ksmbd: use wait_event instead of schedule_timeout()
ksmbd: fix kernel oops from idr_remove()
ksmbd: add channel rwlock
ksmbd: replace sessions list in connection with xarray
MAINTAINERS: ksmbd: add entry for documentation
ksmbd: remove unused ksmbd_share_configs_cleanup function
Linus Torvalds [Tue, 9 Aug 2022 03:04:35 +0000 (20:04 -0700)]
Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more iov_iter updates from Al Viro:
- more new_sync_{read,write}() speedups - ITER_UBUF introduction
- ITER_PIPE cleanups
- unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
switching them to advancing semantics
- making ITER_PIPE take high-order pages without splitting them
- handling copy_page_from_iter() for high-order pages properly
* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
fix copy_page_from_iter() for compound destinations
hugetlbfs: copy_page_to_iter() can deal with compound pages
copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
expand those iov_iter_advance()...
pipe_get_pages(): switch to append_pipe()
get rid of non-advancing variants
ceph: switch the last caller of iov_iter_get_pages_alloc()
9p: convert to advancing variant of iov_iter_get_pages_alloc()
af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
iov_iter: saner helper for page array allocation
fold __pipe_get_pages() into pipe_get_pages()
ITER_XARRAY: don't open-code DIV_ROUND_UP()
unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
unify xarray_get_pages() and xarray_get_pages_alloc()
unify pipe_get_pages() and pipe_get_pages_alloc()
iov_iter_get_pages(): sanity-check arguments
iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
...
Al Viro [Fri, 17 Jun 2022 18:45:41 +0000 (14:45 -0400)]
iov_iter: saner helper for page array allocation
All call sites of get_pages_array() are essenitally identical now.
Replace with common helper...
Returns number of slots available in resulting array or 0 on OOM;
it's up to the caller to make sure it doesn't ask to zero-entry
array (i.e. neither maxpages nor size are allowed to be zero).
Al Viro [Fri, 17 Jun 2022 18:30:39 +0000 (14:30 -0400)]
fold __pipe_get_pages() into pipe_get_pages()
... and don't mangle maxsize there - turn the loop into counting
one instead. Easier to see that we won't run out of array that
way. Note that special treatment of the partial buffer in that
thing is an artifact of the non-advancing semantics of
iov_iter_get_pages() - if not for that, it would be append_pipe(),
same as the body of the loop that follows it. IOW, once we make
iov_iter_get_pages() advancing, the whole thing will turn into
calculate how many pages do we want
allocate an array (if needed)
call append_pipe() that many times.
Al Viro [Fri, 17 Jun 2022 17:35:35 +0000 (13:35 -0400)]
unify pipe_get_pages() and pipe_get_pages_alloc()
The differences between those two are
* pipe_get_pages() gets a non-NULL struct page ** value pointing to
preallocated array + array size.
* pipe_get_pages_alloc() gets an address of struct page ** variable that
contains NULL, allocates the array and (on success) stores its address in
that variable.
Not hard to combine - always pass struct page ***, have
the previous pipe_get_pages_alloc() caller pass ~0U as cap for
array size.
Al Viro [Wed, 15 Jun 2022 06:02:51 +0000 (02:02 -0400)]
ITER_PIPE: cache the type of last buffer
We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
check if ->iov_offset is non-zero (i.e. that pipe is not empty)
if so, get the corresponding pipe_buffer and check its ->ops
if it's &default_pipe_buf_ops, we have an anon buffer.
Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
empty, no buffers occupied: 0
anon, with bytes up to N-1 filled: N
zero-copy, with bytes up to N-1 filled: -N
That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.
Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.
Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator. About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe. There are only two cases where we leave the
sane state:
1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
2) iov_iter copied, then something is put into the copy. Since
they share the underlying pipe, the original gets behind. When we
decide that we are done with the copy (original is not usable until then)
we advance the original. direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data. At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.
Al Viro [Sun, 12 Jun 2022 21:54:35 +0000 (17:54 -0400)]
ITER_PIPE: clean iov_iter_revert()
Fold pipe_truncate() into it, clean up. We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.
Al Viro [Wed, 15 Jun 2022 20:03:25 +0000 (16:03 -0400)]
ITER_PIPE: clean pipe_advance() up
instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.