Yang Li [Sat, 6 Aug 2022 07:19:33 +0000 (15:19 +0800)]
LoongArch: Fix unsigned comparison with less than zero
The return value from the call to get_timer_irq() is int, which can be
a negative error code. However, the return value is being assigned to an
unsigned int variable 'irq', so making 'irq' an int.
Eliminate the following coccicheck warning:
./arch/loongarch/kernel/time.c:146:5-8: WARNING: Unsigned expression compared with zero: irq < 0
Huacai Chen [Sat, 6 Aug 2022 07:19:32 +0000 (15:19 +0800)]
LoongArch: Adjust arch/loongarch/Kconfig
1, ACPI, EFI and SMP are mandatories for LoongArch, select them
unconditionally to avoid various build errors for 'make randconfig'.
2, Move the MMU_GATHER_MERGE_VMAS selection to the correct place.
LoongArch: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS is selected,
cpu_max_bits_warn() generates a runtime warning similar as below while
we show /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit)
instead of NR_CPUS to iterate CPUs.
Linus Torvalds [Fri, 12 Aug 2022 02:46:48 +0000 (19:46 -0700)]
Merge tag 'for-6.0/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- A few fixes for the DM verity and bufio changes in this merge window
- A smatch warning fix for DM writecache locking in writecache_map
* tag 'for-6.0/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm bufio: fix some cases where the code sleeps with spinlock held
dm writecache: fix smatch warning about invalid return from writecache_map
dm verity: fix verity_parse_opt_args parsing
dm verity: fix DM_VERITY_OPTS_MAX value yet again
dm bufio: simplify DM_BUFIO_CLIENT_NO_SLEEP locking
Linus Torvalds [Fri, 12 Aug 2022 02:12:15 +0000 (19:12 -0700)]
Merge tag 'drm-next-2022-08-12-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Not much to squeeze into rc1, just two small fixes, one for core gem
and one for shmem-helpers:
gem:
- Annotate WW context in error paths
shmem-helper:
- Add missing vunmap in error paths"
* tag 'drm-next-2022-08-12-1' of git://anongit.freedesktop.org/drm/drm:
drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
drm/shmem-helper: Add missing vunmap on error
Bharath SM [Thu, 11 Aug 2022 19:46:11 +0000 (19:46 +0000)]
SMB3: fix lease break timeout when multiple deferred close handles for the same file.
Solution is to send lease break ack immediately even in case of
deferred close handles to avoid lease break request timing out
and let deferred closed handle gets closed as scheduled.
Later patches could optimize cases where we then close some
of these handles sooner for the cases where lease break is to 'none'
Steve French [Thu, 11 Aug 2022 05:53:00 +0000 (00:53 -0500)]
smb3: allow deferred close timeout to be configurable
Deferred close can be a very useful feature for allowing
caching data for read, and for minimizing the number of
reopens needed for a file that is repeatedly opened and
close but there are workloads where its default (1 second,
similar to actimeo/acregmax) is much too small.
Allow the user to configure the amount of time we can
defer sending the final smb3 close when we have a
handle lease on the file (rather than forcing it to depend
on value of actimeo which is often unrelated, and less safe).
Adds new mount parameter "closetimeo=" which is the maximum
number of seconds we can wait before sending an SMB3
close when we have a handle lease for it. Default value
also is set to slightly larger at 5 seconds (although some
other clients use larger default this should still help).
Leo Yan [Thu, 11 Aug 2022 06:24:50 +0000 (14:24 +0800)]
perf c2c: Use 'peer' as default display for Arm64
Since Arm64 arch doesn't support HITMs flags, this patch changes to use
'peer' as default display if user doesn't specify any type; for other
arches, it still uses 'tot' as default display type if user doesn't
specify it.
This patch changes to call perf_session__new() in an earlier place, so
session environment can be initialized ahead and arch info can be used
for setting display type.
Leo Yan [Thu, 11 Aug 2022 06:24:49 +0000 (14:24 +0800)]
perf c2c: Sort on peer snooping for load operations
This patch adds a new option 'peer' so can sort on the cache hit for
peer snooping.
For displaying with option 'peer', the "Shared Data Cache Line Table"
and "Shared Cache Line Distribution Pareto" both sort with the metrics
"tot_peer".
Leo Yan [Thu, 11 Aug 2022 06:24:48 +0000 (14:24 +0800)]
perf c2c: Refactor display string
The display type is shown by combination the display string array and a
suffix string "HITMs", which is not friendly to extend display for other
sorting type (e.g. extension for peer operations).
This patch moves the suffix string "HITMs" into display string array for
HITM types, so it can allow us to not necessarily to output string
"HITMs" for new incoming display type.
Leo Yan [Thu, 11 Aug 2022 06:24:47 +0000 (14:24 +0800)]
perf c2c: Refactor node header
The node header array contains 3 items, each item is used for one of
the 3 flavors for node accessing info. To extend sorting on other
snooping type and not always stick to HITMs, the second header string
"Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as
"Node{cpus %peer %stores}").
For this reason, this patch changes the node header array to three
flat variables and uses switch-case in function setup_nodes_header(),
thus it is easier for altering the header string.
Leo Yan [Thu, 11 Aug 2022 06:24:46 +0000 (14:24 +0800)]
perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'
Use more general naming for the main sort dimension, this can allow us
not to sort only on HITM snoop type, so it can be extended to support
other costly snooping operations. So rename the dimension to the prefix
'percent_costly_".
Leo Yan [Thu, 11 Aug 2022 06:24:45 +0000 (14:24 +0800)]
perf c2c: Use explicit names for display macros
Perf c2c tool has an assumption that it heavily depends on HITM snoop
type to detect cache false sharing, unfortunately, HITM is not supported
on some architectures.
Essentially, perf c2c tool wants to find some very costly snooping
operations for false cache sharing, this means it's not necessarily
to stick using HITM tags and we can explore other snooping types
(e.g. SNOOPX_PEER).
For this reason, this patch renames HITM related display macros with
suffix '_HITM', so it can be distinct if later add more display types
for on other snooping type.
Leo Yan [Thu, 11 Aug 2022 06:24:43 +0000 (14:24 +0800)]
perf c2c: Add dimensions of peer metrics for cache line view
This patch adds dimensions of peer ops, which will be used for Shared
cache line distribution pareto.
It adds the percentage dimensions for local and remote peer operations,
and the dimensions for accounting operation numbers which is used for
stdio mode.
Leo Yan [Thu, 11 Aug 2022 06:24:42 +0000 (14:24 +0800)]
perf c2c: Add dimensions for peer load operations
This patch adds three dimensions for peer load operations of 'lcl_peer',
'rmt_peer' and 'tot_peer'. These three dimensions will be used in the
shared data cache line table.
Leo Yan [Thu, 11 Aug 2022 06:24:40 +0000 (14:24 +0800)]
perf mem: Add statistics for peer snooping
Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping
from peer cache line, it can come from a peer core, a peer cluster, or
a remote NUMA node.
This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER. Note, we
take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate
with cache level statistics. Therefore, we account the load operations
for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and
peer related metrics when flag PERF_MEM_SNOOPX_PEER is set.
So three new metrics are introduced: 'lcl_peer' is for local cache
access, the metric 'rmt_peer' is for remote access (includes remote DRAM
and any caches in remote node), and the metric 'tot_peer' is accounting
the sum value of 'lcl_peer' and 'rmt_peer'.
Ali Saidi [Thu, 11 Aug 2022 06:24:39 +0000 (14:24 +0800)]
perf arm-spe: Use SPE data source for neoverse cores
When synthesizing data from SPE, augment the type with source information
for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
the same encoding. I can't find encoding information for any other SPE
implementations to unify their choices with Arm's thus that is left for
future work.
This change populates the mem_lvl_num for Neoverse cores as well as the
deprecated mem_lvl namespace.
Ali Saidi [Thu, 11 Aug 2022 06:24:37 +0000 (14:24 +0800)]
perf tools: Sync addition of PERF_MEM_SNOOPX_PEER
Add a flag to the 'perf mem' data struct to signal that a request caused
a cache-to-cache transfer of a line from a peer of the requestor and
wasn't sourced from a lower cache level.
The line being moved from one peer cache to another has latency and
performance implications.
On Arm64 Neoverse systems the data source can indicate a cache-to-cache
transfer but not if the line is dirty or clean, so instead of
overloading HITM define a new flag that indicates this type of transfer.
Committer notes:
This really is not syncing with the kernel since the patch to the kernel
wasn't merged.
But we're going ahead of this as it seems trivial and is just a matter
of the perf kernel maintainers to give their ack or for us to find
another way of expressing this in the perf records synthesized in
userspace from the ARM64 hardware traces.
RISC-V: Move counter info definition to sbi header file
Counter info encoding format is defined by the SBI specificaiton.
KVM implementation of SBI PMU extension will also leverage this definition.
Move the definition to common sbi header file from the sbi pmu driver.
Some of the SBI PMU calls does not pass 64bit arguments
correctly and not under RV32 compile time flags. Currently,
this doesn't create any incorrect results as RV64 ignores
any value in the additional register and qemu doesn't support
raw events.
Fix those SBI calls in order to set correct values for RV32.
RISC-V: Update user page mapping only once during start
Currently, riscv_pmu_event_set_period updates the userpage mapping.
However, the caller of riscv_pmu_event_set_period should update
the userpage mapping because the counter can not be updated/started
from set_period function in counter overflow path.
Invoke the perf_event_update_userpage at the caller so that it
doesn't get invoked twice during counter start path.
Adrian Hunter [Thu, 11 Aug 2022 17:04:11 +0000 (20:04 +0300)]
perf tools: Tidy guest option documentation
Move common guest options into include files. Use attribute substitution to
customize an example, using "[verse]" to define the block instead of a
"literal" block which does not permit substitution.
Palmer Dabbelt [Thu, 11 Aug 2022 21:41:52 +0000 (14:41 -0700)]
RISC-V: Add Sstc extension support
This series implements Sstc extension support which was ratified
recently. Before the Sstc extension, an SBI call is necessary to
generate timer interrupts as only M-mode have access to the timecompare
registers. Thus, there is significant latency to generate timer
interrupts at kernel. For virtualized enviornments, its even worse as
the KVM handles the SBI call and uses a software timer to emulate the
timecomapre register.
Sstc extension solves both these problems by defining a
stimecmp/vstimecmp at supervisor (host/guest) level. It allows kernel to
program a timer and recieve interrupt without supervisor execution
enviornment (M-mode/HS mode) intervention.
* palmer/riscv-sstc:
RISC-V: Prefer sstc extension if available
RISC-V: Enable sstc extension parsing from DT
RISC-V: Add SSTC extension CSR details
RISC-V ISA has sstc extension which allows updating the next clock event
via a CSR (stimecmp) instead of an SBI call. This should happen dynamically
if sstc extension is available. Otherwise, it will fallback to SBI call
to maintain backward compatibility.
Yipeng Zou [Thu, 21 Jul 2022 06:58:20 +0000 (14:58 +0800)]
riscv:uprobe fix SR_SPIE set/clear handling
In riscv the process of uprobe going to clear spie before exec
the origin insn,and set spie after that.But When access the page
which origin insn has been placed a page fault may happen and
irq was disabled in arch_uprobe_pre_xol function,It cause a WARN
as follows.
There is no need to clear/set spie in arch_uprobe_pre/post/abort_xol.
We can just remove it.
This sentence dates back to the pre-git era and it does not look very
professional... As there is no clear definition of "finished", and given
this page is already a pretty good overview, not to mention it is not the
kernel responsibility to document the protocol in detail, let's update the
text accordingly.
Wolfram Sang [Thu, 11 Aug 2022 12:08:08 +0000 (14:08 +0200)]
i2c: move core from strlcpy to strscpy
Follow the advice of the below link and prefer 'strscpy'. Conversion is
easy because no code used the return value. It has been done with a
simple sed invocation.
Wolfram Sang [Thu, 11 Aug 2022 07:10:30 +0000 (09:10 +0200)]
i2c: move drivers from strlcpy to strscpy
Follow the advice of the below link and prefer 'strscpy'. Conversion is
easy because no driver used the return value and has been done with a
simple sed invocation.
Fix device tree schema validation error messages for the SiFive
Unmatched: ' cache-sets:0:0: 1024 was expected'.
The existing bindings allow for just 1024 cache-sets but the fu740 on
Unmatched the has 2048 cache-sets. The ISA itself permits any arbitrary
power of two, however this is not supported by dt-schema. The RTL for
the IP, to which the number of cache-sets is a tunable parameter, has
been released publicly so speculatively adding a small number of
"reasonable" values seems unwise also.
Instead, as the binding only supports two distinct controllers: add 2048
and explicitly lock it to the fu740's l2 cache while limiting 1024 to
the l2 cache on the fu540.
Namhyung Kim [Thu, 11 Aug 2022 18:54:56 +0000 (11:54 -0700)]
perf offcpu: Update offcpu test for child process
Record off-cpu data with perf bench sched messaging workload and count
the number of offcpu-time events. Also update the test script not to
run next tests if failed already and revise the error messages.
$ sudo ./perf test offcpu -v
88: perf record offcpu profiling tests :
--- start ---
test child forked, pid 344780
Checking off-cpu privilege
Basic off-cpu test
Basic off-cpu test [Success]
Child task off-cpu test
Child task off-cpu test [Success]
test child finished with 0
---- end ----
perf record offcpu profiling tests: Ok
Namhyung Kim [Thu, 11 Aug 2022 18:54:55 +0000 (11:54 -0700)]
perf offcpu: Track child processes
When -p option used or a workload is given, it needs to handle child
processes. The perf_event can inherit those task events
automatically. We can add a new BPF program in task_newtask
tracepoint to track child processes.
Namhyung Kim [Thu, 11 Aug 2022 18:54:54 +0000 (11:54 -0700)]
perf offcpu: Parse process id separately
The current target code uses thread id for tracking tasks because
perf_events need to be opened for each task. But we can use tgid in
BPF maps and check it easily.
Namhyung Kim [Thu, 11 Aug 2022 18:54:53 +0000 (11:54 -0700)]
perf offcpu: Check process id for the given workload
Current task filter checks task->pid which is different for each
thread. But we want to profile all the threads in the process. So
let's compare process id (or thread-group id: tgid) instead.
Sparse complains that cpu_ops_sbi is used undeclared:
arch/riscv/kernel/cpu_ops_sbi.c:17:29: warning: symbol 'cpu_ops_sbi' was not declared. Should it be static?
Fix the warning by adding cpu_ops_sbi to cpu_ops_sbi.h & including that
where used.
Linus Torvalds [Thu, 11 Aug 2022 20:45:37 +0000 (13:45 -0700)]
Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, bpf, can and netfilter.
A little larger than usual but it's all fixes, no late features. It's
large partially because of timing, and partially because of follow ups
to stuff that got merged a week or so before the merge window and
wasn't as widely tested. Maybe the Bluetooth fixes are a little
alarming so we'll address that, but the rest seems okay and not scary.
Notably we're including a fix for the netfilter Kconfig [1], your WiFi
warning [2] and a bluetooth fix which should unblock syzbot [3].
Current release - regressions:
- Bluetooth:
- don't try to cancel uninitialized works [3]
- L2CAP: fix use-after-free caused by l2cap_chan_put
- tls: rx: fix device offload after recent rework
- devlink: fix UAF on failed reload and leftover locks in mlxsw
Current release - new code bugs:
- netfilter:
- flowtable: fix incorrect Kconfig dependencies [1]
- nf_tables: fix crash when nf_trace is enabled
- bpf:
- use proper target btf when exporting attach_btf_obj_id
- arm64: fixes for bpf trampoline support
- Bluetooth:
- ISO: unlock on error path in iso_sock_setsockopt()
- ISO: fix info leak in iso_sock_getsockopt()
- ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
- ISO: fix memory corruption on iso_pinfo.base
- ISO: fix not using the correct QoS
- hci_conn: fix updating ISO QoS PHY
- phy: dp83867: fix get nvmem cell fail
Previous releases - regressions:
- wifi: cfg80211: fix validating BSS pointers in
__cfg80211_connect_result [2]
- atm: bring back zatm uAPI after ATM had been removed
- properly fix old bug making bonding ARP monitor mode not being able
to work with software devices with lockless Tx
- tap: fix null-deref on skb->dev in dev_parse_header_protocol
- revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
devices and breaks others
- netfilter:
- nf_tables: many fixes rejecting cross-object linking which may
lead to UAFs
- nf_tables: fix null deref due to zeroed list head
- nf_tables: validate variable length element extension
- bgmac: fix a BUG triggered by wrong bytes_compl
- bcmgenet: indicate MAC is in charge of PHY PM
Previous releases - always broken:
- bpf:
- fix bad pointer deref in bpf_sys_bpf() injected via test infra
- disallow non-builtin bpf programs calling the prog_run command
- don't reinit map value in prealloc_lru_pop
- fix UAFs during the read of map iterator fd
- fix invalidity check for values in sk local storage map
- reject sleepable program for non-resched map iterator
- mptcp:
- move subflow cleanup in mptcp_destroy_common()
- do not queue data on closed subflows
- virtio_net: fix memory leak inside XDP_TX with mergeable
- vsock: fix memory leak when multiple threads try to connect()
- rework sk_user_data sharing to prevent psock leaks
- geneve: fix TOS inheriting for ipv4
- tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
- phy: c45 baset1: do not skip aneg configuration if clock role is
not specified
- rose: avoid overflow when /proc displays timer information
- x25: fix call timeouts in blocking connects
- can: mcp251x: fix race condition on receive interrupt
- can: j1939:
- replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
- fix memory leak of skbs in j1939_session_destroy()
Misc:
- docs: bpf: clarify that many things are not uAPI
- seg6: initialize induction variable to first valid array index (to
silence clang vs objtool warning)
* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
net: atm: bring back zatm uAPI
dpaa2-eth: trace the allocated address instead of page struct
net: add missing kdoc for struct genl_multicast_group::flags
nfp: fix use-after-free in area_cache_get()
MAINTAINERS: use my korg address for mt7601u
mlxsw: minimal: Fix deadlock in ports creation
bonding: fix reference count leak in balance-alb mode
net: usb: qmi_wwan: Add support for Cinterion MV32
bpf: Shut up kern_sys_bpf warning.
net/tls: Use RCU API to access tls_ctx->netdev
tls: rx: device: don't try to copy too much on detach
tls: rx: device: bound the frag walk
net_sched: cls_route: remove from list when handle is 0
selftests: forwarding: Fix failing tests with old libnet
net: refactor bpf_sk_reuseport_detach()
net: fix refcount bug in sk_psock_get (2)
selftests/bpf: Ensure sleepable program is rejected by hash map iter
selftests/bpf: Add write tests for sk local storage map iterator
selftests/bpf: Add tests for reading a dangling map iter fd
bpf: Only allow sleepable program for resched-able iterator
...
Linus Torvalds [Thu, 11 Aug 2022 20:26:09 +0000 (13:26 -0700)]
Merge tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These fix up direct references to the fwnode field in struct device
and extend ACPI device properties support.
Specifics:
- Replace direct references to the fwnode field in struct device with
dev_fwnode() and device_match_fwnode() (Andy Shevchenko)
- Make the ACPI code handling device properties support properties
with buffer values (Sakari Ailus)"
* tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: property: Fix error handling in acpi_init_properties()
ACPI: VIOT: Do not dereference fwnode in struct device
ACPI: property: Read buffer properties as integers
ACPI: property: Add support for parsing buffer property UUID
ACPI: property: Unify integer value reading functions
ACPI: property: Switch node property referencing from ifs to a switch
ACPI: property: Move property ref argument parsing into a new function
ACPI: property: Use acpi_object_type consistently in property ref parsing
ACPI: property: Tie data nodes to acpi handles
ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
Ben Dooks [Wed, 13 Jul 2022 21:53:06 +0000 (22:53 +0100)]
RISC-V: cpu_ops_spinwait.c should include head.h
Running sparse shows cpu_ops_spinwait.c is missing two definitions
found in head.h, so include it to stop the following warnings:
arch/riscv/kernel/cpu_ops_spinwait.c:15:6: warning: symbol '__cpu_spinwait_stack_pointer' was not declared. Should it be static?
arch/riscv/kernel/cpu_ops_spinwait.c:16:6: warning: symbol '__cpu_spinwait_task_pointer' was not declared. Should it be static?
Linus Torvalds [Thu, 11 Aug 2022 20:11:49 +0000 (13:11 -0700)]
Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull more iomap updates from Darrick Wong:
"In the past 10 days or so I've not heard any ZOMG STOP style
complaints about removing ->writepage support from gfs2 or zonefs, so
here's the pull request removing them (and the underlying fs iomap
support) from the kernel:
- Remove iomap_writepage and all callers, since the mm apparently
never called the zonefs or gfs2 writepage functions"
* tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: remove iomap_writepage
zonefs: remove ->writepage
gfs2: remove ->writepage
gfs2: stop using generic_writepages in gfs2_ail1_start_one
Ben Dooks [Thu, 14 Jul 2022 07:18:11 +0000 (08:18 +0100)]
RISC-V: Declare cpu_ops_spinwait in <asm/cpu_ops.h>
The cpu_ops_spinwait is used in a couple of places in arch/riscv
and is causing a sparse warning due to no declaration. Add this
to <asm/cpu_ops.h> with the others to fix the following:
arch/riscv/kernel/cpu_ops_spinwait.c:16:29: warning: symbol 'cpu_ops_spinwait' was not declared. Should it be static?
Linus Torvalds [Thu, 11 Aug 2022 19:41:07 +0000 (12:41 -0700)]
Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"We have a good pile of various fixes and cleanups from Xiubo, Jeff,
Luis and others, almost exclusively in the filesystem.
Several patches touch files outside of our normal purview to set the
stage for bringing in Jeff's long awaited ceph+fscrypt series in the
near future. All of them have appropriate acks and sat in linux-next
for a while"
* tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits)
libceph: clean up ceph_osdc_start_request prototype
libceph: fix ceph_pagelist_reserve() comment typo
ceph: remove useless check for the folio
ceph: don't truncate file in atomic_open
ceph: make f_bsize always equal to f_frsize
ceph: flush the dirty caps immediatelly when quota is approaching
libceph: print fsid and epoch with osd id
libceph: check pointer before assigned to "c->rules[]"
ceph: don't get the inline data for new creating files
ceph: update the auth cap when the async create req is forwarded
ceph: make change_auth_cap_ses a global symbol
ceph: fix incorrect old_size length in ceph_mds_request_args
ceph: switch back to testing for NULL folio->private in ceph_dirty_folio
ceph: call netfs_subreq_terminated with was_async == false
ceph: convert to generic_file_llseek
ceph: fix the incorrect comment for the ceph_mds_caps struct
ceph: don't leak snap_rwsem in handle_cap_grant
ceph: prevent a client from exceeding the MDS maximum xattr size
ceph: choose auth MDS for getxattr with the Xs caps
ceph: add session already open notify support
...
Linus Torvalds [Thu, 11 Aug 2022 19:10:08 +0000 (12:10 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
- Xen timer fixes
- Documentation formatting fixes
- Make rseq selftest compatible with glibc-2.35
- Fix handling of illegal LEA reg, reg
- Cleanup creation of debugfs entries
- Fix steal time cache handling bug
- Fixes for MMIO caching
- Optimize computation of number of LBRs
- Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh
KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers
KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES
KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals
KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test
KVM: selftests: Make rseq compatible with glibc-2.35
KVM: Actually create debugfs in kvm_create_vm()
KVM: Pass the name of the VM fd to kvm_create_vm_debugfs()
KVM: Get an fd before creating the VM
KVM: Shove vcpu stats_id init into kvm_vcpu_init()
KVM: Shove vm stats_id init into kvm_create_vm()
KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen
KVM: x86/mmu: rename trace function name for asynchronous page fault
KVM: x86/xen: Stop Xen timer before changing IRQ
KVM: x86/xen: Initialize Xen timer only once
KVM: SVM: Disable SEV-ES support if MMIO caching is disable
KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
KVM: x86: Tag kvm_mmu_x86_module_init() with __init
...
Mark Kettenis [Thu, 7 Jul 2022 18:55:28 +0000 (20:55 +0200)]
riscv: dts: starfive: correct number of external interrupts
The PLIC integrated on the Vic_U7_Core integrated on the StarFive
JH7100 SoC actually supports 133 external interrupts. 127 of these
are exposed to the outside world; the remainder are used by other
devices that are part of the core-complex such as the L2 cache
controller. But all 133 interrupts are external interrupts as far
as the PLIC is concerned. Fix the property so that the driver can
manage these additional interrupts, which is important since the
interrupts for the L2 cache controller are enabled by default.
This adds the two PWM controlled LEDs to the HiFive Unmatched device
tree. D12 is just a regular green diode, but D2 is an RGB diode with 3
PWM inputs controlling the three different colours.
Jakub Kicinski [Wed, 10 Aug 2022 16:45:47 +0000 (09:45 -0700)]
net: atm: bring back zatm uAPI
Jiri reports that linux-atm does not build without this header.
Bring it back. It's completely dead code but we can't break
the build for user space :(
Merge changes adding support for device properties with buffer values
to the ACPI device properties handling code.
* acpi-properties:
ACPI: property: Fix error handling in acpi_init_properties()
ACPI: property: Read buffer properties as integers
ACPI: property: Add support for parsing buffer property UUID
ACPI: property: Unify integer value reading functions
ACPI: property: Switch node property referencing from ifs to a switch
ACPI: property: Move property ref argument parsing into a new function
ACPI: property: Use acpi_object_type consistently in property ref parsing
ACPI: property: Tie data nodes to acpi handles
ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
Anuj Gupta [Thu, 11 Aug 2022 09:14:59 +0000 (14:44 +0530)]
io_uring: fix error handling for io_uring_cmd
Commit 97b388d70b53 ("io_uring: handle completions in the core") moved the
error handling from handler to core. But for io_uring_cmd handler we end
up completing more than once (both in handler and in core) leading to
use_after_free.
Change io_uring_cmd handler to avoid calling io_uring_cmd_done in case
of error.
Masahiro Yamada [Sat, 25 Jun 2022 22:34:37 +0000 (07:34 +0900)]
riscv/purgatory: Omit use of bin2c
The .incbin assembler directive is much faster than bin2c + $(CC).
Do similar refactoring as in commit 4c0f032d4963 ("s390/purgatory:
Omit use of bin2c").
Please note the .quad directive matches to size_t in C (both 8 byte)
because the purgatory is compiled only for the 64-bit kernel.
(KEXEC_FILE depends on 64BIT).
Linus Torvalds [Thu, 11 Aug 2022 16:23:08 +0000 (09:23 -0700)]
Merge tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- changes to input core to properly queue synthetic events (such as
autorepeat) and to release multitouch contacts when an input device
is inhibited or suspended
- reworked quirk handling in i8042 driver that consolidates multiple
DMI tables into one and adds several quirks for TUXEDO line of
laptops
- update to mt6779 keypad to better reflect organization of the
hardware
- changes to mtk-pmic-keys driver preparing it to handle more variants
- facelift of adp5588-keys driver
- improvements to iqs7222 driver
- adjustments to various DT binding documents for input devices
- other assorted driver fixes.
* tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits)
Input: adc-joystick - fix ordering in adc_joystick_probe()
dt-bindings: input: ariel-pwrbutton: use spi-peripheral-props.yaml
Input: deactivate MT slots when inhibiting or suspending devices
Input: properly queue synthetic events
dt-bindings: input: iqs7222: Use central 'linux,code' definition
Input: i8042 - add dritek quirk for Acer Aspire One AO532
dt-bindings: input: gpio-keys: accept also interrupt-extended
dt-bindings: input: gpio-keys: reference input.yaml and document properties
dt-bindings: input: gpio-keys: enforce node names to match all properties
dt-bindings: input: Convert adc-keys to DT schema
dt-bindings: input: Centralize 'linux,input-type' definition
dt-bindings: input: Use common 'linux,keycodes' definition
dt-bindings: input: Centralize 'linux,code' definition
dt-bindings: input: Increase maximum keycode value to 0x2ff
Input: mt6779-keypad - implement row/column selection
Input: mt6779-keypad - match hardware matrix organization
Input: i8042 - add additional TUXEDO devices to i8042 quirk tables
Input: goodix - switch use of acpi_gpio_get_*_resource() APIs
Input: i8042 - add TUXEDO devices to i8042 quirk tables
Input: i8042 - add debug output for quirks
...
Jialiang Wang [Wed, 10 Aug 2022 07:30:57 +0000 (15:30 +0800)]
nfp: fix use-after-free in area_cache_get()
area_cache_get() is used to distribute cache->area and set cache->id,
and if cache->id is not 0 and cache->area->kref refcount is 0, it will
release the cache->area by nfp_cpp_area_release(). area_cache_get()
set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
is already set but the refcount is not increased as expected. At this
time, calling the nfp_cpp_area_release() will cause use-after-free.
To avoid the use-after-free, set cache->id after area_init() and
nfp_cpp_area_acquire() complete successfully.
Note: This vulnerability is triggerable by providing emulated device
equipped with specified configuration.
BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
Xianting Tian [Thu, 11 Aug 2022 07:41:48 +0000 (15:41 +0800)]
RISC-V: Add modules to virtual kernel memory layout dump
Modules always live before the kernel, MODULES_END is fixed but
MODULES_VADDR isn't fixed, it depends on the kernel size.
Let's add it to virtual kernel memory layout dump.
As MODULES is only defined for CONFIG_64BIT, so we dump it when
CONFIG_64BIT=y.
Xianting Tian [Thu, 11 Aug 2022 07:41:47 +0000 (15:41 +0800)]
RISC-V: Fixup schedule out issue in machine_crash_shutdown()
Current task of executing crash kexec will be schedule out when panic is
triggered by RCU Stall, as it needs to wait rcu completion. It lead to
inability to enter the crash system.
The implementation of machine_crash_shutdown() is non-standard for RISC-V
according to other Arch's implementation(eg, x86, arm64), we need to send
IPI to stop secondary harts.
Xianting Tian [Thu, 11 Aug 2022 07:41:46 +0000 (15:41 +0800)]
RISC-V: Fixup get incorrect user mode PC for kernel mode regs
When use 'echo c > /proc/sysrq-trigger' to trigger kdump, riscv_crash_save_regs()
will be called to save regs for vmcore, we found "epc" value 00ffffffa5537400
is not a valid kernel virtual address, but is a user virtual address. Other
regs(eg, ra, sp, gp...) are correct kernel virtual address.
Actually 0x00ffffffb0dd9400 is the user mode PC of 'PID: 113 Comm: sh', which
is saved in the task's stack.
Xianting Tian [Thu, 11 Aug 2022 07:41:45 +0000 (15:41 +0800)]
RISC-V: kexec: Fixup use of smp_processor_id() in preemptible context
Use __smp_processor_id() to avoid check the preemption context when
CONFIG_DEBUG_PREEMPT enabled, as we will enter crash kernel and no
return.
Without the patch,
[ 103.781044] sysrq: Trigger a crash
[ 103.784625] Kernel panic - not syncing: sysrq triggered crash
[ 103.837634] CPU1: off
[ 103.889668] CPU2: off
[ 103.933479] CPU3: off
[ 103.939424] Starting crashdump kernel...
[ 103.943442] BUG: using smp_processor_id() in preemptible [00000000] code: sh/346
[ 103.950884] caller is debug_smp_processor_id+0x1c/0x26
[ 103.956051] CPU: 0 PID: 346 Comm: sh Kdump: loaded Not tainted 5.10.113-00002-gce03f03bf4ec-dirty #149
[ 103.965355] Call Trace:
[ 103.967805] [<ffffffe00020372a>] walk_stackframe+0x0/0xa2
[ 103.973206] [<ffffffe000bcf1f4>] show_stack+0x32/0x3e
[ 103.978258] [<ffffffe000bd382a>] dump_stack_lvl+0x72/0x8e
[ 103.983655] [<ffffffe000bd385a>] dump_stack+0x14/0x1c
[ 103.988705] [<ffffffe000bdc8fe>] check_preemption_disabled+0x9e/0xaa
[ 103.995057] [<ffffffe000bdc926>] debug_smp_processor_id+0x1c/0x26
[ 104.001150] [<ffffffe000206c64>] machine_kexec+0x22/0xd0
[ 104.006463] [<ffffffe000291a7e>] __crash_kexec+0x6a/0xa4
[ 104.011774] [<ffffffe000bcf3fa>] panic+0xfc/0x2b0
[ 104.016480] [<ffffffe000656ca4>] sysrq_reset_seq_param_set+0x0/0x70
[ 104.022745] [<ffffffe000657310>] __handle_sysrq+0x8c/0x154
[ 104.028229] [<ffffffe0006577e8>] write_sysrq_trigger+0x5a/0x6a
[ 104.034061] [<ffffffe0003d90e0>] proc_reg_write+0x58/0xd4
[ 104.039459] [<ffffffe00036cff4>] vfs_write+0x7e/0x254
[ 104.044509] [<ffffffe00036d2f6>] ksys_write+0x58/0xbe
[ 104.049558] [<ffffffe00036d36a>] sys_write+0xe/0x16
[ 104.054434] [<ffffffe000201b9a>] ret_from_syscall+0x0/0x2
[ 104.067863] Will call new kernel at ecc00000 from hart id 0
[ 104.074939] FDT image at fc5ee000
[ 104.079523] Bye...
With the patch we can got clear output,
[ 67.740553] sysrq: Trigger a crash
[ 67.744166] Kernel panic - not syncing: sysrq triggered crash
[ 67.809123] CPU1: off
[ 67.865210] CPU2: off
[ 67.909075] CPU3: off
[ 67.919123] Starting crashdump kernel...
[ 67.924900] Will call new kernel at ecc00000 from hart id 0
[ 67.932045] FDT image at fc5ee000
[ 67.935560] Bye...
Jay Vosburgh [Thu, 11 Aug 2022 05:06:53 +0000 (22:06 -0700)]
bonding: fix reference count leak in balance-alb mode
Commit d5410ac7b0ba ("net:bonding:support balance-alb interface
with vlan to bridge") introduced a reference count leak by not releasing
the reference acquired by ip_dev_find(). Remedy this by insuring the
reference is released.
The clang -Wformat warning is terminally broken, and the clang people
can't seem to get their act together.
This test program causes a warning with clang:
#include <stdio.h>
int main(int argc, char **argv)
{
printf("%hhu\n", 'a');
}
resulting in
t.c:5:19: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]
printf("%hhu\n", 'a');
~~~~ ^~~
%d
and apparently clang people consider that a feature, because they don't
want to face the reality of how either C character constants, C
arithmetic, and C varargs functions work.
The rest of the world just shakes their head at that kind of
incompetence, and turns off -Wformat for clang again.
And no, the "you should use a pointless cast to shut this up" is not a
valid answer. That warning should not exist in the first place, or at
least be optinal with some "-Wformat-me-harder" kind of option.
[ Admittedly, there's also very little reason to *ever* use '%hh[ud]' in
C, but what little reason there is is entirely about 'I want to see
only the low 8 bits of the argument'. So I would suggest nobody ever
use that format in the first place, but if they do, the clang
behavious is simply always wrong. Because '%hhu' takes an 'int'. It's
that simple. ]
Mikulas Patocka [Wed, 10 Aug 2022 20:16:34 +0000 (16:16 -0400)]
dm bufio: fix some cases where the code sleeps with spinlock held
Commit b32d45824aa7 ("dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag")
added a "NO_SLEEP" mode, it replaces a mutex with a spinlock, and it
is only usable when the device is in read-only mode (because the write
path may be sleeping while holding the dm_bufio_client lock).
However, there are still two points where the code could sleep even in
read-only mode. One is in __get_unclaimed_buffer -> __make_buffer_clean.
The other is in __try_evict_buffer -> __make_buffer_clean. These functions
will call __make_buffer_clean which sleeps if the buffer is being read.
Fix these cases so that if c->no_sleep is set __make_buffer_clean
will not be called and the buffer will be skipped instead.
Slark Xiao [Wed, 10 Aug 2022 01:45:21 +0000 (09:45 +0800)]
net: usb: qmi_wwan: Add support for Cinterion MV32
There are 2 models for MV32 serials. MV32-W-A is designed
based on Qualcomm SDX62 chip, and MV32-W-B is designed based
on Qualcomm SDX65 chip. So we use 2 different PID to separate it.
Jens Axboe [Thu, 11 Aug 2022 14:37:48 +0000 (08:37 -0600)]
Merge tag 'nvme-6.0-2022-08-11' of git://git.infradead.org/nvme into block-6.0
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.0
- print nvme connect Linux error codes properly (Amit Engel)
- fix the fc_appid_store return value (Christoph Hellwig)
- fix a typo in an error message (Christophe JAILLET)
- add another non-unique identifier quirk (Dennis P. Kliem)
- check if the queue is allocated before stopping it in nvme-tcp
(Maurizio Lombardi)
- restart admin queue if the caller needs to restart queue in nvme-fc
(Ming Lei)
- use kmemdup instead of kmalloc + memcpy in nvme-auth (Zhang Xiaoxu)"
* tag 'nvme-6.0-2022-08-11' of git://git.infradead.org/nvme:
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S70
nvme-tcp: check if the queue is allocated before stopping it
nvme-fabrics: Fix a typo in an error message
nvme-fabrics: parse nvme connect Linux error codes
nvmet-auth: use kmemdup instead of kmalloc + memcpy
nvme-fc: fix the fc_appid_store return value
nvme-fc: restart admin queue if the caller needs to restart queue
Conor Dooley [Fri, 5 Aug 2022 07:43:46 +0000 (08:43 +0100)]
i2c: microchip-corei2c: fix erroneous late ack send
A late ack is currently being sent at the end of a transfer due to
incorrect logic in mchp_corei2c_empty_rx(). Currently the Assert Ack
bit is being written to the controller's control reg after the last
byte has been received, causing it to sent another byte with the ack.
Instead, the AA flag should be written to the control register when
the penultimate byte is read so it is sent out for the last byte.
Reported-by: Andreas Buerkler <[email protected]> Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers") Tested-by: Lewis Hanly <[email protected]> Signed-off-by: Conor Dooley <[email protected]>
[wsa: fixed typos in commit message] Signed-off-by: Wolfram Sang <[email protected]>
dt-bindings: i2c: qcom,i2c-cci: convert to dtschema
Convert the Qualcomm Camera Control Interface (CCI) I2C controller to DT
schema. The original bindings were not complete, so this includes
changes:
1. Add address/size-cells.
2. Describe the clocks per variant.
3. Use more descriptive example based on sdm845.
Dennis P. Kliem [Thu, 11 Aug 2022 10:56:57 +0000 (12:56 +0200)]
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S70
ADATA XPG GAMMIX S70 reports bogus eui64 values that appear to be the same
across all drives. Quirk them out so they are not marked as "non globally
unique" duplicates.
vdpa_sim_blk: add support for discard and write-zeroes
Expose VIRTIO_BLK_F_DISCARD and VIRTIO_BLK_F_WRITE_ZEROES features
to the drivers and handle VIRTIO_BLK_T_DISCARD and
VIRTIO_BLK_T_WRITE_ZEROES requests checking ranges and flags.
The simulator behaves like a ramdisk, so for VIRTIO_BLK_F_DISCARD
does nothing, while for VIRTIO_BLK_T_WRITE_ZEROES sets to 0 the
specified region.