Git Repo - linux.git/log

]> Git Repo - linux.git/log

Jinrong Liang [Tue, 25 Jan 2022 09:59:03 +0000 (17:59 +0800)]

KVM: x86/ioapic: Remove unused "addr" and "length" of ioapic_read_indirect()

The "unsigned long addr" and "unsigned long length" parameter of
ioapic_read_indirect() is not used, so remove it.

No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:59:02 +0000 (17:59 +0800)]

KVM: x86/i8259: Remove unused "addr" of elcr_ioport_{read,write}()

The "u32 addr" parameter of elcr_ioport_write() and elcr_ioport_read()
is not used, so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Paolo Bonzini [Tue, 25 Jan 2022 16:11:30 +0000 (11:11 -0500)]

KVM: SVM: improve split between svm_prepare_guest_switch and sev_es_prepare_guest_switch

KVM performs the VMSAVE to the host save area for both regular and SEV-ES
guests, so hoist it up to svm_prepare_guest_switch. And because
sev_es_prepare_guest_switch does not really need to know the details
of struct svm_cpu_data *, just pass it the pointer to the host save area
inside the HSAVE page.

Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:56 +0000 (17:58 +0800)]

KVM: x86/svm: Remove unused "vcpu" of svm_check_exit_valid()

The "struct kvm_vcpu *vcpu" parameter of svm_check_exit_valid()
is not used, so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:55 +0000 (17:58 +0800)]

KVM: x86/mmu_audit: Remove unused "level" of audit_spte_after_sync()

The "int level" parameter of audit_spte_after_sync() is not used,
so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:54 +0000 (17:58 +0800)]

KVM: x86/tdp_mmu: Remove unused "kvm" of kvm_tdp_mmu_get_root()

The "struct kvm *kvm" parameter of kvm_tdp_mmu_get_root() is not used,
so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:53 +0000 (17:58 +0800)]

KVM: x86/mmu: Remove unused "vcpu" of reset_{tdp,ept}_shadow_zero_bits_mask()

The "struct kvm_vcpu *vcpu" parameter of reset_ept_shadow_zero_bits_mask()
and reset_tdp_shadow_zero_bits_mask() is not used, so remove it.

No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:52 +0000 (17:58 +0800)]

KVM: x86/mmu: Remove unused "kvm" of __rmap_write_protect()

The "struct kvm *kvm" parameter of __rmap_write_protect()
is not used, so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Jinrong Liang [Tue, 25 Jan 2022 09:58:51 +0000 (17:58 +0800)]

KVM: x86/mmu: Remove unused "kvm" of kvm_mmu_unlink_parents()

The "struct kvm *kvm" parameter of kvm_mmu_unlink_parents()
is not used, so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <[email protected]>
Message-Id: <20220125095909 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 8 Dec 2021 01:52:34 +0000 (01:52 +0000)]

KVM: x86: Skip APICv update if APICv is disable at the module level

Bail from the APICv update paths _before_ taking apicv_update_lock if
APICv is disabled at the module level. kvm_request_apicv_update() in
particular is invoked from multiple paths that can be reached without
APICv being enabled, e.g. svm_enable_irq_window(), and taking the
rw_sem for write when APICv is disabled may introduce unnecessary
contention and stalls.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211208015236.1616697 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 8 Dec 2021 01:52:35 +0000 (01:52 +0000)]

KVM: x86: Drop NULL check on kvm_x86_ops.check_apicv_inhibit_reasons

Drop the useless NULL check on kvm_x86_ops.check_apicv_inhibit_reasons
when handling an APICv update, both VMX and SVM unconditionally implement
the helper and leave it non-NULL even if APICv is disabled at the module
level. The latter is a moot point now that __kvm_request_apicv_update()
is called if and only if enable_apicv is true.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211208015236.1616697 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 8 Dec 2021 01:52:36 +0000 (01:52 +0000)]

KVM: x86: Unexport __kvm_request_apicv_update()

Unexport __kvm_request_apicv_update(), it's not used by vendor code and
should never be used by vendor code. The only reason it's exposed at all
is because Hyper-V's SynIC needs to track how many auto-EOIs are in use,
and it's convenient to use apicv_update_lock to guard that tracking.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211208015236.1616697 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 15 Dec 2021 01:15:56 +0000 (01:15 +0000)]

KVM: x86/mmu: Zap _all_ roots when unmapping gfn range in TDP MMU

Zap both valid and invalid roots when zapping/unmapping a gfn range, as
KVM must ensure it holds no references to the freed page after returning
from the unmap operation.  Most notably, the TDP MMU doesn't zap invalid
roots in mmu_notifier callbacks.  This leads to use-after-free and other
issues if the mmu_notifier runs to completion while an invalid root
zapper yields as KVM fails to honor the requirement that there must be
_no_ references to the page after the mmu_notifier returns.

The bug is most easily reproduced by hacking KVM to cause a collision
between set_nx_huge_pages() and kvm_mmu_notifier_release(), but the bug
exists between kvm_mmu_notifier_invalidate_range_start() and memslot
updates as well.  Invalidating a root ensures pages aren't accessible by
the guest, and KVM won't read or write page data itself, but KVM will
trigger e.g. kvm_set_pfn_dirty() when zapping SPTEs, and thus completing
a zap of an invalid root _after_ the mmu_notifier returns is fatal.

  WARNING: CPU: 24 PID: 1496 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:173 [kvm]
  RIP: 0010:kvm_is_zone_device_pfn+0x96/0xa0 [kvm]
  Call Trace:
   <TASK>
   kvm_set_pfn_dirty+0xa8/0xe0 [kvm]
   __handle_changed_spte+0x2ab/0x5e0 [kvm]
   __handle_changed_spte+0x2ab/0x5e0 [kvm]
   __handle_changed_spte+0x2ab/0x5e0 [kvm]
   zap_gfn_range+0x1f3/0x310 [kvm]
   kvm_tdp_mmu_zap_invalidated_roots+0x50/0x90 [kvm]
   kvm_mmu_zap_all_fast+0x177/0x1a0 [kvm]
   set_nx_huge_pages+0xb4/0x190 [kvm]
   param_attr_store+0x70/0x100
   module_attr_store+0x19/0x30
   kernfs_fop_write_iter+0x119/0x1b0
   new_sync_write+0x11c/0x1b0
   vfs_write+0x1cc/0x270
   ksys_write+0x5f/0xe0
   do_syscall_64+0x38/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   </TASK>

Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
Cc: [email protected]
Cc: Ben Gardon <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211215011557 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 15 Dec 2021 01:15:55 +0000 (01:15 +0000)]

KVM: x86/mmu: Move "invalid" check out of kvm_tdp_mmu_get_root()

Move the check for an invalid root out of kvm_tdp_mmu_get_root() and into
the one place it actually matters, tdp_mmu_next_root(), as the other user
already has an implicit validity check. A future bug fix will need to
get references to invalid roots to honor mmu_notifier requests; there's
no point in forcing what will be a common path to open code getting a
reference to a root.

No functional change intended.

Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211215011557 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Wed, 15 Dec 2021 01:15:54 +0000 (01:15 +0000)]

KVM: x86/mmu: Use common TDP MMU zap helper for MMU notifier unmap hook

Use the common TDP MMU zap helper when handling an MMU notifier unmap
event, the two flows are semantically identical. Consolidate the code in
preparation for a future bug fix, as both kvm_tdp_mmu_unmap_gfn_range()
and __kvm_tdp_mmu_zap_gfn_range() are guilty of not zapping SPTEs in
invalid roots.

No functional change intended.

Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20211215011557 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

David Woodhouse [Mon, 25 Oct 2021 13:29:01 +0000 (14:29 +0100)]

KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU

There are circumstances whem kvm_xen_update_runstate_guest() should not
sleep because it ends up being called from __schedule() when the vCPU
is preempted:

[  222.830825]  kvm_xen_update_runstate_guest+0x24/0x100
[  222.830878]  kvm_arch_vcpu_put+0x14c/0x200
[  222.830920]  kvm_sched_out+0x30/0x40
[  222.830960]  __schedule+0x55c/0x9f0

To handle this, make it use the same trick as __kvm_xen_has_interrupt(),
of using the hva from the gfn_to_hva_cache directly. Then it can use
pagefault_disable() around the accesses and just bail out if the page
is absent (which is unlikely).

I almost switched to using a gfn_to_pfn_cache here and bailing out if
kvm_map_gfn() fails, like kvm_steal_time_set_preempted() does — but on
closer inspection it looks like kvm_map_gfn() will *always* fail in
atomic context for a page in IOMEM, which means it will silently fail
to make the update every single time for such guests, AFAICT. So I
didn't do it that way after all. And will probably fix that one too.

Cc: [email protected]
Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: David Woodhouse <[email protected]>
Message-Id: <b17a93e5ff4561e57b1238e3e7ccd0b613eb827e [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Wed, 9 Feb 2022 17:14:22 +0000 (09:14 -0800)]

Merge tag 'kvm-s390-kernel-access' from emailed bundle

Pull s390 kvm fix from Christian Borntraeger:
"Add missing check for the MEMOP ioctl

  The SIDA MEMOPs must only be used for secure guests, otherwise
  userspace can do unwanted memory accesses"

* tag 'kvm-s390-kernel-access' from emailed bundle:
  KVM: s390: Return error on SIDA memop on normal guest

commit | commitdiff | tree

Linus Torvalds [Tue, 8 Feb 2022 20:03:07 +0000 (12:03 -0800)]

Merge tag 'nfs-for-5.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
"Stable Fixes:

   - Fix initialization of nfs_client cl_flags

  Other Fixes:

   - Fix performance issues with uncached readdir calls

   - Fix potential pointer dereferences in rpcrdma_ep_create

   - Fix nfs4_proc_get_locations() kernel-doc comment

   - Fix locking during sunrpc sysfs reads

   - Update my email address in the MAINTAINERS file to my new
     kernel.org email"

* tag 'nfs-for-5.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  SUNRPC: lock against ->sock changing during sysfs read
  MAINTAINERS: Update my email address
  NFS: Fix nfs4_proc_get_locations() kernel-doc comment
  xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create
  NFS: Fix initialisation of nfs_client cl_flags field
  NFS: Avoid duplicate uncached readdir calls on eof
  NFS: Don't skip directory entries when doing uncached readdir
  NFS: Don't overfill uncached readdir pages

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:26 +0000 (17:54 +0200)]

KVM: x86: SVM: move avic definitions from AMD's spec to svm.h

asm/svm.h is the correct place for all values that are defined in
the SVM spec, and that includes AVIC.

Also add some values from the spec that were not defined before
and will be soon useful.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:25 +0000 (17:54 +0200)]

KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it

kvm_apic_update_apicv is called when AVIC is still active, thus IRR bits
can be set by the CPU after it is called, and don't cause the irr_pending
to be set to true.

Also logic in avic_kick_target_vcpu doesn't expect a race with this
function so to make it simple, just keep irr_pending set to true and
let the next interrupt injection to the guest clear it.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:24 +0000 (17:54 +0200)]

KVM: x86: nSVM: deal with L1 hypervisor that intercepts interrupts but lets L2 control them

Fix a corner case in which the L1 hypervisor intercepts
interrupts (INTERCEPT_INTR) and either doesn't set
virtual interrupt masking (V_INTR_MASKING) or enters a
nested guest with EFLAGS.IF disabled prior to the entry.

In this case, despite the fact that L1 intercepts the interrupts,
KVM still needs to set up an interrupt window to wait before
injecting the INTR vmexit.

Currently the KVM instead enters an endless loop of 'req_immediate_exit'.

Exactly the same issue also happens for SMIs and NMI.
Fix this as well.

Note that on VMX, this case is impossible as there is only
'vmexit on external interrupts' execution control which either set,
in which case both host and guest's EFLAGS.IF
are ignored, or not set, in which case no VMexits are delivered.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:22 +0000 (17:54 +0200)]

KVM: x86: nSVM: expose clean bit support to the guest

KVM already honours few clean bits thus it makes sense
to let the nested guest know about it.

Note that KVM also doesn't check if the hardware supports
clean bits, and therefore nested KVM was
already setting clean bits and L0 KVM
was already honouring them.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:21 +0000 (17:54 +0200)]

KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM

While RSM induced VM entries are not full VM entries,
they still need to be followed by actual VM entry to complete it,
unlike setting the nested state.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM, which fail due
to this issue combined with lack of dirty bit setting.

Signed-off-by: Maxim Levitsky <[email protected]>
Cc: [email protected]
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:20 +0000 (17:54 +0200)]

KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state

While usually, restoring the smm state makes the KVM enter
the nested guest thus a different vmcb (vmcb02 vs vmcb01),
KVM should still mark it as dirty, since hardware
can in theory cache multiple vmcbs.

Failure to do so, combined with lack of setting the
nested_run_pending (which is fixed in the next patch),
might make KVM re-enter vmcb01, which was just exited from,
with completely different set of guest state registers
(SMM vs non SMM) and without proper dirty bits set,
which results in the CPU reusing stale IDTR pointer
which leads to a guest shutdown on any interrupt.

On the real hardware this usually doesn't happen,
but when running nested, L0's KVM does check and
honour few dirty bits, causing this issue to happen.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM.

Signed-off-by: Maxim Levitsky <[email protected]>
Cc: [email protected]
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:19 +0000 (17:54 +0200)]

KVM: x86: nSVM: fix potential NULL derefernce on nested migration

Turns out that due to review feedback and/or rebases
I accidentally moved the call to nested_svm_load_cr3 to be too early,
before the NPT is enabled, which is very wrong to do.

KVM can't even access guest memory at that point as nested NPT
is needed for that, and of course it won't initialize the walk_mmu,
which is main issue the patch was addressing.

Fix this for real.

Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load")
Cc: [email protected]
Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Maxim Levitsky [Mon, 7 Feb 2022 15:54:18 +0000 (17:54 +0200)]

KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case

When the guest doesn't enable paging, and NPT/EPT is disabled, we
use guest't paging CR3's as KVM's shadow paging pointer and
we are technically in direct mode as if we were to use NPT/EPT.

In direct mode we create SPTEs with user mode permissions
because usually in the direct mode the NPT/EPT doesn't
need to restrict access based on guest CPL
(there are MBE/GMET extenstions for that but KVM doesn't use them).

In this special "use guest paging as direct" mode however,
and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU
fault on each access and KVM will enter endless loop of page faults.

Since page protection doesn't have any meaning in !PG case,
just don't passthrough these bits.

The fix is the same as was done for VMX in commit:
commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT")

This fixes the boot of windows 10 without NPT for good.
(Without this patch, BSP boots, but APs were stuck in endless
loop of page faults, causing the VM boot with 1 CPU)

Signed-off-by: Maxim Levitsky <[email protected]>
Cc: [email protected]
Message-Id: <20220207155447 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Sean Christopherson [Fri, 4 Feb 2022 21:41:55 +0000 (21:41 +0000)]

Revert "svm: Add warning message for AVIC IPI invalid target"

Remove a WARN on an "AVIC IPI invalid target" exit, the WARN is trivial
to trigger from guest as it will fail on any destination APIC ID that
doesn't exist from the guest's perspective.

Don't bother recording anything in the kernel log, the common tracepoint
for kvm_avic_incomplete_ipi() is sufficient for debugging.

This reverts commit 37ef0c4414c9743ba7f1af4392f0a27a99649f2a.

Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20220204214205.3306634 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 17:54:41 +0000 (17:54 +0000)]

Merge branch kvm-arm64/pmu-bl into kvmarm-master/next

* kvm-arm64/pmu-bl:
  : .
  : Improve PMU support on heterogeneous systems, courtesy of Alexandru Elisei
  : .
  KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU
  KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute
  KVM: arm64: Keep a list of probed PMUs
  KVM: arm64: Keep a per-VM pointer to the default PMU
  perf: Fix wrong name in comment for struct perf_cpu_context
  KVM: arm64: Do not change the PMU event filter after a VCPU has run

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Alexandru Elisei [Thu, 27 Jan 2022 16:17:59 +0000 (16:17 +0000)]

KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU

Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU
device ioctl. If the VCPU is scheduled on a physical CPU which has a
different PMU, the perf events needed to emulate a guest PMU won't be
scheduled in and the guest performance counters will stop counting. Treat
it as an userspace error and refuse to run the VCPU in this situation.

Suggested-by: Marc Zyngier <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Alexandru Elisei [Thu, 27 Jan 2022 16:17:58 +0000 (16:17 +0000)]

KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute

When KVM creates an event and there are more than one PMUs present on the
system, perf_init_event() will go through the list of available PMUs and
will choose the first one that can create the event. The order of the PMUs
in this list depends on the probe order, which can change under various
circumstances, for example if the order of the PMU nodes change in the DTB
or if asynchronous driver probing is enabled on the kernel command line
(with the driver_async_probe=armv8-pmu option).

Another consequence of this approach is that on heteregeneous systems all
virtual machines that KVM creates will use the same PMU. This might cause
unexpected behaviour for userspace: when a VCPU is executing on the
physical CPU that uses this default PMU, PMU events in the guest work
correctly; but when the same VCPU executes on another CPU, PMU events in
the guest will suddenly stop counting.

Fortunately, perf core allows user to specify on which PMU to create an
event by using the perf_event_attr->type field, which is used by
perf_init_event() as an index in the radix tree of available PMUs.

Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU
attribute to allow userspace to specify the arm_pmu that KVM will use when
creating events for that VCPU. KVM will make no attempt to run the VCPU on
the physical CPUs that share the PMU, leaving it up to userspace to manage
the VCPU threads' affinity accordingly.

To ensure that KVM doesn't expose an asymmetric system to the guest, the
PMU set for one VCPU will be used by all other VCPUs. Once a VCPU has run,
the PMU cannot be changed in order to avoid changing the list of available
events for a VCPU, or to change the semantics of existing events.

Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Alexandru Elisei [Thu, 27 Jan 2022 16:17:57 +0000 (16:17 +0000)]

KVM: arm64: Keep a list of probed PMUs

The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Reviewed-by: Reiji Watanabe <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Thu, 27 Jan 2022 16:17:56 +0000 (16:17 +0000)]

KVM: arm64: Keep a per-VM pointer to the default PMU

As we are about to allow selection of the PMU exposed to a guest, start by
keeping track of the default one instead of only the PMU version.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Alexandru Elisei [Thu, 27 Jan 2022 16:17:55 +0000 (16:17 +0000)]

perf: Fix wrong name in comment for struct perf_cpu_context

Commit 0793a61d4df8 ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Thu, 27 Jan 2022 16:17:54 +0000 (16:17 +0000)]

KVM: arm64: Do not change the PMU event filter after a VCPU has run

Userspace can specify which events a guest is allowed to use with the
KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be
identified by a guest from reading the PMCEID{0,1}_EL0 registers.

Changing the PMU event filter after a VCPU has run can cause reads of the
registers performed before the filter is changed to return different values
than reads performed with the new event filter in place. The architecture
defines the two registers as read-only, and this behaviour contradicts
that.

Keep track when the first VCPU has run and deny changes to the PMU event
filter to prevent this from happening.

Signed-off-by: Marc Zyngier <[email protected]>
[ Alexandru E: Added commit message, updated ioctl documentation ]
Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 15:29:28 +0000 (15:29 +0000)]

Merge branch kvm-arm64/misc-5.18 into kvmarm-master/next

* kvm-arm64/misc-5.18:
  : .
  : Misc fixes for KVM/arm64 5.18:
  :
  : - Drop unused kvm parameter to kvm_psci_version()
  :
  : - Implement CONFIG_DEBUG_LIST at EL2
  : .
  KVM: arm64: pkvm: Implement CONFIG_DEBUG_LIST at EL2
  KVM: arm64: Drop unused param from kvm_psci_version()

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Keir Fraser [Mon, 31 Jan 2022 12:40:53 +0000 (12:40 +0000)]

KVM: arm64: pkvm: Implement CONFIG_DEBUG_LIST at EL2

Currently the check functions are stubbed out at EL2. Implement
versions suitable for the constrained EL2 environment.

Signed-off-by: Keir Fraser <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Tue, 8 Feb 2022 01:27:05 +0000 (01:27 +0000)]

KVM: arm64: Drop unused param from kvm_psci_version()

kvm_psci_version() consumes a pointer to struct kvm in addition to a
vcpu pointer. Drop the kvm pointer as it is unused. While the comment
suggests the explicit kvm pointer was useful for calling from hyp, there
exist no such callsite in hyp.

Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 15:20:16 +0000 (15:20 +0000)]

Merge branch kvm-arm64/selftest/vgic-5.18 into kvmarm-master/next

* kvm-arm64/selftest/vgic-5.18:
  : .
  : A bunch of selftest fixes, courtesy of Ricardo Koller
  : .
  kvm: selftests: aarch64: use a tighter assert in vgic_poke_irq()
  kvm: selftests: aarch64: fix some vgic related comments
  kvm: selftests: aarch64: fix the failure check in kvm_set_gsi_routing_irqchip_check
  kvm: selftests: aarch64: pass vgic_irq guest args as a pointer
  kvm: selftests: aarch64: fix assert in gicv3_access_reg

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Ricardo Koller [Thu, 27 Jan 2022 03:08:58 +0000 (19:08 -0800)]

kvm: selftests: aarch64: use a tighter assert in vgic_poke_irq()

vgic_poke_irq() checks that the attr argument passed to the vgic device
ioctl is sane. Make this check tighter by moving it to after the last
attr update.

Signed-off-by: Ricardo Koller <[email protected]>
Reported-by: Reiji Watanabe <[email protected]>
Cc: Andrew Jones <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Ricardo Koller [Thu, 27 Jan 2022 03:08:57 +0000 (19:08 -0800)]

kvm: selftests: aarch64: fix some vgic related comments

Fix the formatting of some comments and the wording of one of them (in
gicv3_access_reg).

Signed-off-by: Ricardo Koller <[email protected]>
Reported-by: Reiji Watanabe <[email protected]>
Cc: Andrew Jones <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Ricardo Koller [Thu, 27 Jan 2022 03:08:56 +0000 (19:08 -0800)]

kvm: selftests: aarch64: fix the failure check in kvm_set_gsi_routing_irqchip_check

kvm_set_gsi_routing_irqchip_check(expect_failure=true) is used to check
the error code returned by the kernel when trying to setup an invalid
gsi routing table. The ioctl fails if "pin >= KVM_IRQCHIP_NUM_PINS", so
kvm_set_gsi_routing_irqchip_check() should test the error only when
"intid >= KVM_IRQCHIP_NUM_PINS+32". The issue is that the test check is
"intid >= KVM_IRQCHIP_NUM_PINS", so for a case like "intid =
KVM_IRQCHIP_NUM_PINS" the test wrongly assumes that the kernel will
return an error. Fix this by using the right check.

Signed-off-by: Ricardo Koller <[email protected]>
Reported-by: Reiji Watanabe <[email protected]>
Cc: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Ricardo Koller [Thu, 27 Jan 2022 03:08:55 +0000 (19:08 -0800)]

kvm: selftests: aarch64: pass vgic_irq guest args as a pointer

The guest in vgic_irq gets its arguments in a struct. This struct used
to fit nicely in a single register so vcpu_args_set() was able to pass
it by value by setting x0 with it. Unfortunately, this args struct grew
after some commits and some guest args became random (specically
kvm_supports_irqfd).

Fix this by passing the guest args as a pointer (after allocating some
guest memory for it).

Signed-off-by: Ricardo Koller <[email protected]>
Reported-by: Reiji Watanabe <[email protected]>
Cc: Andrew Jones <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Ricardo Koller [Thu, 27 Jan 2022 03:08:54 +0000 (19:08 -0800)]

kvm: selftests: aarch64: fix assert in gicv3_access_reg

The val argument in gicv3_access_reg can have any value when used for a
read, not necessarily 0. Fix the assert by checking val only for
writes.

Signed-off-by: Ricardo Koller <[email protected]>
Reported-by: Reiji Watanabe <[email protected]>
Cc: Andrew Jones <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 14:58:38 +0000 (14:58 +0000)]

Merge branch kvm-arm64/vmid-allocator into kvmarm-master/next

* kvm-arm64/vmid-allocator:
  : .
  : VMID allocation rewrite from Shameerali Kolothum Thodi, paving the
  : way for pinned VMIDs and SVA.
  : .
  KVM: arm64: Make active_vmids invalid on vCPU schedule out
  KVM: arm64: Align the VMID allocation with the arm64 ASID
  KVM: arm64: Make VMID bits accessible outside of allocator
  KVM: arm64: Introduce a new VMID allocator for KVM

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Shameer Kolothum [Mon, 22 Nov 2021 12:18:44 +0000 (12:18 +0000)]

KVM: arm64: Make active_vmids invalid on vCPU schedule out

Like ASID allocator, we copy the active_vmids into the
reserved_vmids on a rollover. But it's unlikely that
every CPU will have a vCPU as current task and we may
end up unnecessarily reserving the VMID space.

Hence, set active_vmids to an invalid one when scheduling
out a vCPU.

Signed-off-by: Shameer Kolothum <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Julien Grall [Mon, 22 Nov 2021 12:18:43 +0000 (12:18 +0000)]

KVM: arm64: Align the VMID allocation with the arm64 ASID

At the moment, the VMID algorithm will send an SGI to all the
CPUs to force an exit and then broadcast a full TLB flush and
I-Cache invalidation.

This patch uses the new VMID allocator. The benefits are:
   - Aligns with arm64 ASID algorithm.
   - CPUs are not forced to exit at roll-over. Instead,
     the VMID will be marked reserved and context invalidation
     is broadcasted. This will reduce the IPIs traffic.
   - More flexible to add support for pinned KVM VMIDs in
     the future.

With the new algo, the code is now adapted:
    - The call to update_vmid() will be done with preemption
      disabled as the new algo requires to store information
      per-CPU.

Signed-off-by: Julien Grall <[email protected]>
Signed-off-by: Shameer Kolothum <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Shameer Kolothum [Mon, 22 Nov 2021 12:18:42 +0000 (12:18 +0000)]

KVM: arm64: Make VMID bits accessible outside of allocator

Since we already set the kvm_arm_vmid_bits in the VMID allocator
init function, make it accessible outside as well so that it can
be used in the subsequent patch.

Suggested-by: Will Deacon <[email protected]>
Signed-off-by: Shameer Kolothum <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Shameer Kolothum [Mon, 22 Nov 2021 12:18:41 +0000 (12:18 +0000)]

KVM: arm64: Introduce a new VMID allocator for KVM

A new VMID allocator for arm64 KVM use. This is based on
arm64 ASID allocator algorithm.

One major deviation from the ASID allocator is the way we
flush the context. Unlike ASID allocator, we expect less
frequent rollover in the case of VMIDs. Hence, instead of
marking the CPU as flush_pending and issuing a local context
invalidation on the next context switch, we broadcast TLB
flush + I-cache invalidation over the inner shareable domain
on rollover.

Signed-off-by: Shameer Kolothum <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 14:44:46 +0000 (14:44 +0000)]

Merge branch kvm-arm64/fpsimd-doc into kvmarm-master/next

* kvm-arm64/fpsimd-doc:
  : .
  : FPSIMD documentation update, courtesy of Mark Brown
  : .
  arm64/fpsimd: Clarify the purpose of using last in fpsimd_save()
  KVM: arm64: Add some more comments in kvm_hyp_handle_fpsimd()
  KVM: arm64: Add comments for context flush and sync callbacks

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Mark Brown [Mon, 24 Jan 2022 16:11:15 +0000 (16:11 +0000)]

arm64/fpsimd: Clarify the purpose of using last in fpsimd_save()

When saving the floating point context in fpsimd_save() we always reference
the state using last-> rather than using current->. Looking at the FP code
in isolation the reason for this is not entirely obvious, it's done because
when KVM is running it will bind the guest context and rely on the host
writing out the guest state on context switch away from the guest.

There's a slight trick here in that KVM still uses TIF_FOREIGN_FPSTATE and
TIF_SVE to communicate what needs to be saved, it maintains those flags
and restores them when it is done running the guest so that the normal
restore paths function when we return back to userspace.

Add a comment to explain this to help future readers work out what's going
on a bit faster.

Signed-off-by: Mark Brown <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Mark Brown [Mon, 24 Jan 2022 15:57:20 +0000 (15:57 +0000)]

KVM: arm64: Add some more comments in kvm_hyp_handle_fpsimd()

The handling for FPSIMD/SVE traps is multi stage and involves some trap
manipulation which isn't quite so immediately obvious as might be desired
so add a few more comments.

Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Mark Brown [Mon, 24 Jan 2022 15:57:19 +0000 (15:57 +0000)]

KVM: arm64: Add comments for context flush and sync callbacks

Add a little bit of information on where _ctxflush_fp() and _ctxsync_fp()
are called to help people unfamiliar with the code get up to speed.

Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 14:29:29 +0000 (14:29 +0000)]

Merge branch kvm-arm64/mmu-rwlock into kvmarm-master/next

* kvm-arm64/mmu-rwlock:
  : .
  : MMU locking optimisations from Jing Zhang, allowing permission
  : relaxations to occur in parallel.
  : .
  KVM: selftests: Add vgic initialization for dirty log perf test for ARM
  KVM: arm64: Add fast path to handle permission relaxation during dirty logging
  KVM: arm64: Use read/write spin lock for MMU protection

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Jing Zhang [Tue, 18 Jan 2022 01:57:03 +0000 (01:57 +0000)]

KVM: selftests: Add vgic initialization for dirty log perf test for ARM

For ARM64, if no vgic is setup before the dirty log perf test, the
userspace irqchip would be used, which would affect the dirty log perf
test result.

Signed-off-by: Jing Zhang <[email protected]>
Tested-by: Fuad Tabba <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Jing Zhang [Tue, 18 Jan 2022 01:57:02 +0000 (01:57 +0000)]

KVM: arm64: Add fast path to handle permission relaxation during dirty logging

To reduce MMU lock contention during dirty logging, all permission
relaxation operations would be performed under read lock.

Signed-off-by: Jing Zhang <[email protected]>
Tested-by: Fuad Tabba <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Jing Zhang [Tue, 18 Jan 2022 01:57:01 +0000 (01:57 +0000)]

KVM: arm64: Use read/write spin lock for MMU protection

Replace MMU spinlock with rwlock and update all instances of the lock
being acquired with a write lock acquisition.
Future commit will add a fast path for permission relaxation during
dirty logging under a read lock.

Signed-off-by: Jing Zhang <[email protected]>
Tested-by: Fuad Tabba <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Marc Zyngier [Tue, 8 Feb 2022 14:26:30 +0000 (14:26 +0000)]

Merge branch kvm-arm64/oslock into kvmarm-master/next

* kvm-arm64/oslock:
  : .
  : Debug OS-Lock emulation courtesy of Oliver Upton. From the cover letter:
  :
  : "KVM does not implement the debug architecture to the letter of the
  : specification. One such issue is the fact that KVM treats the OS Lock as
  : RAZ/WI, rather than emulating its behavior on hardware. This series adds
  : emulation support for the OS Lock to KVM. Emulation is warranted as the
  : OS Lock affects debug exceptions taken from all ELs, and is not limited
  : to only the context of the guest."
  : .
  selftests: KVM: Test OS lock behavior
  selftests: KVM: Add OSLSR_EL1 to the list of blessed regs
  KVM: arm64: Emulate the OS Lock
  KVM: arm64: Allow guest to set the OSLK bit
  KVM: arm64: Stash OSLSR_EL1 in the cpu context
  KVM: arm64: Correctly treat writes to OSLSR_EL1 as undefined

Signed-off-by: Marc Zyngier <[email protected]>

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:59 +0000 (17:41 +0000)]

selftests: KVM: Test OS lock behavior

KVM now correctly handles the OS Lock for its guests. When set, KVM
blocks all debug exceptions originating from the guest. Add test cases
to the debug-exceptions test to assert that software breakpoint,
hardware breakpoint, watchpoint, and single-step exceptions are in fact
blocked.

Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:58 +0000 (17:41 +0000)]

selftests: KVM: Add OSLSR_EL1 to the list of blessed regs

OSLSR_EL1 is now part of the visible system register state. Add it to
the get-reg-list selftest to ensure we keep it that way.

Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:57 +0000 (17:41 +0000)]

KVM: arm64: Emulate the OS Lock

The OS lock blocks all debug exceptions at every EL. To date, KVM has
not implemented the OS lock for its guests, despite the fact that it is
mandatory per the architecture. Simple context switching between the
guest and host is not appropriate, as its effects are not constrained to
the guest context.

Emulate the OS Lock by clearing MDE and SS in MDSCR_EL1, thereby
blocking all but software breakpoint instructions.

Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:56 +0000 (17:41 +0000)]

KVM: arm64: Allow guest to set the OSLK bit

Allow writes to OSLAR and forward the OSLK bit to OSLSR. Do nothing with
the value for now.

Reviewed-by: Reiji Watanabe <[email protected]>
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:55 +0000 (17:41 +0000)]

KVM: arm64: Stash OSLSR_EL1 in the cpu context

An upcoming change to KVM will emulate the OS Lock from the PoV of the
guest. Add OSLSR_EL1 to the cpu context and handle reads using the
stored value. Define some mnemonics for for handling the OSLM field and
use them to make the reset value of OSLSR_EL1 more readable.

Wire up a custom handler for writes from userspace and prevent any of
the invariant bits from changing. Note that the OSLK bit is not
invariant and will be made writable by the aforementioned change.

Reviewed-by: Reiji Watanabe <[email protected]>
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

Oliver Upton [Thu, 3 Feb 2022 17:41:54 +0000 (17:41 +0000)]

KVM: arm64: Correctly treat writes to OSLSR_EL1 as undefined

Writes to OSLSR_EL1 are UNDEFINED and should never trap from EL1 to
EL2, but the kvm trap handler for OSLSR_EL1 handles writes via
ignore_write(). This is confusing to readers of code, but should have
no functional impact.

For clarity, use write_to_read_only() rather than ignore_write(). If a
trap is unexpectedly taken to EL2 in violation of the architecture, this
will WARN_ONCE() and inject an undef into the guest.

Reviewed-by: Reiji Watanabe <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
[adopted Mark's changelog suggestion, thanks!]
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

commit | commitdiff | tree

NeilBrown [Mon, 17 Jan 2022 05:36:53 +0000 (16:36 +1100)]

SUNRPC: lock against ->sock changing during sysfs read

->sock can be set to NULL asynchronously unless ->recv_mutex is held.
So it is important to hold that mutex. Otherwise a sysfs read can
trigger an oops.
Commit 17f09d3f619a ("SUNRPC: Check if the xprt is connected before
handling sysfs reads") appears to attempt to fix this problem, but it
only narrows the race window.

Fixes: 17f09d3f619a ("SUNRPC: Check if the xprt is connected before handling sysfs reads")
Fixes: a8482488a7d6 ("SUNRPC query transport's source port")
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

commit | commitdiff | tree

Anna Schumaker [Mon, 7 Feb 2022 16:14:47 +0000 (11:14 -0500)]

MAINTAINERS: Update my email address

Signed-off-by: Anna Schumaker <[email protected]>

commit | commitdiff | tree

Yang Li [Thu, 13 Jan 2022 02:26:04 +0000 (10:26 +0800)]

NFS: Fix nfs4_proc_get_locations() kernel-doc comment

Add the description of @server and @fhandle, and remove the excess
@inode in nfs4_proc_get_locations() kernel-doc comment to remove
warnings found by running scripts/kernel-doc, which is caused by
using 'make W=1'.

fs/nfs/nfs4proc.c:8219: warning: Function parameter or member 'server'
not described in 'nfs4_proc_get_locations'
fs/nfs/nfs4proc.c:8219: warning: Function parameter or member 'fhandle'
not described in 'nfs4_proc_get_locations'
fs/nfs/nfs4proc.c:8219: warning: Excess function parameter 'inode'
description in 'nfs4_proc_get_locations'

Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

commit | commitdiff | tree

Dan Aloni [Tue, 25 Jan 2022 20:06:46 +0000 (22:06 +0200)]

xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create

If there are failures then we must not leave the non-NULL pointers with
the error value, otherwise `rpcrdma_ep_destroy` gets confused and tries
free them, resulting in an Oops.

Signed-off-by: Dan Aloni <[email protected]>
Acked-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

commit | commitdiff | tree

Trond Myklebust [Wed, 2 Feb 2022 23:52:01 +0000 (18:52 -0500)]

NFS: Fix initialisation of nfs_client cl_flags field

For some long forgotten reason, the nfs_client cl_flags field is
initialised in nfs_get_client() instead of being initialised at
allocation time. This quirk was harmless until we moved the call to
nfs_create_rpc_client().

Fixes: dd99e9f98fbf ("NFSv4: Initialise connection to the server in nfs4_alloc_client()")
Cc: [email protected] # 4.8.x
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Mon, 7 Feb 2022 23:25:50 +0000 (15:25 -0800)]

Merge tag '5.17-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull ksmbd server fixes from Steve French:

- NTLMSSP authentication improvement

- RDMA (smbdirect) fix allowing broader set of NICs to be supported

- improved buffer validation

- additional small fixes, including a posix extensions fix for stable

* tag '5.17-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: add support for key exchange
  ksmbd: reduce smb direct max read/write size
  ksmbd: don't align last entry offset in smb2 query directory
  ksmbd: fix same UniqueId for dot and dotdot entries
  ksmbd: smbd: validate buffer descriptor structures
  ksmbd: fix SMB 3.11 posix extension mount failure

commit | commitdiff | tree

Linus Torvalds [Mon, 7 Feb 2022 20:10:35 +0000 (12:10 -0800)]

Merge tag 'ata-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ata fix from Damien Le Moal:
"A single patch from me, to fix a bug that is causing boot issues in
  the field (reports of problems with Fedora 35).

  The bug affects mostly old-ish drives that have issues with read log
  page command handling"

* tag 'ata-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata-core: Fix ata_dev_config_cpr()

commit | commitdiff | tree

Linus Torvalds [Mon, 7 Feb 2022 19:51:14 +0000 (11:51 -0800)]

Merge tag 'mmc-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Fix support for SD Power off notification

  MMC host:
   - moxart: Fix potential use-after-free on remove path
   - sdhci-of-esdhc: Fix error path when setting dma mask
   - sh_mmcif: Fix potential NULL pointer dereference"

* tag 'mmc-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  moxart: fix potential use-after-free on remove path
  mmc: core: Wait for command setting 'Power Off Notification' bit to complete
  mmc: sh_mmcif: Check for null res pointer
  mmc: sdhci-of-esdhc: Check for error num after setting mask

commit | commitdiff | tree

Linus Torvalds [Mon, 7 Feb 2022 17:55:14 +0000 (09:55 -0800)]

Merge tag 'integrity-v5.17-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
"Fixes for recently found bugs.

  One was found/noticed while reviewing IMA support for fsverity digests
  and signatures. Two of them were found/noticed while working on IMA
  namespacing. Plus two other bugs.

  All of them are for previous kernel releases"

* tag 'integrity-v5.17-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Do not print policy rule with inactive LSM labels
  ima: Allow template selection with ima_template[_fmt]= after ima_hash=
  ima: Remove ima_policy file before directory
  integrity: check the return value of audit_log_start()
  ima: fix reference leak in asymmetric_verify()

commit | commitdiff | tree

Damien Le Moal [Mon, 7 Feb 2022 02:27:53 +0000 (11:27 +0900)]

ata: libata-core: Fix ata_dev_config_cpr()

The concurrent positioning ranges log page 47h is a general purpose log
page and not a subpage of the indentify device log. Using
ata_identify_page_supported() to test for concurrent positioning ranges
support is thus wrong. ata_log_supported() must be used.

Furthermore, unlike other advanced ATA features (e.g. NCQ priority),
accesses to the concurrent positioning ranges log page are not gated by
a feature bit from the device IDENTIFY data. Since many older drives
react badly to the READ LOG EXT and/or READ LOG DMA EXT commands isued
to read device log pages, avoid problems with older drives by limiting
the concurrent positioning ranges support detection to drives
implementing at least the ACS-4 ATA standard (major version 11). This
additional condition effectively turns ata_dev_config_cpr() into a nop
for older drives, avoiding problems in the field.

Fixes: fe22e1c2f705 ("libata: support concurrent positioning ranges log")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215519
Cc: [email protected]
Reviewed-by: Hannes Reinecke <[email protected]>
Tested-by: Abderraouf Adjal <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 20:20:50 +0000 (12:20 -0800)]

Linux 5.17-rc3

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 18:34:45 +0000 (10:34 -0800)]

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
"Various bug fixes for ext4 fast commit and inline data handling.

  Also fix regression introduced as part of moving to the new mount API"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  fs/ext4: fix comments mentioning i_mutex
  ext4: fix incorrect type issue during replay_del_range
  jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
  ext4: fix potential NULL pointer dereference in ext4_fill_super()
  jbd2: refactor wait logic for transaction updates into a common function
  jbd2: cleanup unused functions declarations from jbd2.h
  ext4: fix error handling in ext4_fc_record_modified_inode()
  ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
  ext4: fix error handling in ext4_restore_inline_data()
  ext4: fast commit may miss file actions
  ext4: fast commit may not fallback for ineligible commit
  ext4: modify the logic of ext4_mb_new_blocks_simple
  ext4: prevent used blocks from being allocated during fast commit replay

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 18:18:23 +0000 (10:18 -0800)]

Merge tag 'perf-tools-fixes-for-v5.17-2022-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Fix display of grouped aliased events in 'perf stat'.

- Add missing branch_sample_type to perf_event_attr__fprintf().

- Apply correct label to user/kernel symbols in branch mode.

- Fix 'perf ftrace' system_wide tracing, it has to be set before
   creating the maps.

- Return error if procfs isn't mounted for PID namespaces when
   synthesizing records for pre-existing processes.

- Set error stream of objdump process for 'perf annotate' TUI, to avoid
   garbling the screen.

- Add missing arm64 support to perf_mmap__read_self(), the kernel part
   got into 5.17.

- Check for NULL pointer before dereference writing debug info about a
   sample.

- Update UAPI copies for asound, perf_event, prctl and kvm headers.

- Fix a typo in bpf_counter_cgroup.c.

* tag 'perf-tools-fixes-for-v5.17-2022-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf ftrace: system_wide collection is not effective by default
  libperf: Add arm64 support to perf_mmap__read_self()
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  perf stat: Fix display of grouped aliased events
  perf tools: Apply correct label to user/kernel symbols in branch mode
  perf bpf: Fix a typo in bpf_counter_cgroup.c
  perf synthetic-events: Return error if procfs isn't mounted for PID namespaces
  perf session: Check for NULL pointer before dereference
  perf annotate: Set error stream of objdump process for TUI
  perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  perf beauty: Make the prctl arg regexp more strict to cope with PR_SET_VMA
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync linux/perf_event.h with the kernel sources
  tools include UAPI: Sync sound/asound.h copy with the kernel sources

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 18:11:14 +0000 (10:11 -0800)]

Merge tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Intel/PT: filters could crash the kernel

- Intel: default disable the PMU for SMM, some new-ish EFI firmware has
   started using CPL3 and the PMU CPL filters don't discriminate against
   SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
   cycles.

- Fixup for perf_event_attr::sig_data

* tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix crash with stop filters in single-range mode
  perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
  selftests/perf_events: Test modification of perf_event_attr::sig_data
  perf: Copy perf_event_attr::sig_data on modification
  x86/perf: Default set FREEZE_ON_SMI for all

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 18:04:43 +0000 (10:04 -0800)]

Merge tag 'objtool_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Borislav Petkov:
"Fix a potential truncated string warning triggered by gcc12"

* tag 'objtool_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix truncated string warning

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 18:00:40 +0000 (10:00 -0800)]

Merge tag 'irq_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Borislav Petkov:
"Remove a bogus warning introduced by the recent PCI MSI irq affinity
overhaul"

* tag 'irq_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Remove bogus warning in pci_irq_get_affinity()

commit | commitdiff | tree

Linus Torvalds [Sun, 6 Feb 2022 17:57:39 +0000 (09:57 -0800)]

Merge tag 'edac_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:
"Fix altera and xgene EDAC drivers to propagate the correct error code
  from platform_get_irq() so that deferred probing still works"

* tag 'edac_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/xgene: Fix deferred probing
  EDAC/altera: Fix deferred probing

commit | commitdiff | tree

Changbin Du [Thu, 27 Jan 2022 13:20:10 +0000 (21:20 +0800)]

perf ftrace: system_wide collection is not effective by default

The ftrace.target.system_wide must be set before invoking
evlist__create_maps(), otherwise it has no effect.

Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand")
Signed-off-by: Changbin Du <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Rob Herring [Tue, 1 Feb 2022 21:40:56 +0000 (15:40 -0600)]

libperf: Add arm64 support to perf_mmap__read_self()

Add the arm64 variants for read_perf_counter() and read_timestamp().
Unfortunately the counter number is encoded into the instruction, so the
code is a bit verbose to enumerate all possible counters.

Tested-by: Masayoshi Mizuma <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: John Garry <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Wed, 12 Feb 2020 14:04:23 +0000 (11:04 -0300)]

tools include UAPI: Sync sound/asound.h copy with the kernel sources

Picking the changes from:

  06feec6005c9d950 ("ASoC: hdmi-codec: Fix OOB memory accesses")

Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Dmitry Osipenko <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Takashi Iwai <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Ian Rogers [Sat, 5 Feb 2022 01:09:41 +0000 (17:09 -0800)]

perf stat: Fix display of grouped aliased events

An event may have a number of uncore aliases that when added to the
evlist are consecutive.

If there are multiple uncore events in a group then
parse_events__set_leader_for_uncore_aliase will reorder the evlist so
that events on the same PMU are adjacent.

The collect_all_aliases function assumes that aliases are in blocks so
that only the first counter is printed and all others are marked merged.

The reordering for groups breaks the assumption and so all counts are
printed.

This change removes the assumption from collect_all_aliases
that the events are in blocks and instead processes the entire evlist.

Before:

  ```
  $ perf stat -e '{UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE,UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE},duration_time' -a -A -- sleep 1

   Performance counter stats for 'system wide':

  CPU0                  256,866      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 494,413      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      967      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,738      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  285,161      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 429,920      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      955      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,443      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  310,753      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 416,657      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,231      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,573      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  416,067      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 405,966      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,481      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,447      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  312,911      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 408,154      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,086      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,380      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  333,994      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 370,349      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,287      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,335      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  188,107      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 302,423      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      701      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,070      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  307,221      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 383,642      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,036      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,158      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  318,479      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 821,545      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,028      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   2,550      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  227,618      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 372,272      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      903      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,456      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  376,783      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 419,827      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,406      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,453      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  286,583      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 429,956      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      999      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,436      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  313,867      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 370,159      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,114      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,291      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  342,083      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 409,111      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,399      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,684      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  365,828      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 376,037      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,378      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,411      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  382,456      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 621,743      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,232      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,955      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  342,316      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 385,067      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,176      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,268      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  373,588      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 386,163      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,394      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,464      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  381,206      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 546,891      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,266      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,712      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  221,176      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 392,069      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      831      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,456      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  355,401      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 705,595      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,235      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   2,216      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  371,436      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 428,103      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,306      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,442      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  384,352      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 504,200      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,468      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,860      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  228,856      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 287,976      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      832      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,060      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  215,121      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 334,162      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      681      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,026      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  296,179      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 436,083      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,084      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,525      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  262,296      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 416,573      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      986      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,533      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  285,852      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 359,842      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,073      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,326      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  303,379      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 367,222      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,008      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,156      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  273,487      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 425,449      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                      932      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,367      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  297,596      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 414,793      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,140      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,601      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  342,365      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 360,422      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,291      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,342      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  327,196      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 580,858      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,122      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   2,014      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  296,564      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 452,817      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,087      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,694      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  375,002      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 389,393      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,478      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   1,540      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0                  365,213      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36                 594,685      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                    1,401      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                   2,222      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0            1,000,749,060 ns   duration_time

         1.000749060 seconds time elapsed
  ```

After:

  ```
   Performance counter stats for 'system wide':

  CPU0               20,547,434      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU36              45,202,862      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
  CPU0                   82,001      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU36                 159,688      UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
  CPU0            1,000,464,828 ns   duration_time

         1.000464828 seconds time elapsed
  ```

Fixes: 3cdc5c2cb924acb4 ("perf parse-events: Handle uncore event aliases in small groups properly")
Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Asaf Yaffe <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kshipra Bopardikar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vineet Singh <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

German Gomez [Wed, 26 Jan 2022 10:59:26 +0000 (10:59 +0000)]

perf tools: Apply correct label to user/kernel symbols in branch mode

In branch mode, the branch symbols were being displayed with incorrect
cpumode labels. So fix this.

For example, before:
  # perf record -b -a -- sleep 1
  # perf report -b

  Overhead  Command  Source Shared Object  Source Symbol               Target Symbol
     0.08%  swapper  [kernel.kallsyms]     [k] rcu_idle_enter          [k] cpuidle_enter_state
==> 0.08%  cmd0     [kernel.kallsyms]     [.] psi_group_change        [.] psi_group_change
     0.08%  cmd1     [kernel.kallsyms]     [k] psi_group_change        [k] psi_group_change

After:
  # perf report -b

  Overhead  Command  Source Shared Object  Source Symbol               Target Symbol
     0.08%  swapper  [kernel.kallsyms]     [k] rcu_idle_enter          [k] cpuidle_enter_state
     0.08%  cmd0     [kernel.kallsyms]     [k] psi_group_change        [k] pei_group_change
     0.08%  cmd1     [kernel.kallsyms]     [k] psi_group_change        [k] psi_group_change

Reviewed-by: James Clark <[email protected]>
Signed-off-by: German Gomez <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Masanari Iida [Sat, 25 Dec 2021 00:55:58 +0000 (09:55 +0900)]

perf bpf: Fix a typo in bpf_counter_cgroup.c

This patch fixes a spelling typo in error message.

Signed-off-by: Masanari Iida <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Leo Yan [Fri, 24 Dec 2021 12:40:13 +0000 (20:40 +0800)]

perf synthetic-events: Return error if procfs isn't mounted for PID namespaces

For perf recording, it retrieves process info by iterating nodes in proc
fs.  If we run perf in a non-root PID namespace with command:

  # unshare --fork --pid perf record -e cycles -a -- test_program

... in this case, unshare command creates a child PID namespace and
launches perf tool in it, but the issue is the proc fs is not mounted
for the non-root PID namespace, this leads to the perf tool gathering
process info from its parent PID namespace.

We can use below command to observe the process nodes under proc fs:

  # unshare --pid --fork ls /proc
1    137   1968  2128  3    342  48  62   78      crypto   kcore        net       uptime
10   138   2 2142  30   35 49  63   8      devices   keys        pagetypeinfo   version
11   139   20 2143  304  36 50  64   82      device-tree  key-users    partitions     vmallocinfo
12   14    2011  22    305  37 51  65   83      diskstats   kmsg        self       vmstat
128  140   2038  23    307  39 52  656  84      driver   kpagecgroup  slabinfo       zoneinfo
129  15    2074  24    309  4 53  67   9      execdomains  kpagecount   softirqs
13   16    2094  241   31   40 54  68   asound     fb   kpageflags   stat
130  164   2096  242   310  41 55  69   buddyinfo  filesystems  loadavg      swaps
131  17    2098  25    317  42 56  70   bus      fs   locks        sys
132  175   21 26    32   43 57  71   cgroups    interrupts   meminfo      sysrq-trigger
133  179   2102  263   329  44 58  75   cmdline    iomem   misc        sysvipc
134  1875  2103  27    330  45 59  76   config.gz  ioports   modules      thread-self
135  19    2117  29    333  46 6   77   consoles   irq   mounts       timer_list
136  1941  2121  298   34   47 60  773  cpuinfo    kallsyms   mtd        tty

So it shows many existed tasks, since unshared command has not mounted
the proc fs for the new created PID namespace, it still accesses the
proc fs of the root PID namespace.  This leads to two prominent issues:

- Firstly, PID values are mismatched between thread info and samples.
  The gathered thread info are coming from the proc fs of the root PID
  namespace, but samples record its PID from the child PID namespace.

- The second issue is profiled program 'test_program' returns its forked
  PID number from the child PID namespace, perf tool wrongly uses this
  PID number to retrieve the process info via the proc fs of the root
  PID namespace.

To avoid issues, we need to mount proc fs for the child PID namespace
with the option '--mount-proc' when use unshare command:

  # unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program

Conversely, when the proc fs of the root PID namespace is used by child
namespace, perf tool can detect the multiple PID levels and
nsinfo__is_in_root_namespace() returns false, this patch reports error
for this case:

  # unshare --fork --pid perf record -e cycles -a -- test_program
  Couldn't synthesize bpf events.
  Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
  Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.

Reviewed-by: James Clark <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Ameer Hamza [Tue, 25 Jan 2022 12:11:41 +0000 (17:11 +0500)]

perf session: Check for NULL pointer before dereference

Move NULL pointer check before dereferencing the variable.

Addresses-Coverity: 1497622 ("Derereference before null check")
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ameer Hamza <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Namhyung Kim [Wed, 2 Feb 2022 07:08:25 +0000 (23:08 -0800)]

perf annotate: Set error stream of objdump process for TUI

The stderr should be set to a pipe when using TUI. Otherwise it'd
print to stdout and break TUI windows with an error message.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Anshuman Khandual [Wed, 2 Feb 2022 10:57:23 +0000 (16:27 +0530)]

perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()

This updates branch sample type with missing PERF_SAMPLE_BRANCH_TYPE_SAVE.

Suggested-by: James Clark <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: James Clark <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Sun, 9 May 2021 12:39:02 +0000 (09:39 -0300)]

tools headers UAPI: Sync linux/kvm.h with the kernel sources

To pick the changes in:

  f6c6804c43fa18d3 ("kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Janosch Frank <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Sun, 6 Feb 2022 11:28:34 +0000 (08:28 -0300)]

Merge remote-tracking branch 'torvalds/master' into perf/urgent

To check if more kernel API sync is needed and also to see if the perf
build tests continue to pass.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 18:40:17 +0000 (10:40 -0800)]

Merge tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- documentation fixes related to Xen

- enable x2apic mode when available when running as hardware
   virtualized guest under Xen

- cleanup and fix a corner case of vcpu enumeration when running a
   paravirtualized Xen guest

* tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/Xen: streamline (and fix) PV CPU enumeration
  xen: update missing ioctl magic numers documentation
  Improve docs for IOCTL_GNTDEV_MAP_GRANT_REF
  xen: xenbus_dev.h: delete incorrect file name
  xen/x2apic: enable x2apic mode when supported for HVM

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 17:55:59 +0000 (09:55 -0800)]

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:

   - A couple of fixes when handling an exception while a SError has
     been delivered

   - Workaround for Cortex-A510's single-step erratum

  RISC-V:

   - Make CY, TM, and IR counters accessible in VU mode

   - Fix SBI implementation version

  x86:

   - Report deprecation of x87 features in supported CPUID

   - Preparation for fixing an interrupt delivery race on AMD hardware

   - Sparse fix

  All except POWER and s390:

   - Rework guest entry code to correctly mark noinstr areas and fix
     vtime' accounting (for x86, this was already mostly correct but not
     entirely; for ARM, MIPS and RISC-V it wasn't)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
  KVM: x86: Report deprecated x87 features in supported CPUID
  KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
  KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
  KVM: arm64: Avoid consuming a stale esr value when SError occur
  RISC-V: KVM: Fix SBI implementation version
  RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
  kvm/riscv: rework guest entry logic
  kvm/arm64: rework guest entry logic
  kvm/x86: rework guest entry logic
  kvm/mips: rework guest entry logic
  kvm: add guest_state_{enter,exit}_irqoff()
  KVM: x86: Move delivery of non-APICv interrupt into vendor code
  kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 17:21:55 +0000 (09:21 -0800)]

Merge tag 'xfs-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
"I was auditing operations in XFS that clear file privileges, and
  realized that XFS' fallocate implementation drops suid/sgid but
  doesn't clear file capabilities the same way that file writes and
  reflink do.

  There are VFS helpers that do it correctly, so refactor XFS to use
  them. I also noticed that we weren't flushing the log at the correct
  point in the fallocate operation, so that's fixed too.

  Summary:

   - Fix fallocate so that it drops all file privileges when files are
     modified instead of open-coding that incompletely.

   - Fix fallocate to flush the log if the caller wanted synchronous
     file updates"

* tag 'xfs-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: ensure log flush at the end of a synchronous fallocate call
  xfs: move xfs_update_prealloc_flags() to xfs_pnfs.c
  xfs: set prealloc flag in xfs_alloc_file_space()
  xfs: fallocate() should call file_modified()
  xfs: remove XFS_PREALLOC_SYNC
  xfs: reject crazy array sizes being fed to XFS_IOC_GETBMAP*

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 17:13:51 +0000 (09:13 -0800)]

Merge tag 'vfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull vfs fixes from Darrick Wong:
"I was auditing the sync_fs code paths recently and noticed that most
  callers of ->sync_fs ignore its return value (and many implementations
  never return nonzero even if the fs is broken!), which means that
  internal fs errors and corruption are not passed up to userspace
  callers of syncfs(2) or FIFREEZE. Hence fixing the common code and
  XFS, and I'll start working on the ext4/btrfs folks if this is merged.

  Summary:

   - Fix a bug where callers of ->sync_fs (e.g. sync_filesystem and
     syncfs(2)) ignore the return value.

   - Fix a bug where callers of sync_filesystem (e.g. fs freeze) ignore
     the return value.

   - Fix a bug in XFS where xfs_fs_sync_fs never passed back error
     returns"

* tag 'vfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: return errors in xfs_fs_sync_fs
  quota: make dquot_quota_sync return errors from ->sync_fs
  vfs: make sync_filesystem return errors from ->sync_fs
  vfs: make freeze_super abort when sync_filesystem returns error

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 17:04:43 +0000 (09:04 -0800)]

Merge tag 'iomap-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap fix from Darrick Wong:
"A single bugfix for iomap.

  The fix should eliminate occasional complaints about stall warnings
  when a lot of writeback IO completes all at once and we have to then
  go clearing status on a large number of folios.

  Summary:

   - Limit the length of ioend chains in writeback so that we don't trip
     the softlockup watchdog and to limit long tail latency on clearing
     PageWriteback"

* tag 'iomap-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs, iomap: limit individual ioend chain lengths in writeback

commit | commitdiff | tree

Paolo Bonzini [Sat, 5 Feb 2022 05:58:25 +0000 (00:58 -0500)]

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SError has been
delivered

- Workaround for Cortex-A510's single-step[ erratum

commit | commitdiff | tree

Linus Torvalds [Sat, 5 Feb 2022 00:28:11 +0000 (16:28 -0800)]

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Some medium sized bugs in the various drivers. A couple are more
  recent regressions:

   - Fix two panics in hfi1 and two allocation problems

   - Send the IGMP to the correct address in cma

   - Squash a syzkaller bug related to races reading the multicast list

   - Memory leak in siw and cm

   - Fix a corner case spec compliance for HFI/QIB

   - Correct the implementation of fences in siw

   - Error unwind bug in mlx4"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/mlx4: Don't continue event handler after memory allocation failure
  RDMA/siw: Fix broken RDMA Read Fence/Resume logic.
  IB/rdmavt: Validate remote_addr during loopback atomic tests
  IB/cm: Release previously acquired reference counter in the cm_id_priv
  RDMA/siw: Fix refcounting leak in siw_create_qp()
  RDMA/ucma: Protect mc during concurrent multicast leaves
  RDMA/cma: Use correct address when leaving multicast group
  IB/hfi1: Fix tstats alloc and dealloc
  IB/hfi1: Fix AIP early init panic
  IB/hfi1: Fix alloc failure with larger txqueuelen
  IB/hfi1: Fix panic with larger ipoib send_queue_size

commit | commitdiff | tree

Linus Torvalds [Fri, 4 Feb 2022 23:27:45 +0000 (15:27 -0800)]

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Seven fixes, six of which are fairly obvious driver fixes.

  The one core change to the device budget depth is to try to ensure
  that if the default depth is large (which can produce quite a sizeable
  bitmap allocation per device), we give back the memory we don't need
  if there's a queue size reduction in slave_configure (which happens to
  a lot of devices)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: hisi_sas: Fix setting of hisi_sas_slot.is_internal
  scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task
  scsi: pm8001: Fix use-after-free for aborted TMF sas_task
  scsi: pm8001: Fix warning for undescribed param in process_one_iomb()
  scsi: core: Reallocate device's budget map on queue depth change
  scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe
  scsi: pm80xx: Fix double completion for SATA devices

Empty description

RSS Atom

This page took 0.131856 seconds and 4 git commands to generate.