]> Git Repo - linux.git/log
linux.git
4 years agoMerge tag 'linux-kselftest-kunit-fixes-5.11-rc3' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 9 Jan 2021 01:18:50 +0000 (17:18 -0800)]
Merge tag 'linux-kselftest-kunit-fixes-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "One fix to force the use of the 'tty' console for UML.

  Given that kunit tool requires the console output, explicitly stating
  the dependency makes sense than relying on it being the default"

* tag 'linux-kselftest-kunit-fixes-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Force the use of the 'tty' console for UML

4 years agoMerge tag 'linux-kselftest-next-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Sat, 9 Jan 2021 01:13:52 +0000 (17:13 -0800)]
Merge tag 'linux-kselftest-next-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "Two minor fixes to vDSO test changes in this merge window"

* tag 'linux-kselftest-next-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vDSO: fix -Wformat warning in vdso_test_correctness
  selftests/vDSO: add additional binaries to .gitignore

4 years agoMerge tag 'docs-5.11-3' of git://git.lwn.net/linux
Linus Torvalds [Sat, 9 Jan 2021 01:01:28 +0000 (17:01 -0800)]
Merge tag 'docs-5.11-3' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "A handful of relatively small documentation fixes"

* tag 'docs-5.11-3' of git://git.lwn.net/linux:
  docs: admin-guide: bootconfig: Fix feils to fails
  Documentation/admin-guide: kernel-parameters: hyphenate comma-separated
  docs: binfmt-misc: Fix .rst formatting
  docs: remove mention of ENABLE_MUST_CHECK
  atomic: remove further references to atomic_ops
  Documentation: doc-guide: fixes to sphinx.rst
  docs/mm: concepts.rst: Correct the threshold to low watermark
  Documentation: admin: early_param()s are also listed in kernel-parameters
  docs: Fix reST markup when linking to sections

4 years agoMerge tag 'devprop-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 8 Jan 2021 23:45:47 +0000 (15:45 -0800)]
Merge tag 'devprop-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull device properties framework fixes from Rafael Wysocki:
 "Revert a problematic commit that went in during the 5.10 cycle and
  improve the kerneldoc description of the function affected by it (both
  changes from Bard Liao)"

* tag 'devprop-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: add description of fwnode cases
  Revert "device property: Keep secondary firmware node secondary by type"

4 years agoMerge tag 'acpi-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 8 Jan 2021 23:42:56 +0000 (15:42 -0800)]
Merge tag 'acpi-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These address two build issues and drop confusing text from a couple
  of Kconfig entries.

  Specifics:

   - Drop two local variables that are never read and the code updating
     their values from the x86 suspend-to-idle code (Rafael Wysocki)

   - Add empty stub of an ACPI helper function to avoid build issues
     when CONFIG_ACPI is not set (Shawn Guo)

   - Remove confusing text regarding modules from Kconfig entries that
     correspond to non-modular code (Peter Robinson)"

* tag 'acpi-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: Update Kconfig help text for items that are no longer modular
  ACPI: scan: add stub acpi_create_platform_device() for !CONFIG_ACPI
  ACPI: PM: s2idle: Drop unused local variables and related code

4 years agoMerge tag 'pm-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 8 Jan 2021 23:39:25 +0000 (15:39 -0800)]
Merge tag 'pm-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These address two issues in the intel_pstate driver and one in the
  powernow-k8 cpufreq driver.

  Specifics:

   - Make the powernow-k8 cpufreq driver avoid calling
     cpufreq_cpu_get(), which theoretically may return NULL, to get a
     policy pointer that is known to it already (Colin Ian King)

   - Drop two functions that are not used any more from the intel_pstate
     driver (Lukas Bulwahn)

   - Make intel_pstate check the HWP capabilities to get the maximum
     available P-state in the passive mode to avoid using a stale value
     of it in case of out-of-band updates (Rafael Wysocki)"

* tag 'pm-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: remove obsolete functions
  cpufreq: powernow-k8: pass policy rather than use cpufreq_cpu_get()
  cpufreq: intel_pstate: Use HWP capabilities in intel_cpufreq_adjust_perf()

4 years agoMerge tag 'drm-fixes-2021-01-08' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 8 Jan 2021 23:12:08 +0000 (15:12 -0800)]
Merge tag 'drm-fixes-2021-01-08' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Daniel Vetter:
 "Looks like people are back from the break, usual small pile of fixes
  all over. Next week Dave should be back.

  The only thing pending I'm aware of is a "this shouldn't have become
  uapi" reverts for amdgpu, but they're already on the list and not that
  important really so can wait another week.

  Summary:

   - fix for ttm list corruption in radeon, reported by a few people

   - fixes for amdgpu, i915, msm

   - dma-buf use-after free fix"

* tag 'drm-fixes-2021-01-08' of git://anongit.freedesktop.org/drm/drm: (29 commits)
  drm/msm: Only enable A6xx LLCC code on A6xx
  drm/msm: Add modparam to allow vram carveout
  drm/msm: Call msm_init_vram before binding the gpu
  drm/msm/dp: postpone irq_hpd event during connection pending state
  drm/ttm: unexport ttm_pool_init/fini
  drm/radeon: stop re-init the TTM page pool
  dmabuf: fix use-after-free of dmabuf's file->f_inode
  Revert "drm/amd/display: Fix memory leaks in S3 resume"
  drm/amdgpu/display: drop DCN support for aarch64
  drm/amdgpu: enable ras eeprom support for sienna cichlid
  drm/amdgpu: fix no bad_pages issue after umc ue injection
  drm/amdgpu: fix potential memory leak during navi12 deinitialization
  drm/amd/display: Fix unused variable warning
  drm/amd/pm: improve the fine grain tuning function for RV/RV2/PCO
  drm/amd/pm: fix the failure when change power profile for renoir
  drm/amdgpu: fix a GPU hang issue when remove device
  drm/amdgpu: fix a memory protection fault when remove amdgpu device
  drm/amdgpu: switched to cached noretry setting for vangogh
  drm/amd/display: fix sysfs amdgpu_current_backlight_pwm NULL pointer issue
  drm/amd/pm: updated PM to I2C controller port on sienna cichlid
  ...

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 8 Jan 2021 23:06:02 +0000 (15:06 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86:
   - Fixes for the new scalable MMU
   - Fixes for migration of nested hypervisors on AMD
   - Fix for clang integrated assembler
   - Fix for left shift by 64 (UBSAN)
   - Small cleanups
   - Straggler SEV-ES patch

  ARM:
   - VM init cleanups
   - PSCI relay cleanups
   - Kill CONFIG_KVM_ARM_PMU
   - Fixup __init annotations
   - Fixup reg_to_encoding()
   - Fix spurious PMCR_EL0 access

  Misc:
   - selftests cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (38 commits)
  KVM: x86: __kvm_vcpu_halt can be static
  KVM: SVM: Add support for booting APs in an SEV-ES guest
  KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode
  KVM: nSVM: correctly restore nested_run_pending on migration
  KVM: x86/mmu: Clarify TDP MMU page list invariants
  KVM: x86/mmu: Ensure TDP MMU roots are freed after yield
  kvm: check tlbs_dirty directly
  KVM: x86: change in pv_eoi_get_pending() to make code more readable
  MAINTAINERS: Really update email address for Sean Christopherson
  KVM: x86: fix shift out of bounds reported by UBSAN
  KVM: selftests: Implement perf_test_util more conventionally
  KVM: selftests: Use vm_create_with_vcpus in create_vm
  KVM: selftests: Factor out guest mode code
  KVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c
  KVM: SVM: Add register operand to vmsave call in sev_es_vcpu_load
  KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
  KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
  KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
  KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
  ...

4 years agoMerge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 8 Jan 2021 22:55:41 +0000 (14:55 -0800)]
Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull iommu fixes from Will Deacon:
 "This is mainly all Intel VT-D stuff, but there are some fixes for AMD
  and ARM as well.

  We've also got the revert I promised during the merge window, which
  removes a temporary hack to accomodate i915 while we transitioned the
  Intel IOMMU driver over to the common DMA-IOMMU API.

  Finally, there are still a couple of other VT-D fixes floating around,
  so I expect to send you another batch of fixes next week.

  Summary:

   - Fix VT-D TLB invalidation for subdevices

   - Fix VT-D use-after-free on subdevice detach

   - Fix VT-D locking so that IRQs are disabled during SVA bind/unbind

   - Fix VT-D address alignment when flushing IOTLB

   - Fix memory leak in VT-D IRQ remapping failure path

   - Revert temporary i915 sglist hack now that it is no longer required

   - Fix sporadic boot failure with Arm SMMU on Qualcomm SM8150

   - Fix NULL dereference in AMD IRQ remapping code with remapping disabled

   - Fix accidental enabling of irqs on AMD resume-from-suspend path

   - Fix some typos in comments"

* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
  iommu/vt-d: Fix general protection fault in aux_detach_device()
  iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
  iommu/arm-smmu-qcom: Initialize SCTLR of the bypass context
  iommu/vt-d: Fix lockdep splat in sva bind()/unbind()
  Revert "iommu: Add quirk for Intel graphic devices in map_sg"
  iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb()
  iommu/amd: Stop irq_remapping_select() matching when remapping is disabled
  iommu/amd: Set iommu->int_enabled consistently when interrupts are set up
  iommu/intel: Fix memleak in intel_irq_remapping_alloc
  iommu/iova: fix 'domain' typos

4 years agoALSA: usb-audio: Fix implicit feedback sync setup for Pioneer devices
Takashi Iwai [Fri, 8 Jan 2021 07:52:19 +0000 (08:52 +0100)]
ALSA: usb-audio: Fix implicit feedback sync setup for Pioneer devices

Pioneer devices have both playback and capture streams sharing the
same iface/altsetting, and those need to be paired as implicit
feedback.  Instead of a half-baked (and broken) static quirk entry,
set up more generically for those devices by checking the number of
endpoints and the attribute of the secondary EP.

Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Reported-by: František Kučera <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoALSA: usb-audio: Annotate the endpoint index in audioformat
Takashi Iwai [Fri, 8 Jan 2021 07:52:18 +0000 (08:52 +0100)]
ALSA: usb-audio: Annotate the endpoint index in audioformat

There are devices that have multiple endpoints sharing the same
iface/altset not only for sync but also for the actual streams, and
the audioformat for such an endpoint needs to be handled with the
proper endpoint index; otherwise it confuses the endpoint management.

This patch extends the audioformat to annotate the endpoint index, and
put the proper ep_idx=1 to Pioneer device quirk entries accordingly.

Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoALSA: usb-audio: Avoid unnecessary interface re-setup
Takashi Iwai [Fri, 8 Jan 2021 07:52:17 +0000 (08:52 +0100)]
ALSA: usb-audio: Avoid unnecessary interface re-setup

The current endpoint handling assumed (more or less) a unique 1:1
relation between the endpoint and the iface/altset.  The exception was
the sync EP without the implicit feedback which has usually the
secondary EP of the same altset.  This works fine for most devices,
but it turned out that some unusual devices like Pinoeer's ones have
both playback and capture endpoints in the same iface/altsetting and
use both for the implicit feedback mode.  For handling such a case, we
need to extend the endpoint management to take the shared interface
into account.

This patch does that: it adds a new object snd_usb_iface_ref for
managing the reference counts of the each USB interface that is used
by each endpoint.  The interface setup is performed only once for the
(sharing) endpoints, and the doubly initialization is avoided.

Along with this, the resource release of endpoints and interface
refcounts are put into a single function, snd_usb_endpoint_free_all()
instead of looping in the caller side.

Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoALSA: usb-audio: Choose audioformat of a counter-part substream
Takashi Iwai [Fri, 8 Jan 2021 07:52:16 +0000 (08:52 +0100)]
ALSA: usb-audio: Choose audioformat of a counter-part substream

The implicit feedback mode needs to handle two endpoints and the
choice of the audioformat object for the sync EP is important since
this determines the compatibility of the hw_params.  The current code
uses the same audioformat object if both the main EP and the sync EP
point to the same iface/altsetting.  This was done in consideration of
the non-implicit-fb sync EP handling, and it doesn't match well with
the cases where actually to endpoints are defined in the sameiface /
altsetting like a few Pioneer devices.

Modify snd_usb_find_implicit_fb_sync_format() to pick up the
audioformat that is assigned in the counter-part substreams primarily,
so that the actual capture stream can be opened properly.  We keep the
same audioformat object only as a fallback in case nothing found,
though.

Fixes: 9fddc15e8039 ("ALSA: usb-audio: Factor out the implicit feedback quirk code")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoALSA: usb-audio: Fix the missing endpoints creations for quirks
Takashi Iwai [Fri, 8 Jan 2021 07:52:15 +0000 (08:52 +0100)]
ALSA: usb-audio: Fix the missing endpoints creations for quirks

The recent change in the endpoint management moved the endpoint object
creation from the stream open time to the parser of the audio
descriptor.  It works fine for the standard audio, but it overlooked
the other places that create audio streams via quirks
(QUIRK_AUDIO_FIXED_ENDPOINT) like the reported a few Pioneer devices;
those call snd_usb_add_audio_stream() manually, hence they miss the
endpoints, eventually resulting in the error at opening streams.
Moreover, now the sync EP setup was moved to the explicit call of
snd_usb_audioformat_set_sync_ep(), and this needs to be added for
those places, too.

This patch addresses those regressions for quirks.  It adds a local
helper function add_audio_stream_from_fixed_fmt(), which does the all
needed tasks, and replaces the calls of snd_usb_add_audio_stream()
with this new function.

Fixes: 54cb31901b83 ("ALSA: usb-audio: Create endpoint objects at parsing phase")
Reported-by: František Kučera <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoMerge tag 'arm-fixes-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 8 Jan 2021 22:13:54 +0000 (14:13 -0800)]
Merge tag 'arm-fixes-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "These are a small number of bug fixes that all came in before or
  during the merge window, most for the omap platform:

   - One boot regression fix for Nokia N9 (OMAP3).

   - Two small defconfig changes for omap2, to reflect changes in
     drivers

   - Warning fixes for DT issues on omap2, picoxcell and bitmap SoCs.

     The picoxcell platform will be removed in v5.12, but fixing it
     first makes it easier to backport to the fix to stable kernels and
     get a clean build with new dtc versions"

* tag 'arm-fixes-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: picoxcell: fix missing interrupt-parent properties
  ARM: dts: ux500/golden: Set display max brightness
  arm64: dts: bitmain: Use generic "ngpios" rather than "snps,nr-gpios"
  ARM: omap2: pmic-cpcap: fix maximum voltage to be consistent with defaults on xt875
  ARM: omap2plus_defconfig: enable SPI GPIO
  ARM: OMAP2+: omap_device: fix idling of devices during probe
  ARM: dts: OMAP3: disable AES on N950/N9
  ARM: omap2plus_defconfig: drop unused POWER_AVS option

4 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 8 Jan 2021 22:11:34 +0000 (14:11 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Clean-ups following the merging window: remove unused variable,
   duplicate includes, superfluous barrier, move some inline asm to
   separate functions.

 - Disable top-byte-ignore on kernel code addresses with KASAN/MTE
   enabled (already done when MTE is disabled).

 - Fix ARCH_LOW_ADDRESS_LIMIT definition with CONFIG_ZONE_DMA disabled.

 - Compiler/linker flags: link with "-z norelno", discard .eh_frame_hdr
   instead of --no-eh-frame-hdr.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Move PSTATE.TCO setting to separate functions
  arm64: kasan: Set TCR_EL1.TBID1 when KASAN_HW_TAGS is enabled
  arm64: vdso: disable .eh_frame_hdr via /DISCARD/ instead of --no-eh-frame-hdr
  arm64: traps: remove duplicate include statement
  arm64: link with -z norelro for LLD or aarch64-elf
  arm64: mm: Fix ARCH_LOW_ADDRESS_LIMIT when !CONFIG_ZONE_DMA
  arm64: mte: remove an ISB on kernel exit
  arm64/smp: Remove unused irq variable in arch_show_interrupts()

4 years agoARC: [hsdk]: Enable FPU_SAVE_RESTORE
Vineet Gupta [Fri, 8 Jan 2021 21:46:55 +0000 (13:46 -0800)]
ARC: [hsdk]: Enable FPU_SAVE_RESTORE

HSDK has hardware floating point and the common use case is with
glibc+hf so enable that as default.

Signed-off-by: Vineet Gupta <[email protected]>
4 years agodm integrity: fix flush with external metadata device
Mikulas Patocka [Fri, 8 Jan 2021 16:15:56 +0000 (11:15 -0500)]
dm integrity: fix flush with external metadata device

With external metadata device, flush requests are not passed down to the
data device.

Fix this by submitting the flush request in dm_integrity_flush_buffers. In
order to not degrade performance, we overlap the data device flush with
the metadata device flush.

Reported-by: Lukas Straub <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>
4 years agodm: eliminate potential source of excessive kernel log noise
Mike Snitzer [Wed, 6 Jan 2021 23:19:05 +0000 (18:19 -0500)]
dm: eliminate potential source of excessive kernel log noise

There wasn't ever a real need to log an error in the kernel log for
ioctls issued with insufficient permissions. Simply return an error
and if an admin/user is sufficiently motivated they can enable DM's
dynamic debugging to see an explanation for why the ioctls were
disallowed.

Reported-by: Nir Soffer <[email protected]>
Fixes: e980f62353c6 ("dm: don't allow ioctls to targets that don't map to whole devices")
Signed-off-by: Mike Snitzer <[email protected]>
4 years agoMerge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 8 Jan 2021 20:12:30 +0000 (12:12 -0800)]
Merge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull more networking fixes from Jakub Kicinski:
 "Slightly lighter pull request to get back into the Thursday cadence.

  Current release - always broken:

   - can: mcp251xfd: fix Tx/Rx ring buffer driver race conditions

   - dsa: hellcreek: fix led_classdev build errors

  Previous releases - regressions:

   - ipv6: fib: flush exceptions when purging route to avoid netdev
     reference leak

   - ip_tunnels: fix pmtu check in nopmtudisc mode

   - ip: always refragment ip defragmented packets to avoid MTU issues
     when forwarding through tunnels, correct "packet too big" message
     is prohibitively tricky to generate

   - s390/qeth: fix locking for discipline setup / removal and during
     recovery to prevent both deadlocks and races

   - mlx5: Use port_num 1 instead of 0 when delete a RoCE address

  Previous releases - always broken:

   - cdc_ncm: correct overhead calculation in delayed_ndp_size to
     prevent out of bound accesses with Huawei 909s-120 LTE module

   - fix stmmac dwmac-sun8i suspend/resume:
           - PHY being left powered off
           - MAC syscon configuration being reset
           - reference to the reset controller being improperly dropped

   - qrtr: fix null-ptr-deref in qrtr_ns_remove

   - can: tcan4x5x: fix bittiming const, use common bittiming from m_can
     driver

   - mlx5e: CT: Use per flow counter when CT flow accounting is enabled

   - mlx5e: Fix SWP offsets when vlan inserted by driver

  Misc:

   - bpf: Fix a task_iter bug caused by a bpf -> net merge conflict
     resolution

  And the usual many fixes to various error paths"

* tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
  s390/qeth: fix L2 header access in qeth_l3_osa_features_check()
  s390/qeth: fix locking for discipline setup / removal
  s390/qeth: fix deadlock during recovery
  selftests: fib_nexthops: Fix wrong mausezahn invocation
  nexthop: Bounce NHA_GATEWAY in FDB nexthop groups
  nexthop: Unlink nexthop group entry in error path
  nexthop: Fix off-by-one error in error path
  octeontx2-af: fix memory leak of lmac and lmac->name
  chtls: Fix chtls resources release sequence
  chtls: Added a check to avoid NULL pointer dereference
  chtls: Replace skb_dequeue with skb_peek
  chtls: Avoid unnecessary freeing of oreq pointer
  chtls: Fix panic when route to peer not configured
  chtls: Remove invalid set_tcb call
  chtls: Fix hardware tid leak
  net: ip: always refragment ip defragmented packets
  net: fix pmtu check in nopmtudisc mode
  selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking
  docs: octeontx2: tune rst markup
  ...

4 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 8 Jan 2021 20:05:11 +0000 (12:05 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes a functional bug in arm/chacha-neon as well as a potential
  buffer overflow in ecdh"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: ecdh - avoid buffer overflow in ecdh_set_secret()
  crypto: arm/chacha-neon - add missing counter increment

4 years agopoll: fix performance regression due to out-of-line __put_user()
Linus Torvalds [Thu, 7 Jan 2021 17:43:54 +0000 (09:43 -0800)]
poll: fix performance regression due to out-of-line __put_user()

The kernel test robot reported a -5.8% performance regression on the
"poll2" test of will-it-scale, and bisected it to commit d55564cfc222
("x86: Make __put_user() generate an out-of-line call").

I didn't expect an out-of-line __put_user() to matter, because no normal
core code should use that non-checking legacy version of user access any
more.  But I had overlooked the very odd poll() usage, which does a
__put_user() to update the 'revents' values of the poll array.

Now, Al Viro correctly points out that instead of updating just the
'revents' field, it would be much simpler to just copy the _whole_
pollfd entry, and then we could just use "copy_to_user()" on the whole
array of entries, the same way we use "copy_from_user()" a few lines
earlier to get the original values.

But that is not what we've traditionally done, and I worry that threaded
applications might be concurrently modifying the other fields of the
pollfd array.  So while Al's suggestion is simpler - and perhaps worth
trying in the future - this instead keeps the "just update revents"
model.

To fix the performance regression, use the modern "unsafe_put_user()"
instead of __put_user(), with the proper "user_write_access_begin()"
guarding in place. This improves code generation enormously.

Link: https://lore.kernel.org/lkml/20210107134723.GA28532@xsang-OptiPlex-9020/
Reported-by: kernel test robot <[email protected]>
Tested-by: Oliver Sang <[email protected]>
Cc: Al Viro <[email protected]>
Cc: David Laight <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoRevert "init/console: Use ttynull as a fallback when there is no console"
Petr Mladek [Fri, 8 Jan 2021 11:48:47 +0000 (12:48 +0100)]
Revert "init/console: Use ttynull as a fallback when there is no console"

This reverts commit 757055ae8dedf5333af17b3b5b4b70ba9bc9da4e.

The commit caused that ttynull was used as the default console
on several systems[1][2][3]. As a result, the console was
blank even when a better alternative existed.

It happened when there was no console configured
on the command line and ttynull_init() was the first initcall
calling register_console().

Or it happened when /dev/ did not exist when console_on_rootfs()
was called. It was not able to open /dev/console even though
a console driver was registered. It tried to add ttynull console
but it obviously did not help. But ttynull became the preferred
console and was used by /dev/console when it was available later.

The commit tried to fix a historical problem that have been there
for ages. The primary motivation was the commit 3cffa06aeef7ece30f6
("printk/console: Allow to disable console output by using console=""
 or console=null"). It provided a clean solution for a workaround
 that was widely used and worked only by chance.

This revert causes that the console="" or console=null command line
options will again work only by chance. These options will cause that
a particular console will be preferred and the default (tty) ones
will not get enabled. There will be no console registered at
all. As a result there won't be stdin, stdout, and stderr for
the init process. But it worked exactly this way even before.

The proper solution has to fulfill many conditions:

  + Register ttynull only when explicitly required or as
    the ultimate fallback.

  + ttynull should get associated with /dev/console but it must
    not become preferred console when used as a fallback.
    Especially, it must still be possible to replace it
    by a better console later.

Such a change requires clean up of the register_console() code.
Otherwise, it would be even harder to follow. Especially, the use
of has_preferred_console and CON_CONSDEV flag is tricky. The clean
up is risky. The ordering of consoles is not well defined. And
any changes tend to break existing user settings.

Do the revert at the least risky solution for now.

[1] https://lore.kernel.org/linux-kselftest/20201221144302[email protected]/
[2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e191@synopsys.com/
[3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128[email protected]/

Reported-by: Andy Shevchenko <[email protected]>
Reported-by: Vineet Gupta <[email protected]>
Reported-by: Thomas Meyer <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoMerge branches 'acpi-scan' and 'acpi-misc'
Rafael J. Wysocki [Fri, 8 Jan 2021 17:15:44 +0000 (18:15 +0100)]
Merge branches 'acpi-scan' and 'acpi-misc'

* acpi-scan:
  ACPI: scan: add stub acpi_create_platform_device() for !CONFIG_ACPI

* acpi-misc:
  ACPI: Update Kconfig help text for items that are no longer modular

4 years agoHID: Ignore battery for Elan touchscreen on ASUS UX550
Seth Miller [Tue, 5 Jan 2021 04:58:12 +0000 (22:58 -0600)]
HID: Ignore battery for Elan touchscreen on ASUS UX550

Battery status is being reported for the Elan touchscreen on ASUS
UX550 laptops despite not having a batter. It always shows either 0 or
1%.

Signed-off-by: Seth Miller <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
4 years agobtrfs: shrink delalloc pages instead of full inodes
Josef Bacik [Thu, 7 Jan 2021 22:08:30 +0000 (17:08 -0500)]
btrfs: shrink delalloc pages instead of full inodes

Commit 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in
shrink_delalloc") cleaned up how we do delalloc shrinking by utilizing
some infrastructure we have in place to flush inodes that we use for
device replace and snapshot.  However this introduced a pretty serious
performance regression.  To reproduce the user untarred the source
tarball of Firefox (360MiB xz compressed/1.5GiB uncompressed), and would
see it take anywhere from 5 to 20 times as long to untar in 5.10
compared to 5.9. This was observed on fast devices (SSD and better) and
not on HDD.

The root cause is because before we would generally use the normal
writeback path to reclaim delalloc space, and for this we would provide
it with the number of pages we wanted to flush.  The referenced commit
changed this to flush that many inodes, which drastically increased the
amount of space we were flushing in certain cases, which severely
affected performance.

We cannot revert this patch unfortunately because of 3d45f221ce62
("btrfs: fix deadlock when cloning inline extent and low on free
metadata space") which requires the ability to skip flushing inodes that
are being cloned in certain scenarios, which means we need to keep using
our flushing infrastructure or risk re-introducing the deadlock.

Instead to fix this problem we can go back to providing
btrfs_start_delalloc_roots with a number of pages to flush, and then set
up a writeback_control and utilize sync_inode() to handle the flushing
for us.  This gives us the same behavior we had prior to the fix, while
still allowing us to avoid the deadlock that was fixed by Filipe.  I
redid the users original test and got the following results on one of
our test machines (256GiB of ram, 56 cores, 2TiB Intel NVMe drive)

  5.9 0m54.258s
  5.10 1m26.212s
  5.10+patch 0m38.800s

5.10+patch is significantly faster than plain 5.9 because of my patch
series "Change data reservations to use the ticketing infra" which
contained the patch that introduced the regression, but generally
improved the overall ENOSPC flushing mechanisms.

Additional testing on consumer-grade SSD (8GiB ram, 8 CPU) confirm
the results:

  5.10.5            4m00s
  5.10.5+patch      1m08s
  5.11-rc2     5m14s
  5.11-rc2+patch    1m30s

Reported-by: René Rebe <[email protected]>
Fixes: 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc")
CC: [email protected] # 5.10
Signed-off-by: Josef Bacik <[email protected]>
Tested-by: David Sterba <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add my test results ]
Signed-off-by: David Sterba <[email protected]>
4 years agohwmon: (amd_energy) fix allocation of hwmon_channel_info config
David Arcari [Thu, 7 Jan 2021 14:47:07 +0000 (09:47 -0500)]
hwmon: (amd_energy) fix allocation of hwmon_channel_info config

hwmon, specifically hwmon_num_channel_attrs, expects the config
array in the hwmon_channel_info structure to be terminated by
a zero entry.  amd_energy does not honor this convention.  As
result, a KASAN warning is possible.  Fix this by adding an
additional entry and setting it to zero.

Fixes: 8abee9566b7e ("hwmon: Add amd_energy driver to report energy counters")
Signed-off-by: David Arcari <[email protected]>
Cc: Naveen Krishna Chatradhi <[email protected]>
Cc: Jean Delvare <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: David Arcari <[email protected]>
Acked-by: Naveen Krishna Chatradhi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
4 years agoARM: picoxcell: fix missing interrupt-parent properties
Arnd Bergmann [Wed, 30 Dec 2020 15:20:05 +0000 (16:20 +0100)]
ARM: picoxcell: fix missing interrupt-parent properties

dtc points out that the interrupts for some devices are not parsable:

picoxcell-pc3x2.dtsi:45.19-49.5: Warning (interrupts_property): /paxi/gem@30000: Missing interrupt-parent
picoxcell-pc3x2.dtsi:51.21-55.5: Warning (interrupts_property): /paxi/dmac@40000: Missing interrupt-parent
picoxcell-pc3x2.dtsi:57.21-61.5: Warning (interrupts_property): /paxi/dmac@50000: Missing interrupt-parent
picoxcell-pc3x2.dtsi:233.21-237.5: Warning (interrupts_property): /rwid-axi/axi2pico@c0000000: Missing interrupt-parent

There are two VIC instances, so it's not clear which one needs to be
used. I found the BSP sources that reference VIC0, so use that:

https://github.com/r1mikey/meta-picoxcell/blob/master/recipes-kernel/linux/linux-picochip-3.0/0001-picoxcell-support-for-Picochip-picoXcell-SoC.patch

Acked-by: Jamie Iles <[email protected]>
Link: https://lore.kernel.org/r/[email protected]'
Signed-off-by: Arnd Bergmann <[email protected]>
4 years agoblk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED
John Garry [Fri, 8 Jan 2021 08:55:37 +0000 (16:55 +0800)]
blk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED

Showing the hctx flags for when BLK_MQ_F_TAG_HCTX_SHARED is set gives
something like:

root@debian:/home/john# more /sys/kernel/debug/block/sda/hctx0/flags
alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED|3

Add the decoding for that flag.

Fixes: 32bc15afed04b ("blk-mq: Facilitate a shared sbitmap per tagset")
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoblock/rnbd-clt: avoid module unload race with close confirmation
Jack Wang [Fri, 8 Jan 2021 14:36:34 +0000 (15:36 +0100)]
block/rnbd-clt: avoid module unload race with close confirmation

We had kernel panic, it is caused by unload module and last
close confirmation.

call trace:
[1196029.743127]  free_sess+0x15/0x50 [rtrs_client]
[1196029.743128]  rtrs_clt_close+0x4c/0x70 [rtrs_client]
[1196029.743129]  ? rnbd_clt_unmap_device+0x1b0/0x1b0 [rnbd_client]
[1196029.743130]  close_rtrs+0x25/0x50 [rnbd_client]
[1196029.743131]  rnbd_client_exit+0x93/0xb99 [rnbd_client]
[1196029.743132]  __x64_sys_delete_module+0x190/0x260

And in the crashdump confirmation kworker is also running.
PID: 6943   TASK: ffff9e2ac8098000  CPU: 4   COMMAND: "kworker/4:2"
 #0 [ffffb206cf337c30] __schedule at ffffffff9f93f891
 #1 [ffffb206cf337cc8] schedule at ffffffff9f93fe98
 #2 [ffffb206cf337cd0] schedule_timeout at ffffffff9f943938
 #3 [ffffb206cf337d50] wait_for_completion at ffffffff9f9410a7
 #4 [ffffb206cf337da0] __flush_work at ffffffff9f08ce0e
 #5 [ffffb206cf337e20] rtrs_clt_close_conns at ffffffffc0d5f668 [rtrs_client]
 #6 [ffffb206cf337e48] rtrs_clt_close at ffffffffc0d5f801 [rtrs_client]
 #7 [ffffb206cf337e68] close_rtrs at ffffffffc0d26255 [rnbd_client]
 #8 [ffffb206cf337e78] free_sess at ffffffffc0d262ad [rnbd_client]
 #9 [ffffb206cf337e88] rnbd_clt_put_dev at ffffffffc0d266a7 [rnbd_client]

The problem is both code path try to close same session, which lead to
panic.

To fix it, just skip the sess if the refcount already drop to 0.

Fixes: f7a7a5c228d4 ("block/rnbd: client: main functionality")
Signed-off-by: Jack Wang <[email protected]>
Reviewed-by: Gioh Kim <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoblock/rnbd: Adding name to the Contributors List
Swapnil Ingle [Fri, 8 Jan 2021 14:36:33 +0000 (15:36 +0100)]
block/rnbd: Adding name to the Contributors List

Adding name to the Contributors List

Signed-off-by: Swapnil Ingle <[email protected]>
Acked-by: Jack Wang <[email protected]>
Acked-by: Danil Kipnis <[email protected]>
Signed-off-by: Jack Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoblock/rnbd-clt: Fix sg table use after free
Guoqing Jiang [Fri, 8 Jan 2021 14:36:32 +0000 (15:36 +0100)]
block/rnbd-clt: Fix sg table use after free

Since dynamically allocate sglist is used for rnbd_iu, we can't free sg
table after send_usr_msg since the callback function (cqe.done) could
still access the sglist.

Otherwise KASAN reports UAF issue:

[ 4856.600257] BUG: KASAN: use-after-free in dma_direct_unmap_sg+0x53/0x290
[ 4856.600772] Read of size 4 at addr ffff888206af3a98 by task swapper/1/0

[ 4856.601729] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G        W         5.10.0-pserver #5.10.0-1+feature+linux+next+20201214.1025+0910d71
[ 4856.601748] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020
[ 4856.601766] Call Trace:
[ 4856.601785]  <IRQ>
[ 4856.601822]  dump_stack+0x99/0xcb
[ 4856.601856]  ? dma_direct_unmap_sg+0x53/0x290
[ 4856.601888]  print_address_description.constprop.7+0x1e/0x230
[ 4856.601913]  ? freeze_kernel_threads+0x73/0x73
[ 4856.601965]  ? mark_held_locks+0x29/0xa0
[ 4856.602019]  ? dma_direct_unmap_sg+0x53/0x290
[ 4856.602039]  ? dma_direct_unmap_sg+0x53/0x290
[ 4856.602079]  kasan_report.cold.9+0x37/0x7c
[ 4856.602188]  ? mlx5_ib_post_recv+0x430/0x520 [mlx5_ib]
[ 4856.602209]  ? dma_direct_unmap_sg+0x53/0x290
[ 4856.602256]  dma_direct_unmap_sg+0x53/0x290
[ 4856.602366]  complete_rdma_req+0x188/0x4b0 [rtrs_client]
[ 4856.602451]  ? rtrs_clt_close+0x80/0x80 [rtrs_client]
[ 4856.602535]  ? mlx5_ib_poll_cq+0x48b/0x16e0 [mlx5_ib]
[ 4856.602589]  ? radix_tree_insert+0x3a0/0x3a0
[ 4856.602610]  ? do_raw_spin_lock+0x119/0x1d0
[ 4856.602647]  ? rwlock_bug.part.1+0x60/0x60
[ 4856.602740]  rtrs_clt_rdma_done+0x3f7/0x670 [rtrs_client]
[ 4856.602804]  ? rtrs_clt_rdma_cm_handler+0xda0/0xda0 [rtrs_client]
[ 4856.602857]  ? check_flags.part.31+0x6c/0x1f0
[ 4856.602927]  ? rcu_read_lock_sched_held+0xaf/0xe0
[ 4856.602963]  ? rcu_read_lock_bh_held+0xc0/0xc0
[ 4856.603137]  __ib_process_cq+0x10a/0x350 [ib_core]
[ 4856.603309]  ib_poll_handler+0x41/0x1c0 [ib_core]
[ 4856.603358]  irq_poll_softirq+0xe6/0x280
[ 4856.603392]  ? lockdep_hardirqs_on_prepare+0x111/0x210
[ 4856.603446]  __do_softirq+0x10d/0x646
[ 4856.603540]  asm_call_irq_on_stack+0x12/0x20
[ 4856.603563]  </IRQ>

[ 4856.605096] Allocated by task 8914:
[ 4856.605510]  kasan_save_stack+0x19/0x40
[ 4856.605532]  __kasan_kmalloc.constprop.7+0xc1/0xd0
[ 4856.605552]  __kmalloc+0x155/0x320
[ 4856.605574]  __sg_alloc_table+0x155/0x1c0
[ 4856.605594]  sg_alloc_table+0x1f/0x50
[ 4856.605620]  send_msg_sess_info+0x119/0x2e0 [rnbd_client]
[ 4856.605646]  remap_devs+0x71/0x210 [rnbd_client]
[ 4856.605676]  init_sess+0xad8/0xe10 [rtrs_client]
[ 4856.605706]  rtrs_clt_reconnect_work+0xd6/0x170 [rtrs_client]
[ 4856.605728]  process_one_work+0x521/0xa90
[ 4856.605748]  worker_thread+0x65/0x5b0
[ 4856.605769]  kthread+0x1f2/0x210
[ 4856.605789]  ret_from_fork+0x22/0x30

[ 4856.606159] Freed by task 8914:
[ 4856.606559]  kasan_save_stack+0x19/0x40
[ 4856.606580]  kasan_set_track+0x1c/0x30
[ 4856.606601]  kasan_set_free_info+0x1b/0x30
[ 4856.606622]  __kasan_slab_free+0x108/0x150
[ 4856.606642]  slab_free_freelist_hook+0x64/0x190
[ 4856.606661]  kfree+0xe2/0x650
[ 4856.606681]  __sg_free_table+0xa4/0x100
[ 4856.606707]  send_msg_sess_info+0x1d6/0x2e0 [rnbd_client]
[ 4856.606733]  remap_devs+0x71/0x210 [rnbd_client]
[ 4856.606763]  init_sess+0xad8/0xe10 [rtrs_client]
[ 4856.606792]  rtrs_clt_reconnect_work+0xd6/0x170 [rtrs_client]
[ 4856.606813]  process_one_work+0x521/0xa90
[ 4856.606833]  worker_thread+0x65/0x5b0
[ 4856.606853]  kthread+0x1f2/0x210
[ 4856.606872]  ret_from_fork+0x22/0x30

The solution is to free iu's sgtable after the iu is not used anymore.
And also move sg_alloc_table into rnbd_get_iu accordingly.

Fixes: 5a1328d0c3a7 ("block/rnbd-clt: Dynamically allocate sglist for rnbd_iu")
Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Jack Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoblock/rnbd-srv: Fix use after free in rnbd_srv_sess_dev_force_close
Jack Wang [Fri, 8 Jan 2021 14:36:31 +0000 (15:36 +0100)]
block/rnbd-srv: Fix use after free in rnbd_srv_sess_dev_force_close

KASAN detect following BUG:
[  778.215311] ==================================================================
[  778.216696] BUG: KASAN: use-after-free in rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.219037] Read of size 8 at addr ffff88b1d6516c28 by task tee/8842

[  778.220500] CPU: 37 PID: 8842 Comm: tee Kdump: loaded Not tainted 5.10.0-pserver #5.10.0-1+feature+linux+next+20201214.1025+0910d71
[  778.220529] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020
[  778.220555] Call Trace:
[  778.220609]  dump_stack+0x99/0xcb
[  778.220667]  ? rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.220715]  print_address_description.constprop.7+0x1e/0x230
[  778.220750]  ? freeze_kernel_threads+0x73/0x73
[  778.220896]  ? rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.220932]  ? rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.220994]  kasan_report.cold.9+0x37/0x7c
[  778.221066]  ? kobject_put+0x80/0x270
[  778.221102]  ? rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.221184]  rnbd_srv_sess_dev_force_close+0x38/0x60 [rnbd_server]
[  778.221240]  rnbd_srv_dev_session_force_close_store+0x6a/0xc0 [rnbd_server]
[  778.221304]  ? sysfs_file_ops+0x90/0x90
[  778.221353]  kernfs_fop_write+0x141/0x240
[  778.221451]  vfs_write+0x142/0x4d0
[  778.221553]  ksys_write+0xc0/0x160
[  778.221602]  ? __ia32_sys_read+0x50/0x50
[  778.221684]  ? lockdep_hardirqs_on_prepare+0x13d/0x210
[  778.221718]  ? syscall_enter_from_user_mode+0x1c/0x50
[  778.221821]  do_syscall_64+0x33/0x40
[  778.221862]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  778.221896] RIP: 0033:0x7f4affdd9504
[  778.221928] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[  778.221956] RSP: 002b:00007fffebb36b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  778.222011] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f4affdd9504
[  778.222038] RDX: 0000000000000002 RSI: 00007fffebb36c50 RDI: 0000000000000003
[  778.222066] RBP: 00007fffebb36c50 R08: 0000556a151aa600 R09: 00007f4affeb1540
[  778.222094] R10: fffffffffffffc19 R11: 0000000000000246 R12: 0000556a151aa520
[  778.222121] R13: 0000000000000002 R14: 00007f4affea6760 R15: 0000000000000002

[  778.222764] Allocated by task 3212:
[  778.223285]  kasan_save_stack+0x19/0x40
[  778.223316]  __kasan_kmalloc.constprop.7+0xc1/0xd0
[  778.223347]  kmem_cache_alloc_trace+0x186/0x350
[  778.223382]  rnbd_srv_rdma_ev+0xf16/0x1690 [rnbd_server]
[  778.223422]  process_io_req+0x4d1/0x670 [rtrs_server]
[  778.223573]  __ib_process_cq+0x10a/0x350 [ib_core]
[  778.223709]  ib_cq_poll_work+0x31/0xb0 [ib_core]
[  778.223743]  process_one_work+0x521/0xa90
[  778.223773]  worker_thread+0x65/0x5b0
[  778.223802]  kthread+0x1f2/0x210
[  778.223833]  ret_from_fork+0x22/0x30

[  778.224296] Freed by task 8842:
[  778.224800]  kasan_save_stack+0x19/0x40
[  778.224829]  kasan_set_track+0x1c/0x30
[  778.224860]  kasan_set_free_info+0x1b/0x30
[  778.224889]  __kasan_slab_free+0x108/0x150
[  778.224919]  slab_free_freelist_hook+0x64/0x190
[  778.224947]  kfree+0xe2/0x650
[  778.224982]  rnbd_destroy_sess_dev+0x2fa/0x3b0 [rnbd_server]
[  778.225011]  kobject_put+0xda/0x270
[  778.225046]  rnbd_srv_sess_dev_force_close+0x30/0x60 [rnbd_server]
[  778.225081]  rnbd_srv_dev_session_force_close_store+0x6a/0xc0 [rnbd_server]
[  778.225111]  kernfs_fop_write+0x141/0x240
[  778.225140]  vfs_write+0x142/0x4d0
[  778.225169]  ksys_write+0xc0/0x160
[  778.225198]  do_syscall_64+0x33/0x40
[  778.225227]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  778.226506] The buggy address belongs to the object at ffff88b1d6516c00
                which belongs to the cache kmalloc-512 of size 512
[  778.227464] The buggy address is located 40 bytes inside of
                512-byte region [ffff88b1d6516c00ffff88b1d6516e00)

The problem is in the sess_dev release function we call
rnbd_destroy_sess_dev, and could free the sess_dev already, but we still
set the keep_id in rnbd_srv_sess_dev_force_close, which lead to use
after free.

To fix it, move the keep_id before the sysfs removal, and cache the
rnbd_srv_session for lock accessing,

Fixes: 786998050cbc ("block/rnbd-srv: close a mapped device from server side.")
Signed-off-by: Jack Wang <[email protected]>
Reviewed-by: Guoqing Jiang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoblock/rnbd: Select SG_POOL for RNBD_CLIENT
Jack Wang [Fri, 8 Jan 2021 14:36:30 +0000 (15:36 +0100)]
block/rnbd: Select SG_POOL for RNBD_CLIENT

lkp reboot following build error:
 drivers/block/rnbd/rnbd-clt.c: In function 'rnbd_softirq_done_fn':
>> drivers/block/rnbd/rnbd-clt.c:387:2: error: implicit declaration of function 'sg_free_table_chained' [-Werror=implicit-function-declaration]
     387 |  sg_free_table_chained(&iu->sgt, RNBD_INLINE_SG_CNT);
         |  ^~~~~~~~~~~~~~~~~~~~~

The reason is CONFIG_SG_POOL is not enabled in the config, to
avoid such failure, select SG_POOL in Kconfig for RNBD_CLIENT.

Fixes: 5a1328d0c3a7 ("block/rnbd-clt: Dynamically allocate sglist for rnbd_iu")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Jack Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoHID: logitech-dj: add the G602 receiver
Filipe Laíns [Mon, 4 Jan 2021 20:47:17 +0000 (20:47 +0000)]
HID: logitech-dj: add the G602 receiver

Tested. The device gets correctly exported to userspace and I can see
mouse and keyboard events.

Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
4 years agoKVM: x86: __kvm_vcpu_halt can be static
Paolo Bonzini [Fri, 8 Jan 2021 10:54:44 +0000 (05:54 -0500)]
KVM: x86: __kvm_vcpu_halt can be static

Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoMerge tag 'kvmarm-fixes-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Fri, 8 Jan 2021 10:02:40 +0000 (05:02 -0500)]
Merge tag 'kvmarm-fixes-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.11, take #1

- VM init cleanups
- PSCI relay cleanups
- Kill CONFIG_KVM_ARM_PMU
- Fixup __init annotations
- Fixup reg_to_encoding()
- Fix spurious PMCR_EL0 access

4 years agoMerge tag 'drm-misc-fixes-2021-01-08' of git://anongit.freedesktop.org/drm/drm-misc...
Daniel Vetter [Fri, 8 Jan 2021 09:39:17 +0000 (10:39 +0100)]
Merge tag 'drm-misc-fixes-2021-01-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

* dma-buf: fix a use-after-free
* radeon: don't init the TTM page pool manually
* ttm: unexport ttm_pool_{init,fini}()

Signed-off-by: Daniel Vetter <[email protected]>
From: Thomas Zimmermann <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/X/gnKs52t8xUuAlE@linux-uq9g
4 years agoMerge tag 'drm-msm-fixes-2021-01-07' of https://gitlab.freedesktop.org/drm/msm into...
Daniel Vetter [Fri, 8 Jan 2021 08:53:02 +0000 (09:53 +0100)]
Merge tag 'drm-msm-fixes-2021-01-07' of https://gitlab.freedesktop.org/drm/msm into drm-fixes

A few misc fixes from Rob, mostly fallout from the locking rework that
landed in the merge window, plus a few smaller things.

Signed-off-by: Daniel Vetter <[email protected]>
From: Rob Clark <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtWMhzyD6kejmViZeZ+zfJxRvfq-R2t_zA+DcDiTxsYRQ@mail.gmail.com
4 years agox86/resctrl: Don't move a task to the same resource group
Fenghua Yu [Thu, 17 Dec 2020 22:31:19 +0000 (14:31 -0800)]
x86/resctrl: Don't move a task to the same resource group

Shakeel Butt reported in [1] that a user can request a task to be moved
to a resource group even if the task is already in the group. It just
wastes time to do the move operation which could be costly to send IPI
to a different CPU.

Add a sanity check to ensure that the move operation only happens when
the task is not already in the resource group.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.com
4 years agox86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR
Fenghua Yu [Thu, 17 Dec 2020 22:31:18 +0000 (14:31 -0800)]
x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR

Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
[2] https://lore.kernel.org/lkml/20201123022433[email protected]

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt <[email protected]>
Reported-by: Valentin Schneider <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: James Morse <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
4 years agoblock: pre-initialize struct block_device in bdev_alloc_inode
Christoph Hellwig [Thu, 7 Jan 2021 18:36:40 +0000 (19:36 +0100)]
block: pre-initialize struct block_device in bdev_alloc_inode

bdev_evict_inode and bdev_free_inode are also called for the root inode
of bdevfs, for which bdev_alloc is never called.  Move the zeroing o
f struct block_device and the initialization of the bd_bdi field into
bdev_alloc_inode to make sure they are initialized for the root inode
as well.

Fixes: e6cb53827ed6 ("block: initialize struct block_device in bdev_alloc")
Reported-by: Alexey Kardashevskiy <[email protected]>
Tested-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
4 years agoMerge tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Fri, 8 Jan 2021 03:13:29 +0000 (19:13 -0800)]
Merge tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-01-07

* tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups
  net/mlx5e: Fix two double free cases
  net/mlx5: Release devlink object if adev fails
  net/mlx5e: ethtool, Fix restriction of autoneg with 56G
  net/mlx5e: In skb build skip setting mark in switchdev mode
  net/mlx5: E-Switch, fix changing vf VLANID
  net/mlx5e: Fix SWP offsets when vlan inserted by driver
  net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled
  net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address
  net/mlx5e: Add missing capability check for uplink follow
  net/mlx5: Check if lag is supported before creating one
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonet: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
Aleksander Jan Bajkowski [Thu, 7 Jan 2021 19:58:18 +0000 (20:58 +0100)]
net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE

Exclude RMII from modes that report 1 GbE support. Reduced MII supports
up to 100 MbE.

Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Aleksander Jan Bajkowski <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoMerge branch 's390-qeth-fixes-2021-01-07'
Jakub Kicinski [Fri, 8 Jan 2021 02:54:08 +0000 (18:54 -0800)]
Merge branch 's390-qeth-fixes-2021-01-07'

Julian Wiedmann says:

====================
s390/qeth: fixes 2021-01-07

This brings two locking fixes for the device control path.
Also one fix for a path where our .ndo_features_check() attempts to
access a non-existent L2 header.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agos390/qeth: fix L2 header access in qeth_l3_osa_features_check()
Julian Wiedmann [Thu, 7 Jan 2021 17:24:42 +0000 (18:24 +0100)]
s390/qeth: fix L2 header access in qeth_l3_osa_features_check()

ip_finish_output_gso() may call .ndo_features_check() even before the
skb has a L2 header. This conflicts with qeth_get_ip_version()'s attempt
to inspect the L2 header via vlan_eth_hdr().

Switch to vlan_get_protocol(), as already used further down in the
common qeth_features_check() path.

Fixes: f13ade199391 ("s390/qeth: run non-offload L3 traffic over common xmit path")
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agos390/qeth: fix locking for discipline setup / removal
Julian Wiedmann [Thu, 7 Jan 2021 17:24:41 +0000 (18:24 +0100)]
s390/qeth: fix locking for discipline setup / removal

Due to insufficient locking, qeth_core_set_online() and
qeth_dev_layer2_store() can run in parallel, both attempting to load &
setup the discipline (and stepping on each other toes along the way).
A similar race can also occur between qeth_core_remove_device() and
qeth_dev_layer2_store().

Access to .discipline is meant to be protected by the discipline_mutex,
so add/expand the locking in qeth_core_remove_device() and
qeth_core_set_online().
Adjust the locking in qeth_l*_remove_device() accordingly, as it's now
handled by the callers in a consistent manner.

Based on an initial patch by Ursula Braun.

Fixes: 9dc48ccc68b9 ("qeth: serialize sysfs-triggered device configurations")
Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Alexandra Winter <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agos390/qeth: fix deadlock during recovery
Julian Wiedmann [Thu, 7 Jan 2021 17:24:40 +0000 (18:24 +0100)]
s390/qeth: fix deadlock during recovery

When qeth_dev_layer2_store() - holding the discipline_mutex - waits
inside qeth_l*_remove_device() for a qeth_do_reset() thread to complete,
we can hit a deadlock if qeth_do_reset() concurrently calls
qeth_set_online() and thus tries to aquire the discipline_mutex.

Move the discipline_mutex locking outside of qeth_set_online() and
qeth_set_offline(), and turn the discipline into a parameter so that
callers understand the dependency.

To fix the deadlock, we can now relax the locking:
As already established, qeth_l*_remove_device() waits for
qeth_do_reset() to complete. So qeth_do_reset() itself is under no risk
of having card->discipline ripped out while it's running, and thus
doesn't need to take the discipline_mutex.

Fixes: 9dc48ccc68b9 ("qeth: serialize sysfs-triggered device configurations")
Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Alexandra Winter <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoMerge branch 'nexthop-various-fixes'
Jakub Kicinski [Fri, 8 Jan 2021 02:47:21 +0000 (18:47 -0800)]
Merge branch 'nexthop-various-fixes'

Ido Schimmel says:

====================
nexthop: Various fixes

This series contains various fixes for the nexthop code. The bugs were
uncovered during the development of resilient nexthop groups.

Patches #1-#2 fix the error path of nexthop_create_group(). I was not
able to trigger these bugs with current code, but it is possible with
the upcoming resilient nexthop groups code which adds a user
controllable memory allocation further in the function.

Patch #3 fixes wrong validation of netlink attributes.

Patch #4 fixes wrong invocation of mausezahn in a selftest.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoselftests: fib_nexthops: Fix wrong mausezahn invocation
Ido Schimmel [Thu, 7 Jan 2021 14:48:24 +0000 (16:48 +0200)]
selftests: fib_nexthops: Fix wrong mausezahn invocation

For IPv6 traffic, mausezahn needs to be invoked with '-6'. Otherwise an
error is returned:

 # ip netns exec me mausezahn veth1 -B 2001:db8:101::2 -A 2001:db8:91::1 -c 0 -t tcp "dp=1-1023, flags=syn"
 Failed to set source IPv4 address. Please check if source is set to a valid IPv4 address.
  Invalid command line parameters!

Fixes: 7c741868ceab ("selftests: Add torture tests to nexthop tests")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonexthop: Bounce NHA_GATEWAY in FDB nexthop groups
Petr Machata [Thu, 7 Jan 2021 14:48:23 +0000 (16:48 +0200)]
nexthop: Bounce NHA_GATEWAY in FDB nexthop groups

The function nh_check_attr_group() is called to validate nexthop groups.
The intention of that code seems to have been to bounce all attributes
above NHA_GROUP_TYPE except for NHA_FDB. However instead it bounces all
these attributes except when NHA_FDB attribute is present--then it accepts
them.

NHA_FDB validation that takes place before, in rtm_to_nh_config(), already
bounces NHA_OIF, NHA_BLACKHOLE, NHA_ENCAP and NHA_ENCAP_TYPE. Yet further
back, NHA_GROUPS and NHA_MASTER are bounced unconditionally.

But that still leaves NHA_GATEWAY as an attribute that would be accepted in
FDB nexthop groups (with no meaning), so long as it keeps the address
family as unspecified:

 # ip nexthop add id 1 fdb via 127.0.0.1
 # ip nexthop add id 10 fdb via default group 1

The nexthop code is still relatively new and likely not used very broadly,
and the FDB bits are newer still. Even though there is a reproducer out
there, it relies on an improbable gateway arguments "via default", "via
all" or "via any". Given all this, I believe it is OK to reformulate the
condition to do the right thing and bounce NHA_GATEWAY.

Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops")
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonexthop: Unlink nexthop group entry in error path
Ido Schimmel [Thu, 7 Jan 2021 14:48:22 +0000 (16:48 +0200)]
nexthop: Unlink nexthop group entry in error path

In case of error, remove the nexthop group entry from the list to which
it was previously added.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonexthop: Fix off-by-one error in error path
Ido Schimmel [Thu, 7 Jan 2021 14:48:21 +0000 (16:48 +0200)]
nexthop: Fix off-by-one error in error path

A reference was not taken for the current nexthop entry, so do not try
to put it in the error path.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoocteontx2-af: fix memory leak of lmac and lmac->name
Colin Ian King [Thu, 7 Jan 2021 12:39:16 +0000 (12:39 +0000)]
octeontx2-af: fix memory leak of lmac and lmac->name

Currently the error return paths don't kfree lmac and lmac->name
leading to some memory leaks.  Fix this by adding two error return
paths that kfree these objects

Addresses-Coverity: ("Resource leak")
Fixes: 1463f382f58d ("octeontx2-af: Add support for CGX link management")
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoriscv: Enable interrupts during syscalls with M-Mode
Damien Le Moal [Sun, 13 Dec 2020 13:50:36 +0000 (22:50 +0900)]
riscv: Enable interrupts during syscalls with M-Mode

When running is M-Mode (no MMU config), MPIE does not get set. This
results in all syscalls being executed with interrupts disabled as
handle_exception never sets SR_IE as it always sees SR_PIE being
cleared. Fix this by always force enabling interrupts in
handle_syscall when CONFIG_RISCV_M_MODE is enabled.

Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
4 years agoriscv: Fix sifive serial driver
Damien Le Moal [Sun, 13 Dec 2020 13:50:35 +0000 (22:50 +0900)]
riscv: Fix sifive serial driver

Setup the port uartclk in sifive_serial_probe() so that the base baud
rate is correctly printed during device probe instead of always showing
"0".  I.e. the probe message is changed from

38000000.serial: ttySIF0 at MMIO 0x38000000 (irq = 1,
base_baud = 0) is a SiFive UART v0

to the correct:

38000000.serial: ttySIF0 at MMIO 0x38000000 (irq = 1,
base_baud = 115200) is a SiFive UART v0

Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Palmer Dabbelt <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
4 years agoriscv: Fix kernel time_init()
Damien Le Moal [Sun, 13 Dec 2020 13:50:34 +0000 (22:50 +0900)]
riscv: Fix kernel time_init()

If of_clk_init() is not called in time_init(), clock providers defined
in the system device tree are not initialized, resulting in failures for
other devices to initialize due to missing clocks.
Similarly to other architectures and to the default kernel time_init()
implementation, call of_clk_init() before executing timer_probe() in
time_init().

Signed-off-by: Damien Le Moal <[email protected]>
Acked-by: Stephen Boyd <[email protected]>
Reviewed-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
4 years agoriscv: return -ENOSYS for syscall -1
Andreas Schwab [Mon, 21 Dec 2020 22:52:00 +0000 (23:52 +0100)]
riscv: return -ENOSYS for syscall -1

Properly return -ENOSYS for syscall -1 instead of leaving the return value
uninitialized.  This fixes the strace teststuite.

Fixes: 5340627e3fe0 ("riscv: add support for SECCOMP and SECCOMP_FILTER")
Cc: [email protected]
Signed-off-by: Andreas Schwab <[email protected]>
Reviewed-by: Tycho Andersen <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
4 years agoMerge branch 'bug-fixes-for-chtls-driver'
Jakub Kicinski [Fri, 8 Jan 2021 01:06:05 +0000 (17:06 -0800)]
Merge branch 'bug-fixes-for-chtls-driver'

Ayush Sawal says:

====================
Bug fixes for chtls driver

patch 1: Fix hardware tid leak.
patch 2: Remove invalid set_tcb call.
patch 3: Fix panic when route to peer not configured.
patch 4: Avoid unnecessary freeing of oreq pointer.
patch 5: Replace skb_dequeue with skb_peek.
patch 6: Added a check to avoid NULL pointer dereference patch.
patch 7: Fix chtls resources release sequence.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Fix chtls resources release sequence
Ayush Sawal [Wed, 6 Jan 2021 04:29:12 +0000 (09:59 +0530)]
chtls: Fix chtls resources release sequence

CPL_ABORT_RPL is sent after releasing the resources by calling
chtls_release_resources(sk); and chtls_conn_done(sk);
eventually causing kernel panic. Fixing it by calling release
in appropriate order.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Added a check to avoid NULL pointer dereference
Ayush Sawal [Wed, 6 Jan 2021 04:29:11 +0000 (09:59 +0530)]
chtls: Added a check to avoid NULL pointer dereference

In case of server removal lookup_stid() may return NULL pointer, which
is used as listen_ctx. So added a check before accessing this pointer.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Replace skb_dequeue with skb_peek
Ayush Sawal [Wed, 6 Jan 2021 04:29:10 +0000 (09:59 +0530)]
chtls: Replace skb_dequeue with skb_peek

The skb is unlinked twice, one in __skb_dequeue in function
chtls_reset_synq() and another in cleanup_syn_rcv_conn().
So in this patch using skb_peek() instead of __skb_dequeue(),
so that unlink will be handled only in cleanup_syn_rcv_conn().

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Avoid unnecessary freeing of oreq pointer
Ayush Sawal [Wed, 6 Jan 2021 04:29:09 +0000 (09:59 +0530)]
chtls: Avoid unnecessary freeing of oreq pointer

In chtls_pass_accept_request(), removing the chtls_reqsk_free()
call to avoid oreq freeing twice. Here oreq is the pointer to
struct request_sock.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Fix panic when route to peer not configured
Ayush Sawal [Wed, 6 Jan 2021 04:29:08 +0000 (09:59 +0530)]
chtls: Fix panic when route to peer not configured

If route to peer is not configured, we might get non tls
devices from dst_neigh_lookup() which is invalid, adding a
check to avoid it.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Remove invalid set_tcb call
Ayush Sawal [Wed, 6 Jan 2021 04:29:07 +0000 (09:59 +0530)]
chtls: Remove invalid set_tcb call

At the time of SYN_RECV, connection information is not
initialized at FW, updating tcb flag over uninitialized
connection causes adapter crash. We don't need to
update the flag during SYN_RECV state, so avoid this.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agochtls: Fix hardware tid leak
Ayush Sawal [Wed, 6 Jan 2021 04:29:06 +0000 (09:59 +0530)]
chtls: Fix hardware tid leak

send_abort_rpl() is not calculating cpl_abort_req_rss offset and
ends up sending wrong TID with abort_rpl WR causng tid leaks.
Replaced send_abort_rpl() with chtls_send_abort_rpl() as it is
redundant.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoMerge tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jan 2021 00:03:19 +0000 (16:03 -0800)]
Merge tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull gcc-plugins fix from Kees Cook:
 "Bump c++ standard version for latest GCC versions (Valdis Kletnieks)"

* tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-plugins: fix gcc 11 indigestion with plugins...

4 years agoKVM: SVM: Add support for booting APs in an SEV-ES guest
Tom Lendacky [Mon, 4 Jan 2021 20:20:01 +0000 (14:20 -0600)]
KVM: SVM: Add support for booting APs in an SEV-ES guest

Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
where the guest vCPU register state is updated and then the vCPU is VMRUN
to begin execution of the AP. For an SEV-ES guest, this won't work because
the guest register state is encrypted.

Following the GHCB specification, the hypervisor must not alter the guest
register state, so KVM must track an AP/vCPU boot. Should the guest want
to park the AP, it must use the AP Reset Hold exit event in place of, for
example, a HLT loop.

First AP boot (first INIT-SIPI-SIPI sequence):
  Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
  support. It is up to the guest to transfer control of the AP to the
  proper location.

Subsequent AP boot:
  KVM will expect to receive an AP Reset Hold exit event indicating that
  the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
  awaken it. When the AP Reset Hold exit event is received, KVM will place
  the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
  sequence, KVM will make the vCPU runnable. It is again up to the guest
  to then transfer control of the AP to the proper location.

  To differentiate between an actual HLT and an AP Reset Hold, a new MP
  state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
  placed in upon receiving the AP Reset Hold exit event. Additionally, to
  communicate the AP Reset Hold exit event up to userspace (if needed), a
  new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.

A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
a new function that, for non SEV-ES guests, invokes the original SIPI
delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
implements the logic above.

Signed-off-by: Tom Lendacky <[email protected]>
Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
Maxim Levitsky [Thu, 7 Jan 2021 09:38:51 +0000 (11:38 +0200)]
KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit

It is possible to exit the nested guest mode, entered by
svm_set_nested_state prior to first vm entry to it (e.g due to pending event)
if the nested run was not pending during the migration.

In this case we must not switch to the nested msr permission bitmap.
Also add a warning to catch similar cases in the future.

Fixes: a7d5c7ce41ac1 ("KVM: nSVM: delay MSR permission processing to first nested VM run")
Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20210107093854[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode
Maxim Levitsky [Thu, 7 Jan 2021 09:38:54 +0000 (11:38 +0200)]
KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode

We overwrite most of vmcb fields while doing so, so we must
mark it as dirty.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20210107093854[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: nSVM: correctly restore nested_run_pending on migration
Maxim Levitsky [Thu, 7 Jan 2021 09:38:52 +0000 (11:38 +0200)]
KVM: nSVM: correctly restore nested_run_pending on migration

The code to store it on the migration exists, but no code was restoring it.

One of the side effects of fixing this is that L1->L2 injected events
are no longer lost when migration happens with nested run pending.

Signed-off-by: Maxim Levitsky <[email protected]>
Message-Id: <20210107093854[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86/mmu: Clarify TDP MMU page list invariants
Ben Gardon [Thu, 7 Jan 2021 00:19:35 +0000 (16:19 -0800)]
KVM: x86/mmu: Clarify TDP MMU page list invariants

The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain
pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any
pages with a non-zero root_count and tdp_mmu_roots should only contain
pages with a positive root_count, unless a thread holds the MMU lock and
is in the process of modifying the list. Various functions expect these
invariants to be maintained, but they are not explictily documented. Add
to the comments on both fields to document the above invariants.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20210107001935.3732070[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86/mmu: Ensure TDP MMU roots are freed after yield
Ben Gardon [Thu, 7 Jan 2021 00:19:34 +0000 (16:19 -0800)]
KVM: x86/mmu: Ensure TDP MMU roots are freed after yield

Many TDP MMU functions which need to perform some action on all TDP MMU
roots hold a reference on that root so that they can safely drop the MMU
lock in order to yield to other threads. However, when releasing the
reference on the root, there is a bug: the root will not be freed even
if its reference count (root_count) is reduced to 0.

To simplify acquiring and releasing references on TDP MMU root pages, and
to ensure that these roots are properly freed, move the get/put operations
into another TDP MMU root iterator macro.

Moving the get/put operations into an iterator macro also helps
simplify control flow when a root does need to be freed. Note that using
the list_for_each_entry_safe macro would not have been appropriate in
this situation because it could keep a pointer to the next root across
an MMU lock release + reacquire, during which time that root could be
freed.

Reported-by: Maciej S. Szmigiero <[email protected]>
Suggested-by: Paolo Bonzini <[email protected]>
Fixes: faaf05b00aec ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU")
Fixes: 063afacd8730 ("kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU")
Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU")
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20210107001935.3732070[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agokvm: check tlbs_dirty directly
Lai Jiangshan [Thu, 17 Dec 2020 15:41:18 +0000 (23:41 +0800)]
kvm: check tlbs_dirty directly

In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as:
        need_tlb_flush |= kvm->tlbs_dirty;
with need_tlb_flush's type being int and tlbs_dirty's type being long.

It means that tlbs_dirty is always used as int and the higher 32 bits
is useless.  We need to check tlbs_dirty in a correct way and this
change checks it directly without propagating it to need_tlb_flush.

Note: it's _extremely_ unlikely this neglecting of higher 32 bits can
cause problems in practice.  It would require encountering tlbs_dirty
on a 4 billion count boundary, and KVM would need to be using shadow
paging or be running a nested guest.

Cc: [email protected]
Fixes: a4ee1ca4a36e ("KVM: MMU: delay flush all tlbs on sync_page path")
Signed-off-by: Lai Jiangshan <[email protected]>
Message-Id: <20201217154118[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86: change in pv_eoi_get_pending() to make code more readable
Stephen Zhang [Fri, 18 Dec 2020 07:51:37 +0000 (15:51 +0800)]
KVM: x86: change in pv_eoi_get_pending() to make code more readable

Signed-off-by: Stephen Zhang <[email protected]>
Message-Id: <1608277897[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Thu, 7 Jan 2021 23:10:26 +0000 (15:10 -0800)]
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Alexei Starovoitov says:

====================
pull-request: bpf 2021-01-07

We've added 4 non-merge commits during the last 10 day(s) which contain
a total of 4 files changed, 14 insertions(+), 7 deletions(-).

The main changes are:

1) Fix task_iter bug caused by the merge conflict resolution, from Yonghong.

2) Fix resolve_btfids for multiple type hierarchies, from Jiri.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpftool: Fix compilation failure for net.o with older glibc
  tools/resolve_btfids: Warn when having multiple IDs for single type
  bpf: Fix a task_iter bug caused by a merge conflict resolution
  selftests/bpf: Fix a compile error for BPF_F_BPRM_SECUREEXEC
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoMAINTAINERS: Really update email address for Sean Christopherson
Sean Christopherson [Wed, 6 Jan 2021 18:29:16 +0000 (10:29 -0800)]
MAINTAINERS: Really update email address for Sean Christopherson

Use my @google.com address in MAINTAINERS, somehow only the .mailmap
entry was added when the original update patch was applied.

Fixes: c2b1209d852f ("MAINTAINERS: Update email address for Sean Christopherson")
Cc: [email protected]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20210106182916[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86: fix shift out of bounds reported by UBSAN
Paolo Bonzini [Tue, 22 Dec 2020 10:20:43 +0000 (05:20 -0500)]
KVM: x86: fix shift out of bounds reported by UBSAN

Since we know that e >= s, we can reassociate the left shift,
changing the shifted number from 1 to 2 in exchange for
decreasing the right hand side by 1.

Reported-by: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: selftests: Implement perf_test_util more conventionally
Andrew Jones [Fri, 18 Dec 2020 14:17:34 +0000 (15:17 +0100)]
KVM: selftests: Implement perf_test_util more conventionally

It's not conventional C to put non-inline functions in header
files. Create a source file for the functions instead. Also
reduce the amount of globals and rename the functions to
something less generic.

Reviewed-by: Ben Gardon <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201218141734[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: selftests: Use vm_create_with_vcpus in create_vm
Andrew Jones [Fri, 18 Dec 2020 14:17:33 +0000 (15:17 +0100)]
KVM: selftests: Use vm_create_with_vcpus in create_vm

Reviewed-by: Ben Gardon <[email protected]>
Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201218141734[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: selftests: Factor out guest mode code
Andrew Jones [Fri, 18 Dec 2020 14:17:32 +0000 (15:17 +0100)]
KVM: selftests: Factor out guest mode code

demand_paging_test, dirty_log_test, and dirty_log_perf_test have
redundant guest mode code. Factor it out.

Also, while adding a new include, remove the ones we don't need.

Reviewed-by: Ben Gardon <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201218141734[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c
Uros Bizjak [Sun, 20 Dec 2020 20:03:39 +0000 (21:03 +0100)]
KVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c

Commit 16809ecdc1e8a moved __svm_vcpu_run the prototype to svm.h,
but forgot to remove the original from svm.c.

Fixes: 16809ecdc1e8a ("KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests")
Cc: Tom Lendacky <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
Message-Id: <20201220200339[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: SVM: Add register operand to vmsave call in sev_es_vcpu_load
Nathan Chancellor [Sat, 19 Dec 2020 06:37:11 +0000 (23:37 -0700)]
KVM: SVM: Add register operand to vmsave call in sev_es_vcpu_load

When using LLVM's integrated assembler (LLVM_IAS=1) while building
x86_64_defconfig + CONFIG_KVM=y + CONFIG_KVM_AMD=y, the following build
error occurs:

 $ make LLVM=1 LLVM_IAS=1 arch/x86/kvm/svm/sev.o
 arch/x86/kvm/svm/sev.c:2004:15: error: too few operands for instruction
         asm volatile(__ex("vmsave") : : "a" (__sme_page_pa(sd->save_area)) : "memory");
                      ^
 arch/x86/kvm/svm/sev.c:28:17: note: expanded from macro '__ex'
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
                 ^
 ./arch/x86/include/asm/kvm_host.h:1646:10: note: expanded from macro '__kvm_handle_fault_on_reboot'
         "666: \n\t"                                                     \
                 ^
 <inline asm>:2:2: note: instantiated into assembly here
         vmsave
         ^
 1 error generated.

This happens because LLVM currently does not support calling vmsave
without the fixed register operand (%rax for 64-bit and %eax for
32-bit). This will be fixed in LLVM 12 but the kernel currently supports
LLVM 10.0.1 and newer so this needs to be handled.

Add the proper register using the _ASM_AX macro, which matches the
vmsave call in vmenter.S.

Fixes: 861377730aa9 ("KVM: SVM: Provide support for SEV-ES vCPU loading")
Link: https://reviews.llvm.org/D93524
Link: https://github.com/ClangBuiltLinux/linux/issues/1216
Signed-off-by: Nathan Chancellor <[email protected]>
Message-Id: <20201219063711.3526947[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoMerge branch 'kvm-master' into kvm-next
Paolo Bonzini [Thu, 7 Jan 2021 23:06:52 +0000 (18:06 -0500)]
Merge branch 'kvm-master' into kvm-next

Fixes to get_mmio_spte, destined to 5.10 stable branch.

4 years agoKVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()
Sean Christopherson [Fri, 18 Dec 2020 00:31:39 +0000 (16:31 -0800)]
KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte()

Check only the terminal leaf for a "!PRESENT || MMIO" SPTE when looking
for reserved bits on valid, non-MMIO SPTEs.  The get_walk() helpers
terminate their walks if a not-present or MMIO SPTE is encountered, i.e.
the non-terminal SPTEs have already been verified to be regular SPTEs.
This eliminates an extra check-and-branch in a relatively hot loop.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20201218003139.2167891[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86/mmu: Use raw level to index into MMIO walks' sptes array
Sean Christopherson [Fri, 18 Dec 2020 00:31:38 +0000 (16:31 -0800)]
KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array

Bump the size of the sptes array by one and use the raw level of the
SPTE to index into the sptes array.  Using the SPTE level directly
improves readability by eliminating the need to reason out why the level
is being adjusted when indexing the array.  The array is on the stack
and is not explicitly initialized; bumping its size is nothing more than
a superficial adjustment to the stack frame.

Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20201218003139.2167891[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
Sean Christopherson [Fri, 18 Dec 2020 00:31:37 +0000 (16:31 -0800)]
KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE

Get the so called "root" level from the low level shadow page table
walkers instead of manually attempting to calculate it higher up the
stack, e.g. in get_mmio_spte().  When KVM is using PAE shadow paging,
the starting level of the walk, from the callers perspective, is not
the CR3 root but rather the PDPTR "root".  Checking for reserved bits
from the CR3 root causes get_mmio_spte() to consume uninitialized stack
data due to indexing into sptes[] for a level that was not filled by
get_walk().  This can result in false positives and/or negatives
depending on what garbage happens to be on the stack.

Opportunistically nuke a few extra newlines.

Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Reported-by: Richard Herbert <[email protected]>
Cc: Ben Gardon <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20201218003139.2167891[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoKVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
Sean Christopherson [Fri, 18 Dec 2020 00:31:36 +0000 (16:31 -0800)]
KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()

Return -1 from the get_walk() helpers if the shadow walk doesn't fill at
least one spte, which can theoretically happen if the walk hits a
not-present PDPTR.  Returning the root level in such a case will cause
get_mmio_spte() to return garbage (uninitialized stack data).  In
practice, such a scenario should be impossible as KVM shouldn't get a
reserved-bit page fault with a not-present PDPTR.

Note, using mmu->root_level in get_walk() is wrong for other reasons,
too, but that's now a moot point.

Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Cc: Ben Gardon <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20201218003139.2167891[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
4 years agoMerge branch 'net-fix-netfilter-defrag-ip-tunnel-pmtu-blackhole'
Jakub Kicinski [Thu, 7 Jan 2021 22:42:36 +0000 (14:42 -0800)]
Merge branch 'net-fix-netfilter-defrag-ip-tunnel-pmtu-blackhole'

Florian Westphal says:

====================
net: fix netfilter defrag/ip tunnel pmtu blackhole

Christian Perle reported a PMTU blackhole due to unexpected interaction
between the ip defragmentation that comes with connection tracking and
ip tunnels.

Unfortunately setting 'nopmtudisc' on the tunnel breaks the test
scenario even without netfilter.

Christinas setup looks like this:
     +--------+       +---------+       +--------+
     |Router A|-------|Wanrouter|-------|Router B|
     |        |.IPIP..|         |..IPIP.|        |
     +--------+       +---------+       +--------+
          /             mtu 1400           \
         /                                  \
 +--------+                                  +--------+
 |Client A|                                  |Client B|
 +--------+                                  +--------+

MTU is 1500 everywhere, except on Router A to Wanrouter and
Wanrouter to Router B.

Router A and Router B use IPIP tunnel interfaces to tunnel traffic
between Client A and Client B over WAN.

Client A sends a 1400 byte UDP datagram to Client B.
This packet gets encapsulated in the IPIP tunnel.

This works, packet is received on client B.

When conntrack (or anything else that forces ip defragmentation) is
enabled on Router A, the packet gets dropped on Router A after
encapsulation because they exceed the link MTU.

Setting the 'nopmtudisc' flag on the IPIP tunnel makes things worse,
no packets pass even in the no-netfilter scenario.

Patch one is a reproducer script for selftest infra.

Patch two is a fix for 'nopmtudisc' behaviour so ip_tunnel will send
an icmp error to Client A.  This allows 'nopmtudisc' tunnel to forward
the UDP datagrams.

Patch three enables ip refragmentation for all reassembled packets, just
like ipv6.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonet: ip: always refragment ip defragmented packets
Florian Westphal [Tue, 5 Jan 2021 23:15:23 +0000 (00:15 +0100)]
net: ip: always refragment ip defragmented packets

Conntrack reassembly records the largest fragment size seen in IPCB.
However, when this gets forwarded/transmitted, fragmentation will only
be forced if one of the fragmented packets had the DF bit set.

In that case, a flag in IPCB will force fragmentation even if the
MTU is large enough.

This should work fine, but this breaks with ip tunnels.
Consider client that sends a UDP datagram of size X to another host.

The client fragments the datagram, so two packets, of size y and z, are
sent. DF bit is not set on any of these packets.

Middlebox netfilter reassembles those packets back to single size-X
packet, before routing decision.

packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit
isn't set.  At output time, ip refragmentation is skipped as well
because x is still smaller than the mtu of the output device.

If ttransmit device is an ip tunnel, the packet size increases to
x+overhead.

Also, tunnel might be configured to force DF bit on outer header.

In this case, packet will be dropped (exceeds MTU) and an ICMP error is
generated back to sender.

But sender already respects the announced MTU, all the packets that
it sent did fit the announced mtu.

Force refragmentation as per original sizes unconditionally so ip tunnel
will encapsulate the fragments instead.

The only other solution I see is to place ip refragmentation in
the ip_tunnel code to handle this case.

Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet")
Reported-by: Christian Perle <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agonet: fix pmtu check in nopmtudisc mode
Florian Westphal [Tue, 5 Jan 2021 23:15:22 +0000 (00:15 +0100)]
net: fix pmtu check in nopmtudisc mode

For some reason ip_tunnel insist on setting the DF bit anyway when the
inner header has the DF bit set, EVEN if the tunnel was configured with
'nopmtudisc'.

This means that the script added in the previous commit
cannot be made to work by adding the 'nopmtudisc' flag to the
ip tunnel configuration. Doing so breaks connectivity even for the
without-conntrack/netfilter scenario.

When nopmtudisc is set, the tunnel will skip the mtu check, so no
icmp error is sent to client. Then, because inner header has DF set,
the outer header gets added with DF bit set as well.

IP stack then sends an error to itself because the packet exceeds
the device MTU.

Fixes: 23a3647bc4f93 ("ip_tunnels: Use skb-len to PMTU check.")
Cc: Stefano Brivio <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoselftests: netfilter: add selftest for ipip pmtu discovery with enabled connection...
Florian Westphal [Tue, 5 Jan 2021 23:15:21 +0000 (00:15 +0100)]
selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking

Convert Christians bug description into a reproducer.

Cc: Shuah Khan <[email protected]>
Reported-by: Christian Perle <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoARC: unbork 5.11 bootup: fix snafu in _TIF_NOTIFY_SIGNAL handling
Vineet Gupta [Wed, 6 Jan 2021 20:34:36 +0000 (12:34 -0800)]
ARC: unbork 5.11 bootup: fix snafu in _TIF_NOTIFY_SIGNAL handling

Linux 5.11.rcX was failing to boot on ARC HSDK board. Turns out we have
a couple of issues, this being the first one, and I'm to blame as I
didn't pay attention during review.

TIF_NOTIFY_SIGNAL support requires checking multiple TIF_* bits in
kernel return code path. Old code only needed to check a single bit so
BBIT0 <TIF_SIGPENDING> worked. New code needs to check multiple bits so
AND <bit-mask> instruction. So needs to use bit mask variant _TIF_SIGPENDING

Cc: Jens Axboe <[email protected]>
Fixes: 53855e12588743ea128 ("arc: add support for TIF_NOTIFY_SIGNAL")
Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/34
Signed-off-by: Vineet Gupta <[email protected]>
4 years agodocs: admin-guide: bootconfig: Fix feils to fails
Bhaskar Chowdhury [Thu, 7 Jan 2021 12:56:10 +0000 (18:26 +0530)]
docs: admin-guide: bootconfig: Fix feils to fails

s/feils/fails/p

Signed-off-by: Bhaskar Chowdhury <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
4 years agoDocumentation/admin-guide: kernel-parameters: hyphenate comma-separated
Randy Dunlap [Fri, 1 Jan 2021 04:08:31 +0000 (20:08 -0800)]
Documentation/admin-guide: kernel-parameters: hyphenate comma-separated

Hyphenate "comma separated" when it is used as a compound adjective.
hyphenated.

Signed-off-by: Randy Dunlap <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
4 years agodocs: binfmt-misc: Fix .rst formatting
Jonathan Neuschäfer [Fri, 1 Jan 2021 21:14:47 +0000 (22:14 +0100)]
docs: binfmt-misc: Fix .rst formatting

"name below" is not part of the /proc path and should not be formatted
in monospace.

"doesn``t" is rendered in HTML with a double backtick. Revert it back to
"doesn't".

Signed-off-by: Jonathan Neuschäfer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
4 years agodocs: remove mention of ENABLE_MUST_CHECK
Miguel Ojeda [Tue, 5 Jan 2021 05:58:15 +0000 (06:58 +0100)]
docs: remove mention of ENABLE_MUST_CHECK

We removed ENABLE_MUST_CHECK in 196793946264 ("Compiler Attributes:
remove CONFIG_ENABLE_MUST_CHECK"), so let's remove docs' mentions.

At the same time, fix the outdated text related to
ENABLE_WARN_DEPRECATED that wasn't removed in 3337d5cfe5e08
("configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED").

Finally, reflow the paragraph.

Signed-off-by: Miguel Ojeda <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
4 years agodocs: octeontx2: tune rst markup
Lukas Bulwahn [Wed, 6 Jan 2021 16:17:35 +0000 (17:17 +0100)]
docs: octeontx2: tune rst markup

Commit 80b9414832a1 ("docs: octeontx2: Add Documentation for NPA health
reporters") added new documentation with improper formatting for rst, and
caused a few new warnings for make htmldocs in octeontx2.rst:169--202.

Tune markup and formatting for better presentation in the HTML view.

Signed-off-by: Lukas Bulwahn <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Acked-by: George Cherian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 years agoRDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd()
Tom Rix [Wed, 30 Dec 2020 02:46:53 +0000 (18:46 -0800)]
RDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd()

In ocrdma_dealloc_ucontext_pd() uctx->cntxt_pd is assigned to the variable
pd and then after uctx->cntxt_pd is freed, the variable pd is passed to
function _ocrdma_dealloc_pd() which dereferences pd directly or through
its call to ocrdma_mbx_dealloc_pd().

Reorder the free using the variable pd.

Cc: [email protected]
Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
4 years agonet/mlx5e: Fix memleak in mlx5e_create_l2_table_groups
Dinghao Liu [Mon, 21 Dec 2020 11:27:31 +0000 (19:27 +0800)]
net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups

When mlx5_create_flow_group() fails, ft->g should be
freed just like when kvzalloc() fails. The caller of
mlx5e_create_l2_table_groups() does not catch this
issue on failure, which leads to memleak.

Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table")
Signed-off-by: Dinghao Liu <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
This page took 0.144709 seconds and 4 git commands to generate.