Git Repo - linux.git/log

Merge tag 'kvm-s390-master-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix for running nested uner z/VM

There are circumstances when running nested under z/VM that would trigger a
WARN_ON_ONCE. Remove the WARN_ON_ONCE. Long term we certainly want to make this
code more robust and flexible, but just returning instead of WARNING makes
guest bootable again.

KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly

KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared
as supported. My wild guess is that userspaces like QEMU are using "#ifdef
KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be
wrong because the compilation host may not be the runtime host.

The userspace might still want to keep the old "#ifdef" though to not break the
guest debug on old kernels.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20200505154750 [email protected]>
[Do the same for PPC and s390. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Fix build for evmcs.h

I got this error when building kvm selftests:

/usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: multiple definition of `current_evmcs'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: first defined here
/usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: multiple definition of `current_vp_assist'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: first defined here

I think it's because evmcs.h is included both in a test file and a lib file so
the structs have multiple declarations when linking. After all it's not a good
habit to declare structs in the header files.

Cc: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20200504220607 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits

Using CPUID data can be useful for the processor compatibility
check, but that's it. Using it to compute guest-reserved bits
can have both false positives (such as LA57 and UMIP which we
are already handling) and false negatives: in particular, with
this patch we don't allow anymore a KVM guest to set CR4.PKE
when CR4.PKE is clear on the host.

Fixes: b9dd21e104bc ("KVM: x86: simplify handling of PKRU")
Reported-by: Jim Mattson <[email protected]>
Tested-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path

Clear CF and ZF in the VM-Exit path after doing __FILL_RETURN_BUFFER so
that KVM doesn't interpret clobbered RFLAGS as a VM-Fail. Filling the
RSB has always clobbered RFLAGS, its current incarnation just happens
clear CF and ZF in the processs. Relying on the macro to clear CF and
ZF is extremely fragile, e.g. commit 089dd8e53126e ("x86/speculation:
Change FILL_RETURN_BUFFER to work with objtool") tweaks the loop such
that the ZF flag is always set.

Reported-by: Qian Cai <[email protected]>
Cc: Rick Edgecombe <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Fixes: f2fde6a5bcfcf ("KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit")
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20200506035355 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

docs/virt/kvm: Document configuring and running nested guests

This is a rewrite of this[1] Wiki page with further enhancements. The
doc also includes a section on debugging problems in nested
environments, among other improvements.

[1] https://www.linux-kvm.org/page/Nested_Guests

Signed-off-by: Kashyap Chamarthy <[email protected]>
Message-Id: <20200505112839 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

RISC-V: Remove unused code from STRICT_KERNEL_RWX

This patch removes the unused functions set_kernel_text_rw/ro.
Currently, it is not being invoked from anywhere and no other architecture
(except arm) uses this code. Even in ARM, these functions are not invoked
from anywhere currently.

Fixes: d27c3c90817e ("riscv: add STRICT_KERNEL_RWX support")
Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Zong Li <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

Merge tag 'platform-drivers-x86-v5.7-2' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fixes from Andy Shevchenko:

- Avoid loading asus-nb-wmi module on selected laptop models

- Fix S0ix debug support for Jasper Lake PMC

- Few fixes which have been reported by Hulk bot and others

* tag 'platform-drivers-x86-v5.7-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: thinkpad_acpi: Remove always false 'value < 0' statement
  platform/x86: intel_pmc_core: avoid unused-function warnings
  platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TA
  platform/x86: intel_pmc_core: Change Jasper Lake S0ix debug reg map back to ICL
  platform/x86/intel-uncore-freq: make uncore_root_kobj static
  platform/x86: wmi: Make two functions static
  platform/x86: surface3_power: Fix a NULL vs IS_ERR() check in probe

neigh: send protocol value in neighbor create notification

When a new neighbor entry has been added, event is generated but it does not
include protocol, because its value is assigned after the event notification
routine has run, so move protocol assignment code earlier.

Fixes: df9b0e30d44c ("neighbor: Add protocol attribute")
Cc: David Ahern <[email protected]>
Signed-off-by: Roman Mashak <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/amd/display: Prevent dpcd reads with passive dongles

[why]
During hotplug, a DP port may be connected to the sink through
passive adapter which does not support DPCD reads. Issuing reads
without checking for this condition will result in errors

[how]
Ensure the link is in aux_mode before initiating operation that result
in a DPCD read.

Signed-off-by: Aurabindo Pillai <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Acked-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: fix counter in wait_for_no_pipes_pending

[Why]
Wait counter is not being reset for each pipe.

[How]
Move counter reset into pipe loop scope.

Signed-off-by: Roman Li <[email protected]>
Reviewed-by: Zhan Liu <[email protected]>
Acked-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Update DCN2.1 DV Code Revision

[WHY & HOW]
There is a problem in hscale_pixel_rate, the bug
causes DCN to be more optimistic (more likely to underflow)
in upscale cases during prefetch.
This commit ports the fix from DV code to address these issues.

Signed-off-by: Sung Lee <[email protected]>
Reviewed-by: Dmytro Laktyushkin <[email protected]>
Acked-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

io_uring: handle -EFAULT properly in io_uring_setup()

If copy_to_user() in io_uring_setup() failed, we'll leak many kernel
resources, which will be recycled until process terminates. This bug
can be reproduced by using mprotect to set params to PROT_READ. To fix
this issue, refactor io_uring_create() a bit to add a new 'struct
io_uring_params __user *params' parameter and move the copy_to_user()
in io_uring_setup() to io_uring_setup(), if copy_to_user() failed,
we can free kernel resource properly.

Suggested-by: Jens Axboe <[email protected]>
Signed-off-by: Xiaoguang Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

net: broadcom: fix a mistake about ioremap resource

Commit d7a5502b0bb8b ("net: broadcom: convert to
devm_platform_ioremap_resource_byname()") will broke this driver.
idm_base and nicpm_base were optional, after this change, they are
mandatory. it will probe fails with -22 when the dtb doesn't have them
defined. so revert part of this commit and make idm_base and nicpm_base
as optional.

Fixes: d7a5502b0bb8bde ("net: broadcom: convert to devm_platform_ioremap_resource_byname()")
Reported-by: Jonathan Richardson <[email protected]>
Cc: Scott Branden <[email protected]>
Cc: Ray Jui <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm: Fix HDCP failures when SRM fw is missing

The SRM cleanup in 79643fddd6eb2 ("drm/hdcp: optimizing the srm
handling") inadvertently altered the behavior of HDCP auth when
the SRM firmware is missing. Before that patch, missing SRM was
interpreted as the device having no revoked keys. With that patch,
if the SRM fw file is missing we reject _all_ keys.

This patch fixes that regression by returning success if the file
cannot be found. It also checks the return value from request_srm such
that we won't end up trying to parse the ksv list if there is an error
fetching it.

Fixes: 79643fddd6eb ("drm/hdcp: optimizing the srm handling")
Cc: [email protected]
Cc: Ramalingam C <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Thomas Zimmermann <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: [email protected]
Reviewed-by: Ramalingam C <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Changes in v2:
-Noticed a couple other things to clean up
Reviewed-by: Ramalingam C <[email protected]>

iocost: protect iocg->abs_vdebt with iocg->waitq.lock

abs_vdebt is an atomic_64 which tracks how much over budget a given cgroup
is and controls the activation of use_delay mechanism. Once a cgroup goes
over budget from forced IOs, it has to pay it back with its future budget.
The progress guarantee on debt paying comes from the iocg being active -
active iocgs are processed by the periodic timer, which ensures that as time
passes the debts dissipate and the iocg returns to normal operation.

However, both iocg activation and vdebt handling are asynchronous and a
sequence like the following may happen.

1. The iocg is in the process of being deactivated by the periodic timer.

2. A bio enters ioc_rqos_throttle(), calls iocg_activate() which returns
   without anything because it still sees that the iocg is already active.

3. The iocg is deactivated.

4. The bio from #2 is over budget but needs to be forced. It increases
   abs_vdebt and goes over the threshold and enables use_delay.

5. IO control is enabled for the iocg's subtree and now IOs are attributed
   to the descendant cgroups and the iocg itself no longer issues IOs.

This leaves the iocg with stuck abs_vdebt - it has debt but inactive and no
further IOs which can activate it. This can end up unduly punishing all the
descendants cgroups.

The usual throttling path has the same issue - the iocg must be active while
throttled to ensure that future event will wake it up - and solves the
problem by synchronizing the throttling path with a spinlock. abs_vdebt
handling is another form of overage handling and shares a lot of
characteristics including the fact that it isn't in the hottest path.

This patch fixes the above and other possible races by strictly
synchronizing abs_vdebt and use_delay handling with iocg->waitq.lock.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Vlad Dmitriev <[email protected]>
Cc: [email protected] # v5.4+
Fixes: e1518f63f246 ("blk-iocost: Don't let merges push vtime into the future")
Signed-off-by: Jens Axboe <[email protected]>

bus: mhi: core: Fix channel device name conflict

When multiple instances of the same MHI product are present in a system,
we can see a splat from mhi_create_devices() - "sysfs: cannot create
duplicate filename".

This is because the device names assigned to the MHI channel devices are
non-unique.  They consist of the channel's name, and the channel's pipe
id.  For identical products, each instance is going to have the same
set of channel (both in name and pipe id).

To fix this, we prepend the device name of the parent device that the
MHI channels belong to.  Since different instances of the same product
should have unique device names, this makes the MHI channel devices for
each product also unique.

Additionally, remove the pipe id from the MHI channel device name.  This
is an internal detail to the MHI product that provides little value, and
imposes too much device specific internal details to userspace.  It is
expected that channel with a specific name (ie "SAHARA") has a specific
client, and it does not matter what pipe id that channel is enumerated on.
The pipe id is an internal detail between the MHI bus, and the hardware.
The client is not expected to make decisions based on the pipe id, and to
do so would require the client to have intimate knowledge of the hardware,
which is inappropiate as it may violate the layering provided by the MHI
bus.  The limitation of doing this is that each product may only have one
instance of a channel by a unique name.  This limitation is appropriate
given the usecases of MHI channels.

Signed-off-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Hemant Kumar <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

bus: mhi: core: Fix typo in comment

There is a typo - "runtimet" should be "runtime". Fix it.

Signed-off-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Hemant Kumar <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

bus: mhi: core: Offload register accesses to the controller

When reading or writing MHI registers, the core assumes that the physical
link is a memory mapped PCI link. This assumption may not hold for all
MHI devices. The controller knows what is the physical link (ie PCI, I2C,
SPI, etc), and therefore knows the proper methods to access that link.
The controller can also handle link specific error scenarios, such as
reading -1 when the PCI link went down.

Therefore, it is appropriate that the MHI core requests the controller to
make register accesses on behalf of the core, which abstracts the core
from link specifics, and end up removing an unnecessary assumption.

Signed-off-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Hemant Kumar <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

bus: mhi: core: Remove link_status() callback

If the MHI core detects invalid data due to a PCI read, it calls into
the controller via link_status() to double check that the link is infact
down.  All in all, this is pretty pointless, and racy.  There are no good
reasons for this, and only drawbacks.

Its pointless because chances are, the controller is going to do the same
thing to determine if the link is down - attempt a PCI access and compare
the result.  This does not make the link status decision any smarter.

Its racy because its possible that the link was down at the time of the
MHI core access, but then recovered before the controller access.  In this
case, the controller will indicate the link is not down, and the MHI core
will precede to use a bad value as the MHI core does not attempt to retry
the access.

Retrying the access in the MHI core is a bad idea because again, it is
racy - what if the link is down again?  Furthermore, there may be some
higher level state associated with the link status, that is now invalid
because the link went down.

The only reason why the MHI core could see "invalid" data when doing a PCI
access, that is actually valid, is if the register actually contained the
PCI spec defined sentinel for an invalid access.  In this case, it is
arguable that the MHI implementation broken, and should be fixed, not
worked around.

Therefore, remove the link_status() callback before anyone attempts to
implement it.

Signed-off-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Reviewed-by: Hemant Kumar <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails

Powerdown is necessary if mhi_sync_power_up fails due to a timeout, to
clean up the resources. Otherwise a BUG could be triggered when
attempting to clean up MSIs because the IRQ is still active from a
request_irq().

Signed-off-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Hemant Kumar <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

bus: mhi: Fix parsing of mhi_flags

With the current parsing of mhi_flags, the following statement always
return false:

eob = !!(flags & MHI_EOB);

This is due to the fact that 'enum mhi_flags' starts with index 0 and we
are using direct AND operation to extract each bit. Fix this by using
BIT() macros for defining the flags so that the reset of the code need not
be touched.

Fixes: 189ff97cca53 ("bus: mhi: core: Add support for data transfer")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

mei: me: disable mei interface on LBG servers.

Disable the MEI driver on LBG SPS (server) platforms, some corner
flows such as recovery mode does not work, and the driver
doesn't have working use cases.

Cc: <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/amd: Do not flush Device Table in iommu_map_page()

The flush of the Device Table Entries for the domain has already
happened in increase_address_space(), if necessary. Do no flush them
again in iommu_map_page().

Signed-off-by: Joerg Roedel <[email protected]>
Tested-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

iommu/amd: Update Device Table in increase_address_space()

The Device Table needs to be updated before the new page-table root
can be published in domain->pt_root. Otherwise a concurrent call to
fetch_pte might fetch a PTE which is not reachable through the Device
Table Entry.

Fixes: 92d420ec028d ("iommu/amd: Relax locking in dma_ops path")
Reported-by: Qian Cai <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Tested-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

iommu/amd: Call domain_flush_complete() in update_domain()

The update_domain() function is expected to also inform the hardware
about domain changes. This needs a COMPLETION_WAIT command to be sent
to all IOMMUs which use the domain.

Signed-off-by: Joerg Roedel <[email protected]>
Tested-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

iommu/amd: Do not loop forever when trying to increase address space

When increase_address_space() fails to allocate memory, alloc_pte()
will call it again until it succeeds. Do not loop forever while trying
to increase the address space and just return an error instead.

Signed-off-by: Joerg Roedel <[email protected]>
Tested-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

iommu/amd: Fix race in increase_address_space()/fetch_pte()

The 'pt_root' and 'mode' struct members of 'struct protection_domain'
need to be get/set atomically, otherwise the page-table of the domain
can get corrupted.

Merge the fields into one atomic64_t struct member which can be
get/set atomically.

Fixes: 92d420ec028d ("iommu/amd: Relax locking in dma_ops path")
Reported-by: Qian Cai <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Tested-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

platform/x86: thinkpad_acpi: Remove always false 'value < 0' statement

Since 'value' is declared as unsigned long, the following statement is
always false.
value < 0

So let's remove it.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Xiongfeng Wang <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>

platform/x86: intel_pmc_core: avoid unused-function warnings

When both CONFIG_DEBUG_FS and CONFIG_PM_SLEEP are disabled, the
functions that got moved out of the #ifdef section now cause
a warning:

drivers/platform/x86/intel_pmc_core.c:654:13: error: 'pmc_core_lpm_display' defined but not used [-Werror=unused-function]
  654 | static void pmc_core_lpm_display(struct pmc_dev *pmcdev, struct device *dev,
      |             ^~~~~~~~~~~~~~~~~~~~
drivers/platform/x86/intel_pmc_core.c:617:13: error: 'pmc_core_slps0_display' defined but not used [-Werror=unused-function]
  617 | static void pmc_core_slps0_display(struct pmc_dev *pmcdev, struct device *dev,
      |             ^~~~~~~~~~~~~~~~~~~~~~

Rather than add even more #ifdefs here, remove them entirely and
let the compiler work it out, it can actually get rid of all the
debugfs calls without problems as long as the struct member is
there.

The two PM functions just need a __maybe_unused annotations to avoid
another warning instead of the #ifdef.

Fixes: aae43c2bcdc1 ("platform/x86: intel_pmc_core: Relocate pmc_core_*_display() to outside of CONFIG_DEBUG_FS")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>

platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TA

asus-nb-wmi does not add any extra functionality on these Asus
Transformer books. They have detachable keyboards, so the hotkeys are
send through a HID device (and handled by the hid-asus driver) and also
the rfkill functionality is not used on these devices.

Besides not adding any extra functionality, initializing the WMI interface
on these devices actually has a negative side-effect. For some reason
the \_SB.ATKD.INIT() function which asus_wmi_platform_init() calls drives
GPO2 (INT33FC:02) pin 8, which is connected to the front facing webcam LED,
high and there is no (WMI or other) interface to drive this low again
causing the LED to be permanently on, even during suspend.

This commit adds a blacklist of DMI system_ids on which not to load the
asus-nb-wmi and adds these Transformer books to this list. This fixes
the webcam LED being permanently on under Linux.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>

platform/x86: intel_pmc_core: Change Jasper Lake S0ix debug reg map back to ICL

Jasper Lake uses Icelake PCH IPs and the S0ix debug interfaces are same as
Icelake. It uses SLP_S0_DBG register latch/read interface from Icelake
generation. It doesn't use Tiger Lake LPM debug registers. Change the
Jasper Lake S0ix debug interface to use the ICL reg map.

Fixes: 16292bed9c56 ("platform/x86: intel_pmc_core: Add Atom based Jasper Lake (JSL) platform support")
Signed-off-by: Archana Patni <[email protected]>
Acked-by: David E. Box <[email protected]>
Tested-by: Divagar Mohandass <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>

usb: typec: mux: intel: Handle alt mode HPD_HIGH

According to the PMC Type C Subsystem (TCSS) Mux programming guide rev
0.6, when a device is transitioning to DP Alternate Mode state, if the
HPD_STATE (bit 7) field in the status update command VDO is set to
HPD_HIGH, the HPD_HIGH field in the Alternate Mode request “mode_data”
field (bit 14) should also be set. Ensure the bit is correctly handled
while issuing the Alternate Mode request.

Signed-off-by: Prashant Malani <[email protected]>
Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control")
Acked-by: Heikki Krogerus <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: usbfs: correct kernel->user page attribute mismatch

On some architectures (e.g. arm64) requests for
IO coherent memory may use non-cachable attributes if
the relevant device isn't cache coherent. If these
pages are then remapped into userspace as cacheable,
they may not be coherent with the non-cacheable mappings.

In particular this happens with libusb, when it attempts
to create zero-copy buffers for use by rtl-sdr
(https://github.com/osmocom/rtl-sdr/). On low end arm
devices with non-coherent USB ports, the application will
be unexpectedly killed, while continuing to work fine on
arm machines with coherent USB controllers.

This bug has been discovered/reported a few times over
the last few years. In the case of rtl-sdr a compile time
option to enable/disable zero copy was implemented to
work around it.

Rather than relaying on application specific workarounds,
dma_mmap_coherent() can be used instead of remap_pfn_range().
The page cache/etc attributes will then be correctly set in
userspace to match the kernel mapping.

Signed-off-by: Jeremy Linton <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: intel_pmc_mux: Fix the property names

The device property names for the port index number are
"usb2-port-number" and "usb3-port-number", not "usb2-port"
and "usb3-port".

Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control")
Signed-off-by: Heikki Krogerus <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: core: Fix misleading driver bug report

The syzbot fuzzer found a race between URB submission to endpoint 0
and device reset.  Namely, during the reset we call usb_ep0_reinit()
because the characteristics of ep0 may have changed (if the reset
follows a firmware update, for example).  While usb_ep0_reinit() is
running there is a brief period during which the pointers stored in
udev->ep_in[0] and udev->ep_out[0] are set to NULL, and if an URB is
submitted to ep0 during that period, usb_urb_ep_type_check() will
report it as a driver bug.  In the absence of those pointers, the
routine thinks that the endpoint doesn't exist.  The log message looks
like this:

------------[ cut here ]------------
usb 2-1: BOGUS urb xfer, pipe 2 != type 2
WARNING: CPU: 0 PID: 9241 at drivers/usb/core/urb.c:478
usb_submit_urb+0x1188/0x1460 drivers/usb/core/urb.c:478

Now, although submitting an URB while the device is being reset is a
questionable thing to do, it shouldn't count as a driver bug as severe
as submitting an URB for an endpoint that doesn't exist.  Indeed,
endpoint 0 always exists, even while the device is in its unconfigured
state.

To prevent these misleading driver bug reports, this patch updates
usb_disable_endpoint() to avoid clearing the ep_in[] and ep_out[]
pointers when the endpoint being disabled is ep0.  There's no danger
of leaving a stale pointer in place, because the usb_host_endpoint
structure being pointed to is stored permanently in udev->ep0; it
doesn't get deallocated until the entire usb_device structure does.

Reported-and-tested-by: [email protected]
Signed-off-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: gasket: Check the return value of gasket_get_bar_index()

Check the return value of gasket_get_bar_index function as it can return
a negative one (-EINVAL). If this happens, a negative index is used in
the "gasket_dev->bar_data" array.

Addresses-Coverity-ID: 1438542 ("Negative array index read")
Fixes: 9a69f5087ccc2 ("drivers/staging: Gasket driver framework + Apex driver")
Signed-off-by: Oscar Carter <[email protected]>
Cc: stable <[email protected]>
Reviewed-by: Richard Yeh <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: ks7010: remove me from CC list

I lost interest in this driver years ago because I could't keep up with
testing the incoming janitorial patches. So, drop me from CC.

Signed-off-by: Wolfram Sang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction

In LPAR we will only get an intercept for FC==3 for the PQAP
instruction. Running nested under z/VM can result in other intercepts as
well as ECA_APIE is an effective bit: If one hypervisor layer has
turned this bit off, the end result will be that we will get intercepts for
all function codes. Usually the first one will be a query like PQAP(QCI).
So the WARN_ON_ONCE is not right. Let us simply remove it.

Cc: Pierre Morel <[email protected]>
Cc: Tony Krowiak <[email protected]>
Cc: [email protected] # v5.3+
Fixes: e5282de93105 ("s390: ap: kvm: add PQAP interception for AQIC")
Link: https://lore.kernel.org/kvm/[email protected]
Reported-by: Qian Cai <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- Wacom driver functional and regression fixes from Jason Gerecke

- race condition fix in usbhid, found by syzbot and fixed by Alan Stern

- a few device-specific quirks and ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: quirks: Add HID_QUIRK_NO_INIT_REPORTS quirk for Dell K12A keyboard-dock
  HID: mcp2221: add gpiolib dependency
  HID: i2c-hid: reset Synaptics SYNA2393 on resume
  HID: wacom: Report 2nd-gen Intuos Pro S center button status over BT
  HID: usbhid: Fix race between usbhid_close() and usbhid_stop()
  Revert "HID: wacom: generic: read the number of expected touches on a per collection basis"
  HID: alps: ALPS_1657 is too specific; use U1_UNICORN_LEGACY instead
  HID: alps: Add AUI1657 device ID
  HID: logitech: Add support for Logitech G11 extra keys
  HID: multitouch: add eGalaxTouch P80H84 support
  HID: wacom: Read HID_DG_CONTACTMAX directly for non-generic devices

riscv: force __cpu_up_ variables to put in data section

Put __cpu_up_stack_pointer and __cpu_up_task_pointer in data section.
Currently, these two variables are put in bss section, there is a
potential risk that secondary harts get the uninitialized value before
main hart finishing the bss clearing. In this case, all secondary
harts would pass the waiting loop and enable the MMU before main hart
set up the page table.

This issue happens on random booting of multiple harts, which means
it will manifest for BBL and OpenSBI v0.6 (or older version). In OpenSBI
v0.7 (or higher version), we have HSM extension so all the secondary harts
are brought-up by Linux kernel in an orderly fashion. This means we don't
need this change for OpenSBI v0.7 (or higher version).

Signed-off-by: Zong Li <[email protected]>
Reviewed-by: Greentime Hu <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Atish Patra <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: add Linux note to vdso

The Linux note in the vdso allows glibc to check the running kernel
version without having to issue the uname syscall.

Signed-off-by: Andreas Schwab <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: set max_pfn to the PFN of the last page

The current max_pfn equals to zero. In this case, I found it caused users
cannot get some page information through /proc such as kpagecount in v5.6
kernel because of new sanity checks. The following message is displayed by
stress-ng test suite with the command "stress-ng --verbose --physpage 1 -t
1" on HiFive unleashed board.

# stress-ng --verbose --physpage 1 -t 1
stress-ng: debug: [109] 4 processors online, 4 processors configured
stress-ng: info: [109] dispatching hogs: 1 physpage
stress-ng: debug: [109] cache allocate: reducing cache level from L3 (too high) to L0
stress-ng: debug: [109] get_cpu_cache: invalid cache_level: 0
stress-ng: info: [109] cache allocate: using built-in defaults as no suitable cache found
stress-ng: debug: [109] cache allocate: default cache size: 2048K
stress-ng: debug: [109] starting stressors
stress-ng: debug: [109] 1 stressor spawned
stress-ng: debug: [110] stress-ng-physpage: started [110] (instance 0)
stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd34de000 in /proc/kpagecount, errno=0 (Success)
stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd32db078 in /proc/kpagecount, errno=0 (Success)
...
stress-ng: error: [110] stress-ng-physpage: cannot read page count for address 0x3fd32db078 in /proc/kpagecount, errno=0 (Success)
stress-ng: debug: [110] stress-ng-physpage: exited [110] (instance 0)
stress-ng: debug: [109] process [110] terminated
stress-ng: info: [109] successful run completed in 1.00s
#

After applying this patch, the kernel can pass the test.

# stress-ng --verbose --physpage 1 -t 1
stress-ng: debug: [104] 4 processors online, 4 processors configured stress-ng: info: [104] dispatching hogs: 1 physpage
stress-ng: info: [104] cache allocate: using defaults, can't determine cache details from sysfs
stress-ng: debug: [104] cache allocate: default cache size: 2048K
stress-ng: debug: [104] starting stressors
stress-ng: debug: [104] 1 stressor spawned
stress-ng: debug: [105] stress-ng-physpage: started [105] (instance 0) stress-ng: debug: [105] stress-ng-physpage: exited [105] (instance 0) stress-ng: debug: [104] process [105] terminated
stress-ng: info: [104] successful run completed in 1.01s
#

Cc: [email protected]
Signed-off-by: Vincent Chen <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Yash Shah <[email protected]>
Tested-by: Yash Shah <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

RISC-V: Remove N-extension related defines

The RISC-V N-extension is still in draft state hence remove
N-extension related defines from asm/csr.h.

Signed-off-by: Anup Patel <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

RISC-V: Add bitmap reprensenting ISA features common across CPUs

This patch adds riscv_isa bitmap which represents Host ISA features
common across all Host CPUs. The riscv_isa is not same as elf_hwcap
because elf_hwcap will only have ISA features relevant for user-space
apps whereas riscv_isa will have ISA features relevant to both kernel
and user-space apps.

One of the use-case for riscv_isa bitmap is in KVM hypervisor where
we will use it to do following operations:

1. Check whether hypervisor extension is available
2. Find ISA features that need to be virtualized (e.g. floating
point support, vector extension, etc.)

Signed-off-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Alexander Graf <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

RISC-V: Export riscv_cpuid_to_hartid_mask() API

The riscv_cpuid_to_hartid_mask() API should be exported to allow
building KVM RISC-V as loadable module.

Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

nfp: abm: fix a memory leak bug

In function nfp_abm_vnic_set_mac, pointer nsp is allocated by nfp_nsp_open.
But when nfp_nsp_has_hwinfo_lookup fail, the pointer is not released,
which can lead to a memory leak bug. Fix this issue by adding
nfp_nsp_close(nsp) in the error path.

Fixes: f6e71efdf9fb1 ("nfp: abm: look up MAC addresses via management FW")
Signed-off-by: Qiushi Wu <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

atm: fix a memory leak of vcc->user_back

In lec_arp_clear_vccs() only entry->vcc is freed, but vcc
could be installed on entry->recv_vcc too in lec_vcc_added().

This fixes the following memory leak:

unreferenced object 0xffff8880d9266b90 (size 16):
  comm "atm2", pid 425, jiffies 4294907980 (age 23.488s)
  hex dump (first 16 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 6b 6b 6b a5  ............kkk.
  backtrace:
    [<(____ptrval____)>] kmem_cache_alloc_trace+0x10e/0x151
    [<(____ptrval____)>] lane_ioctl+0x4b3/0x569
    [<(____ptrval____)>] do_vcc_ioctl+0x1ea/0x236
    [<(____ptrval____)>] svc_ioctl+0x17d/0x198
    [<(____ptrval____)>] sock_do_ioctl+0x47/0x12f
    [<(____ptrval____)>] sock_ioctl+0x2f9/0x322
    [<(____ptrval____)>] vfs_ioctl+0x1e/0x2b
    [<(____ptrval____)>] ksys_ioctl+0x61/0x80
    [<(____ptrval____)>] __x64_sys_ioctl+0x16/0x19
    [<(____ptrval____)>] do_syscall_64+0x57/0x65
    [<(____ptrval____)>] entry_SYSCALL_64_after_hwframe+0x49/0xb3

Cc: Gengming Liu <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

atm: fix a UAF in lec_arp_clear_vccs()

Gengming reported a UAF in lec_arp_clear_vccs(),
where we add a vcc socket to an entry in a per-device
list but free the socket without removing it from the
list when vcc->dev is NULL.

We need to call lec_vcc_close() to search and remove
those entries contain the vcc being destroyed. This can
be done by calling vcc->push(vcc, NULL) unconditionally
in vcc_destroy_socket().

Another issue discovered by Gengming's reproducer is
the vcc->dev may point to the static device lecatm_dev,
for which we don't need to register/unregister device,
so we can just check for vcc->dev->ops->owner.

Reported-by: Gengming Liu <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: gmac5+: fix potential integer overflow on 32 bit multiply

The multiplication of cfg->ctr[1] by 1000000000 is performed using a
32 bit multiplication (since cfg->ctr[1] is a u32) and this can lead
to a potential overflow. Fix this by making the constant a ULL to
ensure a 64 bit multiply occurs.

Fixes: 504723af0d85 ("net: stmmac: Add basic EST support for GMAC5+")
Addresses-Coverity: ("Unintentional integer overflow")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: fix tcm_parent in tc filter dump

When we tell kernel to dump filters from root (ffff:ffff),
those filters on ingress (ffff:0000) are matched, but their
true parents must be dumped as they are. However, kernel
dumps just whatever we tell it, that is either ffff:ffff
or ffff:0000:

$ nl-cls-list --dev=dummy0 --parent=root
cls basic dev dummy0 id none parent root prio 49152 protocol ip match-all
cls basic dev dummy0 id :1 parent root prio 49152 protocol ip match-all
$ nl-cls-list --dev=dummy0 --parent=ffff:
cls basic dev dummy0 id none parent ffff: prio 49152 protocol ip match-all
cls basic dev dummy0 id :1 parent ffff: prio 49152 protocol ip match-all

This is confusing and misleading, more importantly this is
a regression since 4.15, so the old behavior must be restored.

And, when tc filters are installed on a tc class, the parent
should be the classid, rather than the qdisc handle. Commit
edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto")
removed the classid we save for filters, we can just restore
this classid in tcf_block.

Steps to reproduce this:
ip li set dev dummy0 up
tc qd add dev dummy0 ingress
tc filter add dev dummy0 parent ffff: protocol arp basic action pass
tc filter show dev dummy0 root

Before this patch:
filter protocol arp pref 49152 basic
filter protocol arp pref 49152 basic handle 0x1
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1

After this patch:
filter parent ffff: protocol arp pref 49152 basic
filter parent ffff: protocol arp pref 49152 basic handle 0x1
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1

Fixes: a10fa20101ae ("net: sched: propagate q and parent from caller down to tcf_fill_node")
Fixes: edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto")
Cc: Jamal Hadi Salim <[email protected]>
Cc: Jiri Pirko <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4/chcr: avoid -Wreturn-local-addr warning

gcc-10 warns about functions that return a pointer to a stack
variable. In chcr_write_cpl_set_tcb_ulp(), this does not actually
happen, but it's too hard to see for the compiler:

drivers/crypto/chelsio/chcr_ktls.c: In function 'chcr_write_cpl_set_tcb_ulp.constprop':
drivers/crypto/chelsio/chcr_ktls.c:760:9: error: function may return address of local variable [-Werror=return-local-addr]
  760 |  return pos;
      |         ^~~
drivers/crypto/chelsio/chcr_ktls.c:712:5: note: declared here
  712 |  u8 buf[48] = {0};
      |     ^~~

Split the middle part of the function out into a helper to make
it easier to understand by both humans and compilers, which avoids
the warning.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: fix cancelling of TX timer on dev_close()

With the introduction of TX coalescing, .ndo_start_xmit now potentially
starts the TX completion timer. So only kill the timer _after_ TX has
been disabled.

Fixes: ee1e52d1e4bb ("s390/qeth: add TX IRQ coalescing support for IQD devices")
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'gcc-plugins-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull gcc-plugins fixes from Kees Cook:
"GCC 10 fixes for gcc-plugins:

   - Adjust caller of cgraph_create_edge for GCC 10 argument usage

   - Update common headers to build under GCC 10 (Frédéric Pierret)"

* tag 'gcc-plugins-v5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-common.h: Update for GCC 10
  gcc-plugins/stackleak: Avoid assignment for unused macro argument

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
"A couple of bug fixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: vsock: kick send_pkt worker once device is started
virtio-blk: handle block_device_operations callbacks after hot unplug

net: enetc: fix an issue about leak system resources

the related system resources were not released when enetc_hw_alloc()
return error in the enetc_pci_mdio_probe(), add iounmap() for error
handling label "err_hw_alloc" to fix it.

Fixes: 6517798dd3432a ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
Cc: Andy Shevchenko <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'flexible-array-member-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull flex-array reverts from Gustavo Silva:
"This reverts flexible array changes in include/uapi/

  These structures can get embedded in other structures in user-space
  and cause all sorts of warnings and problems[1]. So, we better don't
  take any chances and keep the zero-length arrays in place for now"

[1] https://lore.kernel.org/lkml/20200424121553 [email protected]/

* tag 'flexible-array-member-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  uapi: revert flexible-array conversions

net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()

When ENOSPC is set the idx is still valid and gets set to the global
MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that
ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:

drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be
used uninitialized in this function [-Wmaybe-uninitialized]
2552 | priv->def_counter[port] = idx;

Also, when ENOSPC is returned mlx4_allocate_default_counters should not
fail.

Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port")
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: Fix reporter's recovery condition

Devlink health core conditions the reporter's recovery with the
expiration of the grace period. This is not relevant for the first
recovery. Explicitly demand that the grace period will only apply to
recoveries other than the first.

Fixes: c8e1da0bf923 ("devlink: Add health report functionality")
Signed-off-by: Aya Levin <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

stmmac: fix pointer check after utilization in stmmac_interrupt

The paranoidal pointer check in IRQ handler looks very strange - it
really protects us only against bogus drivers which request IRQ line
with null pointer dev_id. However, the code fragment is incorrect
because the dev pointer is used before the actual check which leads
to undefined behavior. Remove the check to avoid confusing people
with incorrect code.

Signed-off-by: Maxim Petrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix partial topology connection closure

When an application connects to the TIPC topology server and subscribes
to some services, a new connection is created along with some objects -
'tipc_subscription' to store related data correspondingly...
However, there is one omission in the connection handling that when the
connection or application is orderly shutdown (e.g. via SIGQUIT, etc.),
the connection is not closed in kernel, the 'tipc_subscription' objects
are not freed too.
This results in:
- The maximum number of subscriptions (65535) will be reached soon, new
subscriptions will be rejected;
- TIPC module cannot be removed (unless the objects are somehow forced
to release first);

The commit fixes the issue by closing the connection if the 'recvmsg()'
returns '0' i.e. when the peer is shutdown gracefully. It also includes
the other unexpected cases.

Acked-by: Jon Maloy <[email protected]>
Acked-by: Ying Xue <[email protected]>
Signed-off-by: Tuong Lien <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

MAINTAINERS: remove myself as ceph co-maintainer

Jeff, Ilya, and Dongsheng are doing all of the Ceph maintainance
these days.

[ idryomov: Remove Sage's git tree too, it hasn't been pushed to
in years. ]

Signed-off-by: Sage Weil <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

net: dsa: Do not make user port errors fatal

Prior to 1d27732f411d ("net: dsa: setup and teardown ports"), we would
not treat failures to set-up an user port as fatal, but after this
commit we would, which is a regression for some systems where interfaces
may be declared in the Device Tree, but the underlying hardware may not
be present (pluggable daughter cards for instance).

Fixes: 1d27732f411d ("net: dsa: setup and teardown ports")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ceph: fix double unlock in handle_cap_export()

If the ceph_mdsc_open_export_target_session() return fails, it will
do a "goto retry", but the session mutex has already been unlocked.
Re-lock the mutex in that case to ensure that we don't unlock it
twice.

Signed-off-by: Wu Bo <[email protected]>
Reviewed-by: "Yan, Zheng" <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: fix special error code in ceph_try_get_caps()

There are 3 speical error codes: -EAGAIN/-EFBIG/-ESTALE.
After calling try_get_cap_refs, ceph_try_get_caps test for the
-EAGAIN twice. Ensure that it tests for -ESTALE instead.

Signed-off-by: Wu Bo <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: fix endianness bug when handling MDS session feature bits

Eduard reported a problem mounting cephfs on s390 arch. The feature
mask sent by the MDS is little-endian, so we need to convert it
before storing and testing against it.

Cc: [email protected]
Reported-and-Tested-by: Eduard Shishkin <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: "Yan, Zheng" <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

tty: xilinx_uartps: Fix missing id assignment to the console

When serial console has been assigned to ttyPS1 (which is serial1 alias)
console index is not updated property and pointing to index -1 (statically
initialized) which ends up in situation where nothing has been printed on
the port.

The commit 18cc7ac8a28e ("Revert "serial: uartps: Register own uart console
and driver structures"") didn't contain this line which was removed by
accident.

Fixes: 18cc7ac8a28e ("Revert "serial: uartps: Register own uart console and driver structures"")
Signed-off-by: Shubhrajyoti Datta <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Michal Simek <[email protected]>
Link: https://lore.kernel.org/r/ed3111533ef5bd342ee5ec504812240b870f0853.1588602446.git.michal.simek@xilinx.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>

uapi: revert flexible-array conversions

These structures can get embedded in other structures in user-space
and cause all sorts of warnings and problems. So, we better don't take
any chances and keep the zero-length arrays in place for now.

Signed-off-by: Gustavo A. R. Silva <[email protected]>

kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts

Commit f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI") introduces
the following infinite loop:

BUG: stack guard page was hit at 000000008f595917 \
(stack is 00000000bdefe5a4..00000000ae2b06f5)
kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI
RIP: 0010:kvm_set_irq+0x51/0x160 [kvm]
Call Trace:
irqfd_resampler_ack+0x32/0x90 [kvm]
kvm_notify_acked_irq+0x62/0xd0 [kvm]
kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm]
ioapic_set_irq+0x20e/0x240 [kvm]
kvm_ioapic_set_irq+0x5c/0x80 [kvm]
kvm_set_irq+0xbb/0x160 [kvm]
? kvm_hv_set_sint+0x20/0x20 [kvm]
irqfd_resampler_ack+0x32/0x90 [kvm]
kvm_notify_acked_irq+0x62/0xd0 [kvm]
kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm]
ioapic_set_irq+0x20e/0x240 [kvm]
kvm_ioapic_set_irq+0x5c/0x80 [kvm]
kvm_set_irq+0xbb/0x160 [kvm]
? kvm_hv_set_sint+0x20/0x20 [kvm]
....

The re-entrancy happens because the irq state is the OR of
the interrupt state and the resamplefd state.  That is, we don't
want to show the state as 0 until we've had a chance to set the
resamplefd.  But if the interrupt has _not_ gone low then
ioapic_set_irq is invoked again, causing an infinite loop.

This can only happen for a level-triggered interrupt, otherwise
irqfd_inject would immediately set the KVM_USERSPACE_IRQ_SOURCE_ID high
and then low.  Fortunately, in the case of level-triggered interrupts the VMEXIT already happens because
TMR is set.  Thus, fix the bug by restricting the lazy invocation
of the ack notifier to edge-triggered interrupts, the only ones that
need it.

Tested-by: Suravee Suthikulpanit <[email protected]>
Reported-by: [email protected]
Suggested-by: Paolo Bonzini <[email protected]>
Link: https://www.spinics.net/lists/kvm/msg213512.html
Fixes: f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207489
Signed-off-by: Paolo Bonzini <[email protected]>

USB: serial: qcserial: Add DW5816e support

Add support for Dell Wireless 5816e to drivers/usb/serial/qcserial.c

Signed-off-by: Matt Jolly <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>

KVM: x86: Fixes posted interrupt check for IRQs delivery modes

Current logic incorrectly uses the enum ioapic_irq_destination_types
to check the posted interrupt destination types. However, the value was
set using APIC_DM_XXX macros, which are left-shifted by 8 bits.

Fixes by using the APIC_DM_FIXED and APIC_DM_LOWEST instead.

Fixes: (fdcf75621375 'KVM: x86: Disable posted interrupts for non-standard IRQs delivery modes')
Cc: Alexander Graf <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
Message-Id: <1586239989 [email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Tested-by: Maxim Levitsky <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

gcc-10 warnings: fix low-hanging fruit

Due to a bug-report that was compiler-dependent, I updated one of my
machines to gcc-10. That shows a lot of new warnings. Happily they
seem to be mostly the valid kind, but it's going to cause a round of
churn for getting rid of them..

This is the really low-hanging fruit of removing a couple of zero-sized
arrays in some core code. We have had a round of these patches before,
and we'll have many more coming, and there is nothing special about
these except that they were particularly trivial, and triggered more
warnings than most.

Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'kvmarm-fixes-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for Linux 5.7, take #2

- Fix compilation with Clang
- Correctly initialize GICv4.1 in the absence of a virtual ITS
- Move SP_EL0 save/restore to the guest entry/exit code
- Handle PC wrap around on 32bit guests, and narrow all 32bit
registers on userspace access

Merge tag 'kvmarm-fixes-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for Linux 5.7, take #1

- Prevent the userspace API from interacting directly with the HW
  stage of the virtual GIC
- Fix a couple of vGIC memory leaks
- Tighten the rules around the use of the 32bit PSCI functions
  for 64bit guest, as well as the opposite situation (matches the
  specification)

KVM: SVM: fill in kvm_run->debug.arch.dr[67]

The corresponding code was added for VMX in commit 42dbaa5a057
("KVM: x86: Virtualize debug registers, 2008-12-15) but never for AMD.
Fix this.

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning

Use BUG() in the impossible-to-hit default case when switching on the
scope of INVEPT to squash a warning with clang 11 due to clang treating
the BUG_ON() as conditional.

  >> arch/x86/kvm/vmx/nested.c:5246:3: warning: variable 'roots_to_free'
     is used uninitialized whenever 'if' condition is false
     [-Wsometimes-uninitialized]
                   BUG_ON(1);

Reported-by: kbuild test robot <[email protected]>
Fixes: ce8fe7b77bd8 ("KVM: nVMX: Free only the affected contexts when emulating INVEPT")
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20200504153506 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()

The prepare_to_wait() and finish_wait() calls in io_uring_cancel_files()
are mismatched. Currently I don't see any issues related this bug, just
find it by learning codes.

Signed-off-by: Xiaoguang Wang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

sun6i: dsi: fix gcc-4.8

Older compilers warn about initializers with incorrect curly
braces:

drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c: In function 'sun6i_dsi_encoder_enable':
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:720:8: error: missing braces around initializer [-Werror=missing-braces]
union phy_configure_opts opts = { 0 };
^
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:720:8: error: (near initialization for 'opts.mipi_dphy') [-Werror=missing-braces]

Use the GNU empty initializer extension to avoid this.

Fixes: bb3b6fcb6849 ("sun6i: dsi: Convert to generic phy handling")
Reviewed-by: Paul Kocialkowski <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm: ingenic-drm: add MODULE_DEVICE_TABLE

so that the driver can load by matching the device tree
if compiled as module.

Cc: [email protected] # v5.3+
Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: H. Nikolaus Schaller <[email protected]>
Signed-off-by: Paul Cercueil <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/1694a29b7a3449b6b662cec33d1b33f2ee0b174a.1588574111.git.hns@goldelico.com

vt: fix unicode console freeing with a common interface

By directly using kfree() in different places we risk missing one if
it is switched to using vfree(), especially if the corresponding
vmalloc() is hidden away within a common abstraction.

Oh wait, that's exactly what happened here.

So let's fix this by creating a common abstraction for the free case
as well.

Signed-off-by: Nicolas Pitre <[email protected]>
Reported-by: [email protected]
Fixes: 9a98e7a80f95 ("vt: don't use kmalloc() for the unicode screen buffer")
Cc: <[email protected]>
Reviewed-by: Sam Ravnborg <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart"

This reverts commit 580d952e44de5509c69c8f9346180ecaa78ebeec ("tty:
serial: bcm63xx: fix missing clk_put() in bcm63xx_uart") because we
should not be doing a clk_put() if we were not successful in getting a
valid clock reference via clk_get() in the first place.

Fixes: 580d952e44de ("tty: serial: bcm63xx: fix missing clk_put() in bcm63xx_uart")
Signed-off-by: Florian Fainelli <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: quirks: Add HID_QUIRK_NO_INIT_REPORTS quirk for Dell K12A keyboard-dock

Add a HID_QUIRK_NO_INIT_REPORTS quirk for the Dell K12A keyboard-dock,
which can be used with various Dell Venue 11 models.

Without this quirk the keyboard/touchpad combo works fine when connected
at boot, but when hotplugged 9 out of 10 times it will not work properly.
Adding the quirk fixes this.

Cc: Mario Limonciello <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

drm/virtio: create context before RESOURCE_CREATE_2D in 3D mode

If 3D is enabled, but userspace requests a dumb buffer, we will
call CTX_ATTACH_RESOURCE before actually creating the context.

Fixes: 72b48ae800da ("drm/virtio: enqueue virtio_gpu_create_context after the first 3D ioctl")
Signed-off-by: Gurchetan Singh <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

net: macb: fix an issue about leak related system resources

A call of the function macb_init() can fail in the function
fu540_c000_init. The related system resources were not released
then. use devm_platform_ioremap_resource() to replace ioremap()
to fix it.

Fixes: c218ad559020ff9 ("macb: Add support for SiFive FU540-C000")
Cc: Andy Shevchenko <[email protected]>
Reviewed-by: Yash Shah <[email protected]>
Suggested-by: Nicolas Ferre <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: usb: qmi_wwan: add support for DW5816e

Add support for Dell Wireless 5816e to drivers/net/usb/qmi_wwan.c

Signed-off-by: Matt Jolly <[email protected]>
Acked-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: sch_skbprio: add message validation to skbprio_change()

Do not assume the attribute has the right size.

Fixes: aea5f654e6b7 ("net/sched: add skbprio scheduler")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Linux 5.7-rc4

Merge tag 'for-5.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull more btrfs fixes from David Sterba:
"A few more stability fixes, minor build warning fixes and git url
  fixup:

   - fix partial loss of prealloc extent past i_size after fsync

   - fix potential deadlock due to wrong transaction handle passing via
     journal_info

   - fix gcc 4.8 struct intialization warning

   - update git URL in MAINTAINERS entry"

* tag 'for-5.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  MAINTAINERS: btrfs: fix git repo URL
  btrfs: fix gcc-4.8 build warning for struct initializer
  btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info
  btrfs: fix partial loss of prealloc extent past i_size after fsync

Merge tag 'iommu-fixes-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:

- Fix a memory leak when dev_iommu gets freed and a sub-pointer does
   not

- Build dependency fixes for Mediatek, spapr_tce, and Intel IOMMU
   driver

- Export iommu_group_get_for_dev() only for GPLed modules

- Fix AMD IOMMU interrupt remapping when x2apic is enabled

- Fix error path in the QCOM IOMMU driver probe function

* tag 'iommu-fixes-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/qcom: Fix local_base status check
  iommu: Properly export iommu_group_get_for_dev()
  iommu/vt-d: Use right Kconfig option name
  iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system
  iommu: spapr_tce: Disable compile testing to fix build on book3s_32 config
  iommu/mediatek: Fix MTK_IOMMU dependencies
  iommu: Fix the memory leak in dev_iommu_free()

MAINTAINERS: btrfs: fix git repo URL

The git repo listed for btrfs hasn't been updated in over a year.
List the current one instead.

Signed-off-by: Eric Biggers <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES

Fix the following warnings seen with !CONFIG_MODULES:

  arch/x86/kernel/unwind_orc.c:29:26: warning: 'cur_orc_table' defined but not used [-Wunused-variable]
     29 | static struct orc_entry *cur_orc_table = __start_orc_unwind;
        |                          ^~~~~~~~~~~~~
  arch/x86/kernel/unwind_orc.c:28:13: warning: 'cur_orc_ip_table' defined but not used [-Wunused-variable]
     28 | static int *cur_orc_ip_table = __start_orc_unwind_ip;
        |             ^~~~~~~~~~~~~~~~

Fixes: 153eb2223c79 ("x86/unwind/orc: Convert global variables to static")
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linux Next Mailing List <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/20200428071640.psn5m7eh3zt2in4v@treble

Merge tag 'pm-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:

- prevent the intel_pstate driver from printing excessive diagnostic
   messages in some cases (Chris Wilson)

- make the hibernation restore kernel freeze kernel threads as well as
   user space tasks (Dexuan Cui)

- fix the ACPI device PM disagnostic messages to include the correct
   power state name (Kai-Heng Feng).

* tag 'pm-5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: ACPI: Output correct message on target power state
  PM: hibernate: Freeze kernel threads in software_resume()
  cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once

Merge branches 'pm-cpufreq' and 'pm-sleep'

* pm-cpufreq:
cpufreq: intel_pstate: Only mention the BIOS disabling turbo mode once

* pm-sleep:
PM: hibernate: Freeze kernel threads in software_resume()

Merge tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap fix from Darrick Wong:
"Hoist the check for an unrepresentable FIBMAP return value into
  ioctl_fibmap.

  The internal kernel function can handle 64-bit values (and is needed
  to fix a regression on ext4 + jbd2). It is only the userspace ioctl
  that is so old that it cannot deal"

* tag 'iomap-5.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fibmap: Warn and return an error in case of block > INT_MAX

Merge tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:

  Stable fixes:
   - fix handling of backchannel binding in BIND_CONN_TO_SESSION

  Bugfixes:
   - Fix a credential use-after-free issue in pnfs_roc()
   - Fix potential posix_acl refcnt leak in nfs3_set_acl
   - defer slow parts of rpc_free_client() to a workqueue
   - Fix an Oopsable race in __nfs_list_for_each_server()
   - Fix trace point use-after-free race
   - Regression: the RDMA client no longer responds to server disconnect
     requests
   - Fix return values of xdr_stream_encode_item_{present, absent}
   - _pnfs_return_layout() must always wait for layoutreturn completion

  Cleanups:
   - Remove unreachable error conditions"

* tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix a race in __nfs_list_for_each_server()
  NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION
  SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
  NFSv4: Remove unreachable error condition due to rpc_run_task()
  SUNRPC: Remove unreachable error condition
  xprtrdma: Fix use of xdr_stream_encode_item_{present, absent}
  xprtrdma: Fix trace point use-after-free race
  xprtrdma: Restore wake-up-all to rpcrdma_cm_event_handler()
  nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl
  NFS/pnfs: Fix a credential use-after-free issue in pnfs_roc()
  NFS/pnfs: Ensure that _pnfs_return_layout() waits for layoutreturn completion

Merge tag 'dmaengine-fix-5.7-rc4' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
"Core:
   - Documentation typo fixes
   - fix the channel indexes
   - dmatest: fixes for process hang and iterations

  Drivers:
   - hisilicon: build error fix without PCI_MSI
   - ti-k3: deadlock fix
   - uniphier-xdmac: fix for reg region
   - pch: fix data race
   - tegra: fix clock state"

* tag 'dmaengine-fix-5.7-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: dmatest: Fix process hang when reading 'wait' parameter
  dmaengine: dmatest: Fix iteration non-stop logic
  dmaengine: tegra-apb: Ensure that clock is enabled during of DMA synchronization
  dmaengine: fix channel index enumeration
  dmaengine: mmp_tdma: Reset channel error on release
  dmaengine: mmp_tdma: Do not ignore slave config validation errors
  dmaengine: pch_dma.c: Avoid data race between probe and irq handler
  dt-bindings: dma: uniphier-xdmac: switch to single reg region
  include/linux/dmaengine: Typos fixes in API documentation
  dmaengine: xilinx_dma: Add missing check for empty list
  dmaengine: ti: k3-psil: fix deadlock on error path
  dmaengine: hisilicon: Fix build error without PCI_MSI

vhost: vsock: kick send_pkt worker once device is started

Ning Bo reported an abnormal 2-second gap when booting Kata container [1].
The unconditional timeout was caused by VSOCK_DEFAULT_CONNECT_TIMEOUT of
connecting from the client side. The vhost vsock client tries to connect
an initializing virtio vsock server.

The abnormal flow looks like:
host-userspace           vhost vsock                       guest vsock
==============           ===========                       ============
connect()     -------->  vhost_transport_send_pkt_work()   initializing
   |                     vq->private_data==NULL
   |                     will not be queued
   V
schedule_timeout(2s)
                         vhost_vsock_start()  <---------   device ready
                         set vq->private_data

wait for 2s and failed
connect() again          vq->private_data!=NULL         recv connecting pkt

Details:
1. Host userspace sends a connect pkt, at that time, guest vsock is under
   initializing, hence the vhost_vsock_start has not been called. So
   vq->private_data==NULL, and the pkt is not been queued to send to guest
2. Then it sleeps for 2s
3. After guest vsock finishes initializing, vq->private_data is set
4. When host userspace wakes up after 2s, send connecting pkt again,
   everything is fine.

As suggested by Stefano Garzarella, this fixes it by additional kicking the
send_pkt worker in vhost_vsock_start once the virtio device is started. This
makes the pending pkt sent again.

After this patch, kata-runtime (with vsock enabled) boot time is reduced
from 3s to 1s on a ThunderX2 arm64 server.

[1] https://github.com/kata-containers/runtime/issues/1917

Reported-by: Ning Bo <[email protected]>
Suggested-by: Stefano Garzarella <[email protected]>
Signed-off-by: Jia He <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>

virtio-blk: handle block_device_operations callbacks after hot unplug

A userspace process holding a file descriptor to a virtio_blk device can
still invoke block_device_operations after hot unplug.  This leads to a
use-after-free accessing vblk->vdev in virtblk_getgeo() when
ioctl(HDIO_GETGEO) is invoked:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
  IP: [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
  PGD 800000003a92f067 PUD 3a930067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  task: ffff9be5fbfb8000 ti: ffff9be5fa890000 task.ti: ffff9be5fa890000
  RIP: 0010:[<ffffffffc00e5450>]  [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
  RSP: 0018:ffff9be5fa893dc8  EFLAGS: 00010246
  RAX: ffff9be5fc3f3400 RBX: ffff9be5fa893e30 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9be5fbc10b40
  RBP: ffff9be5fa893dc8 R08: 0000000000000301 R09: 0000000000000301
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff9be5fdc24680
  R13: ffff9be5fbc10b40 R14: ffff9be5fbc10480 R15: 0000000000000000
  FS:  00007f1bfb968740(0000) GS:ffff9be5ffc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000090 CR3: 000000003a894000 CR4: 0000000000360ff0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   [<ffffffffc016ac37>] virtblk_getgeo+0x47/0x110 [virtio_blk]
   [<ffffffff8d3f200d>] ? handle_mm_fault+0x39d/0x9b0
   [<ffffffff8d561265>] blkdev_ioctl+0x1f5/0xa20
   [<ffffffff8d488771>] block_ioctl+0x41/0x50
   [<ffffffff8d45d9e0>] do_vfs_ioctl+0x3a0/0x5a0
   [<ffffffff8d45dc81>] SyS_ioctl+0xa1/0xc0

A related problem is that virtblk_remove() leaks the vd_index_ida index
when something still holds a reference to vblk->disk during hot unplug.
This causes virtio-blk device names to be lost (vda, vdb, etc).

Fix these issues by protecting vblk->vdev with a mutex and reference
counting vblk so the vd_index_ida index can be removed in all cases.

Fixes: 48e4043d4529 ("virtio: add virtio disk geometry feature")
Reported-by: Lance Digby <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>

Merge tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

- copy_*_user validity check for new vfio_dma_rw interface (Yan Zhao)

- Fix a potential math overflow (Yan Zhao)

- Use follow_pfn() for calculating PFNMAPs (Sean Christopherson)

* tag 'vfio-v5.7-rc4' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()
  vfio: avoid possible overflow in vfio_iommu_type1_pin_pages
  vfio: checking of validity of user vaddr in vfio_dma_rw

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
"Add -fasynchronous-unwind-tables to the vDSO CFLAGS"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vdso: Add -fasynchronous-unwind-tables to cflags