Jason Gunthorpe [Tue, 1 Oct 2019 15:38:21 +0000 (12:38 -0300)]
RDMA/mlx5: Add missing synchronize_srcu() for MW cases
While MR uses live as the SRCU 'update', the MW case uses the xarray
directly, xa_erase() causes the MW to become inaccessible to the pagefault
thread.
Thus whenever a MW is removed from the xarray we must synchronize_srcu()
before freeing it.
This must be done before freeing the mkey as re-use of the mkey while the
pagefault thread is using the stale mkey is undesirable.
Add the missing synchronizes to MW and DEVX indirect mkey and delete the
bogus protection against double destroy in mlx5_core_destroy_mkey()
Jason Gunthorpe [Tue, 1 Oct 2019 15:38:20 +0000 (12:38 -0300)]
RDMA/mlx5: Put live in the correct place for ODP MRs
live is used to signal to the pagefault thread that the MR is initialized
and ready for use. It should be after the umem is assigned and all other
setup is completed. This prevents races (at least) of the form:
CPU0 CPU1
mlx5_ib_alloc_implicit_mr()
implicit_mr_alloc()
live = 1
imr->umem = umem
num_pending_prefetch_inc()
if (live)
atomic_inc(num_pending_prefetch)
atomic_set(num_pending_prefetch,0) // Overwrites other thread's store
Further, live is being used with SRCU as the 'update' in an
acquire/release fashion, so it can not be read and written raw.
Move all live = 1's to after MR initialization is completed and use
smp_store_release/smp_load_acquire() for manipulating it.
Add a missing live = 0 when an implicit MR child is deleted, before
queuing work to do synchronize_srcu().
The barriers in update_odp_mr() were some broken attempt to create a
acquire/release, but were not even applied consistently and missed the
point, delete it as well.
Jason Gunthorpe [Tue, 1 Oct 2019 15:38:19 +0000 (12:38 -0300)]
RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
During destroy setting live = 0 and then synchronize_srcu() prevents
num_pending_prefetch from incrementing, and also, ensures that all work
holding that count is queued on the WQ. Testing before causes races of the
form:
Jason Gunthorpe [Tue, 1 Oct 2019 15:38:18 +0000 (12:38 -0300)]
RDMA/odp: Lift umem_mutex out of ib_umem_odp_unmap_dma_pages()
This fixes a race of the form:
CPU0 CPU1
mlx5_ib_invalidate_range() mlx5_ib_invalidate_range()
// This one actually makes npages == 0
ib_umem_odp_unmap_dma_pages()
if (npages == 0 && !dying)
// This one does nothing
ib_umem_odp_unmap_dma_pages()
if (npages == 0 && !dying)
dying = 1;
dying = 1;
schedule_work(&umem_odp->work);
// Double schedule of the same work
schedule_work(&umem_odp->work); // BOOM
npages and dying must be read and written under the umem_mutex lock.
Since whenever ib_umem_odp_unmap_dma_pages() is called mlx5 must also call
mlx5_ib_update_xlt, and both need to be done in the same locking region,
hoist the lock out of unmap.
This avoids an expensive double critical section in
mlx5_ib_invalidate_range().
Jason Gunthorpe [Tue, 1 Oct 2019 15:38:17 +0000 (12:38 -0300)]
RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MR
mlx5_ib_update_xlt() must be protected against parallel free of the MR it
is accessing, also it must be called single threaded while updating the
HW. Otherwise we can have races of the form:
mlx5_ib_post_send_wait() // Replaces VALID with ZAP
This can be solved by putting both the SRCU and the umem_mutex lock around
every call to mlx5_ib_update_xlt(). This ensures that the content of the
interval tree relavent to mlx5_odp_populate_klm() (ie mr->parent == mr)
will not change while it is running, and thus the posted WRs to update the
KLM will always reflect the correct information.
The race above will resolve by either having CPU1 wait till CPU0 completes
the ZAP or CPU0 will run after the add and instead store VALID.
The pagefault path adding children already holds the umem_mutex and SRCU,
so the only missed lock is during MR destruction.
Jason Gunthorpe [Tue, 1 Oct 2019 15:38:16 +0000 (12:38 -0300)]
RDMA/mlx5: Do not allow rereg of a ODP MR
This code is completely broken, the umem of a ODP MR simply cannot be
discarded without a lot more locking, nor can an ODP mkey be blithely
destroyed via destroy_mkey().
Mohamad Heib [Wed, 2 Oct 2019 12:21:27 +0000 (15:21 +0300)]
IB/core: Fix wrong iterating on ports
rdma_for_each_port is already incrementing the iterator's value it
receives therefore, after the first iteration the iterator is increased by
2 which eventually causing wrong queries and possible traces.
Fix the above by removing the old redundant incrementation that was used
before rdma_for_each_port() macro.
Paul Burton [Fri, 4 Oct 2019 17:41:02 +0000 (17:41 +0000)]
MIPS: fw/arc: Remove unused addr variable
The addr variable in prom_free_prom_memory() has been unused since
commit 0df1007677d5 ("MIPS: fw: Record prom memory"), leading to a
compiler warning:
Linus Torvalds [Fri, 4 Oct 2019 18:17:51 +0000 (11:17 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM and x86 bugfixes of all kinds.
The most visible one is that migrating a nested hypervisor has always
been busted on Broadwell and newer processors, and that has finally
been fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: x86: omit "impossible" pmu MSRs from MSR list
KVM: nVMX: Fix consistency check on injected exception error code
KVM: x86: omit absent pmu MSRs from MSR list
selftests: kvm: Fix libkvm build error
kvm: vmx: Limit guest PMCs to those supported on the host
kvm: x86, powerpc: do not allow clearing largepages debugfs entry
KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure
KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF
selftests: kvm: add test for dirty logging inside nested guests
KVM: x86: fix nested guest live migration with PML
KVM: x86: assign two bits to track SPTE kinds
KVM: x86: Expose XSAVEERPTR to the guest
kvm: x86: Enumerate support for CLZERO instruction
kvm: x86: Use AMD CPUID semantics for AMD vCPUs
kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH
KVM: X86: Fix userspace set invalid CR4
kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func
KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns
KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH
arm64: KVM: Kill hyp_alternate_select()
...
Greg KH [Tue, 1 Oct 2019 16:56:11 +0000 (18:56 +0200)]
RDMA/cxgb4: Do not dma memory off of the stack
Nicolas pointed out that the cxgb4 driver is doing dma off of the stack,
which is generally considered a very bad thing. On some architectures it
could be a security problem, but odds are none of them actually run this
driver, so it's just a "normal" bug.
Resolve this by allocating the memory for a message off of the heap
instead of the stack. kmalloc() always will give us a proper memory
location that DMA will work correctly from.
Linus Torvalds [Fri, 4 Oct 2019 18:13:09 +0000 (11:13 -0700)]
Merge tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes and cleanups from Juergen Gross:
- a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some
cases, plus a follow-on cleanup series for that driver
- a patch for introducing non-blocking EFI callbacks in Xen's EFI
driver, plu a cleanup patch for Xen EFI handling merging the x86 and
ARM arch specific initialization into the Xen EFI driver
- a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning
up after a user process has died
- a fix for Xen on ARM after removal of ZONE_DMA
- a cleanup patch for avoiding build warnings for Xen on ARM
* tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: fix self-deadlock after killing user process
xen/efi: have a common runtime setup function
arm: xen: mm: use __GPF_DMA32 for arm64
xen/balloon: Clear PG_offline in balloon_retrieve()
xen/balloon: Mark pages PG_offline in balloon_append()
xen/balloon: Drop __balloon_append()
xen/balloon: Set pages PageOffline() in balloon_add_region()
ARM: xen: unexport HYPERVISOR_platform_op function
xen/efi: Set nonblocking callbacks
RDMA/core: Fix an error handling path in 'res_get_common_doit()'
According to surrounding error paths, it is likely that 'goto err_get;' is
expected here. Otherwise, a call to 'rdma_restrack_put(res);' would be
missing.
Linus Torvalds [Fri, 4 Oct 2019 17:36:31 +0000 (10:36 -0700)]
Merge tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull copy_struct_from_user() helper from Christian Brauner:
"This contains the copy_struct_from_user() helper which got split out
from the openat2() patchset. It is a generic interface designed to
copy a struct from userspace.
The helper will be especially useful for structs versioned by size of
which we have quite a few. This allows for backwards compatibility,
i.e. an extended struct can be passed to an older kernel, or a legacy
struct can be passed to a newer kernel. For the first case (extended
struct, older kernel) the new fields in an extended struct can be set
to zero and the struct safely passed to an older kernel.
The most obvious benefit is that this helper lets us get rid of
duplicate code present in at least sched_setattr(), perf_event_open(),
and clone3(). More importantly it will also help to ensure that users
implementing versioning-by-size end up with the same core semantics.
This point is especially crucial since we have at least one case where
versioning-by-size is used but with slighly different semantics:
sched_setattr(), perf_event_open(), and clone3() all do do similar
checks to copy_struct_from_user() while rt_sigprocmask(2) always
rejects differently-sized struct arguments.
With this pull request we also switch over sched_setattr(),
perf_event_open(), and clone3() to use the new helper"
* tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
usercopy: Add parentheses around assignment in test_copy_struct_from_user
perf_event_open: switch to copy_struct_from_user()
sched_setattr: switch to copy_struct_from_user()
clone3: switch to copy_struct_from_user()
lib: introduce copy_struct_from_user() helper
ib_device_get_netdev() does not have a netdev associated
with the ibdev and thus fails.
Use ib_device_set_netdev() to associate netdev to ibdev
in i40iw before IB device registration.
Linus Torvalds [Fri, 4 Oct 2019 17:18:56 +0000 (10:18 -0700)]
Merge tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull clone3/pidfd fixes from Christian Brauner:
"This contains a couple of fixes:
- Fix pidfd selftest compilation (Shuah Kahn)
Due to a false linking instruction in the Makefile compilation for
the pidfd selftests would fail on some systems.
- Fix compilation for glibc on RISC-V systems (Seth Forshee)
In some scenarios linux/uapi/linux/sched.h is included where
__ASSEMBLY__ is defined causing a build failure because struct
clone_args was not guarded by an #ifndef __ASSEMBLY__.
- Add missing clone3() and struct clone_args kernel-doc (Christian Brauner)
clone3() and struct clone_args were missing kernel-docs. (The goal
is to use kernel-doc for any function or type where it's worth it.)
For struct clone_args this also contains a comment about the fact
that it's versioned by size"
* tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
sched: add kernel-doc for struct clone_args
fork: add kernel-doc for clone3
selftests: pidfd: Fix undefined reference to pthread_create()
sched: Add __ASSEMBLY__ guards around struct clone_args
Linus Torvalds [Fri, 4 Oct 2019 17:12:37 +0000 (10:12 -0700)]
Merge tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Been offline for 3 days, got back and had some fixes queued up.
Nothing too major, the i915 dp-mst fix is important, and amdgpu has a
bulk move speedup fix and some regressions, but nothing too insane for
an rc2 pull. The intel fixes are also 2 weeks worth, they missed the
boat last week.
core:
- writeback fixes
i915:
- Fix DP-MST crtc_mask
- Fix dsc dpp calculations
- Fix g4x sprite scaling stride check with GTT remapping
- Fix concurrence on cases where requests where getting retired at
same time as resubmitted to HW
- Fix gen9 display resolutions by setting the right max plane width
- Fix GPU hang on preemption
- Mark contents as dirty on a write fault. This was breaking cursor
sprite with dumb buffers.
komeda:
- memory leak fix
tilcdc:
- include fix
amdgpu:
- Enable bulk moves
- Power metrics fixes for Navi
- Fix S4 regression
- Add query for tcc disabled mask
- Fix several leaks in error paths
- randconfig fixes
- clang fixes"
* tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drm: (21 commits)
Revert "drm/i915: Fix DP-MST crtc_mask"
drm/omap: fix max fclk divider for omap36xx
drm/i915: Fix g4x sprite scaling stride check with GTT remapping
drm/i915/dp: Fix dsc bpp calculations, v5.
drm/amd/display: fix dcn21 Makefile for clang
drm/amd/display: hide an unused variable
drm/amdgpu: display_mode_vba_21: remove uint typedef
drm/amdgpu: hide another #warning
drm/amdgpu: make pmu support optional, again
drm/amd/display: memory leak
drm/amdgpu: fix multiple memory leaks in acp_hw_init
drm/amdgpu: return tcc_disabled_mask to userspace
drm/amdgpu: don't increment vram lost if we are in hibernation
Revert "drm/amdgpu: disable stutter mode for renoir"
drm/amd/powerplay: add sensor lock support for smu
drm/amd/powerplay: change metrics update period from 1ms to 100ms
drm/amdgpu: revert "disable bulk moves for now"
drm/tilcdc: include linux/pinctrl/consumer.h again
drm/komeda: prevent memory leak in komeda_wb_connector_add
drm: Clear the fence pointer when writeback job signaled
...
Linus Torvalds [Fri, 4 Oct 2019 16:56:51 +0000 (09:56 -0700)]
Merge tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Mandate timespec64 for the io_uring timeout ABI (Arnd)
- Set of NVMe changes via Sagi:
- controller removal race fix from Balbir
- quirk additions from Gabriel and Jian-Hong
- nvme-pci power state save fix from Mario
- Add 64bit user commands (for 64bit registers) from Marta
- nvme-rdma/nvme-tcp fixes from Max, Mark and Me
- Minor cleanups and nits from James, Dan and John
- Two s390 dasd fixes (Jan, Stefan)
- Have loop change block size in DIO mode (Martijn)
- paride pg header ifdef guard (Masahiro)
- Two blk-mq queue scheduler tweaks, fixing an ordering issue on zoned
devices and suboptimal performance on others (Ming)
* tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block: (22 commits)
block: sed-opal: fix sparse warning: convert __be64 data
block: sed-opal: fix sparse warning: obsolete array init.
block: pg: add header include guard
Revert "s390/dasd: Add discard support for ESE volumes"
s390/dasd: Fix error handling during online processing
io_uring: use __kernel_timespec in timeout ABI
loop: change queue block size to match when using DIO
blk-mq: apply normal plugging for HDD
blk-mq: honor IO scheduler for multiqueue devices
nvme-rdma: fix possible use-after-free in connect timeout
nvme: Move ctrl sqsize to generic space
nvme: Add ctrl attributes for queue_count and sqsize
nvme: allow 64-bit results in passthru commands
nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T
nvmet-tcp: remove superflous check on request sgl
Added QUIRKs for ADATA XPG SX8200 Pro 512GB
nvme-rdma: Fix max_hw_sectors calculation
nvme: fix an error code in nvme_init_subsystem()
nvme-pci: Save PCI state before putting drive into deepest state
nvme-tcp: fix wrong stop condition in io_work
...
mei: avoid FW version request on Ibex Peak and earlier
The fixed MKHI client on PCH 6 gen platforms
does not support fw version retrieval.
The error is not fatal, but it fills up the kernel logs and
slows down the driver start.
This patch disables requesting FW version on GEN6 and earlier platforms.
Fixes warning:
[ 15.964298] mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: Could not read FW version
[ 15.964301] mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: version command failed -5
Qian Cai [Thu, 3 Oct 2019 21:36:36 +0000 (17:36 -0400)]
s390/mm: fix -Wunused-but-set-variable warnings
Convert two functions to static inline to get ride of W=1 GCC warnings
like,
mm/gup.c: In function 'gup_pte_range':
mm/gup.c:1816:16: warning: variable 'ptem' set but not used
[-Wunused-but-set-variable]
pte_t *ptep, *ptem;
^~~~
mm/mmap.c: In function 'acct_stack_growth':
mm/mmap.c:2322:16: warning: variable 'new_start' set but not used
[-Wunused-but-set-variable]
unsigned long new_start;
^~~~~~~~~
Jiri Kosina [Tue, 1 Oct 2019 20:08:01 +0000 (22:08 +0200)]
s390: mark __cpacf_query() as __always_inline
arch/s390/kvm/kvm-s390.c calls on several places __cpacf_query() directly,
which makes it impossible to meet the "i" constraint for the asm operands
(opcode in this case).
As we are now force-enabling CONFIG_OPTIMIZE_INLINING on all
architectures, this causes a build failure on s390:
In file included from arch/s390/kvm/kvm-s390.c:44:
./arch/s390/include/asm/cpacf.h: In function '__cpacf_query':
./arch/s390/include/asm/cpacf.h:179:2: warning: asm operand 3 probably doesn't match constraints
179 | asm volatile(
| ^~~
./arch/s390/include/asm/cpacf.h:179:2: error: impossible constraint in 'asm'
Mark __cpacf_query() as __always_inline in order to fix that, analogically
how we fixes __cpacf_check_opcode(), cpacf_query_func() and scpacf_query()
already.
Reported-and-tested-by: Michal Kubecek <[email protected]> Fixes: d83623c5eab2 ("s390: mark __cpacf_check_opcode() and cpacf_query_func() as __always_inline") Fixes: e60fb8bf68d4 ("s390/cpacf: mark scpacf_query() as __always_inline") Fixes: ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Signed-off-by: Jiri Kosina <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
Merge tag 'usb-serial-5.4-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fixes for 5.4-rc2
Here's a fix for a long-standing issue in the keyspan driver which could
lead to NULL-pointer dereferences when a device had unexpected endpoint
descriptors.
Included are also some new device IDs.
All but the last two commits have been in linux-next with no reported
issues.
Signed-off-by: Johan Hovold <[email protected]>
* tag 'usb-serial-5.4-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: keyspan: fix NULL-derefs on open() and write()
USB: serial: option: add support for Cinterion CLS8 devices
USB: serial: option: add Telit FN980 compositions
USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20
Randy Dunlap [Tue, 1 Oct 2019 02:15:12 +0000 (19:15 -0700)]
tty: n_hdlc: fix build on SPARC
Fix tty driver build on SPARC by not using __exitdata.
It appears that SPARC does not support section .exit.data.
Fixes these build errors:
`.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
`.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
`.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
`.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
Michal Simek [Fri, 4 Oct 2019 13:04:11 +0000 (15:04 +0200)]
serial: uartps: Fix uartps_major handling
There are two parts which should be fixed. The first one is to assigned
uartps_major at the end of probe() to avoid complicated logic when
something fails.
The second part is initialized uartps_major number to 0 when last device is
removed. This will ensure that on next probe driver will ask for new
dynamic major number.
Following an incorrect indentation reported to me by Dan Carpenter, I
noticed that the SysRq lines were inherited from the lpuart driver[1] (note
how the 'continue' is aligned to 'sport->port.sysrq = 0') and we have never
actually tested the SysRq support.
'sport->sysrq = 0' is not necessary neither before nor after 'continue',
because sysrq will already be 0 after uart_handle_sysrq_char() will finish.
Also, since the LINFlexD driver never called uart_handle_break(), sysrq
would have never been set to a nonzero value, so uart_handle_sysrq_char()
was not going to do anything.
Break conditions are detected based on a null data byte along with a
framing error (stop bit sampled to 0).
serial: sh-sci: Use platform_get_irq_optional() for optional interrupts
As platform_get_irq() now prints an error when the interrupt does not
exist, scary warnings may be printed for optional interrupts:
sh-sci e6550000.serial: IRQ index 1 not found
sh-sci e6550000.serial: IRQ index 2 not found
sh-sci e6550000.serial: IRQ index 3 not found
sh-sci e6550000.serial: IRQ index 4 not found
sh-sci e6550000.serial: IRQ index 5 not found
Fix this by calling platform_get_irq_optional() instead for all but the
first interrupts, which are optional.
The sifive serial driver implements earlycon support, but unless
another driver is built in that supports earlycon support it won't
be usable. Explicitly select SERIAL_EARLYCON instead.
Johan Hovold [Tue, 1 Oct 2019 08:49:08 +0000 (10:49 +0200)]
media: stkwebcam: fix runtime PM after driver unbind
Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.
Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.
Note that runtime PM has never actually been enabled for this driver
since the support_autosuspend flag in its usb_driver struct is not set.
Johan Hovold [Tue, 1 Oct 2019 08:49:07 +0000 (10:49 +0200)]
USB: serial: fix runtime PM after driver unbind
Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.
Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.
Johan Hovold [Tue, 1 Oct 2019 08:49:06 +0000 (10:49 +0200)]
USB: usblp: fix runtime PM after driver unbind
Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.
Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.
Johan Hovold [Tue, 1 Oct 2019 08:49:05 +0000 (10:49 +0200)]
USB: usb-skeleton: fix runtime PM after driver unbind
Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.
Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.
Maxime Ripard [Wed, 2 Oct 2019 11:26:51 +0000 (13:26 +0200)]
dt-bindings: usb: Bring back phy-names
While the original bindings that were superseeded by the YAML schemas
didn't mention that phy-names was needed, it turns out that phy-names is
required if phys is set according to phy/phy-bindings.txt.
Let's add back those properties.
Fixes: 14ec072a19ad ("dt-bindings: usb: Convert USB HCD generic binding to YAML") Fixes: c93bcace1098 ("dt-bindings: usb: Convert the generic OHCI binding to YAML") Fixes: c3e2485d5f4f ("dt-bindings: usb: Convert the generic EHCI binding to YAML") Reported-by: Emmanuel Vadot <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Maxime Ripard [Wed, 2 Oct 2019 11:26:50 +0000 (13:26 +0200)]
ARM: dts: sunxi: Revert phy-names removal for ECHI and OHCI
This reverts commits 3d109bdca981 ("ARM: dts: sunxi: Remove useless
phy-names from EHCI and OHCI"), 0a3df8bb6dad ("ARM: dts: sunxi: h3/h5:
Remove useless phy-names from EHCI and OHCI") and 3c7ab90aaa28 ("arm64:
dts: allwinner: Remove useless phy-names from EHCI and OHCI").
It turns out that while the USB bindings were not mentionning it, the PHY
client bindings were mandating that phy-names is set when phys is. Let's
add it back.
Fixes: 3d109bdca981 ("ARM: dts: sunxi: Remove useless phy-names from EHCI and OHCI") Fixes: 0a3df8bb6dad ("ARM: dts: sunxi: h3/h5: Remove useless phy-names from EHCI and OHCI") Fixes: 3c7ab90aaa28 ("arm64: dts: allwinner: Remove useless phy-names from EHCI and OHCI") Reported-by: Emmanuel Vadot <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
According to usb_ep_set_halt()'s description,
__usbhsg_ep_set_halt_wedge() should return -EAGAIN if the IN endpoint
has any queue or data. Otherwise, this driver is possible to cause
just STALL without sending a short packet data on g_mass_storage driver,
and then a few resetting a device happens on a host side during
a usb enumaration.
usb: renesas_usbhs: gadget: Do not discard queues in usb_ep_set_{halt,wedge}()
The commit 97664a207bc2 ("usb: renesas_usbhs: shrink spin lock area")
had added a usbhsg_pipe_disable() calling into
__usbhsg_ep_set_halt_wedge() accidentally. But, this driver should
not call the usbhsg_pipe_disable() because the function discards
all queues. So, this patch removes it.
gcc points out a suspicious cast from a pointer to an 'int' when
compile-testing on 64-bit architectures.
drivers/usb/gadget/udc/lpc32xx_udc.c: In function ‘udc_pop_fifo’:
drivers/usb/gadget/udc/lpc32xx_udc.c:1156:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
drivers/usb/gadget/udc/lpc32xx_udc.c: In function ‘udc_stuff_fifo’:
drivers/usb/gadget/udc/lpc32xx_udc.c:1257:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
The code works find, but it's easy enough to change the cast to
a uintptr_t to shut up that warning.
Mathias Nyman [Fri, 4 Oct 2019 11:59:33 +0000 (14:59 +0300)]
xhci: Fix NULL pointer dereference in xhci_clear_tt_buffer_complete()
udev stored in ep->hcpriv might be NULL if tt buffer is cleared
due to a halted control endpoint during device enumeration
xhci_clear_tt_buffer_complete is called by hub_tt_work() once it's
scheduled, and by then usb core might have freed and allocated a
new udev for the next enumeration attempt.
Kai-Heng Feng [Fri, 4 Oct 2019 11:59:32 +0000 (14:59 +0300)]
xhci: Increase STS_SAVE timeout in xhci_suspend()
After commit f7fac17ca925 ("xhci: Convert xhci_handshake() to use
readl_poll_timeout_atomic()"), ASMedia xHCI may fail to suspend.
Although the algorithms are essentially the same, the old max timeout is
(usec + usec * time of doing readl()), and the new max timeout is just
usec, which is much less than the old one.
Increase the timeout to make ASMedia xHCI able to suspend again.
Bill Kuzeja [Fri, 4 Oct 2019 11:59:31 +0000 (14:59 +0300)]
xhci: Prevent deadlock when xhci adapter breaks during init
The system can hit a deadlock if an xhci adapter breaks while initializing.
The deadlock is between two threads: thread 1 is tearing down the
adapter and is stuck in usb_unlocked_disable_lpm waiting to lock the
hcd->handwidth_mutex. Thread 2 is holding this mutex (while still trying
to add a usb device), but is stuck in xhci_endpoint_reset waiting for a
stop or config command to complete. A reboot is required to resolve.
It turns out when calling xhci_queue_stop_endpoint and
xhci_queue_configure_endpoint in xhci_endpoint_reset, the return code is
not checked for errors. If the timing is right and the adapter dies just
before either of these commands get issued, we hang indefinitely waiting
for a completion on a command that didn't get issued.
This wasn't a problem before the following fix because we didn't send
commands in xhci_endpoint_reset:
commit f5249461b504 ("xhci: Clear the host side toggle manually when
endpoint is soft reset")
With the patch I am submitting, a duration test which breaks adapters
during initialization (and which deadlocks with the standard kernel) runs
without issue.
Rick Tseng [Fri, 4 Oct 2019 11:59:30 +0000 (14:59 +0300)]
usb: xhci: wait for CNR controller not ready bit in xhci resume
NVIDIA 3.1 xHCI card would lose power when moving power state into D3Cold.
Thus we need to wait for CNR bit to clear in xhci resume, just as in
xhci init.
Mathias Nyman [Fri, 4 Oct 2019 11:59:29 +0000 (14:59 +0300)]
xhci: Fix USB 3.1 capability detection on early xHCI 1.1 spec based hosts
Early xHCI 1.1 spec did not mention USB 3.1 capable hosts should set
sbrn to 0x31, or that the minor revision is a two digit BCD
containing minor and sub-minor numbers.
This was later clarified in xHCI 1.2.
Some USB 3.1 capable hosts therefore have sbrn set to 0x30, or minor
revision set to 0x1 instead of 0x10.
Detect the USB 3.1 capability correctly for these hosts as well
Jan Schmidt [Fri, 4 Oct 2019 11:59:28 +0000 (14:59 +0300)]
xhci: Check all endpoints for LPM timeout
If an endpoint is encountered that returns USB3_LPM_DEVICE_INITIATED, keep
checking further endpoints, as there might be periodic endpoints later
that return USB3_LPM_DISABLED due to shorter service intervals.
Without this, the code can set too high a maximum-exit-latency and
prevent the use of multiple USB3 cameras that should be able to work.
Mathias Nyman [Fri, 4 Oct 2019 11:59:27 +0000 (14:59 +0300)]
xhci: Prevent device initiated U1/U2 link pm if exit latency is too long
If host/hub initiated link pm is prevented by a driver flag we still must
ensure that periodic endpoints have longer service intervals than link pm
exit latency before allowing device initiated link pm.
Fix this by continue walking and checking endpoint service interval if
xhci_get_timeout_no_hub_lpm() returns anything else than USB3_LPM_DISABLED
The check printing out the "WARN Wrong bounce buffer write length:"
uses incorrect values when comparing bytes written from scatterlist
to bounce buffer. Actual copied lengths are fine.
The used seg->bounce_len will be set to equal new_buf_len a few lines later
in the code, but is incorrect when doing the comparison.
The patch which added this false warning was backported to 4.8+ kernels
so this should be backported as far as well.
Johan Hovold [Thu, 19 Sep 2019 08:30:39 +0000 (10:30 +0200)]
USB: legousbtower: fix open after failed reset request
The driver would return with a nonzero open count in case the reset
control request failed. This would prevent any further attempts to open
the char dev until the device was disconnected.
Fix this by incrementing the open count only on successful open.
Johan Hovold [Thu, 19 Sep 2019 08:30:38 +0000 (10:30 +0200)]
USB: legousbtower: fix potential NULL-deref on disconnect
The driver is using its struct usb_device pointer as an inverted
disconnected flag, but was setting it to NULL before making sure all
completion handlers had run. This could lead to a NULL-pointer
dereference in a number of dev_dbg and dev_err statements in the
completion handlers which relies on said pointer.
Fix this by unconditionally stopping all I/O and preventing
resubmissions by poisoning the interrupt URBs at disconnect and using a
dedicated disconnected flag.
This also makes sure that all I/O has completed by the time the
disconnect callback returns.
Johan Hovold [Thu, 19 Sep 2019 08:30:37 +0000 (10:30 +0200)]
USB: legousbtower: fix deadlock on disconnect
Fix a potential deadlock if disconnect races with open.
Since commit d4ead16f50f9 ("USB: prevent char device open/deregister
race") core holds an rw-semaphore while open is called and when
releasing the minor number during deregistration. This can lead to an
ABBA deadlock if a driver takes a lock in open which it also holds
during deregistration.
This effectively reverts commit 78663ecc344b ("USB: disconnect open race
in legousbtower") which needlessly introduced this issue after a generic
fix for this race had been added to core by commit d4ead16f50f9 ("USB:
prevent char device open/deregister race").
Paolo Bonzini [Tue, 1 Oct 2019 13:33:07 +0000 (15:33 +0200)]
KVM: x86: omit "impossible" pmu MSRs from MSR list
INTEL_PMC_MAX_GENERIC is currently 32, which exceeds the 18
contiguous MSR indices reserved by Intel for event selectors.
Since some machines actually have MSRs past the reserved range,
filtering them against x86_pmu.num_counters_gp may have false
positives. Cut the list to 18 entries to avoid this.
Heikki Krogerus [Fri, 4 Oct 2019 10:02:19 +0000 (13:02 +0300)]
usb: typec: ucsi: displayport: Fix for the mode entering routine
Making sure that ucsi_displayport_enter() function does not
return an error if the displayport alternate mode has
already been entered. It's normal that the firmware (or
controller) has already entered the alternate mode by the
time the operating system is notified about the device.
Heikki Krogerus [Fri, 4 Oct 2019 10:02:18 +0000 (13:02 +0300)]
usb: typec: ucsi: ccg: Remove run_isr flag
The "run_isr" flag is used for preventing the driver from
calling the interrupt service routine in its runtime resume
callback when the driver is expecting completion to a
command, but what that basically does is that it hides the
real problem. The real problem is that the controller is
allowed to suspend in the middle of command execution.
As a more appropriate fix for the problem, using autosuspend
delay time that matches UCSI_TIMEOUT_MS (5s). That prevents
the controller from suspending while still in the middle of
executing a command.
This fixes a potential deadlock. Both ccg_read() and
ccg_write() are called with the mutex already taken at least
from ccg_send_command(). In ccg_read() and ccg_write, the
mutex is only acquired so that run_isr flag can be set.
James Morse [Wed, 2 Oct 2019 09:49:35 +0000 (10:49 +0100)]
arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
CPUs affected by Neoverse-N1 #1542419 may execute a stale instruction if
it was recently modified. The affected sequence requires freshly written
instructions to be executable before a branch to them is updated.
There are very few places in the kernel that modify executable text,
all but one come with sufficient synchronisation:
* The module loader's flush_module_icache() calls flush_icache_range(),
which does a kick_all_cpus_sync()
* bpf_int_jit_compile() calls flush_icache_range().
* Kprobes calls aarch64_insn_patch_text(), which does its work in
stop_machine().
* static keys and ftrace both patch between nops and branches to
existing kernel code (not generated code).
The affected sequence is the interaction between ftrace and modules.
The module PLT is cleaned using __flush_icache_range() as the trampoline
shouldn't be executable until we update the branch to it.
Drop the double-underscore so that this path runs kick_all_cpus_sync()
too.
James Morse [Thu, 3 Oct 2019 17:01:27 +0000 (18:01 +0100)]
arm64: Fix incorrect irqflag restore for priority masking for compat
Commit bd82d4bd2188 ("arm64: Fix incorrect irqflag restore for priority
masking") added a macro to the entry.S call paths that leave the
PSTATE.I bit set. This tells the pPNMI masking logic that interrupts
are masked by the CPU, not by the PMR. This value is read back by
local_daif_save().
Commit bd82d4bd2188 added this call to el0_svc, as el0_svc_handler
is called with interrupts masked. el0_svc_compat was missed, but should
be covered in the same way as both of these paths end up in
el0_svc_common(), which expects to unmask interrupts.
Fixes: bd82d4bd2188 ("arm64: Fix incorrect irqflag restore for priority masking") Signed-off-by: James Morse <[email protected]> Cc: Julien Thierry <[email protected]> Signed-off-by: Will Deacon <[email protected]>
Mark Rutland [Thu, 3 Oct 2019 09:49:32 +0000 (10:49 +0100)]
arm64: mm: avoid virt_to_phys(init_mm.pgd)
If we take an unhandled fault in the kernel, we call show_pte() to dump
the {PGDP,PGD,PUD,PMD,PTE} values for the corresponding page table walk,
where the PGDP value is virt_to_phys(mm->pgd).
The boot-time and runtime kernel page tables, init_pg_dir and
swapper_pg_dir respectively, are kernel symbols. Thus, it is not valid
to call virt_to_phys() on either of these, though we'll do so if we take
a fault on a TTBR1 address.
When CONFIG_DEBUG_VIRTUAL is not selected, virt_to_phys() will silently
fix this up. However, when CONFIG_DEBUG_VIRTUAL is selected, this
results in splats as below. Depending on when these occur, they can
happen to suppress information needed to debug the original unhandled
fault, such as the backtrace:
| Unable to handle kernel paging request at virtual address ffff7fffec73cf0f
| Mem abort info:
| ESR = 0x96000004
| EC = 0x25: DABT (current EL), IL = 32 bits
| SET = 0, FnV = 0
| EA = 0, S1PTW = 0
| Data abort info:
| ISV = 0, ISS = 0x00000004
| CM = 0, WnR = 0
| ------------[ cut here ]------------
| virt_to_phys used for non-linear address: 00000000102c9dbe (swapper_pg_dir+0x0/0x1000)
| WARNING: CPU: 1 PID: 7558 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0xe0/0x170 arch/arm64/mm/physaddr.c:12
| Kernel panic - not syncing: panic_on_warn set ...
| SMP: stopping secondary CPUs
| Dumping ftrace buffer:
| (ftrace buffer empty)
| Kernel Offset: disabled
| CPU features: 0x0002,23000438
| Memory Limit: none
| Rebooting in 1 seconds..
We can avoid this by ensuring that we call __pa_symbol() for
init_mm.pgd, as this will always be a kernel symbol. As the dumped
{PGD,PUD,PMD,PTE} values are the raw values from the relevant entries we
don't need to handle these specially.
Julien Grall [Thu, 3 Oct 2019 11:12:08 +0000 (12:12 +0100)]
arm64: cpufeature: Effectively expose FRINT capability to userspace
The HWCAP framework will detect a new capability based on the sanitized
version of the ID registers.
Sanitization is based on a whitelist, so any field not described will end
up to be zeroed.
At the moment, ID_AA64ISAR1_EL1.FRINTTS is not described in
ftr_id_aa64isar1. This means the field will be zeroed and therefore the
userspace will not be able to see the HWCAP even if the hardware
supports the feature.
This can be fixed by describing the field in ftr_id_aa64isar1.
Will Deacon [Tue, 1 Oct 2019 10:43:13 +0000 (11:43 +0100)]
arm64: Mark functions using explicit register variables as '__always_inline'
As of ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"),
inline functions are no longer annotated with '__always_inline', which
allows the compiler to decide whether inlining is really a good idea or
not. Although this is a great idea on paper, the reality is that AArch64
GCC prior to 9.1 has been shown to get confused when creating an
out-of-line copy of a function passing explicit 'register' variables
into an inline assembly block:
It's not clear whether this is specific to arm64 or not but, for now,
ensure that all of our functions using 'register' variables are marked
as '__always_inline' so that the old behaviour is effectively preserved.
Hopefully other architectures are luckier with their compilers.
Johan Hovold [Thu, 26 Sep 2019 09:12:27 +0000 (11:12 +0200)]
USB: usblcd: drop redundant lcd mutex
Drop the redundant lcd mutex introduced by commit 925ce689bb31 ("USB:
autoconvert trivial BKL users to private mutex") which replaced an
earlier BKL use.
The lock serialised calls to open() against other open() and a custom
ioctl() returning the bcdDevice (sic!), but neither is needed.
Johan Hovold [Thu, 26 Sep 2019 09:12:26 +0000 (11:12 +0200)]
USB: usblcd: drop redundant disconnect mutex
Drop the redundant disconnect mutex which was introduced after the
open-disconnect race had been addressed generally in USB core by commit d4ead16f50f9 ("USB: prevent char device open/deregister race").
Specifically, the rw-semaphore in core guarantees that all calls to
open() will have completed and that no new calls to open() will occur
after usb_deregister_dev() returns. Hence there is no need use the
driver data as an inverted disconnected flag.
Johan Hovold [Thu, 26 Sep 2019 09:12:25 +0000 (11:12 +0200)]
USB: usblcd: fix I/O after disconnect
Make sure to stop all I/O on disconnect by adding a disconnected flag
which is used to prevent new I/O from being started and by stopping all
ongoing I/O before returning.
This also fixes a potential use-after-free on driver unbind in case the
driver data is freed before the completion handler has run.
Dan Carpenter [Tue, 1 Oct 2019 12:01:17 +0000 (15:01 +0300)]
usb: typec: tcpm: usb: typec: tcpm: Fix a signedness bug in tcpm_fw_get_caps()
The "port->typec_caps.data" and "port->typec_caps.type" variables are
enums and in this context GCC will treat them as an unsigned int so they
can never be less than zero.
USB: dummy-hcd: fix power budget for SuperSpeed mode
The power budget for SuperSpeed mode should be 900 mA
according to USB specification, so set the power budget
to 900mA for dummy_start_ss which is only used for
SuperSpeed mode.
If the max power consumption of SuperSpeed device is
larger than 500 mA, insufficient available bus power
error happens in usb_choose_configuration function
when the device connects to dummy hcd.
Mao Wenan [Mon, 16 Sep 2019 15:09:21 +0000 (23:09 +0800)]
usbip: vhci_hcd indicate failed message
If the return value of vhci_init_attr_group and
sysfs_create_group is non-zero, which mean they failed
to init attr_group and create sysfs group, so it would
better add 'failed' message to indicate that.
This patch also change pr_err to dev_err to trace which
device is failed.
Alan Stern [Tue, 17 Sep 2019 16:47:23 +0000 (12:47 -0400)]
USB: yurex: Don't retry on unexpected errors
According to Greg KH, it has been generally agreed that when a USB
driver encounters an unknown error (or one it can't handle directly),
it should just give up instead of going into a potentially infinite
retry loop.
The three codes -EPROTO, -EILSEQ, and -ETIME fall into this category.
They can be caused by bus errors such as packet loss or corruption,
attempting to communicate with a disconnected device, or by malicious
firmware. Nowadays the extent of packet loss or corruption is
negligible, so it should be safe for a driver to give up whenever one
of these errors occurs.
Although the yurex driver handles -EILSEQ errors in this way, it
doesn't do the same for -EPROTO (as discovered by the syzbot fuzzer)
or other unrecognized errors. This patch adjusts the driver so that
it doesn't log an error message for -EPROTO or -ETIME, and it doesn't
retry after any errors.
Johan Hovold [Wed, 25 Sep 2019 09:29:13 +0000 (11:29 +0200)]
USB: adutux: fix NULL-derefs on disconnect
The driver was using its struct usb_device pointer as an inverted
disconnected flag, but was setting it to NULL before making sure all
completion handlers had run. This could lead to a NULL-pointer
dereference in a number of dev_dbg statements in the completion handlers
which relies on said pointer.
The pointer was also dereferenced unconditionally in a dev_dbg statement
release() something which would lead to a NULL-deref whenever a device
was disconnected before the final character-device close if debugging
was enabled.
Fix this by unconditionally stopping all I/O and preventing
resubmissions by poisoning the interrupt URBs at disconnect and using a
dedicated disconnected flag.
This also makes sure that all I/O has completed by the time the
disconnect callback returns.
Johan Hovold [Wed, 25 Sep 2019 09:29:12 +0000 (11:29 +0200)]
USB: adutux: fix use-after-free on disconnect
The driver was clearing its struct usb_device pointer, which it used as
an inverted disconnected flag, before deregistering the character device
and without serialising against racing release().
This could lead to a use-after-free if a racing release() callback
observes the cleared pointer and frees the driver data before
disconnect() is finished with it.
This could also lead to NULL-pointer dereferences in a racing open().
Johan Hovold [Thu, 3 Oct 2019 13:49:58 +0000 (15:49 +0200)]
USB: serial: keyspan: fix NULL-derefs on open() and write()
Fix NULL-pointer dereferences on open() and write() which can be
triggered by a malicious USB device.
The current URB allocation helper would fail to initialise the newly
allocated URB if the device has unexpected endpoint descriptors,
something which could lead NULL-pointer dereferences in a number of
open() and write() paths when accessing the URB. For example:
The Rio500 kernel driver has not been used by Rio500 owners since 2001
not long after the rio500 project added support for a user-space USB stack
through the very first versions of usbdevfs and then libusb.
Support for the kernel driver was removed from the upstream utilities
in 2008:
https://gitlab.freedesktop.org/hadess/rio500/commit/943f624ab721eb8281c287650fcc9e2026f6f5db
Jia-Ye Li [Wed, 25 Sep 2019 08:37:29 +0000 (16:37 +0800)]
staging: exfat: Use kvzalloc() instead of kzalloc() for exfat_sb_info
Fix mount failed "Cannot allocate memory".
When the memory gets fragmented, kzalloc() might fail to allocate
physically contiguous pages for the struct exfat_sb_info (its size is
about 34KiB) even the total free memory is enough.
Use kvzalloc() to solve this problem.
Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc
In fbtft_framebuffer_alloc the error handling path should take care of
releasing frame buffer after it is allocated via framebuffer_alloc, too.
Therefore, in two failure cases the goto destination is changed to
address this issue.
Okash Khawaja [Tue, 1 Oct 2019 21:47:29 +0000 (22:47 +0100)]
staging: speakup: document sysfs attributes
Speakup exposes a set of sysfs attributes under
/sys/accessibility/speakup/ for user-space to interact with and
configure speakup's kernel modules. This patch describes those
attributes. Some attributes either lack a description or contain
incomplete description. They are marked wit TODO.
Paul Burton [Thu, 3 Oct 2019 22:46:36 +0000 (22:46 +0000)]
MIPS: pmcs-msp71xx: Remove unused addr variable
The addr variable in prom_free_prom_memory() has been unused since
commit b3c948e2c00f ("MIPS: msp: Record prom memory"), causing a warning
& build failure due to -Werror. Remove the unused variable.
Commit b3c948e2c00f ("MIPS: msp: Record prom memory") introduced use of
a MAX_PROM_MEM value but didn't define it. A bounds check in
prom_meminit() suggests its value was supposed to be 5, so define it as
such & adjust the bounds check to use the macro rather than a magic
number.