pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors
IF the server rejected our layout return with a state error such as
NFS4ERR_BAD_STATEID, or even a stale inode error, then we do want
to clear out all the remaining layout segments and mark that stateid
as invalid.
NFS: Refactor nfs_instantiate() for dentry referencing callers
Since commit b0c6108ecf64 ("nfs_instantiate(): prevent multiple aliases for
directory inode"), nfs_instantiate() may succeed without actually
instantiating the dentry that was passed in. That can be problematic for
some callers in NFSv3, so this patch breaks things up so we can get the
actual dentry obtained.
Chuck Lever [Fri, 13 Sep 2019 20:01:07 +0000 (16:01 -0400)]
SUNRPC: Fix congestion window race with disconnect
If the congestion window closes just as the transport disconnects,
a reconnect is never driven because:
1. The XPRT_CONG_WAIT flag prevents tasks from taking the write lock
2. There's no wake-up of the first task on the xprt->sending queue
To address this, clear the congestion wait flag as part of
completing a disconnect.
Fixes: 75891f502f5f ("SUNRPC: Support for congestion control ... ") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
SUNRPC: Fix buffer handling of GSS MIC without slack
The GSS Message Integrity Check data for krb5i may lie partially in the XDR
reply buffer's pages and tail. If so, we try to copy the entire MIC into
free space in the tail. But as the estimations of the slack space required
for authentication and verification have improved there may be less free
space in the tail to complete this copy -- see commit 2c94b8eca1a2
("SUNRPC: Use au_rslack when computing reply buffer size"). In fact, there
may only be room in the tail for a single copy of the MIC, and not part of
the MIC and then another complete copy.
The real world failure reported is that `ls` of a directory on NFS may
sometimes return -EIO, which can be traced back to xdr_buf_read_netobj()
failing to find available free space in the tail to copy the MIC.
Fix this by checking for the case of the MIC crossing the boundaries of
head, pages, and tail. If so, shift the buffer until the MIC is contained
completely within the pages or tail. This allows the remainder of the
function to create a sub buffer that directly address the complete MIC.
SUNRPC: Don't receive TCP data into a request buffer that has been reset
If we've removed the request from the receive list, and have added
it back after resetting the request receive buffer, then we should
only receive message data if it is a new reply (i.e. if
transport->recv.copied is zero).
Fixes: 277e4ab7d530b ("SUNRPC: Simplify TCP receive code by switching...") Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
SUNRPC: Dequeue the request from the receive queue while we're re-encoding
Ensure that we dequeue the request from the transport receive queue
while we're re-encoding to prevent issues like use-after-free when
we release the bvec.
Fixes: 7536908982047 ("SUNRPC: Ensure the bvecs are reset when we re-encode...") Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] # v4.20+ Signed-off-by: Anna Schumaker <[email protected]>
Chuck Lever [Mon, 26 Aug 2019 17:12:57 +0000 (13:12 -0400)]
xprtrdma: Send Queue size grows after a reconnect
Eli Dorfman reports that after a series of idle disconnects, an
RPC/RDMA transport becomes unusable (rdma_create_qp returns
-ENOMEM). Problem was tracked down to increasing Send Queue size
after each reconnect.
The rdma_create_qp() API does not promise to leave its @qp_init_attr
parameter unaltered. In fact, some drivers do modify one or more of
its fields. Thus our calls to rdma_create_qp must use a fresh copy
of ib_qp_init_attr each time.
This fix is appropriate for kernels dating back to late 2007, though
it will have to be adapted, as the connect code has changed over the
years.
Chuck Lever [Mon, 26 Aug 2019 17:12:51 +0000 (13:12 -0400)]
xprtrdma: Clear xprt->reestablish_timeout on close
Ensure that the re-establishment delay does not grow exponentially
on each good reconnect. This probably should have been part of
commit 675dd90ad093 ("xprtrdma: Modernize ops->connect").
Chuck Lever [Mon, 26 Aug 2019 17:12:46 +0000 (13:12 -0400)]
xprtrdma: Recycle MRs after disconnect
The optimization done in "xprtrdma: Simplify rpcrdma_mr_pop" was a
bit too optimistic. MRs left over after a reconnect still need to
be recycled, not added back to the free list, since they could be
in flight or actually fully registered.
Anna Schumaker [Wed, 14 Aug 2019 19:46:48 +0000 (15:46 -0400)]
NFS: Have nfs41_proc_reclaim_complete() call nfs4_call_sync_custom()
An async call followed by an rpc_wait_for_completion() is basically the
same as a synchronous call, so we can use nfs4_call_sync_custom() to
keep our custom callback ops and the RPC_TASK_NO_ROUND_ROBIN flag.
Wenwen Wang [Wed, 21 Aug 2019 03:21:21 +0000 (22:21 -0500)]
NFSv4: Fix a memory leak bug
In nfs4_try_migration(), if nfs4_begin_drain_session() fails, the
previously allocated 'page' and 'locations' are not deallocated, leading to
memory leaks. To fix this issue, go to the 'out' label to free 'page' and
'locations' before returning the error.
Chuck Lever [Mon, 19 Aug 2019 22:51:49 +0000 (18:51 -0400)]
xprtrdma: Optimize rpcrdma_post_recvs()
Micro-optimization: In rpcrdma_post_recvs, since commit e340c2d6ef2a
("xprtrdma: Reduce the doorbell rate (Receive)"), the common case is
to return without doing anything. Found with perf.
Chuck Lever [Mon, 19 Aug 2019 22:50:16 +0000 (18:50 -0400)]
xprtrdma: Fix bc_max_slots return value
For the moment the returned value just happens to be correct because
the current backchannel server implementation does not vary the
number of credits it offers. The spec does permit this value to
change during the lifetime of a connection, however.
The actual maximum is fixed for all RPC/RDMA transports, because
each transport instance has to pre-allocate the resources for
processing BC requests. That's the value that should be returned.
Fixes: 7402a4fedc2b ("SUNRPC: Fix up backchannel slot table ... ") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Chuck Lever [Mon, 19 Aug 2019 22:48:43 +0000 (18:48 -0400)]
xprtrdma: Use an llist to manage free rpcrdma_reps
rpcrdma_rep objects are removed from their free list by only a
single thread: the Receive completion handler. Thus that free list
can be converted to an llist, where a single-threaded consumer and
a multi-threaded producer (rpcrdma_buffer_put) can both access the
llist without the need for any serialization.
This eliminates spin lock contention between the Receive completion
handler and rpcrdma_buffer_get, and makes the rep consumer wait-
free.
Chuck Lever [Mon, 19 Aug 2019 22:47:10 +0000 (18:47 -0400)]
xprtrdma: Cache free MRs in each rpcrdma_req
Instead of a globally-contended MR free list, cache MRs in each
rpcrdma_req as they are released. This means acquiring and releasing
an MR will be lock-free in the common case, even outside the
transport send lock.
The original idea of per-rpcrdma_req MR free lists was suggested by
Shirley Ma <[email protected]> several years ago. I just now
figured out how to make that idea work with on-demand MR allocation.
Chuck Lever [Mon, 19 Aug 2019 22:45:37 +0000 (18:45 -0400)]
xprtrdma: Move rpcrdma_mr_get out of frwr_map
Refactor: Retrieve an MR and handle error recovery entirely in
rpc_rdma.c, as this is not a device-specific function.
Note that since commit 89f90fe1ad8b ("SUNRPC: Allow calls to
xprt_transmit() to drain the entire transmit queue"), the
xprt_transmit function handles the cond_resched. The transport no
longer has to do this itself.
Chuck Lever [Mon, 19 Aug 2019 22:44:50 +0000 (18:44 -0400)]
xprtrdma: Combine rpcrdma_mr_put and rpcrdma_mr_unmap_and_put
Clean up. There is only one remaining rpcrdma_mr_put call site, and
it can be directly replaced with unmap_and_put because mr->mr_dir is
set to DMA_NONE just before the call.
Now all the call sites do a DMA unmap, and we can just rename
mr_unmap_and_put to mr_put, which nicely matches mr_get.
Chuck Lever [Mon, 19 Aug 2019 22:43:17 +0000 (18:43 -0400)]
xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methods
Commit 48be539dd44a ("xprtrdma: Introduce ->alloc_slot call-out for
xprtrdma") added a separate alloc_slot and free_slot to the RPC/RDMA
transport. Later, commit 75891f502f5f ("SUNRPC: Support for
congestion control when queuing is enabled") modified the generic
alloc/free_slot methods, but neglected the methods in xprtrdma.
Found via code review.
Fixes: 75891f502f5f ("SUNRPC: Support for congestion control ... ") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Chuck Lever [Mon, 19 Aug 2019 22:41:44 +0000 (18:41 -0400)]
xprtrdma: Rename CQE field in Receive trace points
Make the field name the same for all trace points that handle
pointers to struct rpcrdma_rep. That makes it easy to grep for
matching rep points in trace output.
Chuck Lever [Mon, 19 Aug 2019 22:40:11 +0000 (18:40 -0400)]
xprtrdma: Boost maximum transport header size
Although I haven't seen any performance results that justify it,
I've received several complaints that NFS/RDMA no longer supports
a maximum rsize and wsize of 1MB. These days it is somewhat smaller.
To simplify the logic that determines whether a chunk list is
necessary, the implementation uses a fixed maximum size of the
transport header. Currently that maximum size is 256 bytes, one
quarter of the default inline threshold size for RPC/RDMA v1.
Since commit a78868497c2e ("xprtrdma: Reduce max_frwr_depth"), the
size of chunks is also smaller to take advantage of inline page
lists in device internal MR data structures.
The combination of these two design choices has reduced the maximum
NFS rsize and wsize that can be used for most RNIC/HCAs. Increasing
the maximum transport header size and the maximum number of RDMA
segments it can contain increases the negotiated maximum rsize/wsize
on common RNIC/HCAs.
Chuck Lever [Mon, 19 Aug 2019 22:39:25 +0000 (18:39 -0400)]
xprtrdma: Fix calculation of ri_max_segs again
Commit 302d3deb206 ("xprtrdma: Prevent inline overflow") added this
calculation back in 2016, but got it wrong. I tested only the lower
bound, which is why there is a max_t there. The upper bound should be
rounded up too.
Now, when using DIV_ROUND_UP, that takes care of the lower bound as
well.
Chuck Lever [Mon, 19 Aug 2019 22:37:52 +0000 (18:37 -0400)]
xprtrdma: Refresh the documenting comment in frwr_ops.c
Things have changed since this comment was written. In particular,
the reworking of connection closing, on-demand creation of MRs, and
the removal of fr_state all mean that deferring MR recovery to
frwr_map is no longer needed. The description is obsolete.
Chuck Lever [Mon, 19 Aug 2019 22:37:05 +0000 (18:37 -0400)]
SUNRPC: Inline xdr_commit_encode
Micro-optimization: For xdr_commit_encode call sites in
net/sunrpc/xdr.c, eliminate the extra calling sequence. On my
client, this change saves about a microsecond for every 30 calls
to xdr_reserve_space().
Chuck Lever [Mon, 19 Aug 2019 22:36:19 +0000 (18:36 -0400)]
SUNRPC: Remove rpc_wake_up_queued_task_on_wq()
Clean up: commit c544577daddb ("SUNRPC: Clean up transport write
space handling") appears to have removed the last caller of
rpc_wake_up_queued_task_on_wq().
Jia-Ju Bai [Fri, 26 Jul 2019 07:48:53 +0000 (15:48 +0800)]
fs: nfs: Fix possible null-pointer dereferences in encode_attrs()
In encode_attrs(), there is an if statement on line 1145 to check
whether label is NULL:
if (label && (attrmask[2] & FATTR4_WORD2_SECURITY_LABEL))
When label is NULL, it is used on lines 1178-1181:
*p++ = cpu_to_be32(label->lfs);
*p++ = cpu_to_be32(label->pi);
*p++ = cpu_to_be32(label->len);
p = xdr_encode_opaque_fixed(p, label->label, label->len);
To fix these bugs, label is checked before being used.
These bugs are found by a static analysis tool STCheck written by us.
Linus Torvalds [Sun, 18 Aug 2019 16:51:48 +0000 (09:51 -0700)]
Merge tag 'for-5.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Two fixes that popped up during testing:
- fix for sysfs-related code that adds/removes block groups, warnings
appear during several fstests in connection with sysfs updates in
5.3, the fix essentially replaces a workaround with scope NOFS and
applies to 5.2-based branch too
- add sanity check of trim range"
* tag 'for-5.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: trim: Check the range passed into to prevent overflow
Btrfs: fix sysfs warning and missing raid sysfs directories
Linus Torvalds [Sun, 18 Aug 2019 16:45:42 +0000 (09:45 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Fix the inconsistent error handling in the umwait init code
- Rework the boot param zeroing so gcc9 stops complaining about out
of bound memset. The resulting source code is actually more sane to
read than the smart solution we had
- Maintainers update so Tony gets involved when Intel models are
added
- Some more fallthrough fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Save fields explicitly, zero out everything else
MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h
x86/fpu/math-emu: Address fallthrough warnings
x86/apic/32: Fix yet another implicit fallthrough warning
x86/umwait: Fix error handling in umwait_init()
Linus Torvalds [Sun, 18 Aug 2019 16:36:51 +0000 (09:36 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
"A single fix for a EFI mixed mode regression caused by recent rework
which did not take the firmware bitwidth into account"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi-stub: Fix get_efi_config_table on mixed-mode setups
Linus Torvalds [Sun, 18 Aug 2019 16:26:16 +0000 (09:26 -0700)]
Merge tag 'spdx-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX fixes from Greg KH:
"Here are four small SPDX fixes for 5.3-rc5.
A few style fixes for some SPDX comments, added an SPDX tag for one
file, and fix up some GPL boilerplate for another file.
All of these have been in linux-next for a few weeks with no reported
issues (they are comment changes only, so that's to be expected...)"
* tag 'spdx-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
i2c: stm32: Use the correct style for SPDX License Identifier
intel_th: Use the correct style for SPDX License Identifier
coccinelle: api/atomic_as_refcounter: add SPDX License Identifier
kernel/configs: Replace GPL boilerplate code with SPDX identifier
Linus Torvalds [Sun, 18 Aug 2019 16:17:41 +0000 (09:17 -0700)]
Merge tag 'char-misc-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes for 5.3-rc5.
These are two different subsystems needing some fixes, the habanalabs
driver which is has some more big endian fixes for problems found. The
other are some small soundwire fixes, including some Kconfig
dependencies needed to resolve reported build errors.
All of these have been in linux-next this week with no reported
issues"
* tag 'char-misc-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: xilinx-sdfec: fix dependency and build error
habanalabs: fix device IRQ unmasking for BE host
habanalabs: fix endianness handling for internal QMAN submission
habanalabs: fix completion queue handling when host is BE
habanalabs: fix endianness handling for packets from user
habanalabs: fix DRAM usage accounting on context tear down
habanalabs: Avoid double free in error flow
soundwire: fix regmap dependencies and align with other serial links
soundwire: cadence_master: fix definitions for INTSTAT0/1
soundwire: cadence_master: fix register definition for SLAVE_STATE
Linus Torvalds [Sun, 18 Aug 2019 16:14:56 +0000 (09:14 -0700)]
Merge tag 'staging-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are four small staging and iio driver fixes for 5.3-rc5
Two are for the dt3000 comedi driver for some reported problems found
in that codebase, and two are some small iio fixes.
All of these have been in linux-next this week with no reported
issues"
* tag 'staging-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: dt3000: Fix rounding up of timer divisor
staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
iio: adc: max9611: Fix temperature reading in probe
iio: frequency: adf4371: Fix output frequency setting
Linus Torvalds [Sun, 18 Aug 2019 16:11:29 +0000 (09:11 -0700)]
Merge tag 'usb-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are number of small USB fixes for 5.3-rc5.
Syzbot has been on a tear recently now that it has some good USB
debugging hooks integrated, so there's a number of fixes in here found
by those tools for some _very_ old bugs. Also a handful of gadget
driver fixes for reported issues, some hopefully-final dma fixes for
host controller drivers, and some new USB serial gadget driver ids.
All of these have been in linux-next this week with no reported issues
(the usb-serial ones were in linux-next in its own branch, but merged
into mine on Friday)"
* tag 'usb-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: add a hcd_uses_dma helper
usb: don't create dma pools for HCDs with a localmem_pool
usb: chipidea: imx: fix EPROBE_DEFER support during driver probe
usb: host: fotg2: restart hcd after port reset
USB: CDC: fix sanity checks in CDC union parser
usb: cdc-acm: make sure a refcount is taken early enough
USB: serial: option: add the BroadMobi BM818 card
USB: serial: option: Add Motorola modem UARTs
USB: core: Fix races in character device registration and deregistraion
usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt
usb: gadget: composite: Clear "suspended" on reset/disconnect
usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role"
USB: serial: option: add D-Link DWM-222 device ID
USB: serial: option: Add support for ZTE MF871A
Linus Torvalds [Sun, 18 Aug 2019 02:39:54 +0000 (19:39 -0700)]
Merge tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes that should go into this series. This contains:
- Revert of the REQ_NOWAIT_INLINE and associated dio changes. There
were still corner cases there, and even though I had a solution for
it, it's too involved for this stage. (me)
- Set of NVMe fixes (via Sagi)
- io_uring fix for fixed buffers (Anthony)
- io_uring defer issue fix (Jackie)
- Regression fix for queue sync at exit time (zhengbin)
- xen blk-back memory leak fix (Wenwen)"
* tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block:
io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list
block: remove REQ_NOWAIT_INLINE
io_uring: fix manual setup of iov_iter for fixed buffers
xen/blkback: fix memory leaks
blk-mq: move cancel of requeue_work to the front of blk_exit_queue
nvme-pci: Fix async probe remove race
nvme: fix controller removal race with scan work
nvme-rdma: fix possible use-after-free in connect error flow
nvme: fix a possible deadlock when passthru commands sent to a multipath device
nvme-core: Fix extra device_put() call on error path
nvmet-file: fix nvmet_file_flush() always returning an error
nvmet-loop: Flush nvme_delete_wq when removing the port
nvmet: Fix use-after-free bug when a port is removed
nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
Linus Torvalds [Sun, 18 Aug 2019 02:31:30 +0000 (19:31 -0700)]
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fixes from Sasha Levin:
- A few fixes for the userspace hyper-v tools from Adrian Vladu.
- A fix for the hyper-v MAINTAINERs entry from Lan Tianyu.
- Fix for SPDX license identifier in the userspace tools from Nishad
Kamdar.
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
MAINTAINERS: Fix Hyperv vIOMMU driver file name
tools: hv: Use the correct style for SPDX License Identifier
tools: hv: fix typos in toolchain
tools: hv: fix KVP and VSS daemons exit code
tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
tools: hv: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in the trace header file related to Microsoft Hyper-V
client drivers.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used)
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46
Adrian Vladu [Mon, 6 May 2019 17:27:37 +0000 (17:27 +0000)]
tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
Fixed pep8/flake8 python style code for lsvmbus tool.
The TAB indentation was on purpose ignored (pep8 rule W191) to make
sure the code is complying with the Linux code guideline.
The following command doe not show any warnings now:
pep8 --ignore=W191 lsvmbus
flake8 --ignore=W191 lsvmbus
Linus Torvalds [Sat, 17 Aug 2019 17:44:50 +0000 (10:44 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has one revert because of a regression, two fixes for tiny race
windows (which we were not able to trigger), a MAINTAINERS addition,
and a SPDX fix"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: stm32: Use the correct style for SPDX License Identifier
i2c: emev2: avoid race when unregistering slave client
i2c: rcar: avoid race when unregistering slave client
MAINTAINERS: i2c-imx: take over maintainership
Revert "i2c: imx: improve the error handling in i2c_imx_dma_request()"
Linus Torvalds [Sat, 17 Aug 2019 17:36:47 +0000 (10:36 -0700)]
Merge tag 'riscv/for-v5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Two patches to fix significant bugs in floating point register
context handling
- A minor fix in RISC-V flush_tlb_page(), to supply a valid end address
to flush_tlb_range()
- Two minor defconfig additions: to build the virtio hwrng driver by
default (for QEMU targets), and to partially synchronize the 32-bit
defconfig with the 64-bit defconfig
* tag 'riscv/for-v5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Make __fstate_clean() work correctly.
riscv: Correct the initialized flow of FP register
riscv: defconfig: Update the defconfig
riscv: rv32_defconfig: Update the defconfig
riscv: fix flush_tlb_range() end address for flush_tlb_page()
Linus Torvalds [Fri, 16 Aug 2019 17:51:47 +0000 (10:51 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Don't taint the kernel if CPUs have different sets of page sizes
supported (other than the one in use).
- Issue I-cache maintenance for module ftrace trampoline.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ftrace: Ensure module ftrace trampoline is coherent with I-side
arm64: cpufeature: Don't treat granule sizes as strict
Will Deacon [Fri, 16 Aug 2019 13:57:43 +0000 (14:57 +0100)]
arm64: ftrace: Ensure module ftrace trampoline is coherent with I-side
The initial support for dynamic ftrace trampolines in modules made use
of an indirect branch which loaded its target from the beginning of
a special section (e71a4e1bebaf7 ("arm64: ftrace: add support for far
branches to dynamic ftrace")). Since no instructions were being patched,
no cache maintenance was needed. However, later in be0f272bfc83 ("arm64:
ftrace: emit ftrace-mod.o contents through code") this code was reworked
to output the trampoline instructions directly into the PLT entry but,
unfortunately, the necessary cache maintenance was overlooked.
Add a call to __flush_icache_range() after writing the new trampoline
instructions but before patching in the branch to the trampoline.
Linus Torvalds [Fri, 16 Aug 2019 16:13:16 +0000 (09:13 -0700)]
Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These add a check to avoid recent suspend-to-idle power regression on
systems with NVMe drives where the PCIe ASPM policy is "performance"
(or when the kernel is built without ASPM support), fix an issue
related to frequency limits in the schedutil cpufreq governor and fix
a mistake related to the PM QoS usage in the cpufreq core introduced
recently.
Specifics:
- Disable NVMe power optimization related to suspend-to-idle added
recently on systems where PCIe ASPM is not able to put PCIe links
into low-power states to prevent excess power from being drawn by
the system while suspended (Rafael Wysocki).
- Make the schedutil governor handle frequency limits changes
properly in all cases (Viresh Kumar).
- Prevent the cpufreq core from treating positive values returned by
dev_pm_qos_update_request() as errors (Viresh Kumar)"
* tag 'pm-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled
PCI/ASPM: Add pcie_aspm_enabled()
cpufreq: schedutil: Don't skip freq update when limits change
cpufreq: dev_pm_qos_update_request() can return 1 on success
Linus Torvalds [Fri, 16 Aug 2019 15:49:45 +0000 (08:49 -0700)]
Merge tag 'sound-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All small fixes targeted for stable:
- Two fixes for USB-audio with malformed descriptor, spotted by
fuzzers
- Two fixes Conexant HD-audio codec wrt power management
- Quirks for HD-audio AMD platform and HP laptop
- HD-audio memory leak fix"
* tag 'sound-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term
ALSA: usb-audio: Fix an OOB bug in parse_audio_mixer_unit
ALSA: hda - Add a generic reboot_notify
ALSA: hda - Let all conexant codec enter D3 when rebooting
ALSA: hda/realtek - Add quirk for HP Envy x360
ALSA: hda - Fix a memory leak bug
ALSA: hda - Apply workaround for another AMD chip 1022:1487
Linus Torvalds [Fri, 16 Aug 2019 15:41:15 +0000 (08:41 -0700)]
Merge tag 'drm-fixes-2019-08-16' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Nothing too crazy this week, one amdgpu fix to use vmalloc for a
struct that grew in size, and another MST fix for nouveau, and some
other misc fixes:
* tag 'drm-fixes-2019-08-16' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau: Only recalculate PBN/VCPI on mode/connector changes
drm/ast: Fixed reboot test may cause system hanged
drm/scheduler: use job count instead of peek
drm/amd/display: use kvmalloc for dc_state (v2)
drm/amdgpu: fix gfx9 soft recovery
drm/i915: Use after free in error path in intel_vgpu_create_workload()
John Hubbard [Wed, 31 Jul 2019 05:46:27 +0000 (22:46 -0700)]
x86/boot: Save fields explicitly, zero out everything else
Recent gcc compilers (gcc 9.1) generate warnings about an out of bounds
memset, if the memset goes accross several fields of a struct. This
generated a couple of warnings on x86_64 builds in sanitize_boot_params().
Fix this by explicitly saving the fields in struct boot_params
that are intended to be preserved, and zeroing all the rest.
[ tglx: Tagged for stable as it breaks the warning free build there as well ]
Merge tag 'soundwire-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-linus
Vinod writes:
soundwire fixes for v5.3-rc5
Pierre sent fixes which are queued now for v5.3-rc5 are:
- regmap dependecy
- cadence register definitions
* tag 'soundwire-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: fix regmap dependencies and align with other serial links
soundwire: cadence_master: fix definitions for INTSTAT0/1
soundwire: cadence_master: fix register definition for SLAVE_STATE
Hui Peng [Thu, 15 Aug 2019 04:31:34 +0000 (00:31 -0400)]
ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term
`check_input_term` recursively calls itself with input from
device side (e.g., uac_input_terminal_descriptor.bCSourceID)
as argument (id). In `check_input_term`, if `check_input_term`
is called with the same `id` argument as the caller, it triggers
endless recursive call, resulting kernel space stack overflow.
This patch fixes the bug by adding a bitmap to `struct mixer_build`
to keep track of the checked ids and stop the execution if some id
has been checked (similar to how parse_audio_unit handles unitid
argument).
Linus Torvalds [Thu, 15 Aug 2019 19:29:36 +0000 (12:29 -0700)]
Merge tag 'xfs-5.3-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
- Fix crashes when the attr fork isn't present due to errors but inode
inactivation tries to zap the attr data anyway.
- Convert more directory corruption debugging asserts to actual
EFSCORRUPTED returns instead of blowing up later on.
- Don't fail writeback just because we ran out of memory allocating
metadata log data.
* tag 'xfs-5.3-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't crash on null attr fork xfs_bmapi_read
xfs: remove more ondisk directory corruption asserts
fs: xfs: xfs_log: Don't use KM_MAYFAIL at xfs_log_reserve().
Jackie Liu [Wed, 14 Aug 2019 09:35:22 +0000 (17:35 +0800)]
io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list
This patch may fix two issues:
First, when IOSQE_IO_DRAIN set, the next IOs need to be inserted into
defer list to delay execution, but link io will be actively scheduled to
run by calling io_queue_sqe.
Second, when multiple LINK_IOs are inserted together with defer_list,
the LINK_IO is no longer keep order.
|-------------|
| LINK_IO | ----> insert to defer_list -----------
|-------------| |
| LINK_IO | ----> insert to defer_list ----------|
|-------------| |
| LINK_IO | ----> insert to defer_list ----------|
|-------------| |
| NORMAL_IO | ----> insert to defer_list ----------|
|-------------| |
|
queue_work at same time <-----|
Fixes: 9e645e1105c ("io_uring: add support for sqe links") Signed-off-by: Jackie Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
Jens Axboe [Thu, 15 Aug 2019 17:09:16 +0000 (11:09 -0600)]
block: remove REQ_NOWAIT_INLINE
We had a few issues with this code, and there's still a problem around
how we deal with error handling for chained/split bios. For now, just
revert the code and we'll try again with a thoroug solution. This
reverts commits:
e15c2ffa1091 ("block: fix O_DIRECT error handling for bio fragments") 0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments") 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO") 893a1c97205a ("blk-mq: allow REQ_NOWAIT to return an error inline")
io_uring: fix manual setup of iov_iter for fixed buffers
Commit bd11b3a391e3 ("io_uring: don't use iov_iter_advance() for fixed
buffers") introduced an optimization to avoid using the slow
iov_iter_advance by manually populating the iov_iter iterator in some
cases.
However, the computation of the iterator count field was erroneous: The
first bvec was always accounted for an extent of page size even if the
bvec length was smaller.
In consequence, some I/O operations on fixed buffers were unable to
operate on the full extent of the buffer, consistently skipping some
bytes at the end of it.
Linus Torvalds [Thu, 15 Aug 2019 16:20:17 +0000 (09:20 -0700)]
Merge tag 'auxdisplay-for-linus-v5.3-rc5' of git://github.com/ojeda/linux
Pull auxdisplay fixes from Miguel Ojeda:
"A few minor auxdisplay improvements:
- A couple of small header cleanups for charlcd (Masahiro Yamada)
- A trivial typo fix for the examples of cfag12864b (Masahiro Yamada)
- An Kconfig help text improvement for charlcd (Mans Rullgard)
- An error path fix for panel (zhengbin)"
* tag 'auxdisplay-for-linus-v5.3-rc5' of git://github.com/ojeda/linux:
auxdisplay: Fix a typo in cfag12864b-example.c
auxdisplay: charlcd: add include guard to charlcd.h
auxdisplay: charlcd: move charlcd.h to drivers/auxdisplay
auxdisplay: charlcd: add help text for backlight initial state
auxdisplay: panel: need to delete scan_timer when misc_register fails in panel_attach
Linus Torvalds [Thu, 15 Aug 2019 16:18:56 +0000 (09:18 -0700)]
Merge tag 'devicetree-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix building DT binding examples for in tree builds
- Correct some refcounting in adjust_local_phandle_references()
- Update FSL FEC binding with deprecated properties
- Schema fix in stm32 pinctrl
- Fix typo in of_irq_parse_one docbook comment
* tag 'devicetree-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: irq: fix a trivial typo in a doc comment
dt-bindings: pinctrl: stm32: Fix 'st,syscfg' schema
dt-bindings: fec: explicitly mark deprecated properties
of: resolver: Add of_node_put() before return and break
dt-bindings: Fix generated example files getting added to schemas
Randy Dunlap [Tue, 13 Aug 2019 23:01:20 +0000 (16:01 -0700)]
misc: xilinx-sdfec: fix dependency and build error
lib/devres.c, which implements devm_ioremap_resource(), is only built
when CONFIG_HAS_IOMEM is set/enabled, so XILINX_SDFEC should depend
on HAS_IOMEM. Fixes this build error (as seen on UML builds):
The USB buffer allocation code is the only place in the usb core (and in
fact the whole kernel) that uses is_device_dma_capable, while the URB
mapping code uses the uses_dma flag in struct usb_bus. Switch the buffer
allocation to use the uses_dma flag used by the rest of the USB code,
and create a helper in hcd.h that checks this flag as well as the
CONFIG_HAS_DMA to simplify the caller a bit.
André Draszik [Sat, 10 Aug 2019 15:07:58 +0000 (16:07 +0100)]
usb: chipidea: imx: fix EPROBE_DEFER support during driver probe
If driver probe needs to be deferred, e.g. because ci_hdrc_add_device()
isn't ready yet, this driver currently misbehaves badly:
a) success is still reported to the driver core (meaning a 2nd
probe attempt will never be done), leaving the driver in
a dysfunctional state and the hardware unusable
c) the error path in combination with driver removal causes
imbalanced calls to the clk_*() and pm_()* APIs
a) happens because the original intended return value is
overwritten (with 0) by the return code of
regulator_disable() in ci_hdrc_imx_probe()'s error path
b) happens because ci_pdev is -EPROBE_DEFER, which causes
ci_hdrc_remove_device() to OOPS
Fix a) by being more careful in ci_hdrc_imx_probe()'s error
path and not overwriting the real error code
Fix b) by calling the respective cleanup functions during
remove only when needed (when ci_pdev != NULL, i.e. when
everything was initialised correctly). This also has the
side effect of not causing imbalanced clk_*() and pm_*()
API calls as part of the error code path.
Tony Lindgren [Thu, 15 Aug 2019 08:26:02 +0000 (01:26 -0700)]
USB: serial: option: Add Motorola modem UARTs
On Motorola Mapphone devices such as Droid 4 there are five USB ports
that do not use the same layout as Gobi 1K/2K/etc devices listed in
qcserial.c. So we should use qcaux.c or option.c as noted by
Dan Williams <[email protected]>.
As the Motorola USB serial ports have an interrupt endpoint as shown
with lsusb -v, we should use option.c instead of qcaux.c as pointed out
by Johan Hovold <[email protected]>.
The ff/ff/ff interfaces seem to always be UARTs on Motorola devices.
For the other interfaces, class 0x0a (CDC Data) should not in general
be added as they are typically part of a multi-interface function as
noted earlier by Bjørn Mork <[email protected]>.
However, looking at the Motorola mapphone kernel code, the mdm6600 0x0a
class is only used for flashing the modem firmware, and there are no
other interfaces. So I've added that too with more details below as it
works just fine.
The ttyUSB ports on Droid 4 are:
ttyUSB0 DIAG, CQDM-capable
ttyUSB1 MUX or NMEA, no response
ttyUSB2 MUX or NMEA, no response
ttyUSB3 TCMD
ttyUSB4 AT-capable
The ttyUSB0 is detected as QCDM capable by ModemManager. I think
it's only used for debugging with ModemManager --debug for sending
custom AT commands though. ModemManager already can manage data
connection using the USB QMI ports that are already handled by the
qmi_wwan.c driver.
To enable the MUX or NMEA ports, it seems that something needs to be
done additionally to enable them, maybe via the DIAG or TCMD port.
It might be just a NVRAM setting somewhere, but I have no idea what
NVRAM settings may need changing for that.
The TCMD port seems to be a Motorola custom protocol for testing
the modem and to configure it's NVRAM and seems to work just fine
based on a quick test with a minimal tcmdrw tool I wrote.
The voice modem AT-capable port seems to provide only partial
support, and no PM support compared to the TS 27.010 based UART
wired directly to the modem.
The UARTs added with this change are the same product IDs as the
Motorola Mapphone Android Linux kernel mdm6600_id_table. I don't
have any mdm9600 based devices, so I have only tested these on
mdm6600 based droid 4.
Then for the class 0x0a (CDC Data) mode, the Motorola Mapphone Android
Linux kernel driver moto_flashqsc.c just seems to change the
port->bulk_out_size to 8K from the default. And is only used for
flashing the modem firmware it seems.
I've verified that flashing the modem with signed firmware works just
fine with the option driver after manually toggling the GPIO pins, so
I've added droid 4 modem flashing mode to the option driver. I've not
added the other devices listed in moto_flashqsc.c in case they really
need different port->bulk_out_size. Those can be added as they get
tested to work for flashing the modem.
After this patch the output of /sys/kernel/debug/usb/devices has
the following for normal 22b8:2a70 mode including the related qmi_wwan
interfaces:
Tony Luck [Wed, 14 Aug 2019 23:40:30 +0000 (16:40 -0700)]
MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h
There are a few different subsystems in the kernel that depend on model
specific behaviour (perf, EDAC, power, ...). Easier for just one person
to have the task to get new model numbers included instead of having
these groups trip over each other to do it.
[ bp: s/Cpu/CPU/ and add [email protected] so that it gets CCed too as
FYI. ]
Lyude Paul [Fri, 9 Aug 2019 00:53:05 +0000 (20:53 -0400)]
drm/nouveau: Only recalculate PBN/VCPI on mode/connector changes
I -thought- I had fixed this entirely, but it looks like that I didn't
test this thoroughly enough as we apparently still make one big mistake
with nv50_msto_atomic_check() - we don't handle the following scenario:
* CRTC #1 has n VCPI allocated to it, is attached to connector DP-4
which is attached to encoder #1. enabled=y active=n
* CRTC #1 is changed from DP-4 to DP-5, causing:
* DP-4 crtc=#1→NULL (VCPI n→0)
* DP-5 crtc=NULL→#1
* CRTC #1 steals encoder #1 back from DP-4 and gives it to DP-5
* CRTC #1 maintains the same mode as before, just with a different
connector
* mode_changed=n connectors_changed=y
(we _SHOULD_ do VCPI 0→n here, but don't)
Once the above scenario is repeated once, we'll attempt freeing VCPI
from the connector that we didn't allocate due to the connectors
changing, but the mode staying the same. Sigh.
Since nv50_msto_atomic_check() has broken a few times now, let's rethink
things a bit to be more careful: limit both VCPI/PBN allocations to
mode_changed || connectors_changed, since neither VCPI or PBN should
ever need to change outside of routing and mode changes.
Changes since v1:
* Fix accidental reversal of clock and bpp arguments in
drm_dp_calc_pbn_mode() - William Lewis