x86/spectre_v2: Don't check microcode versions when running under hypervisors
As:
1) It's known that hypervisors lie about the environment anyhow (host
mismatch)
2) Even if the hypervisor (Xen, KVM, VMWare, etc) provided a valid
"correct" value, it all gets to be very murky when migration happens
(do you provide the "new" microcode of the machine?).
And in reality the cloud vendors are the ones that should make sure that
the microcode that is running is correct and we should just sing lalalala
and trust them.
Takashi Iwai [Mon, 5 Mar 2018 21:00:55 +0000 (22:00 +0100)]
ALSA: seq: Don't allow resizing pool in use
This is a fix for a (sort of) fallout in the recent commit d15d662e89fc ("ALSA: seq: Fix racy pool initializations") for
CVE-2018-1000004.
As the pool resize deletes the existing cells, it may lead to a race
when another thread is writing concurrently, eventually resulting a
UAF.
A simple workaround is not to allow the pool resizing when the pool is
in use. It's an invalid behavior in anyway.
Andy Lutomirski [Wed, 7 Mar 2018 19:12:27 +0000 (11:12 -0800)]
x86/vsyscall/64: Drop "native" vsyscalls
Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2
on, Linux had three vsyscall modes: "native", "emulate", and "none".
"emulate" is the default. All known user programs work correctly in
emulate mode, but vsyscalls turn into page faults and are emulated.
This is very slow. In "native" mode, the vsyscall page is easily
usable as an exploit gadget, but vsyscalls are a bit faster -- they
turn into normal syscalls. (This is in contrast to vDSO functions,
which can be much faster than syscalls.) In "none" mode, there are
no vsyscalls.
For all practical purposes, "native" was really just a chicken bit
in case something went wrong with the emulation. It's been over six
years, and nothing has gone wrong. Delete it.
Dmitry Torokhov [Tue, 6 Mar 2018 18:59:15 +0000 (10:59 -0800)]
Revert "platform/chrome: chromeos_laptop: make chromeos_laptop const"
This reverts commit a376cd91606365609d8fbd57247618bd51da1fc6 because
chromeos_laptop instances should not be marked as "const" (at this
time), since i2c_peripheral is being modified (we change "state" and
"tries") when we instantiate devices.
Linus Torvalds [Thu, 8 Mar 2018 01:37:32 +0000 (17:37 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- we are reverting patch that was switched touchpad on Lenovo T460P
over to native RMI because on these boxes BIOS messes up with SMBus
controller state. We might re-enable it later once SMBus issue is
resolved
- disabling interrupts in matrix_keypad driver was racy
- mms114 now has SPDX header and matching MODULE_LICENSE
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Revert "Input: synaptics - Lenovo Thinkpad T460p devices should use RMI"
Input: matrix_keypad - fix race when disabling interrupts
Input: mms114 - add SPDX identifier
Input: mms114 - fix license module information
1. On T460p with BIOS version 2.22 touchpad and trackpoint stop working
after suspend-resume cycle. Due to strange state of the device another
suspend is impossible.
The following dmesg errors can be observed:
thinkpad_acpi: EC reports that Thermal Table has changed
rmi4_smbus 7-002c: failed to get SMBus version number!
rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed to read current IRQ mask.
rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -16.
rmi4_f01 rmi4-00.fn01: Resume failed with code -16.
rmi4_physical rmi4-00: Failed to suspend functions: -16
rmi4_smbus 7-002c: Failed to resume device: -16
PM: resume devices took 0.640 seconds
rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-16).
rmi4_physical rmi4-00: rmi_driver_clear_irq_bits: Failed to change enabled interrupts!
rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to change enabled interrupts!
psmouse: probe of serio3 failed with error -1
2. On another T460p with BIOS version 2.15 two finger scrolling gesture
on the touchpad stops working after suspend-resume cycle (about 75%
reproducibility, when it still works, the scrolling gesture becomes
laggy). Nothing suspicious appears in the dmesg.
Analysis form Richard Schütz:
"RMI is unreliable on the ThinkPad T460p because the device is affected
by the firmware behavior addressed in a7ae81952cda ("i2c: i801: Allow
ACPI SystemIO OpRegion to conflict with PCI BAR")."
James Zhu [Tue, 6 Mar 2018 19:52:35 +0000 (14:52 -0500)]
drm/amdgpu:Always save uvd vcpu_bo in VM Mode
When UVD is in VM mode, there is not uvd handle exchanged,
uvd.handles are always 0. So vcpu_bo always need save,
Otherwise amdgpu driver will fail during suspend/resume.
HW Engineer's Notes:
During switch from vga->extended, if we set the VGA_TEST_ENABLE and then
hit the VGA_TEST_RENDER_START, then the DCHUBP timing gets updated correctly.
Then vBIOS will have it poll for the VGA_TEST_RENDER_DONE and unset
VGA_TEST_ENABLE, to leave it in the same state as before.
Leo (Sunpeng) Li [Tue, 20 Feb 2018 20:46:09 +0000 (15:46 -0500)]
drm/amd/display: Fix memleaks when atomic check fails.
While checking plane states for updates during atomic check, we create
dc_plane_states in preparation. These dc states should be freed if
something errors.
Although the input transfer function is also freed by
dc_plane_state_release(), we should free it (on error) under the same
scope as where it is created.
Eric Yang [Fri, 19 Jan 2018 00:07:54 +0000 (19:07 -0500)]
drm/amd/display: fix cursor related Pstate hang
Move cursor programming to inside the OTG_MASTER_UPDATE_LOCK
If graphics plane go from 1 pipe to hsplit, the cursor updates
after mpc programming and unlock. Which means there is a window
of time where cursor is enabled on the wrong pipe if it's on
the right side of the screen (i.e. case where cursor need to
move from pipe 0 to pipe 3 post split). This will cause pstate hang.
Solution is to program the cursor while still locked.
Mikita Lipski [Thu, 18 Jan 2018 19:53:57 +0000 (14:53 -0500)]
drm/amd/display: Set irq state only on existing crtcs
Because AMDGPU_CRTC_IRQ_VLINE1 = 6, it expected 6 more crtcs to be
programed with disabled irq state in amdgpu_irq_disable_all. That caused errors and accessed
the wrong memory location.
Harry Wentland [Mon, 18 Dec 2017 18:48:12 +0000 (13:48 -0500)]
drm/amd/display: Make create_stream_for_sink more consistent
We've got a helper function to call dc_create_stream_for_sink and one
other place that calls it directly. Make sure we call the helper
functions always since we need to update a bunch of things in stream and
don't want to miss that.
Shirish S [Wed, 28 Feb 2018 06:44:58 +0000 (12:14 +0530)]
drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)
The below commit
"drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2"
introduces a slight behavioral change to rmfb. Instead of disabling a crtc
when the primary plane is disabled, it now preserves it.
This change leads to BUG hit while performing atomic commit on amd driver.
As a fix this patch ensures that we disable the CRTC's with NULL FB by returning
-EINVAL and hence triggering fall back to the old behavior and turning off the
crtc in atomic_remove_fb().
V2: Added error check for plane_state and removed sanity check for crtc.
Harry Wentland [Tue, 20 Feb 2018 18:36:23 +0000 (13:36 -0500)]
drm/amd/display: Default HDMI6G support to true. Log VBIOS table error.
There have been many reports of Ellesmere and Baffin systems not being
able to drive HDMI 4k60 due to the fact that we check the HDMI_6GB_EN
bit from VBIOS table. Windows seems to not have this issue.
On some systems we fail to the encoder cap info from VBIOS. In that case
we should default to enabling HDMI6G support.
This was tested by dwagner on
https://bugs.freedesktop.org/show_bug.cgi?id=102820
Shirish S [Tue, 13 Feb 2018 08:45:17 +0000 (14:15 +0530)]
drm/amd/display: update plane params before validation
This patch updates the dc's plane state with the parameters set by the
user side.
This is needed to validate the plane capabilities with the parameters
user space wants to set.
Shirish S [Tue, 13 Feb 2018 08:41:37 +0000 (14:11 +0530)]
drm/amd/display: validate plane in dce110 for scaling
CZ & ST support uptil a limit 2:1 downscaling, this patch
adds validate_plane hook, that shall be used to validate
the plane attributes sent by the user space based
on dce110 capabilities.
Shirish S [Fri, 16 Feb 2018 06:14:22 +0000 (11:44 +0530)]
drm/amd/display: defer modeset check in dm_update_planes_state
amdgpu_dm_atomic_check() is used to validate the entire configuration of
planes and crtc's that the user space wants to commit.
However amdgpu_dm_atomic_check() depends upon DRM_MODE_ATOMIC_ALLOW_MODESET
flag else its mostly dummy.
Its not mandatory for the user space to set DRM_MODE_ATOMIC_ALLOW_MODESET,
and in general its not set either along with DRM_MODE_ATOMIC_TEST_ONLY.
Considering its importantance, this patch defers the allow_modeset check
in dm_update_planes_state(), so that there shall be scope to validate
the configuration sent from user space, without impacting the population
of dc/dm related data structures.
Rex Zhu [Tue, 27 Feb 2018 10:20:53 +0000 (18:20 +0800)]
drm/amdgpu: Notify sbios device ready before send request
it is required if a platform supports PCIe root complex
core voltage reduction. After receiving this notification,
SBIOS can apply default PCIe root complex power policy.
Bhavesh Davda [Fri, 22 Dec 2017 22:17:13 +0000 (14:17 -0800)]
xen-blkfront: move negotiate_mq to cover all cases of new VBDs
negotiate_mq should happen in all cases of a new VBD being discovered by
xen-blkfront, whether called through _probe() or a hot-attached new VBD
from dom-0 via xenstore. Otherwise, hot-attached new VBDs are left
configured without multi-queue.
Leon Romanovsky [Wed, 7 Mar 2018 12:49:09 +0000 (14:49 +0200)]
RDMA/ucma: Limit possible option size
Users of ucma are supposed to provide size of option level,
in most paths it is supposed to be equal to u8 or u16, but
it is not the case for the IB path record, where it can be
multiple of struct ib_path_rec_data.
This patch takes simplest possible approach and prevents providing
values more than possible to allocate.
Parav Pandit [Wed, 7 Mar 2018 06:07:41 +0000 (08:07 +0200)]
IB/core: Fix possible crash to access NULL netdev
resolved_dev returned might be NULL as ifindex is transient number.
Ignoring NULL check of resolved_dev might crash the kernel.
Therefore perform NULL check before accessing resolved_dev.
Additionally rdma_resolve_ip_route() invokes addr_resolve() which
performs check and address translation for loopback ifindex.
Therefore, checking it again in rdma_resolve_ip_route() is not helpful.
Therefore, the code is simplified to avoid IFF_LOOPBACK check.
Linus Torvalds [Wed, 7 Mar 2018 18:59:23 +0000 (10:59 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Nine bug fixes for s390:
- Three fixes for the expoline code, one of them is strictly speaking
a cleanup but as it relates to code added with 4.16 I would like to
include the patch.
- Three timer related fixes in the common I/O layer
- A fix for the handling of internal DASD request which could cause
panics.
- One correction in regard to the accounting of pud page tables vs.
compat tasks.
- The register scrubbing in entry.S caused spurious crashes, this is
fixed now as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/entry.S: fix spurious zeroing of r0
s390: Fix runtime warning about negative pgtables_bytes
s390: do not bypass BPENTER for interrupt system calls
s390/cio: clear timer when terminating driver I/O
s390/cio: fix return code after missing interrupt
s390/cio: fix ccw_device_start_timeout API
s390/clean-up: use CFI_* macros in entry.S
s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)
s390/dasd: fix handling of internal requests
Linus Torvalds [Wed, 7 Mar 2018 18:54:11 +0000 (10:54 -0800)]
Merge tag 'regulator-fix-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of fixes here:
- another half of the supend to idle fix from Geert that went in
earlier, both he and I are confused as to why he didn't notice that
this was missing when his earlier fix was merged.
- a simple fix for a test done the wrong way round in the
stm32-vrefbuf driver"
* tag 'regulator-fix-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Fix resume from suspend to idle
regulator: stm32-vrefbuf: fix check on ready flag
Linus Torvalds [Wed, 7 Mar 2018 18:50:15 +0000 (10:50 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is mostly fixes for driver specific issues (nine of them) and the
storvsc performance improvement with interrupt handling which was
dropped from the previous fixes pull request.
We also have two regressions: one is a double call_rcu() in ATA error
handling and the other is a missed conversion to BLK_STS_OK in
__scsi_error_from_host_byte()"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Fix kernel crash during port toggle
scsi: qla2xxx: Fix FC-NVMe LUN discovery
scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()
scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops
scsi: qla2xxx: ensure async flags are reset correctly
scsi: qla2xxx: do not check login_state if no loop id is assigned
scsi: qla2xxx: Fixup locking for session deletion
scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
scsi: mpt3sas: wait for and flush running commands on shutdown/unload
scsi: mpt3sas: fix oops in error handlers after shutdown/unload
scsi: storvsc: Spread interrupts when picking a channel for I/O requests
scsi: megaraid_sas: Do not use 32-bit atomic request descriptor for Ventura controllers
gfs2: Fixes to "Implement iomap for block_map" (2)
It turns out that commit 3229c18c0d6b2 'Fixes to "Implement iomap for
block_map"' introduced another bug in gfs2_iomap_begin that can cause
gfs2_block_map to set bh->b_size of an actual buffer to 0. This can
lead to arbitrary incorrect behavior including crashes or disk
corruption. Revert the incorrect part of that commit.
Koen Vandeputte [Wed, 7 Mar 2018 16:46:39 +0000 (10:46 -0600)]
PCI: dwc: Fix enumeration end when reaching root subordinate
The subordinate value indicates the highest bus number which can be
reached downstream though a certain device.
Commit a20c7f36bd3d ("PCI: Do not allocate more buses than available in
parent") ensures that downstream devices cannot assign busnumbers higher
than the upstream device subordinate number, which was indeed illogical.
By default, dw_pcie_setup_rc() inits the Root Complex subordinate to a
value of 0x01.
Due to this combined with above commit, enumeration stops digging deeper
downstream as soon as bus num 0x01 has been assigned, which is always the
case for a bridge device.
This results in all devices behind a bridge bus remaining undetected, as
these would be connected to bus 0x02 or higher.
Fix this by initializing the RC to a subordinate value of 0xff, which is
not altering hardware behaviour in any way, but informs probing function
pci_scan_bridge() later on which reads this value back from register.
The following nasty errors during boot are also fixed by this:
pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01] (conflicts with (null) [bus 01])
...
pci_bus 0000:03: [bus 03] partially hidden behind bridge 0000:01 [bus 01]
...
pci_bus 0000:04: [bus 04] partially hidden behind bridge 0000:01 [bus 01]
...
pci_bus 0000:05: [bus 05] partially hidden behind bridge 0000:01 [bus 01]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 05
pci_bus 0000:02: busn_res: can not insert [bus 02-05] under [bus 01] (conflicts with (null) [bus 01])
pci_bus 0000:02: [bus 02-05] partially hidden behind bridge 0000:01 [bus 01]
Peter Malone [Wed, 7 Mar 2018 13:00:34 +0000 (14:00 +0100)]
fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in
sbusfb_ioctl_helper().
'index' is defined as an int in sbusfb_ioctl_helper().
We retrieve this from the user:
if (get_user(index, &c->index) ||
__get_user(count, &c->count) ||
__get_user(ured, &c->red) ||
__get_user(ugreen, &c->green) ||
__get_user(ublue, &c->blue))
return -EFAULT;
and then we use 'index' in the following way:
red = cmap->red[index + i] >> 8;
green = cmap->green[index + i] >> 8;
blue = cmap->blue[index + i] >> 8;
This is a classic information leak vulnerability. 'index' should be
an unsigned int, given its usage above.
This patch is straight-forward; it changes 'index' to unsigned int
in two switch-cases: FBIOGETCMAP_SPARC && FBIOPUTCMAP_SPARC.
The slaves and holders link for the hidden gendisks confuse lsblk so that
it errors out on, or doesn't report the nvme multipath devices. Given
that we don't need holder relationships for something that can't even be
directly accessed we should just stop creating those links.
Ingo Molnar [Wed, 7 Mar 2018 08:18:32 +0000 (09:18 +0100)]
Merge tag 'perf-urgent-for-mingo-4.16-20180306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Prevent auxtrace_queues__process_index() from queuing AUX area data for
decoding when the --no-itrace option has been used (Adrian Hunter)
- Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo)
- Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin)
Josh Poimboeuf [Tue, 6 Mar 2018 23:58:15 +0000 (17:58 -0600)]
objtool: Fix 32-bit build
Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit
host. This also simplifies read_retpoline_hints() a bit and makes its
implementation similar to most of the other annotation reading
functions.
The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to
put the error QP in flush list. Since SQ and RQ of each error qp are added
to two different flush list, we need to protect it using locks of
corresponding CQs. Difference in order of acquiring the lock in
SQ poll_cq and RQ poll_cq can cause a hard lockup.
Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock.
Instead of this lock, introduces qplib_cq.flush_lock to handle
addition/deletion of QPs in flush list. Also, always invoke the flush_lock
in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential
deadlock.
Other than the poll_cq context, the movement of QP to/from flush list can
be done in modify_qp context or from an async error event from HW.
Synchronize these operations using the bnxt_re verbs layer CQ locks.
To achieve this, adds a call back to the HW abstraction layer(qplib) to
bnxt_re ib_verbs layer in case of async error event. Also, removes the
buddy cq functions as it is no longer required.
Max Gurtovoy [Mon, 5 Mar 2018 18:09:48 +0000 (20:09 +0200)]
RDMA/core: Reduce poll batch for direct cq polling
Fix warning limit for kernel stack consumption:
drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct':
drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes
is larger than 1024 bytes [-Werror=frame-larger-than=]
Using smaller ib_wc array on the stack brings us comfortably below that
limit again.
Mark Bloch [Mon, 5 Mar 2018 18:09:47 +0000 (20:09 +0200)]
IB/mlx5: When not in dual port RoCE mode, use provided port as native
The series that introduced dual port RoCE mode assumed that we don't have
a dual port HCA that use the mlx5 driver, this is not the case for
Connect-IB HCAs. This reasoning led to assigning 1 as the native port
index which causes issue when the second port is used.
For example query_pkey() when called on the second port will return values
of the first port. Make sure that we assign the right port index as the
native port index.
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Daniel Jurgens <[email protected]> Signed-off-by: Mark Bloch <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
Jack M [Mon, 5 Mar 2018 18:09:46 +0000 (20:09 +0200)]
IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
The commit cited below added a gid_type field (RoCEv1 or RoCEv2)
to GID properties.
When adding GIDs, this gid_type field was copied over to the
hardware gid table. However, when deleting GIDs, the gid_type field
was not copied over to the hardware gid table.
As a result, when running RoCEv2, all RoCEv2 gids in the
hardware gid table were set to type RoCEv1 when any gid was deleted.
This problem would persist until the next gid was added (which would again
restore the gid_type field for all the gids in the hardware gid table).
Fix this by copying over the gid_type field to the hardware gid table
when deleting gids, so that the gid_type of all remaining gids is
preserved when a gid is deleted.
When using IPv4 addresses in RoCEv2, the GID format for the mapped
IPv4 address should be: ::ffff:<4-byte IPv4 address>.
In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords
zeroed out by memset, which resulted in deleting the ffff field.
However, since procedure ipv6_addr_v4mapped() already verifies that the
gid has format ::ffff:<ipv4 address>, no change is needed for the gid,
and the memset can simply be removed.
Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware") Reviewed-by: Moni Shoua <[email protected]> Signed-off-by: Jack Morgenstein <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
Bill Kuzeja [Mon, 5 Mar 2018 05:02:55 +0000 (00:02 -0500)]
scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure
Because of the shifting around of code in qla2x00_probe_one recently,
failures during adapter initialization can lead to problems, i.e. NULL
pointer crashes and doubly freed data structures which cause eventual
panics.
This V2 version makes the relevant memory free routines idempotent, so
repeat calls won't cause any harm. I also removed the problematic
probe_init_failed exit point as it is not needed.
Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure") Signed-off-by: Bill Kuzeja <[email protected]> Acked-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Hannes Reinecke [Mon, 26 Feb 2018 14:26:01 +0000 (15:26 +0100)]
scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
The firmware event workqueue should not be marked as WQ_MEM_RECLAIM
as it's doesn't need to make forward progress under memory pressure.
In the current state it will result in a deadlock if the device had been
forcefully removed.
RDMA/qedr: Fix kernel panic when running fio over NFSoRDMA
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock,
leading to a case where two context's accessing poll_cq at
the same time lead to one of them having a pointer to an old
latest_cqe and reading an invalid cqe element
Fix iWARP connect and listen to use the mapped port for
ipv4 and ipv6. Without this fixed, running on a server
that has iwpmd enabled will not use the correct port
Mike Snitzer [Mon, 5 Mar 2018 20:26:06 +0000 (15:26 -0500)]
dm table: allow upgrade from bio-based to specialized bio-based variant
In practice this is really only meaningful in the context of the DM
multipath target (which uses dm_table_set_type() to set the type of
device DM should create via its "queue_mode" option).
So this change allows a DM multipath device with "queue_mode bio" to be
upgraded from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED -- iff the
underlying device(s) are NVMe.
DM_TYPE_NVME_BIO_BASED is just a DM core implementation detail that
allows for NVMe-specific optimizations (e.g. use direct_make_request
instead of generic_make_request). If in the future there is no benefit
or need to distinguish NVMe vs not: then it will be removed.
Mike Snitzer [Mon, 5 Mar 2018 19:10:11 +0000 (14:10 -0500)]
dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checks
This eliminates the "queue_mode" configuration's "nvme" mode. There
wasn't anything NVMe-specific about that mode. It was named "nvme"
because it was a short name for the mode. But the entire point of the
mode was to optimize the multipath target for underlying devices that
are _not_ SCSI-based. Devices that aren't SCSI have no need for the
various SCSI device handler (scsi_dh) specific code in DM multipath.
But rather than narrowly define this scsi_dh vs not branching in terms
of "nvme": invert the logic so that we're just checking whether a
multipath device is layered on SCSI devices with scsi_dh attached.
This allows any future storage technology to avoid scsi_dh specific code
in the multipath target too.
Jonathan Brassow [Tue, 27 Feb 2018 20:58:59 +0000 (21:58 +0100)]
dm raid: fix incorrect sync_ratio when degraded
Upstream commit 4102d9de6d375 ("dm raid: fix rs_get_progress()
synchronization state/ratio") in combination with commit 7c29744ecce
("dm raid: simplify rs_get_progress()") introduced a regression by
incorrectly reporting a sync_ratio of 0 for degraded raid sets. This
caused lvm2 to fail to repair raid legs automatically.
Fix by identifying the degraded state by checking the MD_RECOVERY_INTR
flag and returning mddev->recovery_cp in case it is set.
MD sets recovery = [ MD_RECOVERY_RECOVER MD_RECOVERY_INTR
MD_RECOVERY_NEEDED ] when a RAID member fails. It then shuts down any
sync thread that is running and leaves us with all MD_RECOVERY_* flags
cleared. The bug occurs if a status is requested in the short time it
takes to shut down any sync thread and clear the flags, because we were
keying in on the MD_RECOVERY_NEEDED - understanding it to be the initial
phase of a “recover” sync thread. However, this is an incorrect
interpretation if MD_RECOVERY_INTR is also set.
This also explains why the bug only happened when automatic repair was
enabled and not a normal ‘manual’ method. It is impossible to react
quick enough to hit the problematic window without it being automated.
Mike Snitzer [Thu, 22 Feb 2018 18:31:20 +0000 (13:31 -0500)]
dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl
Otherwise an underlying device's teardown (e.g. SCSI) may race with the
DM ioctl or persistent reservation and result in dereferencing driver
memory that gets freed when the underlying device's final blkdev_put()
occurs.
bdgrab() only increases the refcount for the block_device's inode to
ensure the block_device struct itself will not be freed, but does not
guarantee the block_device will remain associated with the gendisk or
its storage.
gcc-6.3 and earlier show a new warning after a seemingly unrelated
change to the arm64 PAGE_KERNEL definition:
In file included from drivers/md/dm-bufio.c:14:0:
drivers/md/dm-bufio.c: In function 'alloc_buffer':
include/linux/sched/mm.h:182:56: warning: 'noio_flag' may be used uninitialized in this function [-Wmaybe-uninitialized]
current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags;
^
The same warning happened earlier on linux-3.18 for MIPS and I did a
workaround for that, but now it's come back.
gcc-7 and newer are apparently smart enough to figure this out, and
other architectures don't show it, so the best I could come up with is
to rework the caller slightly in a way that makes it obvious enough to
all arm64 compilers what is happening here.
Linus Torvalds [Tue, 6 Mar 2018 20:41:30 +0000 (12:41 -0800)]
Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull sigingo fix from Eric Biederman:
"The kbuild test robot found that I accidentally moved si_pkey when I
was cleaning up siginfo_t. A short followed by an int with the int
having 8 byte alignment. Sheesh siginfo_t is a weird structure.
I have now corrected it and added build time checks that with a little
luck will catch any similar future mistakes. The build time checks
were sufficient for me to verify the bug and to verify my fix. So they
are at least useful this once."
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal/x86: Include the field offsets in the build time checks
signal: Correct the offset of si_pkey in struct siginfo
Hans de Goede [Tue, 6 Mar 2018 09:50:05 +0000 (10:50 +0100)]
Revert "typec: tcpm: Only request matching pdos"
Commit 57e6f0d7b804 ("typec: tcpm: Only request matching pdos") is causing
a regression, before this commit e.g. the GPD win and GPD pocket devices
were charging at 9V 3A with a PD charger, now they are instead slowly
discharging at 5V 0.4A, as this commit causes the ports max_snk_mv/ma/mw
settings to be completely ignored.
Arguably the way to fix this would be to add a PDO_VAR() describing the
voltage range to the snk_caps of boards which can handle any voltage in
their range, but the "typec: tcpm: Only request matching pdos" commit
looks at the type of PDO advertised by the source/charger and if that
is fixed (as it typically is) only compairs against PDO_FIXED entries
in the snk_caps so supporting a range of voltage would require adding a
PDO_FIXED entry for *every possible* voltage to snk_caps.
AFAICT there is no reason why a fixed source_cap cannot be matched against
a variable snk_cap, so at a minimum the commit should be rewritten to
support that.
For now lets revert the "typec: tcpm: Only request matching pdos" commit,
fixing the regression.
Merlijn Wajer [Mon, 5 Mar 2018 17:35:10 +0000 (11:35 -0600)]
usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
Without pm_runtime_{get,put}_sync calls in place, reading
vbus status via /sys causes the following error:
Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
pgd = b333e822
[fa0ab060] *pgd=48011452(bad)
[<c05261b0>] (musb_default_readb) from [<c0525bd0>] (musb_vbus_show+0x58/0xe4)
[<c0525bd0>] (musb_vbus_show) from [<c04c0148>] (dev_attr_show+0x20/0x44)
[<c04c0148>] (dev_attr_show) from [<c0259f74>] (sysfs_kf_seq_show+0x80/0xdc)
[<c0259f74>] (sysfs_kf_seq_show) from [<c0210bac>] (seq_read+0x250/0x448)
[<c0210bac>] (seq_read) from [<c01edb40>] (__vfs_read+0x1c/0x118)
[<c01edb40>] (__vfs_read) from [<c01edccc>] (vfs_read+0x90/0x144)
[<c01edccc>] (vfs_read) from [<c01ee1d0>] (SyS_read+0x3c/0x74)
[<c01ee1d0>] (SyS_read) from [<c0106fe0>] (ret_fast_syscall+0x0/0x54)
Radim Krčmář [Tue, 6 Mar 2018 16:42:28 +0000 (17:42 +0100)]
Merge tag 'kvm-s390-master-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: Fixes
- Fix random memory corruption when running as guest2 (e.g. KVM in
LPAR) and starting guest3 (nested KVM) with many CPUs (e.g. a nested
guest with 200 vcpu)
- io interrupt delivery counter was not exported
Radim Krčmář [Tue, 6 Mar 2018 16:24:09 +0000 (17:24 +0100)]
Merge tag 'kvm-ppc-fixes-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Fixes for PPC KVM:
- Fix guest time accounting in the host
- Fix large-page backing for radix guests on POWER9
- Fix HPT guests on POWER9 backed by 2M or 1G pages
- Compile fixes for some configs and gcc versions
Maxime Ripard [Wed, 21 Feb 2018 12:57:02 +0000 (13:57 +0100)]
drm/sun4i: rgb: Fix potential division by zero
In the case where mode_valid callback of our RGB connector was called
before mode_set was being called, the range of dividers would not be set,
resulting in a division by zero later on in the clk_round_rate logic.
Set the range of dividers before calling clk_round_rate to fix this.
Maxime Ripard [Wed, 21 Feb 2018 12:57:01 +0000 (13:57 +0100)]
drm/sun4i: tcon: Reduce the scope of the LVDS error a bit
The current logic to deal with old DT missing the LVDS properties doesn't
take into account whether the LVDS output is supported in the first place,
resulting in spurious error messages on SoCs where it doesn't even matter.
Introduce a new TCON flag to list if LVDS is supported at all to prevent
this from happening.
There is a lock odering problem between mmap_sem and ashmem_mutex causing
a lockdep splat[1] during a syzcaller test. This patch fixes the problem
by move copy_from_user out of ashmem_mutex.
Adrian Hunter [Wed, 28 Feb 2018 08:39:04 +0000 (10:39 +0200)]
perf tools: Fix trigger class trigger_on()
trigger_on() means that the trigger is available but not ready, however
trigger_on() was making it ready. That can segfault if the signal comes
before trigger_ready(). e.g. (USR2 signal delivery not shown)
Takashi Iwai [Tue, 6 Mar 2018 11:14:17 +0000 (12:14 +0100)]
ALSA: hda/realtek - Fix dock line-out volume on Dell Precision 7520
The dock line-out pin (NID 0x17 of ALC3254 codec) on Dell Precision
7520 may route to three different DACs, 0x02, 0x03 and 0x06. The
first two DACS have the volume amp controls while the last one
doesn't. And unfortunately, the auto-parser assigns this pin to DAC3,
resulting in the non-working volume control for the line out.
Fix it by disabling the routing to DAC3 on the corresponding pin.