Git Repo - linux.git/log

powerpc/mm: Fix crash in page table dump with huge pages

The page table dump code doesn't know about huge pages, so currently
it crashes (or walks random memory, usually leading to a crash), if it
finds a huge page. On Book3S we only see huge pages in the Linux page
tables when we're using the P9 Radix MMU.

Teaching the code to properly handle huge pages is a bit more involved,
so for now just prevent the crash.

Cc: [email protected] # v4.10+
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <[email protected]>

drm/nouveau/fifo/gk104-: Silence a locking warning

Presumably we can never actually hit this return, but static checkers
complain that we should unlock before we return.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>

drm/nouveau/secboot: plug memory leak in ls_ucode_img_load_gr() error path

The last goto looks spurious because it releases less resources than the
previous one.
Also free 'img->sig' if 'ls_ucode_img_build()' fails.

Fixes: 9d896f3e41a6 ("drm/nouveau/secboot: abstract LS firmware loading functions")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>

drm/nouveau: Fix drm poll_helper handling

Commit cae9ff036eea effectively disabled the drm poll_helper by checking
the wrong flag to see if the driver should enable the poll or not:
mode_config.poll_enabled is only set to true by poll_init and it is not
indicating if the poll is enabled or not.
nouveau_display_create() will initialize the poll and going to disable it
right away. After poll_init() the mode_config.poll_enabled will be true,
but the poll itself is disabled.

To avoid the race caused by calling the poll_enable() from different paths,
this patch will enable the poll from one place, in the
nouveau_display_hpd_work().

In case the pm_runtime is disabled we will enable the poll in
nouveau_drm_load() once.

Fixes: cae9ff036eea ("drm/nouveau: Don't enabling polling twice on runtime resume")
Signed-off-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Lyude <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>

i2c: mv64xxx: don't override deferred probing when getting irq

There is no reason to use platform_get_irq() for non-DT probing and
irq_of_parse_and_map() for DT probing. Indeed, platform_get_irq()
works fine for both.

In addition, using platform_get_irq() properly returns -EPROBE_DEFER
when the interrupt controller is not yet available, so instead of
inventing our own error code (-ENXIO), return the one provided by
platform_get_irq().

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

uio: fix incorrect memory leak cleanup

Commit 75f0aef6220d ("uio: fix memory leak") has fixed up some
memory leaks during the failure paths of the addition of uio
attributes, but still is not correct entirely. A kobject_uevent()
failure still needs a kobject_put() and the kobject container
structure allocation failure before the kobject_init() doesn't
need a kobject_put(). Fix this properly.

Fixes: 75f0aef6220d ("uio: fix memory leak")
Signed-off-by: Suman Anna <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

misc: pci_endpoint_test: select CRC32

There is the following link error with CONFIG_PCI_ENDPOINT_TEST=y and
CONFIG_CRC32=m:

drivers/built-in.o: In function 'pci_endpoint_test_ioctl':
pci_endpoint_test.c:(.text+0xf1251): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf1322): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf13b2): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf141e): undefined reference to 'crc32_le'

Fix this by selecting CRC32 in the PCI_ENDPOINT_TEST kconfig entry.

Fixes: 2c156ac71c6b ("misc: Add host side PCI driver for PCI test function device")
Signed-off-by: Tobias Regnery <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

char: lp: fix possible integer overflow in lp_setup()

The lp_setup() code doesn't apply any bounds checking when passing
"lp=none", and only in this case, resulting in an overflow of the
parport_nr[] array. All versions in Git history are affected.

Reported-By: Roee Hay <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: [email protected]
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'pstore-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore fix from Kees Cook:
"Fix bad EFI vars iterator usage"

* tag 'pstore-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
efi-pstore: Fix read iter after pstore API refactor

mlx5e: add CONFIG_INET dependency

We now reference the arp_tbl, which requires IPv4 support to be
enabled in the kernel, otherwise we get a link error:

drivers/net/built-in.o: In function `mlx5e_tc_update_neigh_used_value':
(.text+0x16afec): undefined reference to `arp_tbl'
drivers/net/built-in.o: In function `mlx5e_rep_neigh_init':
en_rep.c:(.text+0x16c16d): undefined reference to `arp_tbl'
drivers/net/built-in.o: In function `mlx5e_rep_netevent_event':
en_rep.c:(.text+0x16cbb5): undefined reference to `arp_tbl'

This adds a Kconfig dependency for it.

Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

KVM: x86: lower default for halt_poll_ns

In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to
increase heavily even in cases where the performance improvement was
small. In particular, bandwidth divided by CPU usage was as much as
60% lower.

To some extent this is the expected effect of the patch, and the
additional CPU utilization is only visible when running the
benchmarks. However, halving the threshold also halves the extra
CPU utilization (from +30-130% to +20-70%) and has no negative
effect on performance.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

dm bufio: make the parameter "retain_bytes" unsigned long

Change the type of the parameter "retain_bytes" from unsigned to
unsigned long, so that on 64-bit machines the user can set more than
4GiB of data to be retained.

Also, change the type of the variable "count" in the function
"__evict_old_buffers" to unsigned long. The assignment
"count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];"
could result in unsigned long to unsigned overflow and that could result
in buffers not being freed when they should.

While at it, avoid division in get_retain_buffers(). Division is slow,
we can change it to shift because we have precalculated the log2 of
block size.

Cc: [email protected]
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

net: Improve handling of failures on link and route dumps

In general, rtnetlink dumps do not anticipate failure to dump a single
object (e.g., link or route) on a single pass. As both route and link
objects have grown via more attributes, that is no longer a given.

netlink dumps can handle a failure if the dump function returns an
error; specifically, netlink_dump adds the return code to the response
if it is <= 0 so userspace is notified of the failure. The missing
piece is the rtnetlink dump functions returning the error.

Fix route and link dump functions to return the errors if no object is
added to an skb (detected by skb->len != 0). IPv6 route dumps
(rt6_dump_route) already return the error; this patch updates IPv4 and
link dumps. Other dump functions may need to be ajusted as well.

Reported-by: Jan Moskyto Matejka <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: Add warning about remote memory exposure

The driver explicitly bypasses APIs to register all memory once a
connection is made, and thus allows remote access to memory.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Acked-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

smc: switch to usage of IB_PD_UNSAFE_GLOBAL_RKEY

Currently, SMC enables remote access to physical memory when a user
has successfully configured and established an SMC-connection until ten
minutes after the last SMC connection is closed. Because this is considered
a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in
such a case.

This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY.
This improves user awareness, but does not remove the security risk itself.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

efi-pstore: Fix read iter after pstore API refactor

During the internal pstore API refactoring, the EFI vars read entry was
accidentally made to update a stack variable instead of the pstore
private data pointer. This corrects the problem (and removes the now
needless argument).

Fixes: 125cc42baf8a ("pstore: Replace arguments for read() API")
Signed-off-by: Kees Cook <[email protected]>

Merge branch 'i2c-mux/for-current' of https://github.com/peda-r/i2c-mux into i2c/for-current

Pull bugfixes from the i2c mux subsubsystem:

This fixes an old bug in resource cleanup on failure in i2c-mux-reg and
a new log spamming bug from this merge window in the i2c-mux core.

ipmr: vrf: Find VIFs using the actual device

The skb->dev that is passed into ip_mr_input is
the loX device for VRFs. When we lookup a vif
for this dev, none is found as we do not create
vifs for loopbacks. Instead lookup a vif for the
actual device that the packet was received on,
eg the vlan.

Signed-off-by: Thomas Winter <[email protected]>
cc: David Ahern <[email protected]>
cc: Nikolay Aleksandrov <[email protected]>
cc: roopa <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: eliminate negative reordering in tcp_clean_rtx_queue

tcp_ack() can call tcp_fragment() which may dededuct the
value tp->fackets_out when MSS changes. When prior_fackets
is larger than tp->fackets_out, tcp_clean_rtx_queue() can
invoke tcp_update_reordering() with negative values. This
results in absurd tp->reodering values higher than
sysctl_tcp_max_reordering.

Note that tcp_update_reordering indeeds sets tp->reordering
to min(sysctl_tcp_max_reordering, metric), but because
the comparison is signed, a negative metric always wins.

Fixes: c7caf8d3ed7a ("[TCP]: Fix reord detection due to snd_una covered holes")
Reported-by: Rebecca Isaacs <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

- convert the debug feature to refcount_t

- reduce the copy size for strncpy_from_user

- 8 bug fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/virtio: change virtio_feature_desc:features type to __le32
  s390: convert debug_info.ref_count from atomic_t to refcount_t
  s390: move _text symbol to address higher than zero
  s390/qdio: increase string buffer size
  s390/ccwgroup: increase string buffer size
  s390/topology: let topology_mnest_limit() return unsigned char
  s390/uaccess: use sane length for __strncpy_from_user()
  s390/uprobes: fix compile for !KPROBES
  s390/ftrace: fix compile for !MODULES
  s390/cputime: fix incorrect system time

Merge tag 'edac_fix_for_4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
"A single amd64_edac fix correcting chip select sizes reporting on
F17h"

* tag 'edac_fix_for_4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h

memory: omap-gpmc: Fix debug output for access width

The width needs to be configured in bytes with 1 meaning 8-bit
access and 2 meaning 16-bit access.

Cc: Peter Ujfalusi <[email protected]>
Acked-by: Roger Quadros <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>

ARM: dts: LogicPD Torpedo: Fix camera pin mux

Fix commit 05c4ffc3a266 ("ARM: dts: LogicPD Torpedo: Add MT9P031 Support")
In the previous commit, I indicated that the only testing was done by
showing the camera showed up when probing. This patch fixes an incorrect
pin muxing on cam_d0, cam_d1 and cam_d2.

Signed-off-by: Adam Ford <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>

ARM: dts: omap4: enable CEC pin for Pandaboard A4 and ES

The CEC pin was always pulled up, making it impossible to use it.

Change to PIN_INPUT so it can be used by the new CEC support.

Signed-off-by: Hans Verkuil <[email protected]>
Reviewed-by: Tomi Valkeinen <[email protected]>

ARM: dts: gta04: fix polarity of clocks for mcbsp4

The clock polarity setting of the mcbsp connected to
the modem was wrong so almost only noise
was received.
With this patch it is also the same as it was on
earlier non-dt kernels where it was working properly

Signed-off-by: Andreas Kemnade <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>

ARM: dts: dra7: Add power hold and power controller properties to palmas

Add power hold and power controller properties to palmas node.
This is needed to shutdown pmic correctly on boards with
powerhold set.

Signed-off-by: Keerthy <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>

genirq: Fix chained interrupt data ordering

irq_set_chained_handler_and_data() sets up the chained interrupt and then
stores the handler data.

That's racy against an immediate interrupt which gets handled before the
store of the handler data happened. The handler will dereference a NULL
pointer and crash.

Cure it by storing handler data before installing the chained handler.

Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]

staging: fsl-dpaa2/eth: add ETHERNET dependency

The new driver cannot link correctly when the netdevice infrastructure
is disabled:

ERROR: "netdev_info" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!
ERROR: "skb_to_sgvec" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!
ERROR: "napi_disable" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!
ERROR: "napi_schedule_prep" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!
ERROR: "__napi_schedule_irqoff" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!
ERROR: "netif_carrier_on" [drivers/staging/fsl-dpaa2/ethernet/fsl-dpaa2-eth.ko] undefined!

This adds a dependency on NETDEVICES and ETHERNET.

Fixes: 0352d1d85201 ("staging: fsl-dpaa2/eth: Add APIs for DPNI objects")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: typec: fusb302: refactor resume retry mechanism

The i2c functions need to test the pm_suspend state and do, if needed, some
retry before i2c operations. This code was repeated 4x.

To isolate this, create a new function to check suspend state and call it in
every need place.

As at it, move the error message from pr_err to dev_err.

Signed-off-by: Rui Miguel Silva <[email protected]>
Acked-by: Yueyao Zhu <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: typec: fusb302: reset i2c_busy state in error

Fix reset of i2c_busy flag if an error occurs during the i2c block read.

Signed-off-by: Rui Miguel Silva <[email protected]>
Acked-by: Yueyao Zhu <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: dwc3: keystone: check return value

Function devm_clk_get() returns an ERR_PTR when it fails. However, in
function kdwc3_probe(), its return value is not checked, which may
result in a bad memory access bug. This patch fixes the bug.

Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

usb: gadget: f_fs: avoid out of bounds access on comp_desc

Companion descriptor is only used for SuperSpeed endpoints,
if the endpoints are HighSpeed or FullSpeed, the Companion
descriptor will not allocated, so we can only access it if
gadget is SuperSpeed.

I can reproduce this issue on Rockchip platform rk3368 SoC
which supports USB 2.0, and use functionfs for ADB. Kernel
build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report
the following BUG:

==================================================================
BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr ffffffc0601f6509
Read of size 1 by task swapper/0/0
============================================================================
BUG kmalloc-256 (Not tainted): kasan: bad access detected
----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1
alloc_debug_processing+0x128/0x17c
___slab_alloc.constprop.58+0x50c/0x610
__slab_alloc.isra.55.constprop.57+0x24/0x34
__kmalloc+0xe0/0x250
ffs_func_bind+0x52c/0x99c
usb_add_function+0xd8/0x1d4
configfs_composite_bind+0x48c/0x570
udc_bind_to_driver+0x6c/0x170
usb_udc_attach_driver+0xa4/0xd0
gadget_dev_desc_UDC_store+0xcc/0x118
configfs_write_file+0x1a0/0x1f8
__vfs_write+0x64/0x174
vfs_write+0xe4/0x200
SyS_write+0x68/0xc8
el0_svc_naked+0x24/0x28
INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247
...
Call trace:
[<ffffff900808aab4>] dump_backtrace+0x0/0x230
[<ffffff900808acf8>] show_stack+0x14/0x1c
[<ffffff90084ad420>] dump_stack+0xa0/0xc8
[<ffffff90082157cc>] print_trailer+0x188/0x198
[<ffffff9008215948>] object_err+0x3c/0x4c
[<ffffff900821b5ac>] kasan_report+0x324/0x4dc
[<ffffff900821aa38>] __asan_load1+0x24/0x50
[<ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0
[<ffffff90089d3760>] composite_setup+0xdcc/0x1ac8
[<ffffff90089d7394>] android_setup+0x124/0x1a0
[<ffffff90089acd18>] _setup+0x54/0x74
[<ffffff90089b6b98>] handle_ep0+0x3288/0x4390
[<ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4
[<ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298
[<ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20
[<ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac
[<ffffff9008116610>] handle_irq_event+0x60/0xa0
[<ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4
[<ffffff9008115568>] generic_handle_irq+0x30/0x40
[<ffffff90081159b4>] __handle_domain_irq+0xac/0xdc
[<ffffff9008080e9c>] gic_handle_irq+0x64/0xa4
...
Memory state around the buggy address:
  ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc
>ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                       ^
  ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
==================================================================

Signed-off-by: William Wu <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

usb: gadget: gserial: check if console kthread exists

Check for bad pointer that may result because of kthread_create failure.
This check is needed since the gserial setup callback function
(gs_console_setup()) is only freeing the info->con_buf in case of
kthread_create failure which will result into bad info->console_thread
pointer.
Without checking info->console_thread pointer validity in the
gserial_console_exit() function, before calling kthread_stop(), the
rmmod will generate Kernel Oops.

Signed-off-by: Bogdan Mirea <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

usb: dwc3: gadget: Prevent losing events in event cache

The dwc3 driver can overwite its previous events if its top-half IRQ
handler (TH) gets invoked again before processing the events in the
cache. We see this as a hang in the file transfer and the host will
attempt to reset the device. TH gets the event count and deasserts the
interrupt line by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. If
there's a new event coming between reading the event count and interrupt
deassertion, dwc3 will lose previous pending events. More generally, we
will see 0 event count, which should not affect anything.

This shouldn't be possible in the current dwc3 implementation. However,
through testing and reading the PCIe trace, the TH occasionally still
gets invoked one more time after HW interrupt deassertion. (With PCIe
legacy interrupts, TH is called repeatedly as long as the interrupt line
is asserted). We suspect that there is a small detection delay in the
SW.

To avoid this issue, Check DWC3_EVENT_PENDING flag to determine if the
events are processed in the bottom-half IRQ handler. If not, return
IRQ_HANDLED and don't process new event.

Cc: [email protected]
Signed-off-by: Thinh Nguyen <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

usb: dwc3: gadget: Fix ISO transfer performance

Commit 08a36b543803 ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()")
caused a small change in the way ISO transfer is handled in the case
when XferInProgress event happens on Isoc EP with an active transfer.
This caused a performance degradation of 50%. e.g. using g_webcam on DUT
and luvcview on host the video frame rate dropped from 16fps to 8fps
@high-speed.

Make the ISO transfer handling equivalent to that prior to that commit
to get back the original ISO performance numbers.

Fixes: 08a36b543803 ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()")
Signed-off-by: Roger Quadros <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

usb: dwc3: pci: add Intel Cannonlake PCI IDs

Intel Cannonlake PCH has the same DWC3 than Intel
Sunrisepoint. Add the new IDs to the supported devices.

Signed-off-by: Heikki Krogerus <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>

kvm: arm/arm64: Fix use after free of stage2 page table

We yield the kvm->mmu_lock occassionaly while performing an operation
(e.g, unmap or permission changes) on a large area of stage2 mappings.
However this could possibly cause another thread to clear and free up
the stage2 page tables while we were waiting for regaining the lock and
thus the original thread could end up in accessing memory that was
freed. This patch fixes the problem by making sure that the stage2
pagetable is still valid after we regain the lock. The fact that
mmu_notifer->release() could be called twice (via __mmu_notifier_release
and mmu_notifier_unregsister) enhances the possibility of hitting
this race where there are two threads trying to unmap the entire guest
shadow pages.

While at it, cleanup the redudant checks around cond_resched_lock in
stage2_wp_range(), as cond_resched_lock already does the same checks.

Cc: Mark Rutland <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Cc: [email protected]
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

kvm: arm/arm64: Force reading uncached stage2 PGD

Make sure we don't use a cached value of the KVM stage2 PGD while
resetting the PGD.

Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Signed-off-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

ebtables: arpreply: Add the standard target sanity check

The info->target comes from userspace and it would be used directly.
So we need to add the sanity check to make sure it is a valid standard
target, although the ebtables tool has already checked it. Kernel needs
to validate anything coming from userspace.

If the target is set as an evil value, it would break the ebtables
and cause a panic. Because the non-standard target is treated as one
offset.

Now add one helper function ebt_invalid_target, and we would replace
the macro INVALID_TARGET later.

Signed-off-by: Gao Feng <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

powerpc/kprobes: Fix handling of instruction emulation on probe re-entry

Commit 22d8b3dec214c ("powerpc/kprobes: Emulate instructions on kprobe
handler re-entry") enabled emulating instructions on kprobe re-entry,
rather than single-stepping always. However, we didn't update the single
stepping code to only be run if the emulation fails. Also, we missed
re-enabling preemption if the instruction emulation was successful. Fix
those issues.

Fixes: 22d8b3dec214c ("powerpc/kprobes: Emulate instructions on kprobe handler re-entry")
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc/powernv: Set NAPSTATELOST after recovering paca on P9 DD1

Commit 17ed4c8f81da ("powerpc/powernv: Recover correct PACA on wakeup
from a stop on P9 DD1") promises to set the NAPSTATELOST bit in paca
after recovering the correct paca for the thread waking up from stop1
on DD1, so that the GPRs can be correctly restored on the stop exit
path. However, it loads the value 1 into r3, but stores the value in
r0 into NAPSTATELOST(r13).

Fix this by correctly set the NAPSTATELOST bit in paca after
recovering the paca on POWER9 DD1.

Fixes: 17ed4c8f81da ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1")
Signed-off-by: Gautham R. Shenoy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

selftests/powerpc: Test TM and VMX register state

Test that the VMX checkpointed register state is maintained when a VMX
unavailable exception is taken during a transaction.

Thanks to Breno Leitao <[email protected]> and
Gustavo Bueno Romero <[email protected]> for the original test this
is based heavily on.

Signed-off-by: Michael Neuling <[email protected]>
Reviewed-by: Cyril Bur <[email protected]>
[mpe: Add to .gitignore, always build it 64-bit to fix build errors]
Signed-off-by: Michael Ellerman <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Track alignment in BPF verifier so that legitimate programs won't be
    rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.

2) Make tail calls work properly in arm64 BPF JIT, from Deniel
    Borkmann.

3) Make the configuration and semantics Generic XDP make more sense and
    don't allow both generic XDP and a driver specific instance to be
    active at the same time. Also from Daniel.

4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.

5) Fix use-after-free in VRF driver, from Gao Feng.

6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers in
    qca_spi driver, from Stefan Wahren.

7) Always run cleanup routines in BPF samples when we get SIGTERM, from
    Andy Gospodarek.

8) The mdio phy code should bring PHYs out of reset using the shared
    GPIO lines before invoking bus->reset(). From Florian Fainelli.

9) Some USB descriptor access endian fixes in various drivers from
    Johan Hovold.

10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
    Pressman.

11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.

12) Cure netdev leak in AF_PACKET when using timestamping via control
    messages. From Douglas Caetano dos Santos.

13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it. From Miroslav
    Lichvar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  ldmvsw: stop the clean timer at beginning of remove
  ldmvsw: unregistering netdev before disable hardware
  net: netcp: fix check of requested timestamping filter
  ipv6: avoid dad-failures for addresses with NODAD
  qed: Fix uninitialized data in aRFS infrastructure
  mdio: mux: fix device_node_continue.cocci warnings
  net/packet: fix missing net_device reference release
  net/mlx4_core: Use min3 to select number of MSI-X vectors
  macvlan: Fix performance issues with vlan tagged packets
  net: stmmac: use correct pointer when printing normal descriptor ring
  net/mlx5: Use underlay QPN from the root name space
  net/mlx5e: IPoIB, Only support regular RQ for now
  net/mlx5e: Fix setup TC ndo
  net/mlx5e: Fix ethtool pause support and advertise reporting
  net/mlx5e: Use the correct pause values for ethtool advertising
  vmxnet3: ensure that adapter is in proper state during force_close
  sfc: revert changes to NIC revision numbers
  net: ch9200: add missing USB-descriptor endianness conversions
  net: irda: irda-usb: fix firmware name on big-endian hosts
  net: dsa: mv88e6xxx: add default case to switch
  ...

Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
"A set of minor cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  [CIFS] Minor cleanup of xattr query function
  fs: cifs: transport: Use time_after for time comparison
  SMB2: Fix share type handling
  cifs: cifsacl: Use a temporary ops variable to reduce code length
  Don't delay freeing mids when blocked on slow socket write of request
  CIFS: silence lockdep splat in cifs_relock_file()

Merge branch 'stable/for-jens-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus

Pull a single fix from Konrad.

block: xen-blkback: add null check to avoid null pointer dereference

Add null check before calling xen_blkif_put() to avoid potential
null pointer dereference.

Addresses-Coverity-ID: 1350942
Cc: Juergen Gross <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Merge branch 'ldmsw-fixes'

Shannon Nelson says:

====================
ldmvsw: port removal stability

Under heavy reboot stress testing we found a couple of timing issues
when removing the device that could cause the kernel great heartburn,
addressed by these two patches.
====================

Signed-off-by: David S. Miller <[email protected]>

ldmvsw: stop the clean timer at beginning of remove

Stop the clean timer earlier to be sure there's no asynchronous
interference while stopping the port.

Orabug: 25748241

Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ldmvsw: unregistering netdev before disable hardware

When running LDom binding/unbinding test, kernel may panic
in ldmvsw_open(). It is more likely that because we're removing
the ldc connection before unregistering the netdev in vsw_port_remove(),
we set up a window of time where one process could be removing the
device while another trying to UP the device. This also sometimes causes
vio handshake error due to opening a device without closing it completely.
We should unregister the netdev before we disable the "hardware".

Orabug: 25980913, 25925306

Signed-off-by: Thomas Tai <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: netcp: fix check of requested timestamping filter

The driver doesn't support timestamping of all received packets and
should return error when trying to enable the HWTSTAMP_FILTER_ALL
filter.

Cc: WingMan Kwok <[email protected]>
Cc: Richard Cochran <[email protected]>
Signed-off-by: Miroslav Lichvar <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dm mpath: multipath_clone_and_map must not return -EIO

Since 412445ac ("dm: introduce a new DM_MAPIO_KILL return value"), the
clone_and_map_rq methods must not return errno values, so fix it up
to properly return DM_MAPIO_KILL, instead of the -EIO value that snuck
in due to a conflict between two patches.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm mpath: don't return -EIO from dm_report_EIO

Instead just turn the macro into a helper for the warning message.
This removes an unnecessary assignment and will allow the next commit to
fix a place where -EIO is the wrong return value.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm rq: add a missing break to map_request

We don't want to bug when receiving a DM_MAPIO_KILL value..

Fixes: 412445ac ("dm: introduce a new DM_MAPIO_KILL return value")
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm space map disk: fix some book keeping in the disk space map

When decrementing the reference count for a block, the free count wasn't
being updated if the reference count went to zero.

Cc: [email protected]
Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm thin metadata: call precommit before saving the roots

These calls were the wrong way round in __write_initial_superblock.

Cc: [email protected]
Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

Merge tag 'mlx5-fixes-2017-05-12-V2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-05-12

This series contains some mlx5 fixes for net.
Please pull and let me know if there's any problem.

For -stable:
("net/mlx5e: Fix ethtool pause support and advertise reporting") kernels >= 4.8
("net/mlx5e: Use the correct pause values for ethtool advertising") kernels >= 4.8

v1->v2:
Dropped statistics spinlock patch, it needs some extra work.
====================

Signed-off-by: David S. Miller <[email protected]>

ipv6: avoid dad-failures for addresses with NODAD

Every address gets added with TENTATIVE flag even for the addresses with
IFA_F_NODAD flag and dad-work is scheduled for them. During this DAD process
we realize it's an address with NODAD and complete the process without
sending any probe. However the TENTATIVE flags stays on the
address for sometime enough to cause misinterpretation when we receive a NS.
While processing NS, if the address has TENTATIVE flag, we mark it DADFAILED
and endup with an address that was originally configured as NODAD with
DADFAILED.

We can't avoid scheduling dad_work for addresses with NODAD but we can
avoid adding TENTATIVE flag to avoid this racy situation.

Signed-off-by: Mahesh Bandewar <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed: Fix uninitialized data in aRFS infrastructure

Current memset is using incorrect type of variable, causing the
upper-half of the strucutre to be left uninitialized and causing:

ethernet/qlogic/qed/qed_init_fw_funcs.c: In function 'qed_set_rfs_mode_disable':
ethernet/qlogic/qed/qed_init_fw_funcs.c:993:3: error: '*((void *)&ramline+4)' is used uninitialized in this function [-Werror=uninitialized]

Fixes: d51e4af5c209 ("qed: aRFS infrastructure support")
Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Yuval Mintz <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mdio: mux: fix device_node_continue.cocci warnings

Device node iterators put the previous value of the index variable, so an
explicit put causes a double put.

In particular, of_mdiobus_register can fail before doing anything
interesting, so one could view it as a no-op from the reference count
point of view.

Generated by: scripts/coccinelle/iterators/device_node_continue.cocci

CC: Jon Mason <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/packet: fix missing net_device reference release

When using a TX ring buffer, if an error occurs processing a control
message (e.g. invalid message), the net_device reference is not
released.

Fixes c14ac9451c348 ("sock: enable timestamping using control messages")
Signed-off-by: Douglas Caetano dos Santos <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_core: Use min3 to select number of MSI-X vectors

Signed-off-by: Yuval Shaia <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macvlan: Fix performance issues with vlan tagged packets

Macvlan always turns on offload features that have sofware
fallback (NETIF_GSO_SOFTWARE).  This allows much higher guest-guest
communications over macvtap.

However, macvtap does not turn on these features for vlan tagged traffic.
As a result, depending on the HW that mactap is configured on, the
performance of guest-guest communication over a vlan is very
inconsistent.  If the HW supports TSO/UFO over vlans, then the
performance will be fine.  If not, the the performance will suffer
greatly since the VM may continue using TSO/UFO, and will force the host
segment the traffic and possibly overlow the macvtap queue.

This patch adds the always on offloads to vlan_features.  This
makes sure that any vlan tagged traffic between 2 guest will not
be segmented needlessly.

Signed-off-by: Vladislav Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

arm64: perf: Ignore exclude_hv when kernel is running in HYP

commit d98ecdaca296 ("arm64: perf: Count EL2 events if the kernel is
running in HYP") returns -EINVAL when perf system call perf_event_open is
called with exclude_hv != exclude_kernel. This change breaks applications
on VHE enabled ARMv8.1 platforms. The issue was observed with HHVM
application, which calls perf_event_open with exclude_hv = 1 and
exclude_kernel = 0.

There is no separate hypervisor privilege level when VHE is enabled, the
host kernel runs at EL2. So when VHE is enabled, we should ignore
exclude_hv from the application. This behaviour is consistent with PowerPC
where the exclude_hv is ignored when the hypervisor is not present and with
x86 where this flag is ignored.

Signed-off-by: Ganapatrao Kulkarni <[email protected]>
[will: added comment to justify the behaviour of exclude_hv]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

arm64: Remove redundant mov from LL/SC cmpxchg

The cmpxchg implementation introduced by commit c342f78217e8 ("arm64:
cmpxchg: patch in lse instructions when supported by the CPU") performs
an apparently redundant register move of [old] to [oldval] in the
success case - it always uses the same register width as [oldval] was
originally loaded with, and is only executed when [old] and [oldval] are
known to be equal anyway.

The only effect it seemingly does have is to take up a surprising amount
of space in the kernel text, as removing it reveals:

text data bss dec hex filename
12426658 1348614 4499749 18275021 116dacd vmlinux.o.new
12429238 1348614 4499749 18277601 116e4e1 vmlinux.o.old

Reviewed-by: Will Deacon <[email protected]>
Signed-off-by: Robin Murphy <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

i2c: mux: only print failure message on error

As is, a failure message is printed unconditionally, which is confusing.
And noisy.

Fixes: 8d4d159f25a7 ("i2c: mux: provide more info on failure in i2c_mux_add_adapter")
Signed-off-by: Peter Rosin <[email protected]>

i2c: mux: reg: rename label to indicate what it does

That maintains sanity if it is ever called from some other spot, and
also makes the label names coherent.

Signed-off-by: Peter Rosin <[email protected]>

i2c: mux: reg: put away the parent i2c adapter on probe failure

It is only prudent to let go of resources that are not used.

Fixes: b3fdd32799d8 ("i2c: mux: Add register-based mux i2c-mux-reg")
Signed-off-by: Peter Rosin <[email protected]>

KVM: nVMX: fix EPT permissions as reported in exit qualification

This fixes the new ept_access_test_read_only and ept_access_test_read_write
testcases from vmx.flat.

The problem is that gpte_access moves bits around to switch from EPT
bit order (XWR) to ACC_*_MASK bit order (RWX).  This results in an
incorrect exit qualification.  To fix this, make pt_access and
pte_access operate on raw PTE values (only with NX flipped to mean
"can execute") and call gpte_access at the end of the walk.  This
lets us use pte_access to compute the exit qualification with XWR
bit order.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

KVM: VMX: Don't enable EPT A/D feature if EPT feature is disabled

We can observe eptad kvm_intel module parameter is still Y
even if ept is disabled which is weird. This patch will
not enable EPT A/D feature if EPT feature is disabled.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

KVM: x86: Fix load damaged SSEx MXCSR register

Reported by syzkaller:

   BUG: unable to handle kernel paging request at ffffffffc07f6a2e
   IP: report_bug+0x94/0x120
   PGD 348e12067
   P4D 348e12067
   PUD 348e14067
   PMD 3cbd84067
   PTE 80000003f7e87161

   Oops: 0003 [#1] SMP
   CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G           OE   4.11.0+ #8
   task: ffff92fdfb525400 task.stack: ffffbda6c3d04000
   RIP: 0010:report_bug+0x94/0x120
   RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202
    do_trap+0x156/0x170
    do_error_trap+0xa3/0x170
    ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
    ? mark_held_locks+0x79/0xa0
    ? retint_kernel+0x10/0x10
    ? trace_hardirqs_off_thunk+0x1a/0x1c
    do_invalid_op+0x20/0x30
    invalid_op+0x1e/0x30
   RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
    ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm]
    kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm]
    kvm_vcpu_ioctl+0x384/0x780 [kvm]
    ? kvm_vcpu_ioctl+0x384/0x780 [kvm]
    ? sched_clock+0x13/0x20
    ? __do_page_fault+0x2a0/0x550
    do_vfs_ioctl+0xa4/0x700
    ? up_read+0x1f/0x40
    ? __do_page_fault+0x2a0/0x550
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x23/0xc2

SDM mentioned that "The MXCSR has several reserved bits, and attempting to write
a 1 to any of these bits will cause a general-protection exception(#GP) to be
generated". The syzkaller forks' testcase overrides xsave area w/ random values
and steps on the reserved bits of MXCSR register. The damaged MXCSR register
values of guest will be restored to SSEx MXCSR register before vmentry. This
patch fixes it by catching userspace override MXCSR register reserved bits w/
random values and bails out immediately.

Reported-by: Andrey Konovalov <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: [email protected]
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

kvm: nVMX: off by one in vmx_write_pml_buffer()

There are PML_ENTITY_NUM elements in the pml_address[] array so the >
should be >= or we write beyond the end of the array when we do:

pml_address[vmcs12->guest_pml_index--] = gpa;

Fixes: c5f983f6e845 ("nVMX: Implement emulated Page Modification Logging")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

net: stmmac: use correct pointer when printing normal descriptor ring

There are two pointers in sysfs_display_ring,
one that increments if using normal dma descriptors,
another if using extended dma descriptors.

When printing the normal dma descriptors, the wrong pointer is used,
thus the printed descriptor addresses are incorrect.

Signed-off-by: Niklas Cassel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

- fix build failures with PR KVM configurations
- fix a host crash that can occur on POWER9 with radix guests

KVM: arm: rename pm_fake handler to trap_raz_wi

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.

Signed-off-by: Zhichao Huang <[email protected]>
Reviewed-by: Alex Bennee <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

KVM: arm: plug potential guest hardware debug leakage

Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.

This patch enable trapping of all debug registers, preventing the
guests to access the debug registers. This includes access to the
debug mode(DBGDSCR) in the guest world all the time which could
otherwise mess with the host state. Reads return 0 and writes are
ignored (RAZ_WI).

The result is the guest cannot detect any working hardware based debug
support. As debug exceptions are still routed to the guest normal
debug using software based breakpoints still works.

To support debugging using hardware registers we need to implement a
debug register aware world switch as well as special trapping for
registers that may affect the host state.

Cc: [email protected]
Signed-off-by: Zhichao Huang <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

drm/i915: don't do allocate_va_range again on PIN_UPDATE

If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.

v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 1f23475c893a85c934143cd64865ebb9b6af383f)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Fix rawclk readout for g4x

Turns out our skills in decoding the CLKCFG register weren't good
enough. On this particular elk the answer we got was 400 MHz when
in reality the clock was running at 266 MHz, which then caused us
to program a bogus AUX clock divider that caused all AUX communication
to fail.

Sadly the docs are now in bit heaven, so the fix will have to be based
on empirical evidence. Using another elk machine I was able to frob
the FSB frequency from the BIOS and see how it affects the CLKCFG
register. The machine seesm to use a frequency of 266 MHz by default,
and fortunately it still boot even with the 50% CPU overclock that
we get when we bump the FSB up to 400 MHz.

It turns out the actual FSB frequency and the register have no real
link whatsoever. The register value is based on some straps or something,
but fortunately those too can be configured from the BIOS on this board,
although it doesn't seem to respect the settings 100%. In the end I was
able to derive the following relationship:

BIOS FSB / strap | CLKCFG
-------------------------
200              | 0x2
266              | 0x0
333              | 0x4
400              | 0x4

So only the 200 and 400 MHz cases actually match how we're currently
decoding that register. But as the comment next to some of the defines
says, we have been just guessing anyway.

So let's fix things up so that at least the 266 MHz case will work
correctly as that is actually the setting used by both the buggy
machine and my test machine.

The fact that 333 and 400 MHz BIOS settings result in the same register
value is a little disappointing, as that means we can't tell them apart.
However, according to the gmch datasheet for both elk and ctg 400 Mhz is
not even a supported FSB frequency, so I'm going to make the assumption
that we should decode it as 333 MHz instead.

Cc: [email protected]
Cc: Tomi Sarvela <[email protected]>
Reported-by: Tomi Sarvela <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100926
Signed-off-by: Ville Syrjälä <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Jani Nikula <[email protected]>
Tested-by: Tomi Sarvela <[email protected]>
(cherry picked from commit 6f38123ecaac446312a63523b68df84ceb5a06ed)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Fix runtime PM for LPE audio

Not calling pm_runtime_enable() means that runtime PM can't be
enabled at all via sysfs. So we definitely need to call it
from somewhere.

Calling it from the driver seems like a bad idea because it
would have to be paired with a pm_runtime_disable() at driver
unload time, otherwise the core gets upset. Also if there's
no LPE audio driver loaded then we couldn't runtime suspend
i915 either.

So it looks like a better plan is to call it from i915 when
we register the platform device. That seems to match how
pci generally does things. I cargo culted the
pm_runtime_forbid() and pm_runtime_set_active() calls from
pci as well.

The exposed runtime PM API is massive an thorougly misleading, so
I don't actually know if this is how you're supposed to use the API
or not. But it seems to work. I can now runtime suspend i915 again
with or without the LPE audio driver loaded, and reloading the
LPE audio driver also seems to work.

Note that powertop won't auto-tune runtime PM for platform devices,
which is a little annoying. So I'm not sure that leaving runtime
PM in "on" mode by default is the best choice here. But I've left
it like that for now at least.

Also remove the comment about there not being much benefit from
LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime
PM, which will also block s0ix, and that could have a measurable
impact on power consumption.

Cc: [email protected]
Cc: Takashi Iwai <[email protected]>
Cc: Pierre-Louis Bossart <[email protected]>
Fixes: 0b6b524f3915 ("ALSA: x86: Don't enable runtime PM as default")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Takashi Iwai <[email protected]>
(cherry picked from commit 183c00350ccda86781f6695840e6c5f5b22efbd1)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/glk: Fix DSI "*ERROR* ULPS is still active" messages

The sequence in glk_dsi_device_ready() enters ULPS then waits until it is
*not* active to then disable it. The correct sequence according to the
spec is to enter ULPS then wait until the GLK_ULPS_NOT_ACTIVE bit is
zero, i.e., ULPS is active, and then disable ULPS.

Fixing the condition gets rid of the following spurious error messages:

[drm:glk_dsi_device_ready [i915]] *ERROR* ULPS is still active

Fixes: 4644848369c0 ("drm/i915/glk: Add MIPIIO Enable/disable sequence")
Cc: Deepak M <[email protected]>
Cc: Madhav Chauhan <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: [email protected]
Cc: <[email protected]>
Signed-off-by: Ander Conselvan de Oliveira <[email protected]>
Reviewed-by: Madhav Chauhan <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 3acbec03b3c51559d01c879e9564d9c9610fe8ce)
Signed-off-by: Jani Nikula <[email protected]>

netfilter: nf_tables: revisit chain/object refcounting from elements

Andreas reports that the following incremental update using our commit
protocol doesn't work.

# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busy

The existing code is not well-integrated into the commit phase protocol,
since element deletions do not result in refcount decrement from the
preparation phase. This results in bogus EBUSY errors like the one
above.

Two new functions come with this patch:

* nft_set_elem_activate() function is used from the abort path, to
  restore the set element refcounting on objects that occurred from
  the preparation phase.

* nft_set_elem_deactivate() that is called from nft_del_setelem() to
  decrement set element refcounting on objects from the preparation
  phase in the commit protocol.

The nft_data_uninit() has been renamed to nft_data_release() since this
function does not uninitialize any data store in the data register,
instead just releases the references to objects. Moreover, a new
function nft_data_hold() has been introduced to be used from
nft_set_elem_activate().

Reported-by: Andreas Schultz <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: missing sanitization in data from userspace

Do not assume userspace always sends us NFT_DATA_VALUE for bitwise and
cmp expressions. Although NFT_DATA_VERDICT does not make any sense, it
is still possible to handcraft a netlink message using this incorrect
data type.

Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: can't assume lock is acquired when dumping set elems

When dumping the elements related to a specified set, we may invoke the
nf_tables_dump_set with the NFNL_SUBSYS_NFTABLES lock not acquired. So
we should use the proper rcu operation to avoid race condition, just
like other nft dump operations.

Signed-off-by: Liping Zhang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: synproxy: fix conntrackd interaction

This patch fixes the creation of connection tracking entry from
netlink when synproxy is used. It was missing the addition of
the synproxy extension.

This was causing kernel crashes when a conntrack entry created by
conntrackd was used after the switch of traffic from active node
to the passive node.

Signed-off-by: Eric Leblond <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: xtables: zero padding in data_to_user

When looking up an iptables rule, the iptables binary compares the
aligned match and target data (XT_ALIGN). In some cases this can
exceed the actual data size to include padding bytes.

Before commit f77bc5b23fb1 ("iptables: use match, target and data
copy_to_user helpers") the malloc()ed bytes were overwritten by the
kernel with kzalloced contents, zeroing the padding and making the
comparison succeed. After this patch, the kernel copies and clears
only data, leaving the padding bytes undefined.

Extend the clear operation from data size to aligned data size to
include the padding bytes, if any.

Padding bytes can be observed in both match and target, and the bug
triggered, by issuing a rule with match icmp and target ACCEPT:

iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT

Fixes: f77bc5b23fb1 ("iptables: use match, target and data copy_to_user helpers")
Reported-by: Paul Moore <[email protected]>
Reported-by: Richard Guy Briggs <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge tag 'ipvs-fixes-for-v4.12' of http://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs

Simon Horman says:

====================
IPVS Fixes for v4.12

please consider this fix to IPVS for v4.12.

* It is a fix from Julian Anastasov to only SNAT SNAT packet replies only for
  NATed connections

My understanding is that this fix is appropriate for 4.9.25, 4.10.13, 4.11
as well as the nf tree. Julian has separately posted backports for other
-stable kernels; please see:

* [PATCH 3.2.88,3.4.113 -stable 1/3] ipvs: SNAT packet replies only for
        NATed connections
* [PATCH 3.10.105,3.12.73,3.16.43,4.1.39 -stable 2/3] ipvs: SNAT packet
        replies only for NATed connections
* [PATCH 4.4.65 -stable 3/3] ipvs: SNAT packet replies only for NATed
        connections
====================

Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nfnl_cthelper: reject del request if helper obj is in use

We can still delete the ct helper even if it is in use, this will cause
a use-after-free error. In more detail, I mean:
  # nfct helper add ssdp inet udp
  # iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
  # nfct helper delete ssdp //--> oops, succeed!
  BUG: unable to handle kernel paging request at 000026ca
  IP: 0x26ca
  [...]
  Call Trace:
   ? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
   nf_hook_slow+0x21/0xb0
   ip_output+0xe9/0x100
   ? ip_fragment.constprop.54+0xc0/0xc0
   ip_local_out+0x33/0x40
   ip_send_skb+0x16/0x80
   udp_send_skb+0x84/0x240
   udp_sendmsg+0x35d/0xa50

So add reference count to fix this issue, if ct helper is used by
others, reject the delete request.

Apply this patch:
  # nfct helper delete ssdp
  nfct v1.4.3: netlink error: Device or resource busy

Signed-off-by: Liping Zhang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: introduce nf_conntrack_helper_put helper function

And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.

Signed-off-by: Liping Zhang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: don't setup nat info for confirmed ct

We cannot setup nat info if the ct has been confirmed already, else,
different cpu may race to handle the same ct. In extreme situation,
we may hit the "BUG_ON(nf_nat_initialized(ct, maniptype))" in the
nf_nat_setup_info.

Also running the following commands will easily hit NF_CT_ASSERT in
nf_conntrack_alter_reply:
  # nft flush ruleset
  # ping -c 2 -W 1 1.1.1.111 &
  # nft add table t
  # nft add chain t c {type nat hook postrouting priority 0 \;}
  # nft add rule t c snat to 4.5.6.7
  WARNING: CPU: 1 PID: 10065 at net/netfilter/nf_conntrack_core.c:1472
  nf_conntrack_alter_reply+0x9a/0x1a0 [nf_conntrack]
  [...]
  Call Trace:
   nf_nat_setup_info+0xad/0x840 [nf_nat]
   ? deactivate_slab+0x65d/0x6c0
   nft_nat_eval+0xcd/0x100 [nft_nat]
   nft_do_chain+0xff/0x5d0 [nf_tables]
   ? mark_held_locks+0x6f/0xa0
   ? __local_bh_enable_ip+0x70/0xa0
   ? trace_hardirqs_on_caller+0x11f/0x190
   ? ipt_do_table+0x310/0x610
   ? trace_hardirqs_on+0xd/0x10
   ? __local_bh_enable_ip+0x70/0xa0
   ? ipt_do_table+0x32b/0x610
   ? __lock_acquire+0x2ac/0x1580
   ? ipt_do_table+0x32b/0x610
   nft_nat_do_chain+0x65/0x80 [nft_chain_nat_ipv4]
   nf_nat_ipv4_fn+0x1ae/0x240 [nf_nat_ipv4]
   nf_nat_ipv4_out+0x4a/0xf0 [nf_nat_ipv4]
   nft_nat_ipv4_out+0x15/0x20 [nft_chain_nat_ipv4]
   nf_hook_slow+0x2c/0xf0
   ip_output+0x154/0x270

So for the confirmed ct, just ignore it and return NF_ACCEPT.

Fixes: 9a08ecfe74d7 ("netfilter: don't attach a nat extension by default")
Signed-off-by: Liping Zhang <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

staging: rtl8723bs: remove re-positioned call to kfree in os_dep/ioctl_cfg80211.c

A re-positioned call to kfree() in
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
causes a segmentation error. This patch removed the kfree() call.

Fixes 6557ddfec348 ("staging: rtl8723bs: Fix various errors in os_dep/ioctl_cfg80211.c")
Signed-off-by: Ian W Morrison <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

s390/virtio: change virtio_feature_desc:features type to __le32

The feature member of virtio_feature_desc contains little endian
values, given that it contents will be converted with
le32_to_cpu(). The "wrong" __u32 type leads to the sparse warnings
below.
In order to avoid them, use the correct __le32 type instead.

drivers/s390/virtio/virtio_ccw.c:749:14: warning: cast to restricted __le32
drivers/s390/virtio/virtio_ccw.c:762:28: warning: cast to restricted __le32

Acked-by: Halil Pasic <[email protected]>
Acked-by: Cornelia Huck <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

netfilter: ctnetlink: Make some parameters integer to avoid enum mismatch

Not all parameters passed to ctnetlink_parse_tuple() and
ctnetlink_exp_dump_tuple() match the enum type in the signatures of these
functions. Since this is intended change the argument type of to be an
unsigned integer value.

Signed-off-by: Matthias Kaehlcke <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

kvm: arm/arm64: Fix race in resetting stage2 PGD

In kvm_free_stage2_pgd() we check the stage2 PGD before holding
the lock and proceed to take the lock if it is valid. And we unmap
the page tables, followed by releasing the lock. We reset the PGD
only after dropping this lock, which could cause a race condition
where another thread waiting on or even holding the lock, could
potentially see that the PGD is still valid and proceed to perform
a stage2 operation and later encounter a NULL PGD.

[223090.242280] Unable to handle kernel NULL pointer dereference at
virtual address 00000040
[223090.262330] PC is at unmap_stage2_range+0x8c/0x428
[223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
[223090.262531] Call trace:
[223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
[223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
[223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
[223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
[223090.262543] [<ffff0000080a2478>]
kvm_mmu_notifier_invalidate_page+0x50/0x8c
[223090.262547] [<ffff0000082274f8>]
__mmu_notifier_invalidate_page+0x5c/0x84
[223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
[223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
[223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
[223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
[223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
[223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
[223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
[223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
[223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
[223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
[223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
[223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
[223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
[223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
[223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
[223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
[223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
[223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
[223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
[223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
[223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
[223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
[223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c

This patch moves the stage2 PGD manipulation under the lock.

Reported-by: Alexander Graf <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

Merge tag 'gvt-fixes-2017-05-11' of https://github.com/01org/gvt-linux into drm-intel-fixes

gvt-fixes-2017-05-11

- vGPU scheduler performance regression fix (Ping)
- bypass in-context mmio restore (Chuanxiao)
- one typo fix (Colin)

Signed-off-by: Jani Nikula <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]

USB: serial: io_ti: fix div-by-zero in set_termios

Fix a division-by-zero in set_termios when debugging is enabled and a
high-enough speed has been requested so that the divisor value becomes
zero.

Instead of just fixing the offending debug statement, cap the baud rate
at the base as a zero divisor value also appears to crash the firmware.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable <[email protected]> # 2.6.12
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>

USB: serial: mct_u232: fix big-endian baud-rate handling

Drop erroneous cpu_to_le32 when setting the baud rate, something which
corrupted the divisor on big-endian hosts.

Found using sparse:

warning: incorrect type in argument 1 (different base types)
expected unsigned int [unsigned] [usertype] val
got restricted __le32 [usertype] <noident>

Fixes: af2ac1a091bc ("USB: serial mct_usb232: move DMA buffers to heap")
Cc: stable <[email protected]> # 2.6.34
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Acked-By: Pete Zaitcev <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>

USB: serial: ir-usb: fix big-endian baud-rate debug printk

Add missing endianness conversion when printing the supported baud
rates.

Found using sparse:

warning: restricted __le16 degrades to integer

Fixes: e0d795e4f36c ("usb: irda: cleanup on ir-usb module")
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>

staging: rtl8192e: GetTs Fix invalid TID 7 warning.

TID 7 is a valid value for QoS IEEE 802.11e.

The switch statement that follows states 7 is valid.

Remove function IsACValid and use the default case to filter
invalid TIDs.

Signed-off-by: Malcolm Priestley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.

EPROM_CMD is 2 byte aligned on PCI map so calling with rtl92e_readl
will return invalid data so use rtl92e_readw.

The device is unable to select the right eeprom type.

Signed-off-by: Malcolm Priestley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: rtl8192e: fix 2 byte alignment of register BSSIDR.

BSSIDR has two byte alignment on PCI ioremap correct the write
by swapping to 16 bits first.

This fixes a problem that the device associates fail because
the filter is not set correctly.

Signed-off-by: Malcolm Priestley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.

The driver attempts to alter memory that is mapped to PCI device.

This is because tx_fwinfo_8190pci points to skb->data

Move the pci_map_single to when completed buffer is ready to be mapped with
psdec is empty to drop on mapping error.

Signed-off-by: Malcolm Priestley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>