Git Repo - linux.git/log

cdc-ether: usbnet_cdc_zte_status() can be static

Fixes the following sparse warning:

drivers/net/usb/cdc_ether.c:469:6: warning:
symbol 'usbnet_cdc_zte_status' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tools: psock_lib: harden socket filter used by psock tests

The filter added by sock_setfilter is intended to only permit
packets matching the pattern set up by create_payload(), but
we only check the ip_len, and a single test-character in
the IP packet to ensure this condition.

Harden the filter by adding additional constraints so that we only
permit UDP/IPv4 packets that meet the ip_len and test-character
requirements. Include the bpf_asm src as a comment, in case this
needs to be enhanced in the future

Signed-off-by: Sowmini Varadhan <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ravb: Remove Rx overflow log messages

Remove Rx overflow log messages as in an environment where logging results
in network traffic logging may cause further overflows.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi <[email protected]>
[simon: reworked changelog]
Signed-off-by: Simon Horman <[email protected]>
Acked-by: Sergei Shtylyov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

lwt_bpf: bpf_lwt_prog_cmp() can be static

Fixes the following sparse warning:

net/core/lwt_bpf.c:355:5: warning:
symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 's390-qeth-next'

Ursula Braun says:

====================
s390: qeth patches

yesterday I came up with 13 qeth patches. Since you have not been
happy with the 13th patch, I want to make sure that at least the
remaining 12 qeth patches can be applied to net-next. Here is the
resend of them.
====================

Signed-off-by: David S. Miller <[email protected]>

s390/qeth: fix retrieval of vipa and proxy-arp addresses

qeth devices in layer3 mode need a separate handling of vipa and proxy-arp
addresses. vipa and proxy-arp addresses processed by qeth can be read from
userspace. Introduced with commit 5f78e29ceebf ("qeth: optimize IP handling
in rx_mode callback") the retrieval of vipa and proxy-arp addresses is
broken, if more than one vipa or proxy-arp address are set.

The qeth code used local variable "int i" for 2 different purposes. This
patch now spends 2 separate local variables of type "int".
While touching these functions hash_for_each_safe() is converted to
hash_for_each(), since there is no removal of hash entries.

Signed-off-by: Ursula Braun <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Reference-ID: RQM 3524
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: issue STARTLAN as first IPA command

STARTLAN needs to be the first IPA command after MPC initialization
completes.
So move the qeth_send_startlan() call from the layer disciplines
into the core path, right after the MPC handshake.
While at it, replace the magic LAN OFFLINE return code
with the existing enum.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: shuffle MAC management functions around

Move all MAC utility functions in one place, and drop the
forward declarations.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: extract qeth_l2_remove_mac()

This matches qeth_l2_write_mac().

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: consolidate errno translation

Consolidate errno handling for MAC management: Instead of doing this in every
caller, do it in one place.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Suggested-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: don't convert return code twice

qeth_l2_send_groupmac() already translates the return code, so
calling qeth_setdel_makerc() a second time only produces garbage.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: drop qeth_l2_del_all_macs() parameter

The only caller passes del = 0, so remove both the parameter and
the code that handles != 0.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Acked-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: Remove QETH_IP_HEADER_SIZE

Remove unused define QETH_IP_HEADER_SIZE.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Acked-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: Allow reading hsuid in state DOWN

Accessing the current hsuid via card->options.hsuid is perfectly
fine, even when the card is DOWN.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Acked-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: display warning for OSA3 RX/TX checksum offloading

When RX/TX checksum offloading is turned on and the adapter is
an OSA 3 card in layer 3 mode, the checksum offloading is only
performed when both peers use different adapters. If both peers
share an OSA 3 card, communication is a memory copy and
checksum offloading is not performed.

This patch adds a warning to inform the administrator.

OSA 3 in layer 2 mode does not offer the RX/TX checksum
offload feature.

Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: test RX/TX checksum offload reply

Turning on receive and/or transmit checksum offload support
on the OSA card requires 2 commands:
1. start command which replies with available features
2. enable command to turn on selected features.

The current version does not check the reply of the start
command and simply uses the returned value to enable
offload features. When the start command returns zero, this
leads to a situation where no checksum offload
is turned on by the hardware. Even worse no error
indication is returned. The Linux kernel assumes
the OSA card performs RX/TX checksum offload, but the hardware
does not perform any checksum verification at all.

This patch checks the return of the start and enable
command responses from the hardware and turns off
checksum offloading if the commands fails or does not
respond with the correct bit setting.

Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: rework RX/TX checksum offload

Rework the RX/TX checksum offloading command sequence to use
the provided function call back mechanims to return card
data to the device driver.

Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Julian Wiedmann <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'bpf-cb-access'

Daniel Borkmann says:

====================
More flexible BPF cb access

This patch improves BPF's cb access by allowing b/h/w/dw
access variants on it. For details, please see individual
patches.
====================

Signed-off-by: David S. Miller <[email protected]>

bpf: allow b/h/w/dw access for bpf's cb in ctx

When structs are used to store temporary state in cb[] buffer that is
used with programs and among tail calls, then the generated code will
not always access the buffer in bpf_w chunks. We can ease programming
of it and let this act more natural by allowing for aligned b/h/w/dw
sized access for cb[] ctx member. Various test cases are attached as
well for the selftest suite. Potentially, this can also be reused for
other program types to pass data around.

Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bpf: pass original insn directly to convert_ctx_access

Currently, when calling convert_ctx_access() callback for the various
program types, we pass in insn->dst_reg, insn->src_reg, insn->off from
the original instruction. This information is needed to rewrite the
instruction that is based on the user ctx structure into a kernel
representation for the ctx. As we'd like to allow access size beyond
just BPF_W, we'd need also insn->code for that in order to decode the
original access size. Given that, lets just pass insn directly to the
convert_ctx_access() callback and work on that to not clutter the
callback with even more arguments we need to pass when everything is
already contained in insn. So lets go through that once, no functional
change.

Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

block: Rename blk_queue_zone_size and bdev_zone_size

All block device data fields and functions returning a number of 512B
sectors are by convention named xxx_sectors while names in the form
xxx_size are generally used for a number of bytes. The blk_queue_zone_size
and bdev_zone_size functions were not following this convention so rename
them.

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <[email protected]>
Collapsed the two patches, they were nonsensically split and broke
bisection.

Signed-off-by: Jens Axboe <[email protected]>

Merge branch 'smc-fixes'

Ursula Braun says:

====================
net/smc: fix typo and clc-bug

I received 2 bug reports for my new AF_SMC-code. Here are the fixes for them.
====================

Signed-off-by: David S. Miller <[email protected]>

smc: ETH_ALEN as memcpy length for mac addresses

When creating an SMC connection, there is a CLC (connection layer control)
handshake to prepare for RDMA traffic. The corresponding code is part of
commit 0cfdd8f92cac ("smc: connection and link group creation").
Mac addresses to be exchanged in the handshake are copied with a wrong
length of 12 instead of 6 bytes. Following code overwrites the wrongly
copied code, but nevertheless the correct length should already be used for
the preceding mac address copying. Use ETH_ALEN for the memcpy length with
mac addresses.

Signed-off-by: Ursula Braun <[email protected]>
Fixes: 0cfdd8f92cac ("smc: connection and link group creation")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fix AF_SMC related typo

When introducing the new socket family AF_SMC in
commit ac7138746e14 ("smc: establish new socket family"),
a typo in af_family_clock_key_strings has slipped in.
This patch repairs it.

Signed-off-by: Ursula Braun <[email protected]>
Fixes: ac7138746e14 ("smc: establish new socket family")
Reported-by: Andrew Morton <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlxsw-fixes'

Jiri Pirko says:

====================
mlxsw: Couple of fixes

Couple of simple fixes from Arkadi and Elad.

Please queue these up for stable. Thanks.
====================

Signed-off-by: David S. Miller <[email protected]>

mlxsw: pci: Fix EQE structure definition

The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This
leads to duplication with other fields in the Event Queue Element such as
sub-type, cqn and owner.

Fixes: eda6500a987a0 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Elad Raz <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: switchx2: Fix memory leak at skb reallocation

During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.

Fix this by adding the original skb release to the main path.

Fixes: d003462a50de ("mlxsw: Simplify mlxsw_sx_port_xmit function")
Signed-off-by: Arkadi Sharshevsky <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Fix memory leak at skb reallocation

During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.

Fix this by adding the original skb release to the main path.

Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Arkadi Sharshevsky <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: Initialize mbox lock and list for mgmt dev

Initialize mbox lock and list for mgmt dev to avoid NULL pointer
dereference when cxgb_set_vf_mac is called.

And also allocate memory for private data while allocating mgmt
netdev.

Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: core: Make netif_wake_subqueue a wrapper

netif_wake_subqueue() is duplicating the same thing that netif_tx_wake_queue()
does, so make it call it directly after looking up the queue from the index.

Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

KVM: x86: fix emulation of "MOV SS, null selector"

This is CVE-2017-2583. On Intel this causes a failed vmentry because
SS's type is neither 3 nor 7 (even though the manual says this check is
only done for usable SS, and the dmesg splat says that SS is unusable!).
On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.

The fix fabricates a data segment descriptor when SS is set to a null
selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
this in turn ensures CPL < 3 because RPL must be equal to CPL.

Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
the bug and deciphering the manuals.

Reported-by: Xiaohan Zhang <[email protected]>
Fixes: 79d5b4c3cd809c770d4bf9812635647016c56011
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

capability: export has_capability

has_capability() is sometimes needed by modules to test capability
for specified task other than current, so export it.

Cc: Kirti Wankhede <[email protected]>
Signed-off-by: Jike Song <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
Acked-by: James Morris <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

KVM: x86: fix NULL deref in vcpu_scan_ioapic

Reported by syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
    IP: _raw_spin_lock+0xc/0x30
    PGD 3e28eb067
    PUD 3f0ac6067
    PMD 0
    Oops: 0002 [#1] SMP
    CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
    Call Trace:
     ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
     kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
     ? pick_next_task_fair+0xe1/0x4e0
     ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
     kvm_vcpu_ioctl+0x33a/0x600 [kvm]
     ? hrtimer_try_to_cancel+0x29/0x130
     ? do_nanosleep+0x97/0xf0
     do_vfs_ioctl+0xa1/0x5d0
     ? __hrtimer_init+0x90/0x90
     ? do_nanosleep+0x5b/0xf0
     SyS_ioctl+0x79/0x90
     do_syscall_64+0x6e/0x180
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0

The syzkaller folks reported a NULL pointer dereference due to
ENABLE_CAP succeeding even without an irqchip.  The Hyper-V
synthetic interrupt controller is activated, resulting in a
wrong request to rescan the ioapic and a NULL pointer dereference.

    #include <sys/ioctl.h>
    #include <sys/mman.h>
    #include <sys/types.h>
    #include <linux/kvm.h>
    #include <pthread.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #ifndef KVM_CAP_HYPERV_SYNIC
    #define KVM_CAP_HYPERV_SYNIC 123
    #endif

    void* thr(void* arg)
    {
struct kvm_enable_cap cap;
cap.flags = 0;
cap.cap = KVM_CAP_HYPERV_SYNIC;
ioctl((long)arg, KVM_ENABLE_CAP, &cap);
return 0;
    }

    int main()
    {
void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
int kvmfd = open("/dev/kvm", 0);
int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
struct kvm_userspace_memory_region memreg;
memreg.slot = 0;
memreg.flags = 0;
memreg.guest_phys_addr = 0;
memreg.memory_size = 0x1000;
memreg.userspace_addr = (unsigned long)host_mem;
host_mem[0] = 0xf4;
ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg);
int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0);
struct kvm_sregs sregs;
ioctl(cpufd, KVM_GET_SREGS, &sregs);
sregs.cr0 = 0;
sregs.cr4 = 0;
sregs.efer = 0;
sregs.cs.selector = 0;
sregs.cs.base = 0;
ioctl(cpufd, KVM_SET_SREGS, &sregs);
struct kvm_regs regs = { .rflags = 2 };
ioctl(cpufd, KVM_SET_REGS, &regs);
ioctl(vmfd, KVM_CREATE_IRQCHIP, 0);
pthread_t th;
pthread_create(&th, 0, thr, (void*)(long)cpufd);
usleep(rand() % 10000);
ioctl(cpufd, KVM_RUN, 0);
pthread_join(th, 0);
return 0;
    }

This patch fixes it by failing ENABLE_CAP if without an irqchip.

Reported-by: Dmitry Vyukov <[email protected]>
Fixes: 5c919412fe61 (kvm/x86: Hyper-V synthetic interrupt controller)
Cc: [email protected] # 4.5+
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: eventfd: fix NULL deref irqbypass consumer

Reported syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    PGD 0

    Oops: 0002 [#1] SMP
    CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
    Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
    task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
    RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    Call Trace:
     irqfd_shutdown+0x66/0xa0 [kvm]
     process_one_work+0x16b/0x480
     worker_thread+0x4b/0x500
     kthread+0x101/0x140
     ? process_one_work+0x480/0x480
     ? kthread_create_on_node+0x60/0x60
     ret_from_fork+0x25/0x30
    RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
    CR2: 0000000000000008

The syzkaller folks reported a NULL pointer dereference that due to
unregister an consumer which fails registration before. The syzkaller
creates two VMs w/ an equal eventfd occasionally. So the second VM
fails to register an irqbypass consumer. It will make irqfd as inactive
and queue an workqueue work to shutdown irqfd and unregister the irqbypass
consumer when eventfd is closed. However, the second consumer has been
initialized though it fails registration. So the token(same as the first
VM's) is taken to unregister the consumer through the workqueue, the
consumer of the first VM is found and unregistered, then NULL deref incurred
in the path of deleting consumer from the consumers list.

This patch fixes it by making irq_bypass_register/unregister_consumer()
looks for the consumer entry based on consumer pointer itself instead of
token matching.

Reported-by: Dmitry Vyukov <[email protected]>
Suggested-by: Alex Williamson <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Alex Williamson <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Introduce segmented_write_std

Introduces segemented_write_std.

Switches from emulated reads/writes to standard read/writes in fxsave,
fxrstor, sgdt, and sidt. This fixes CVE-2017-2584, a longstanding
kernel memory leak.

Since commit 283c95d0e389 ("KVM: x86: emulate FXSAVE and FXRSTOR",
2016-11-09), which is luckily not yet in any final release, this would
also be an exploitable kernel memory *write*!

Reported-by: Dmitry Vyukov <[email protected]>
Cc: [email protected]
Fixes: 96051572c819194c37a8367624b285be10297eca
Fixes: 283c95d0e3891b64087706b344a4b545d04a6e62
Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Steve Rutherford <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: flush pending lapic jump label updates on module unload

KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
These are implemented with delayed_work structs which can still be
pending when the KVM module is unloaded. We've seen this cause kernel
panics when the kvm_intel module is quickly reloaded.

Use the new static_key_deferred_flush() API to flush pending updates on
module unload.

Signed-off-by: David Matlack <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

jump_labels: API for flushing deferred jump label updates

Modules that use static_key_deferred need a way to synchronize with
any delayed work that is still pending when the module is unloaded.
Introduce static_key_deferred_flush() which flushes any pending
jump label updates.

Signed-off-by: David Matlack <[email protected]>
Cc: [email protected]
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

mmc: mxs-mmc: Fix additional cycles after transmission stop

According to the code the intention is to append 8 SCK cycles
instead of 4 at end of a MMC_STOP_TRANSMISSION command. But this
will never happened because it's an AC command not an ADTC command.
So fix this by moving the statement into the right function.

Signed-off-by: Stefan Wahren <[email protected]>
Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Cc: <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci-acpi: Only powered up enabled acpi child devices

Commit e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are
powered when probing") introduced code to powerup any acpi child
nodes listed in the dstd. But some dstd-s list all possible devices
used on some board variants, while reporting if the device is actually
present and enabled in the status field of the device.

So we end up calling the acpi _PS0 (power-on) method for devices which
are not actually present. This does not always end well, e.g. on my
cube iwork8 air tablet, this results in freezing the entire tablet as
soon as the r8723bs module is loaded.

This commit fixes this by checking the child device's status.present
and status.enabled bits and only call acpi_device_fix_up_power()
if both are set.

Fixes: e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing")
BugLink: https://github.com/hadess/rtl8723bs/issues/80
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mac80211: set wifi_acked[_valid] bits for transmitted SKBs

There may be situations in which the in-kernel originator of an
SKB cares about its wifi transmission status. To have that, set
the wifi_acked[_valid] bits before freeing/orphaning the SKB if
the destructor is set. The originator can then use it in there.

Signed-off-by: Johannes Berg <[email protected]>

mac80211: Add RX flag to indicate ICV stripped

Add a flag that indicates that the WEP ICV was stripped from an
RX packet, allowing the device to not transfer that if it's
already checked.

Signed-off-by: David Spinadel <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

wext: handle NULL extra data in iwe_stream_add_point better

gcc-7 complains that wl3501_cs passes NULL into a function that
then uses the argument as the input for memcpy:

drivers/net/wireless/wl3501_cs.c: In function 'wl3501_get_scan':
include/net/iw_handler.h:559:3: error: argument 2 null where non-null expected [-Werror=nonnull]
memcpy(stream + point_len, extra, iwe->u.data.length);

This works fine here because iwe->u.data.length is guaranteed to be 0
and the memcpy doesn't actually have an effect.

Making the length check explicit avoids the warning and should have
no other effect here.

Also check the pointer itself, since otherwise we get warnings
elsewhere in the code.

Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

x86/entry: Fix the end of the stack for newly forked tasks

When unwinding a task, the end of the stack is always at the same offset
right below the saved pt_regs, regardless of which syscall was used to
enter the kernel.  That convention allows the unwinder to verify that a
stack is sane.

However, newly forked tasks don't always follow that convention, as
reported by the following unwinder warning seen by Dave Jones:

  WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value           (null)

The warning was due to the following call chain:

  (ftrace handler)
  call_usermodehelper_exec_async+0x5/0x140
  ret_from_fork+0x22/0x30

The problem is that ret_from_fork() doesn't create a stack frame before
calling other functions.  Fix that by carefully using the frame pointer
macros.

In addition to conforming to the end of stack convention, this also
makes related stack traces more sensible by making it clear to the user
that ret_from_fork() was involved.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

x86/unwind: Include __schedule() in stack traces

In the following commit:

0100301bfdf5 ("sched/x86: Rewrite the switch_to() code")

... the layout of the 'inactive_task_frame' struct was designed to have
a frame pointer header embedded in it, so that the unwinder could use
the 'bp' and 'ret_addr' fields to report __schedule() on the stack (or
ret_from_fork() for newly forked tasks which haven't actually run yet).

Finish the job by changing get_frame_pointer() to return a pointer to
inactive_task_frame's 'bp' field rather than 'bp' itself. This allows
the unwinder to start one frame higher on the stack, so that it properly
reports __schedule().

Reported-by: Miroslav Benes <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/598e9f7505ed0aba86e8b9590aa528c6c7ae8dcd.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

x86/unwind: Disable KASAN checks for non-current tasks

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless. The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

In such cases, it's possible that the unwinder may read a KASAN-poisoned
region of the stack. Account for that by using READ_ONCE_NOCHECK() when
reading the stack of another task.

Use READ_ONCE() when reading the stack of the current task, since KASAN
warnings can still be useful for finding bugs in that case.

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

x86/unwind: Silence warnings for non-current tasks

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless. The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

Since stack "corruption" on another task's stack isn't necessarily a
bug, silence the warnings when unwinding tasks other than current.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

netvsc: add rcu_read locking to netvsc callback

The receive callback (in tasklet context) is using RCU to get reference
to associated VF network device but this is not safe. RCU read lock
needs to be held. Found by running with full lockdep debugging
enabled.

Fixes: f207c10d9823 ("hv_netvsc: use RCU to protect vf_netdev")
Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Make hfunc variable const type in nicvf_set_rxfh()

>From struct ethtool_ops:

int (*set_rxfh)(struct net_device *, const u32 *indir,
const u8 *key, const u8 hfunc);

Change function arg of hfunc to const type.

V2: Fixed indentation.

Signed-off-by: Robert Richter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: Fix error return code in nicvf_open()

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 712c31853440 ("net: thunderx: Program LMAC credits based on MTU")
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: efx_get_phys_port_id() can be static

Fixes the following sparse warning:

drivers/net/ethernet/sfc/efx.c:2337:5: warning:
symbol 'efx_get_phys_port_id' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vxlan: Set ports in flow key when doing route lookups

Otherwise, a xfrm policy with sport/dport being set cannot be matched.

Signed-off-by: Martynas Pumputis <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

r8152: fix the sw rx checksum is unavailable

Fix the hw rx checksum is always enabled, and the user couldn't switch
it to sw rx checksum.

Note that the RTL_VER_01 only support sw rx checksum only. Besides,
the hw rx checksum for RTL_VER_02 is disabled after
commit b9a321b48af4 ("r8152: Fix broken RX checksums."). Re-enable it.

Signed-off-by: Hayes Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

HID: i2c-hid: Add sleep between POWER ON and RESET

Support for the Asus Touchpad was recently added. It turns out this
device can fail initialisation (and become unusable) when the RESET
command is sent too soon after the POWER ON command.

Unfortunately the i2c-hid specification does not specify the need for
a delay between these two commands. But it was discovered the Windows
driver has a 1ms delay.

As a result, this patch modifies the i2c-hid module to add a sleep
inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms.

See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further
details.

Signed-off-by: Brendan McGrath <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Two AF_* families adding entries to the lockdep tables
at the same time.

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
"27 fixes.

  There are three patches that aren't actually fixes. They're simple
  function renamings which are nice-to-have in mainline as ongoing net
  development depends on them."

* akpm: (27 commits)
  timerfd: export defines to userspace
  mm/hugetlb.c: fix reservation race when freeing surplus pages
  mm/slab.c: fix SLAB freelist randomization duplicate entries
  zram: support BDI_CAP_STABLE_WRITES
  zram: revalidate disk under init_lock
  mm: support anonymous stable page
  mm: add documentation for page fragment APIs
  mm: rename __page_frag functions to __page_frag_cache, drop order from drain
  mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
  mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
  mm: don't dereference struct page fields of invalid pages
  mailmap: add codeaurora.org names for nameless email commits
  signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
  mm: pmd dirty emulation in page fault handler
  ipc/sem.c: fix incorrect sem_lock pairing
  lib/Kconfig.debug: fix frv build failure
  mm: get rid of __GFP_OTHER_NODE
  mm: fix remote numa hits statistics
  mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
  ocfs2: fix crash caused by stale lvb with fsdlm plugin
  ...

vfio-mdev: remove some dead code

We set info.count to 1 in mtty_get_irq_info() so static checkers
complain that, "Why do we have impossible conditions?" The answer is
that it seems to be left over dead code that can be safely removed.

Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Kirti Wankhede <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio-mdev: buffer overflow in ioctl()

This is a sample driver for documentation so the impact is probably
pretty low. But we should check that bar_index is valid so we
don't write beyond the end of the mdev_state->region_info[] array.

Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Kirti Wankhede <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio-mdev: return -EFAULT if copy_to_user() fails

The copy_to_user() function returns the number of bytes which it wasn't
able to copy but we want to return a negative error code.

Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Kirti Wankhede <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

Merge tag 'asoc-fix-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.10

As well as the usual smattering of driver specific fixes collected since
the merge window this has one particularly important fix to the core for
handling of aux_devs which was broken during the merge window by some of
the componentization refactoring.

xfs: Timely free truncated dirty pages

Commit 99579ccec4e2 "xfs: skip dirty pages in ->releasepage()" started
to skip dirty pages in xfs_vm_releasepage() which also has the effect
that if a dirty page is truncated, it does not get freed by
block_invalidatepage() and is lingering in LRU list waiting for reclaim.
So a simple loop like:

while true; do
dd if=/dev/zero of=file bs=1M count=100
rm file
done

will keep using more and more memory until we hit low watermarks and
start pagecache reclaim which will eventually reclaim also the truncate
pages. Keeping these truncated (and thus never usable) pages in memory
is just a waste of memory, is unnecessarily stressing page cache
reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM
prematurely.

So instead of just skipping dirty pages in xfs_vm_releasepage(), return
to old behavior of skipping them only if they have delalloc or unwritten
buffers and fix the spurious warnings by warning only if the page is
clean.

CC: [email protected]
CC: Brian Foster <[email protected]>
CC: Vlastimil Babka <[email protected]>
Reported-by: Petr Tůma <[email protected]>
Fixes: 99579ccec4e271c3d4d4e7c946058766812afdab
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix rtlwifi crash, from Larry Finger.

2) Memory disclosure in appletalk ipddp routing code, from Vlad
    Tsyrklevich.

3) r8152 can erroneously split an RX packet into multiple URBs if the
    Rx FIFO is not empty when we suspend. Fix this by waiting for the
    FIFO to empty before suspending. From Hayes Wang.

4) Two GRO fixes (enter slow path when not enough SKB tail room exists,
    disable frag0 optimizations when there are IPV6 extension headers)
    from Eric Dumazet and Herbert Xu.

5) A series of mlx5e bug fixes (do source udp port offloading for
    tunnels properly, Ip fragment matching fixes, handling firmware
    errors properly when installing TC rules, etc.) from Saeed Mahameed,
    Or Gerlitz, Roi Dayan, Hadar Hen Zion, Gil Rockah, and Daniel
    Jurgens.

6) Two VRF fixes from David Ahern (don't skip multipath selection for
    VRF paths, disallow VRF to be configured with table ID 0).

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
  net: vrf: do not allow table id 0
  net: phy: marvell: fix Marvell 88E1512 used in SGMII mode
  sctp: Fix spelling mistake: "Atempt" -> "Attempt"
  net: ipv4: Fix multipath selection with vrf
  cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig
  gro: use min_t() in skb_gro_reset_offset()
  net/mlx5: Only cancel recovery work when cleaning up device
  net/mlx5e: Remove WARN_ONCE from adaptive moderation code
  net/mlx5e: Un-register uplink representor on nic_disable
  net/mlx5e: Properly handle FW errors while adding TC rules
  net/mlx5e: Fix kbuild warnings for uninitialized parameters
  net/mlx5e: Set inline mode requirements for matching on IP fragments
  net/mlx5e: Properly get address type of encapsulation IP headers
  net/mlx5e: TC ipv4 tunnel encap offload error flow fixes
  net/mlx5e: Warn when rejecting offload attempts of IP tunnels
  net/mlx5e: Properly handle offloading of source udp port for IP tunnels
  gro: Disable frag0 optimization on IPv6 ext headers
  gro: Enter slow-path if there is no tailroom
  mlx4: Return EOPNOTSUPP instead of ENOTSUPP
  net/af_iucv: don't use paged skbs for TX on HiperSockets
  ...

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"This fixes a regression in aesni that renders it useless if it's
built-in with a modular pcbc configuration"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aesni - Fix failure when built-in with modular pcbc

nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too

Commit 54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter
readiness") introduced a quirk to adapters that cannot read the bit
NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
need a delay or else the action of reading the bit NVME_CSTS_RDY could
somehow corrupt adapter's registers state and it never recovers.

When this quirk was added, we checked ctrl->tagset in order to avoid
quirking in probe time, supposing we would never require such delay
during probe. Well, it was too optimistic; we in fact need this quirk
at probe time in some cases, like after a kexec.

In some experiments, after abnormal shutdown of machine (aka power cord
unplug), we booted into our bootloader in Power, which is a Linux kernel,
and kexec'ed into another distro. If this kexec is too quick, we end up
reaching the probe of NVMe adapter in that distro when adapter is in
bad state (not fully initialized on our bootloader). What happens next
is that nvme_wait_ready() is unable to complete, except if the quirk is
enabled.

So, this patch removes the original ctrl->tagset verification in order
to enable the quirk even on probe time.

Fixes: 54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter readiness")
Reported-by: Andrew Byrne <[email protected]>
Reported-by: Jaime A. H. Gomez <[email protected]>
Reported-by: Zachary D. Myers <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Acked-by: Jeffrey Lien <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

nvme-rdma: fix nvme_rdma_queue_is_ready

Now that we don't abuse the cmd field in struct request for nvme command
passthrough this function needs to be converted to the proper accessor
as well.

Fixes: d49187e97e ("nvme: introduce struct nvme_request")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>

Merge branch 'cls_flower-ARP'

Simon Horman says:

====================
net/sched: cls_flower: Support matching ARP

Add support for support matching on ARP operation, and hardware and
protocol addresses for Ethernet hardware and IPv4 protocol addresses.

Changes since RFC:
* None other than dropping RFC designation after positive feedback from Jiri
====================

Signed-off-by: David S. Miller <[email protected]>

net/sched: cls_flower: Support matching on ARP

Support matching on ARP operation, and hardware and protocol addresses
for Ethernet hardware and IPv4 protocol addresses.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol arp parent ffff: flower indev eth0 \
arp_op request arp_sip 10.0.0.1 action drop
tc filter add dev eth0 protocol rarp parent ffff: flower indev eth0 \
arp_op reply arp_tha 52:54:3f:00:00:00/24 action drop

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

flow disector: ARP support

Allow dissection of (R)ARP operation hardware and protocol addresses
for Ethernet hardware and IPv4 protocol addresses.

There are currently no users of FLOW_DISSECTOR_KEY_ARP.
A follow-up patch will allow FLOW_DISSECTOR_KEY_ARP to be used by the
flower classifier.

Signed-off-by: Simon Horman <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xhci: fix deadlock at host remove by running watchdog correctly

If a URB is killed while the host is removed we can end up in a situation
where the hub thread takes the roothub device lock, and waits for
the URB to be given back by xhci-hcd, blocking the host remove code.

xhci-hcd tries to stop the endpoint and give back the urb, but can't
as the host is removed from PCI bus at the same time, preventing the normal
way of giving back urb.

Instead we need to rely on the stop command timeout function to give back
the urb. This xhci_stop_endpoint_command_watchdog() timeout function
used a XHCI_STATE_DYING flag to indicate if the timeout function is already
running, but later this flag has been taking into use in other places to
mark that xhci is dying.

Remove checks for XHCI_STATE_DYING in xhci_urb_dequeue. We are still
checking that reading from pci state does not return 0xffffffff or that
host is not halted before trying to stop the endpoint.

This whole area of stopping endpoints, giving back URBs, and the wathdog
timeout need rework, this fix focuses on solving a specific deadlock
issue that we can then send to stable before any major rework.

Cc: <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: netcp: correct netcp_get_stats function signature

Commit: bc1f44709cf2 - net: make ndo_get_stats64 a void function
and
Commit: 6a8162e99ef3 - net: netcp: store network statistics in 64 bits.

The commit 6a8162e99ef3 adds ndo_get_stats64 function as per old
signature which causes compilation error:

drivers/net/ethernet/ti/netcp_core.c:1951:28: error:
initialization from incompatible pointer type
.ndo_get_stats64 = netcp_get_stats,

Hence correct netcp_get_stats function signature as per
the latest definition.

Signed-off-by: Keerthy <[email protected]>
Fixes: 6a8162e99ef344fc ("net: netcp: store network statistics in 64 bits")
Signed-off-by: David S. Miller <[email protected]>

perf/x86/intel: Use ULL constant to prevent undefined shift behaviour

When x86_pmu.num_counters is 32 the shift of the integer constant 1 is
exceeding 32bit and therefor undefined behaviour.

Fix this by shifting 1ULL instead of 1.

Reported-by: CoverityScan CID#1192105 ("Bad bit shift operation")
Signed-off-by: Colin Ian King <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

mac80211: recalculate min channel width on VHT opmode changes

When an associated station changes its VHT operating mode this
can/will affect the bandwidth it's using, and consequently we
must recalculate the minimum bandwidth we need to use. Failure
to do so can lead to one of two scenarios:
1) we use a too high bandwidth, this is benign
2) we use a too narrow bandwidth, causing rate control and
actual PHY configuration to be out of sync, which can in
turn cause problems/crashes

Signed-off-by: Johannes Berg <[email protected]>

mac80211: calculate min channel width correctly

In the current minimum chandef code there's an issue in that the
recalculation can happen after rate control is initialized for a
station that has a wider bandwidth than the current chanctx, and
then rate control can immediately start using those higher rates
which could cause problems.

Observe that first of all that this problem is because we don't
take non-associated and non-uploaded stations into account. The
restriction to non-associated is quite pointless and is one of
the causes for the problem described above, since the rate init
will happen before the station is set to associated; no frames
could actually be sent until associated, but the rate table can
already contain higher rates and that might cause problems.

Also, rejecting non-uploaded stations is wrong, since the rate
control can select higher rates for those as well.

Secondly, it's then necessary to recalculate the minimal config
before initializing rate control, so that when rate control is
initialized, the higher rates are already available. This can be
done easily by adding the necessary function call in rate init.

Change-Id: Ib9bc02d34797078db55459d196993f39dcd43070
Signed-off-by: Johannes Berg <[email protected]>

cfg80211: consider VHT opmode on station update

Currently, this attribute is only fetched on station addition, but
not on station change. Since this info is only present in the assoc
request, with full station state support in the driver it cannot be
present when the station is added.

Thus, add support for changing the VHT opmode on station update if
done before (or while) the station is marked as associated. After
this, ignore it, since it used to be ignored.

Signed-off-by: Beni Lev <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: fix the TID on NDPs sent as EOSP carrier

In the commit below, I forgot to translate the mac80211's
AC to QoS IE order. Moreover, the condition in the if was
wrong. Fix both issues.
This bug would hit only with clients that didn't set all
the ACs as delivery enabled.

Fixes: f438ceb81d4 ("mac80211: uapsd_queues is in QoS IE order")
Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: Fix headroom allocation when forwarding mesh pkt

This patch fix issue introduced by my previous commit that
tried to ensure enough headroom was present, and instead
broke it.

When forwarding mesh pkt, mac80211 may also add security header,
and it must therefore be taken into account in the needed headroom.

Fixes: d8da0b5d64d5 ("mac80211: Ensure enough headroom when forwarding mesh pkt")
Signed-off-by: Cedric Izoard <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

net: vrf: do not allow table id 0

Frank reported that vrf devices can be created with a table id of 0.
This breaks many of the run time table id checks and should not be
allowed. Detect this condition at create time and fail with EINVAL.

Fixes: 193125dbd8eb ("net: Introduce VRF device driver")
Reported-by: Frank Kellermann <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phy: marvell: fix Marvell 88E1512 used in SGMII mode

When an Marvell 88E1512 PHY is connected to a nic in SGMII mode, the
fiber page is used for the SGMII host-side connection. The PHY driver
notices that SUPPORTED_FIBRE is set, so it tries reading the fiber page
for the link status, and ends up reading the MAC-side status instead of
the outgoing (copper) link. This leads to incorrect results reported
via ethtool.

If the PHY is connected via SGMII to the host, ignore the fiber page.
However, continue to allow the existing power management code to
suspend and resume the fiber page.

Fixes: 6cfb3bcc0641 ("Marvell phy: check link status in case of fiber link.")
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sctp: Fix spelling mistake: "Atempt" -> "Attempt"

Trivial fix to spelling mistake in WARN_ONCE message

Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv4: Fix multipath selection with vrf

fib_select_path does not call fib_select_multipath if oif is set in the
flow struct. For VRF use cases oif is always set, so multipath route
selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
check similar to what is done in fib_table_lookup.

Add saddr and proto to the flow struct for the fib lookup done by the
VRF driver to better match hash computation for a flow.

Fixes: 613d09b30f8b ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'dsa-phys_port_name'

Florian Fainelli says:

====================
net: dsa: Implement ndo_get_phys_port_name()

This patch series implements ndo_get_phys_port_name() so we can revert
ndo_get_phys_id() which was (ab)used in the DSA layer.
====================

Signed-off-by: David S. Miller <[email protected]>

Revert "net: dsa: Implement ndo_get_phys_port_id"

This reverts commit 3a543ef479868e36c95935de320608a7e41466ca ("net: dsa:
Implement ndo_get_phys_port_id") since it misuses the purpose of
ndo_get_phys_port_id(). We have ndo_get_phys_port_name() to do the
correct thing for us now.

Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Vivien Didelot <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Implement ndo_get_phys_port_name()

Return the physical port number of a DSA created network device using
ndo_get_phys_port_name().

Signed-off-by: Florian Fainelli <[email protected]>
Tested-by: Vivien Didelot <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig

We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:

warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.

Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: make "label" property optional for dsa2

In the new DTS bindings for DSA (dsa2), the "ethernet" and "link"
phandles are respectively mandatory and exclusive to CPU port and DSA
link device tree nodes.

Simplify dsa2.c a bit by checking the presence of such phandle instead
of checking the redundant "label" property.

Then the Linux philosophy for Ethernet switch ports is to expose them to
userspace as standard NICs by default. Thus use the standard enumerated
"eth%d" device name if no "label" property is provided for a user port.
This allows to save DTS files from subjective net device names.

If one wants to rename an interface, udev rules can be used as usual.

Of course the current behavior is unchanged, and the optional "label"
property for user ports has precedence over the enumerated name.

Signed-off-by: Vivien Didelot <[email protected]>
Acked-by: Uwe Kleine-König <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'tracepoint-updates-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.10

gro: use min_t() in skb_gro_reset_offset()

On 32bit arches, (skb->end - skb->data) is not 'unsigned int',
so we shall use min_t() instead of min() to avoid a compiler error.

Fixes: 1272ce87fa01 ("gro: Enter slow-path if there is no tailroom")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cfg80211: wext does not need to set monitor channel in managed mode

There is not a valid reason to attempt setting the monitor channel
while in managed mode. Since this code path only deals with this mode,
remove the code block.

Johannes: I'll note that the comment indicated it was for backward
compatibility, but the code wasn't functional since switching the
monitor channel isn't supported (any more?) when in managed mode, as
that mode owns the channel configuration. Additionally, since monitor
can't be done on a managed mode interface, this would only have had
any effect to start with if a separate monitor interface is present,
in which case it's better to change the channel through that anyway,
if even possible.

Signed-off-by: Jorge Ramirez-Ortiz <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code

hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot
cpu. This works as long as the boot CPU is actually on the physical package
0, which is normaly the case after power on / reboot.

But it fails with a NULL pointer dereference when a kdump kernel is started
on a secondary socket which has a different physical package id because the
locigal package translation for physical package 0 does not exist.

Use the logical package id of the boot cpu instead of hard coded 0.

[ tglx: Rewrote changelog once more ]

Fixes: cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data")
Signed-off-by: Prarit Bhargava <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Harish Chegondi <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

USB: serial: ch341: fix control-message error handling

A short control transfer would currently fail to be detected, something
which could lead to stale buffer data being used as valid input.

Check for short transfers, and make sure to log any transfer errors.

Note that this also avoids leaking heap data to user space (TIOCMGET)
and the remote device (break control).

Fixes: 6ce76104781a ("USB: Driver for CH341 USB-serial adaptor")
Cc: stable <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>

arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags

In current code, the @changed always returns the last one's status for
the huge page with the contiguous bit set. This is really not what we
want. Even one of the PTEs is changed, we should tell it to the caller.

This patch fixes this issue.

Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Cc: <[email protected]> # 4.5.x-
Signed-off-by: Huang Shijie <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

vme: Fix wrong pointer utilization in ca91cx42_slave_get

In ca91cx42_slave_get function, the value pointed by vme_base pointer is
set through:

*vme_base = ioread32(bridge->base + CA91CX42_VSI_BS[i]);

So it must be dereferenced to be used in calculation of pci_base:

*pci_base = (dma_addr_t)*vme_base + pci_offset;

This bug was caught thanks to the following gcc warning:

drivers/vme/bridges/vme_ca91cx42.c: In function ‘ca91cx42_slave_get’:
drivers/vme/bridges/vme_ca91cx42.c:467:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
*pci_base = (dma_addr_t)vme_base + pci_offset;

Signed-off-by: Augusto Mecking Caringi <[email protected]>
Acked-By: Martyn Welch <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

nohz: Fix collision between tick and other hrtimers

When the tick is stopped and an interrupt occurs afterward, we check on
that interrupt exit if the next tick needs to be rescheduled. If it
doesn't need any update, we don't want to do anything.

In order to check if the tick needs an update, we compare it against the
clockevent device deadline. Now that's a problem because the clockevent
device is at a lower level than the tick itself if it is implemented
on top of hrtimer.

Every hrtimer share this clockevent device. So comparing the next tick
deadline against the clockevent device deadline is wrong because the
device may be programmed for another hrtimer whose deadline collides
with the tick. As a result we may end up not reprogramming the tick
accidentally.

In a worst case scenario under full dynticks mode, the tick stops firing
as it is supposed to every 1hz, leaving /proc/stat stalled:

      Task in a full dynticks CPU
      ----------------------------

      * hrtimer A is queued 2 seconds ahead
      * the tick is stopped, scheduled 1 second ahead
      * tick fires 1 second later
      * on tick exit, nohz schedules the tick 1 second ahead but sees
        the clockevent device is already programmed to that deadline,
        fooled by hrtimer A, the tick isn't rescheduled.
      * hrtimer A is cancelled before its deadline
      * tick never fires again until an interrupt happens...

In order to fix this, store the next tick deadline to the tick_sched
local structure and reuse that value later to check whether we need to
reprogram the clock after an interrupt.

On the other hand, ts->sleep_length still wants to know about the next
clock event and not just the tick, so we want to improve the related
comment to avoid confusion.

Reported-by: James Hartsock <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Wanpeng Li <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

auxdisplay: fix new ht16k33 build errors

Fix build errors caused by selecting incorrect kconfig symbols.

drivers/built-in.o:(.data+0x19cec): undefined reference to `sys_fillrect'
drivers/built-in.o:(.data+0x19cf0): undefined reference to `sys_copyarea'
drivers/built-in.o:(.data+0x19cf4): undefined reference to `sys_imageblit'

Fixes: 31114fa95bdb (auxdisplay: ht16k33: select framebuffer helper modules)
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Miguel Ojeda Sandonis <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Acked-by: Robin van der Gracht <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

sysrq: attach sysrq handler correctly for 32-bit kernel

The sysrq input handler should be attached to the input device which has
a left alt key.

On 32-bit kernels, some input devices which has a left alt key cannot
attach sysrq handler. Because the keybit bitmap in struct input_device_id
for sysrq is not correctly initialized. KEY_LEFTALT is 56 which is
greater than BITS_PER_LONG on 32-bit kernels.

I found this problem when using a matrix keypad device which defines
a KEY_LEFTALT (56) but doesn't have a KEY_O (24 == 56%32).

Cc: Jiri Slaby <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
Acked-by: Dmitry Torokhov <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

ppdev: don't print a free'd string

A previous fix of a memory leak now prints the string 'name'
that was previously free'd. Fix this by free'ing the string
at the end of the function and adding an error exit path for
the error conditions.

CoverityScan CID#1384523 ("Use after free")

Fixes: 2bd362d5f45c1 ("ppdev: fix memory leak")
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

extcon: return error code on failure

Function get_zeroed_page() returns a NULL pointer if there is no enough
memory. In function extcon_sync(), it returns 0 if the call to
get_zeroed_page() fails. The return value 0 indicates success in the
context, which is incosistent with the execution status. This patch
fixes the bug by returning -ENOMEM.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188611

Signed-off-by: Pan Bian <[email protected]>
Fixes: a580982f0836e
Cc: stable <[email protected]>
Acked-by: Chanwoo Choi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "tty: serial: 8250: add CON_CONSDEV to flags"

This commit needs to be reverted because it prevents people from
using the serial console as a secondary console with input being
directed to tty0.

IOW, if you boot with console=ttyS0 console=tty0 then all kernels
prior to this commit will produce output on both ttyS0 and tty0
but input will only be taken from tty0. With this patch the serial
console will always be the primary console instead of tty0,
potentially preventing people from getting into their machines in
emergency situations.

Fixes: d03516df8375 ("tty: serial: 8250: add CON_CONSDEV to flags")
Signed-off-by: Herbert Xu <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Clearing FIFOs in RS485 emulation mode causes subsequent transmits to break

When in RS485 emulation mode, __do_stop_tx_rs485() calls
serial8250_clear_fifos(). This not only clears the FIFOs, but also sets
all bits in their control register (UART_FCR) to 0.

One of the effects of this is the disabling of the FIFOs, which turns
them into single-byte holding registers. The rest of the driver doesn't
know this, which results in the lions share of characters passed into a
write call to be dropped.

(I can supply logic analyzer screenshots if necessary)

This fix replaces the serial8250_clear_fifos() call to
serial8250_clear_and_reinit_fifos() - this prevents the "dropped
characters" issue from manifesting again while retaining the requirement
of clearing the RX FIFO after transmission if the SER_RS485_RX_DURING_TX
flag is disabled.

Signed-off-by: Daniel Jedrychowski <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

8250_pci: Fix potential use-after-free in error path

Commit f209fa03fc9d ("serial: 8250_pci: Detach low-level driver during
PCI error recovery") introduces a potential use-after-free in case the
pciserial_init_ports call in serial8250_io_resume fails, which may
happen if a memory allocation fails or if the .init quirk failed for
whatever reason). If this happen, further pci_get_drvdata will return a
pointer to freed memory.

This patch reworks the PCI recovery resume hook to restore the old priv
structure in this case, which should be ok, since the ports were already
detached. Such error during recovery causes us to give up on the
recovery.

Fixes: f209fa03fc9d ("serial: 8250_pci: Detach low-level driver during
PCI error recovery")
Reported-by: Michal Suchanek <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

tty/serial: atmel: RS485 half duplex w/DMA: enable RX after TX is done

When using RS485 in half duplex, RX should be enabled when TX is
finished, and stopped when TX starts.

Before commit 0058f0871efe7b01c6 ("tty/serial: atmel: fix RS485 half
duplex with DMA"), RX was not disabled in atmel_start_tx() if the DMA
was used. So, collisions could happened.

But disabling RX in atmel_start_tx() uncovered another bug:
RX was enabled again in the wrong place (in atmel_tx_dma) instead of
being enabled when TX is finished (in atmel_complete_tx_dma), so the
transmission simply stopped.

This bug was not triggered before commit 0058f0871efe7b01c6
("tty/serial: atmel: fix RS485 half duplex with DMA") because RX was
never disabled before.

Moving atmel_start_rx() in atmel_complete_tx_dma() corrects the problem.

Cc: [email protected]
Reported-by: Gil Weber <[email protected]>
Fixes: 0058f0871efe7b01c6
Tested-by: Gil Weber <[email protected]>
Signed-off-by: Richard Genoud <[email protected]>
Acked-by: Alexandre Belloni <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>