Git Repo - linux.git/log

net_sched: convert tcf_exts from list to pointer array

As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.

The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.

Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: move tc offload macros to pkt_cls.h

struct tcf_exts belongs to filters, should not be visible
to plain tc actions.

Cc: Ido Schimmel <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: fix a typo in tc_for_each_action()

It is harmless because all users pass 'a' to this macro.

Fixes: 00175aec941e ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
Cc: Amir Vadai <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: remove an unnecessary list_del()

This list_del() for tc action is not needed actually,
because we only use this list to chain bulk operations,
therefore should not be carried for latter operations.

Fixes: ec0595cc4495 ("net_sched: get rid of struct tcf_common")
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: remove the leftover cleanup_a()

After refactoring tc_action into tcf_common, we no
longer need to cleanup temporary "actions" in list,
they are permanently stored in the hashtable.

Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2016-08-16

This series contains fixes to e1000e, igb, ixgbe and i40e.

Kshitiz Gupta provides a fix for igb to resolve the PHY delay compensation
math in several functions.

Jarod Wilson provides a fix for e1000e which had to broken up into 2
patches, first is prepares the driver for expanding the list of NICs
that have occasional ~10 hour clock jumps when being used for PTP.
Second patch actually fixes i218 silicon which has been experiencing
the clock jumps while using PTP.

Alex provides 2 patches for ixgbe now that he is back at Intel.  First
fixes setting VLNCTRL.VFE bit, which was left unchanged in earlier patches
which resulted in disabling VLAN filtering for all the VFs.  Second
corrects the support for disabling the VLAN tag filtering via the
feature bit.

Lastly, David fixes i40e which was causing a kernel panic when
non-contiguous traffic classes or traffic classes not starting with TC0,
were configured on a link partner switch.  To fix this, changed the
logic when determining the total number of TCs enabled.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlxsw-fixes'

Jiri Pirko says:

====================
mlxsw: IPv4 UC router fixes

Ido says:
Patches 1-3 fix a long standing problem in the driver's init sequence,
which manifests itself quite often when routing daemons try to configure
an IP address on registered netdevs that don't yet have an associated
vPort.

Patches 4-9 add missing packet traps for the router to work properly and
also fix ordering issue following the recent changes to the driver's init
sequence.

The last patch isn't related to the router, but fixes a general problem
in which under certain conditions packets aren't trapped to CPU.

v1->v2:
- Change order of patch 7
- Add patch 6 following Ilan's comment
- Add patchset name and cover letter
====================

Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Allow packets to be trapped from any PG

When packets enter the device they are classified to a priority group
(PG) buffer based on their PCP value. After their egress port and
traffic class are determined they are moved to the switch's shared
buffer and await transmission, if:

(Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
||
(Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)

Packets scheduled to transmission through CPU port (trapped to CPU) use
traffic class 7, which has a zero maximum and minimum quotas. However,
when such packets arrive from PG 0 they are admitted to the shared
buffer as PG 0 has a non-zero minimum quota.

Allow all packets to be trapped to the CPU - regardless of the PG they
were classified to - by assigning a 10KB minimum quota for CPU port and
TC7.

Fixes: 8e8dfe9fdf06 ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
Reported-by: Tamir Winetroub <[email protected]>
Tested-by: Tamir Winetroub <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Unmap 802.1Q FID before destroying it

Before destroying the 802.1Q FID we should first remove the VID-to-FID
mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
mlxsw_sp_fid_create().

Fixes: 14d39461b3f4 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Add missing rollbacks in error path

While going over the code I noticed we are missing two rollbacks in the
port's creation error path. Add them and adjust the place of one of them
in the port's removal sequence so that both are symmetric.

Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: reg: Fix missing op field fill-up

Ralue pack function needs to set op, otherwise it is 0 for add always.

Fixes: d5a1c749d22 ("mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition")
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Trap loop-backed packets

One of the conditions to generate an ICMP Redirect Message is that "the
packet is being forwarded out the same physical interface that it was
received from" (RFC 1812).

Therefore, we need to be able to trap such packets and let the kernel
decide what to do with them.

For each RIF, enable the loop-back filter, which will raise the LBERROR
trap whenever the ingress RIF equals the egress RIF.

Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
Reported-by: Ilan Tayari <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Add missing packet traps

Add the following traps:

1) MTU Error: Trap packets whose size is bigger than the egress RIF's
MTU. If DF bit isn't set, traffic will continue to be routed in slow
path.

2) TTL Error: Trap packets whose TTL expired. This allows traceroute to
work properly.

3) OSPF packets.

Fixes: 7b27ce7bb9cd ("mlxsw: spectrum: Add traps needed for router implementation")
Signed-off-by: Elad Raz <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Mark port as active before registering it

Commit bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of
init sequence") moved ports initialization to the end of the init
sequence, which means ports are the first to be removed during fini.

Since the FDB delayed work is still active when ports are removed it's
possible for it to process FDB notifications of inactive ports,
resulting in a warning message.

Fix that by marking ports as inactive only after unregistering them. The
NETDEV_UNREGISTER event will invoke bridge's driver port removal
sequence that will cause the FDB (and FDB notifications) to be flushed.

Fixes: bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of init sequence")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Create PVID vPort before registering netdevice

After registering a netdevice it's possible for user space applications
to configure an IP address on it. From the driver's perspective, this
means a router interface (RIF) should be created for the PVID vPort.

Therefore, we must create the PVID vPort before registering the
netdevice.

Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Remove redundant errors from the code

Currently, when device configuration fails we emit errors to the kernel
log despite the fact we already get these from the EMAD transaction
layer, so remove them.

In addition to being unnecessary, removing these error messages will
allow us to reuse mlxsw_sp_port_add_vid() to create the PVID vPort
before registering the netdevice.

Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Don't return upon error in removal path

When removing a VLAN filter from the device we shouldn't return upon the
first error we encounter, as otherwise we'll have resources that will
never be freed nor used.

Instead, we should keep trying to free as much resources as possible in
a best effort mode.

Remove the error message as well, since we already get these from the
EMAD transaction code.

Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netfilter: tproxy: properly refcount tcp listeners

inet_lookup_listener() and inet6_lookup_listener() no longer
take a reference on the found listener.

This minimal patch adds back the refcounting, but we might do
this differently in net-next later.

Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Reported-and-tested-by: Denys Fedoryshchenko <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nfnetlink_acct: report overquota to the right netns

We should report the over quota message to the right net namespace
instead of the init netns.

Signed-off-by: Liping Zhang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel.

* tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power_supply: tps65217-charger: fix missing platform_set_drvdata()
  power: reset: hisi-reboot: Unmap region obtained by of_iomap
  power: reset: reboot-mode: fix build error of missing ioremap/iounmap on UM
  power: supply: max17042_battery: fix model download bug.

irqchip/gicv3: Remove disabling redistributor and group1 non-secure interrupts

As per the GICv3 specification, to power down a processor using GICv3
and allow automatic power-on if an interrupt must be sent to a processor,
software must set Enable to zero for all interrupt groups(by writing
to GICC_CTLR or ICC_IGRPEN{0,1}_EL1/3 as appropriate.

When commit 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier")
was introduced there were no firmware implementations(in particular PSCI)
handling this.

Linux kernel may not be aware of the CPU power state details and might
fail to identify the power states that require quiescing the CPU
interface. Even if it can be aware of those details, it can't determine
which CPU power state have been triggered at the platform level and how
the power control is implemented.

This patch make disabling redistributor and group1 non-secure interrupts
in the power down path and re-enabling of redistributor in the power-up
path conditional. It will be handled in the kernel if and only if the
non-secure accesses are permitted to access and modify control registers.
It is left to the platform implementation otherwise.

Cc: Marc Zyngier <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Tested-by: Christopher Covington <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

irqchip/gic: Allow self-SGIs for SMP on UP configurations

On systems where a single CPU is present, the GIC may not support
having SGIs delivered to a target list. In that case, we use the
self-SGI mechanism to allow the interrupt to be delivered locally.

Tested-by: Fabio Estevam <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

md: don't print the same repeated messages about delayed sync operation

This fixes a long-standing bug that caused a flood of messages like:
"md: delaying data-check of md1 until md2 has finished (they share one
or more physical units)"

It can be reproduced like this:
1. Create at least 3 raid1 arrays on a pair of disks, each on different
   partitions.
2. Request a sync operation like 'check' or 'repair' on 2 arrays by
   writing to their md/sync_action attribute files. One operation should
   start and one should be delayed and a message like the above will be
   printed.
3. Issue a write to the third array. Each write will cause 2 copies of
   the message to be printed.

This happens when wake_up(&resync_wait) is called, usually by
md_check_recovery(). Then the delayed sync thread again prints the
message and is put to sleep. This patch adds a check in md_do_sync() to
prevent printing this message more than once for the same pair of
devices.

Reported-by: Sven Koehler <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=151801
Signed-off-by: Artur Paszkiewicz <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>

md: remove obsolete ret in md_start_sync

The ret is not needed anymore since we have already
move resync_start into md_do_sync in commit 41a9a0d.

Reviewed-by: NeilBrown <[email protected]>
Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>

arm64: kernel: avoid literal load of virtual address with MMU off

Literal loads of virtual addresses are subject to runtime relocation when
CONFIG_RELOCATABLE=y, and given that the relocation routines run with the
MMU and caches enabled, literal loads of relocated values performed with
the MMU off are not guaranteed to return the latest value unless the
memory covering the literal is cleaned to the PoC explicitly.

So defer the literal load until after the MMU has been enabled, just like
we do for primary_switch() and secondary_switch() in head.S.

Fixes: 1e48ef7fcc37 ("arm64: add support for building vmlinux as a relocatable PIE binary")
Cc: <[email protected]> # 4.6+
Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

arm64: Fix NUMA build error when !CONFIG_ACPI

Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
enabled, disabling the latter leads to the following build error on
arm64:

arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))

This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
the arm64_acpi_numa_init() definition.

Fixes: d8b47fca8c23 ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT")
Reviewed-by: Hanjun Guo <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name

Otherwise, if nfnetlink_log.ko is not loaded, we cannot add rules
to log packets to the userspace when we specify it with arp family,
such as:

  # nft add rule arp filter input log group 0
  <cmdline>:1:1-37: Error: Could not process rule: No such file or
  directory
  add rule arp filter input log group 0
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Liping Zhang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: conntrack: do not dump other netns's conntrack entries via proc

We should skip the conntracks that belong to a different namespace,
otherwise other unrelated netns's conntrack entries will be dumped via
/proc/net/nf_conntrack.

Fixes: 56d52d4892d0 ("netfilter: conntrack: use a single hashtable for all namespaces")
Signed-off-by: Liping Zhang <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

dm raid: support raid0 with missing metadata devices

The raid0 MD personality does not start a raid0 array with any of its
data devices missing.

dm-raid was removing data/metadata device pairs unconditionally if it
failed to read a superblock off the respective metadata device of such
pair, resulting in failure to start arrays with the raid0 personality.

Avoid removing any data/metadata device pairs in case of raid0
(e.g. lvm2 segment type 'raid0_meta') thus allowing MD to start the
array.

Also, avoid region size validation for raid0.

Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork

cgroup_threadgroup_rwsem is acquired in read mode during process exit
and fork.  It is also grabbed in write mode during
__cgroups_proc_write().  I've recently run into a scenario with lots
of memory pressure and OOM and I am beginning to see

systemd

__switch_to+0x1f8/0x350
__schedule+0x30c/0x990
schedule+0x48/0xc0
percpu_down_write+0x114/0x170
__cgroup_procs_write.isra.12+0xb8/0x3c0
cgroup_file_write+0x74/0x1a0
kernfs_fop_write+0x188/0x200
__vfs_write+0x6c/0xe0
vfs_write+0xc0/0x230
SyS_write+0x6c/0x110
system_call+0x38/0xb4

This thread is waiting on the reader of cgroup_threadgroup_rwsem to
exit.  The reader itself is under memory pressure and has gone into
reclaim after fork. There are times the reader also ends up waiting on
oom_lock as well.

__switch_to+0x1f8/0x350
__schedule+0x30c/0x990
schedule+0x48/0xc0
jbd2_log_wait_commit+0xd4/0x180
ext4_evict_inode+0x88/0x5c0
evict+0xf8/0x2a0
dispose_list+0x50/0x80
prune_icache_sb+0x6c/0x90
super_cache_scan+0x190/0x210
shrink_slab.part.15+0x22c/0x4c0
shrink_zone+0x288/0x3c0
do_try_to_free_pages+0x1dc/0x590
try_to_free_pages+0xdc/0x260
__alloc_pages_nodemask+0x72c/0xc90
alloc_pages_current+0xb4/0x1a0
page_table_alloc+0xc0/0x170
__pte_alloc+0x58/0x1f0
copy_page_range+0x4ec/0x950
copy_process.isra.5+0x15a0/0x1870
_do_fork+0xa8/0x4b0
ppc_clone+0x8/0xc

In the meanwhile, all processes exiting/forking are blocked almost
stalling the system.

This patch moves the threadgroup_change_begin from before
cgroup_fork() to just before cgroup_canfork().  There is no nee to
worry about threadgroup changes till the task is actually added to the
threadgroup.  This avoids having to call reclaim with
cgroup_threadgroup_rwsem held.

tj: Subject and description edits.

Signed-off-by: Balbir Singh <[email protected]>
Acked-by: Zefan Li <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected] # v4.2+
Signed-off-by: Tejun Heo <[email protected]>

clocksource/drivers/mips-gic-timer: Make gic_clocksource_of_init() return int

In commit:

d8152bf85d2c0 ("clocksource/drivers/mips-gic-timer: Convert init function to return error")

several return values were added to a void function resulting in the following warnings:

clocksource/mips-gic-timer.c: In function 'gic_clocksource_of_init':
clocksource/mips-gic-timer.c:175:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:183:4: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:190:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:195:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:200:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:211:2: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c: At top level:
clocksource/mips-gic-timer.c:213:1: warning: comparison of distinct pointer types lacks a cast [enabled by default]
clocksource/mips-gic-timer.c: In function 'gic_clocksource_of_init':
clocksource/mips-gic-timer.c:183:18: warning: ignoring return value of 'PTR_ERR', declared with attribute warn_unused_result [-Wunused-result]

Given that the addition of the return values was intentional, it seems
that the conversion of the containing function from void to int was
simply overlooked.

Signed-off-by: Paul Gortmaker <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: d8152bf85d2c ("clocksource/drivers/mips-gic-timer: Convert init function to return error")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

clocksource/drivers/kona: Fix get_counter() error handling

I could not figure out why, but GCC cannot prove that the
kona_timer_init() function always initializes its two outputs,
and we get a warning for the use of the 'lsw' variable later,
which is obviously correct.

drivers/clocksource/bcm_kona_timer.c: In function 'kona_timer_init':
drivers/clocksource/bcm_kona_timer.c:119:13: error: 'lsw' may be used uninitialized in this function [-Werror=maybe-uninitialized]

Slightly reordering the loop makes the warning disappear, after
it becomes more obvious to the compiler that the loop is
always entered on the first iteration.

As pointed out by Ray Jui, there is a related problem in the
way we deal with the loop running into the limit, as we just
keep going there with an invalid counter data, so instead we
now propagate a -ETIMEDOUT result to the caller.

Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Acked-by: Ray Jui <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Link: https://patchwork.kernel.org/patch/9174261/
Signed-off-by: Ingo Molnar <[email protected]>

clocksource/drivers/time-armada-370-xp: Fix the clock reference

While converting the init function to return an error, the wrong clock
was get. This leads to the wrong clock rate and slows down the kernel.
For example, it affects typical boot time:

- without fix: over 1 minute
- with fix: 15 seconds

Tested-by: Stefan Roese <[email protected]>
Tested-by: Ralph Sennhauser <[email protected]>
Signed-off-by: Gregory CLEMENT <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 12549e27c63c ("clocksource/drivers/time-armada-370-xp: Convert init function to return error")
Link: http://lkml.kernel.org/r/[email protected]
[ Refined the changelog. ]
Signed-off-by: Ingo Molnar <[email protected]>

arm64: KVM: report configured SRE value to 32-bit world

After commit b34f2bc ("arm64: KVM: Make ICC_SRE_EL1 access return the
configured SRE value") we report SRE value to 64-bit guest, but 32-bit
one still handled as RAZ/WI what leads to funny promise we do not keep:

"GICv3: GIC: unable to set SRE (disabled at EL2), panic ahead"

Instead, return the actual value of the ICC_SRE_EL1 register that the
guest should see.

[ Tweaked commit message - Christoffer ]

Signed-off-by: Vladimir Murzin <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

arm64: KVM: remove misleading comment on pmu status

Comment about how PMU access is handled is not relavant since v4.6
where proper PMU support was added in.

Signed-off-by: Vladimir Murzin <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

genirq: Correctly configure the trigger on chained interrupts

Commit 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ")
moved the trigger configuration call from the irqdomain mapping to
the interrupt being actually requested.

This patch failed to handle the case where we configure a chained
interrupt, which doesn't get requested through the usual path.

In order to solve this, let's call __irq_set_trigger just before
starting the cascade interrupt. Special care must be taken to
make the flow handler stick, as the .irq_set_type method could
have reset it (it doesn't know we're dealing with a chained
interrupt).

Based on an initial patch by Jon Hunter.

Fixes: 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ")
Reported-by: John Stultz <[email protected]>
Reported-by: Linus Walleij <[email protected]>
Tested-by: John Stultz <[email protected]>
Acked-by: Jon Hunter <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

KVM: arm/arm64: timer: Workaround misconfigured timer interrupt

Similarily to f005bd7e3b84 ("clocksource/arm_arch_timer: Force
per-CPU interrupt to be level-triggered"), make sure we can
survive an interrupt that has been misconfigured as edge-triggered
by forcing it to be level-triggered (active low is assumed, but
the GIC doesn't really care whether this is high or low).

Hopefully, the amount of shouting in the kernel log will convince
the user to do something about their firmware.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

arm64: Document workaround for Cortex-A72 erratum #853709

We already have a workaround for Cortex-A57 erratum #852523,
but Cortex-A72 r0p0 to r0p2 do suffer from the same issue
(known as erratum #853709).

Let's document the fact that we already handle this.

Acked-by: Will Deacon <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

KVM: arm/arm64: Change misleading use of is_error_pfn

When converting a gfn to a pfn, we call gfn_to_pfn_prot, which returns
various kinds of error values. It turns out that is_error_pfn() only
returns true when the gfn was found in a memory slot and could somehow
not be used, but it does not return true if the gfn does not belong to
any memory slot.

Change use to is_error_noslot_pfn() which covers both cases.

Note: Since we already check for kvm_is_error_hva(hva) explicitly in the
caller of this function while holding the kvm->srcu lock protecting the
memory slots, this should never be a problem, but nevertheless this
change is warranted as it shows the intention of the code.

Reported-by: James Hogan <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

block: Fix race triggered by blk_set_queue_dying()

blk_set_queue_dying() can be called while another thread is
submitting I/O or changing queue flags, e.g. through dm_stop_queue().
Hence protect the QUEUE_FLAG_DYING flag change with locking.

Signed-off-by: Bart Van Assche <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

md: do not count journal as spare in GET_ARRAY_INFO

GET_ARRAY_INFO counts journal as spare (spare_disks), which is not
accurate. This patch fixes this.

Reported-by: Yi Zhang <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>

Merge branch 'iomap-fixes-4.8-rc3' into for-next

xfs: remove OWN_AG rmap when allocating a block from the AGFL

When we're really tight on space, xfs_alloc_ag_vextent_small() can
allocate a block from the AGFL and give it to the caller. Since the
caller is never the AGFL-fixing method, we must remove the OWN_AG
reverse mapping because it will clash with whatever rmap the caller
wants to set up. This bug was discovered by running generic/299
repeatedly.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost fixes from Michael Tsirkin:
- test fixes
- a vsock fix

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  tools/virtio: add dma stubs
  vhost/test: fix after swiotlb changes
  vhost/vsock: drop space available check for TX vq
  ringtest: test build fix

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes, minor cleanup and a change to the default
  config"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dasd: fix failing CUIR assignment under LPAR
  s390/pageattr: handle numpages parameter correctly
  s390/dasd: fix hanging device after clear subchannel
  s390/qdio: avoid reschedule of outbound tasklet once killed
  s390/qdio: remove checks for ccw device internal state
  s390/qdio: fix double return code evaluation
  s390/qdio: get rid of spin_lock_irqsave usage
  s390/cio: remove subchannel_id from ccw_device_private
  s390/qdio: obtain subchannel_id via ccw_device_get_schid()
  s390/cio: stop using subchannel_id from ccw_device_private
  s390/config: make the vector optimized crc function builtin
  s390/lib: fix memcmp and strstr
  s390/crc32-vx: Fix checksum calculation for small sizes
  s390: clarify compressed image code path

xfs: (re-)implement FIEMAP_FLAG_XATTR

Use a special read-only iomap_ops implementation to support fiemap on
the attr fork.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: simplify xfs_file_iomap_begin

We'll never get nimap == 0 for a successful return from xfs_bmapi_read,
so don't try to handle it.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

iomap: mark ->iomap_end as optional

No need to implement it for read-only mappings.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

iomap: prepare iomap_fiemap for attribute mappings

By bassing through an -ENOENT, similar to the old XFS implementation of
FIEMAP_FLAG_XATTR.

Signed-off-by: Dave Chinner <[email protected]>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

iomap: fiemap should honor the FIEMAP_FLAG_SYNC flag

The flag is checked as supported, but then we do an unconditional
sync of the file, regardless of whether the flag is set or not. Make
the sync conditional on having the FIEMAP_FLAG_SYNC flag set.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

iomap: remove superflous pagefault_disable from iomap_write_actor

iov_iter_copy_from_user_atomic disables page faults internally, no need to
do it around the call. This also brings the iomap code in line with
the original filemap version.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

iomap: remove superflous mark_page_accessed from iomap_write_actor

This catches up with commit 2457ae ("mm: non-atomically mark page
accessed during page cache allocation where possible"), which
moved the initial access marking into the pagecache allocator.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: store rmapbt block count in the AGF

Track the number of blocks used for the rmapbt in the AGF. When we
get to the AG reservation code we need this counter to quickly
make our reservation during mount.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: don't invalidate whole file on DAX read/write

When we do DAX IO, we try to invalidate the entire page cache held
on the file. This is incorrect as it will trash the entire mapping
tree that now tracks dirty state in exceptional entries in the radix
tree slots.

What we are trying to do is remove cached pages (e.g from reads
into holes) that sit in the radix tree over the range we are about
to write to. Hence we should just limit the invalidation to the
range we are about to overwrite.

Reported-by: Jan Kara <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: fix bogus space reservation in xfs_iomap_write_allocate

The space reservations was without an explaination in commit

    "Add error reporting calls in error paths that return EFSCORRUPTED"

back in 2003.  There is no reason to reserve disk blocks in the
transaction when allocating blocks for delalloc space as we already
reserved the space when creating the delalloc extent.

With this fix we stop running out of the reserved pool in
generic/229, which has happened for long time with small blocksize
file systems, and has increased in severity with the new buffered
write path.

[ dchinner: we still need to pass the block reservation into
  xfs_bmapi_write() to ensure we don't deadlock during AG selection.
  See commit dbd5c8c ("xfs: pass total block res. as total
  xfs_bmapi_write() parameter") for more details on why this is
  necessary. ]

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

xfs: don't assert fail on non-async buffers on ioacct decrement

The buffer I/O accounting mechanism tracks async buffers under I/O. As
an optimization, the buffer I/O count is incremented only once on the
first async I/O for a given hold cycle of a buffer and decremented once
the buffer is released to the LRU (or freed).

xfs_buf_ioacct_dec() has an ASSERT() check for an XBF_ASYNC buffer, but
we have one or two corner cases where a buffer can be submitted for I/O
multiple times via different methods in a single hold cycle. If an async
I/O occurs first, the I/O count is incremented. If a sync I/O occurs
before the hold count drops, XBF_ASYNC is cleared by the time the I/O
count is decremented.

Remove the async assert check from xfs_buf_ioacct_dec() as this is a
perfectly valid scenario. For the purposes of I/O accounting, we really
only care about the buffer async state at I/O submission time.

Discovered-and-analyzed-by: Dave Chinner <[email protected]>
Signed-off-by: Brian Foster <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes the following issues:

   - Missing ULL suffixes for 64-bit constants in sha3.
   - Two caam AEAD regressions.
   - Bogus setkey hooks in non-hmac caam hashes.
   - Missing kbuild dependency for powerpc crc32c"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix non-hmac hashes
  crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC
  crypto: caam - defer aead_set_sh_desc in case of zero authsize
  crypto: caam - fix echainiv(authenc) encrypt shared descriptor
  crypto: sha3 - Add missing ULL suffixes for 64-bit constants

i40e: check for and deal with non-contiguous TCs

The i40e driver was causing a kernel panic when
non-contiguous Traffic Classes, or Traffic Classes not
starting with TC0, were configured on a link partner switch.
i40e does not support non-contiguous TCs.

To fix this, the patch changes the logic when determining
the total number of TCs enabled.  Before, this would use the
highest TC number enabled and assume that all TCs below it were
also enabled.  Now, we create a bitmask of enabled TCs and scan
it to determine not only the number of TCs, but also if the set
of enabled TCs starts at zero and is contiguous.  If not, then
DCB is disabled by only returning one TC.

Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

dm raid: enhance attempt_restore_of_faulty_devices() to support more devices

attempt_restore_of_faulty_devices() is limited to 64 when it should support
the new maximum of 253 when identifying any failed devices. It clears any
revivable devices via an MD personality hot remove and add cylce to allow
for their recovery.

Address by using existing functions to retrieve and update all failed
devices' bitfield members in the dm raid superblocks on all RAID devices
and check for any devices to clear in it.

Whilst on it, don't call attempt_restore_of_faulty_devices() for any MD
personality not providing disk hot add/remove methods (i.e. raid0 now),
because such personalities don't support reviving of failed disks.

Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

ixgbe: Re-enable ability to toggle VLAN filtering

Back when I submitted the GSO code I messed up and dropped the support for
disabling the VLAN tag filtering via the feature bit. This patch
re-enables the use of the NETIF_F_HW_VLAN_CTAG_FILTER to enable/disable the
VLAN filtering independent of toggling promiscuous mode.

Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

dm raid: fix restoring of failed devices regression

'lvchange --refresh RaidLV' causes a mapped device suspend/resume cycle
aiming at device restore and resync after transient device failures. This
failed because flag RT_FLAG_RS_RESUMED was always cleared in the suspend path,
thus the device restore wasn't performed in the resume path.

Solve by removing RT_FLAG_RS_RESUMED from the suspend path and resume
unconditionally. Also, remove superfluous comment from raid_resume().

Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths

When I was adding the code for enabling VLAN promiscuous mode with SR-IOV
enabled I had inadvertently left the VLNCTRL.VFE bit unchanged as I has
assumed there was code in another path that was setting it when we enabled
SR-IOV. This wasn't the case and as a result we were just disabling VLAN
filtering for all the VFs apparently.

Also the previous patches were always clearing CFIEN which was always set
to 0 by the hardware anyway so I am dropping the redundant bit clearing.

Fixes: 16369564915a ("ixgbe: Add support for VLAN promiscuous with SR-IOV")
Signed-off-by: Alexander Duyck <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

dm raid: fix frozen recovery regression

On LVM2 conversions via lvconvert(8), the target keeps mapped devices in
frozen state when requesting RAID devices be resynchronized. This
applies to e.g. adding legs to a raid1 device or taking over from raid0
to raid4 when the rebuild flag's set on the new raid1 legs or the added
dedicated parity stripe.

Also, fix frozen recovery for reshaping as well.

Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

PCI: Use positive flags in pci_alloc_irq_vectors()

Instead of passing negative flags like PCI_IRQ_NOMSI to prevent use of
certain interrupt types, pass positive flags like PCI_IRQ_LEGACY,
PCI_IRQ_MSI, etc., to specify the acceptable interrupt types.

This is based on a number of pending driver conversions that just happend
to be a whole more obvious to read this way, and given that we have no
users in the tree yet it can still easily be done.

I've also added a PCI_IRQ_ALL_TYPES catchall to keep the case of accepting
all interrupt types very simple.

[bhelgaas: changelog, fix PCI_IRQ_AFFINITY doc typo, remove mention of
PCI_IRQ_NOLEGACY]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Alexander Gordeev <[email protected]>

Merge tag 'pinctrl-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"Here are a few pin control fixes for the v4.8 series, nothing special
  about them:

   - Add the missing <linux/io.h> header to the Intel Merrifield driver
     to get rid of build mess.

   - Drop two instances of pinctrl_unregister() called for drivers using
     devm_* resource management.

   - Remove the default debounce time for the AMD driver"

* tag 'pinctrl-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: intel: merrifield: Add missed header
  pinctrl/amd: Remove the default de-bounce time
  pinctrl: pistachio: Drop pinctrl_unregister for devm_ registered device
  pinctrl: meson: Drop pinctrl_unregister for devm_ registered device

perf unwind: Use addr_location::addr instead of ip for entries

This fixes the srcline translation for call chains of user space
applications.

Before we got:

    perf report --stdio --no-children -s sym,srcline -g address
     8.92%  [.] main                                 mandelbrot.h:41
            |
            |--3.70%--main +8390240
            |          __libc_start_main +139950056726769
            |          _start +8388650
            |
            |--2.74%--main +8390189
            |
             --2.08%--main +8390296
                       __libc_start_main +139950056726769
                       _start +8388650

     7.59%  [.] main                                 complex:1326
            |
            |--4.79%--main +8390203
            |          __libc_start_main +139950056726769
            |          _start +8388650
            |
             --2.80%--main +8390219

     7.12%  [.] __muldc3                             libgcc2.c:1945
            |
            |--3.76%--__muldc3 +139950060519490
            |          main +8390224
            |          __libc_start_main +139950056726769
            |          _start +8388650
            |
             --3.32%--__muldc3 +139950060519512
                       main +8390224

With this patch applied, we instead get:

    perf report --stdio --no-children -s sym,srcline -g address
     8.92%  [.] main                                 mandelbrot.h:41
            |
            |--3.70%--main mandelbrot.h:41
            |          __libc_start_main +241
            |          _start +4194346
            |
            |--2.74%--main mandelbrot.h:41
            |
             --2.08%--main mandelbrot.h:41
                       __libc_start_main +241
                       _start +4194346

     7.59%  [.] main                                 complex:1326
            |
            |--4.79%--main complex:1326
            |          __libc_start_main +241
            |          _start +4194346
            |
             --2.80%--main complex:1326

     7.12%  [.] __muldc3                             libgcc2.c:1945
            |
            |--3.76%--__muldc3 libgcc2.c:1945
            |          main mandelbrot.h:39
            |          __libc_start_main +241
            |          _start +4194346
            |
             --3.32%--__muldc3 libgcc2.c:1945
                       main mandelbrot.h:39

Suggested-and-Acked-by: Namhyung Kim <[email protected]>
Signed-off-by: Milian Wolff <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
LPU-Reference: 20160816153926 [email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Merge tag 'perf-urgent-for-mingo-20160815' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix occasional decoding errors when tracing system-wide with
  Intel PT (Adrian Hunter)

- Fix ip compression in Intel PT for some specific packet types not
  present on current hardware (Adrian Hunter)

- Fix annotation of objects with debuginfo files (Anton Blanchard)

- Fix build on Fedora Rawhide (25) wrt using the right header to
  get the major() & minor() definitions in the jitdump code, now
  it is deprecated getting those using sys/types.h, one has to use
  sys/sysmacros.h (Arnaldo Carvalho de Melo)

- Sync arm64/s390 kvm related header files (Arnaldo Carvalho de Melo)

- Check for dup and fdopen failures in 'perf probe' (Colin Ian King,
  Arnaldo Carvalho de Melo)

- Fix showing callchains in pipe mode, i.e.

    perf record -g -o - workload | perf script

  now shows callchains (He Kuang)

- Show proper message when the scripts directory points to some
  invalid location in 'perf script --list' (He Kuang)

- Fix 'perf mem -t store' to record 'cpu/mem-stores/P' events
  again (Jiri Olsa)

- Fix ppc64le build failure when libelf is not present (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

e1000e: fix PTP on e1000_pch_lpt variants

I've got reports that the Intel I-218V NIC in Intel NUC5i5RYH systems used
as a PTP slave experiences random ~10 hour clock jumps, which are resolved
if the same workaround for the 82574 and 82583 is employed, so set the
appropriate flag2 in e1000_pch_lpt_info too.

Reported-by: Rupesh Patel <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000e: factor out systim sanitization

This is prepatory work for an expanding list of adapter families that have
occasional ~10 hour clock jumps when being used for PTP. Factor out the
sanitization function and convert to using a feature (bug) flag, per
suggestion from Jesse Brandeburg.

Littering functional code with device-specific checks is much messier than
simply checking a flag, and having device-specific init set flags as needed.
There are probably a number of other cases in the e1000e code that
could/should be converted similarly.

Suggested-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: fix adjusting PTP timestamps for Tx/Rx latency

Fix PHY delay compensation math in igb_ptp_tx_hwtstamp() and
igb_ptp_rx_rgtstamp. Add PHY delay compensation in
igb_ptp_rx_pktstamp().

In the IGB driver, there are two functions that retrieve timestamps
received by the PHY - igb_ptp_rx_rgtstamp() and igb_ptp_rx_pktstamp().
The previous commit only changed igb_ptp_rx_rgtstamp(), and the change
was incorrect.

There are two instances in which PHY delay compensations should be
made:

- Before the packet transmission over the PHY, the latency between
  when the packet is timestamped and transmission of the packets,
  should be an add operation, but it is currently a subtract.

- After the packets are received from the PHY, the latency between
  the receiving and timestamping of the packets should be a subtract
  operation, but it is currently an add.

Signed-off-by: Kshitiz Gupta <[email protected]>
Fixes: 3f544d2 (igb: adjust ptp timestamps for tx/rx latency)
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

KVM: arm64: ITS: avoid re-mapping LPIs

When a guest wants to map a device-ID/event-ID combination that is
already mapped, we may end up in a situation where an LPI is never
"put", thus never being freed.
Since the GICv3 spec says that mapping an already mapped LPI is
UNPREDICTABLE, lets just bail out early in this situation to avoid
any potential leaks.

Signed-off-by: Andre Przywara <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

block: Fix secure erase

Commit 288dab8a35a0 ("block: add a separate operation type for secure
erase") split REQ_OP_SECURE_ERASE from REQ_OP_DISCARD without considering
all the places REQ_OP_DISCARD was being used to mean either. Fix those.

Signed-off-by: Adrian Hunter <[email protected]>
Fixes: 288dab8a35a0 ("block: add a separate operation type for secure erase")
Signed-off-by: Jens Axboe <[email protected]>

pNFS/flexfiles: Set reasonable default retrans values for the data channel

Prior to this patch, the retrans value was set at 5, meaning that we
could see a maximum retransmission timeout value of more than 6 minutes.
That's a tad high for NFSv3 where the protocol does allow the server to
drop requests at any time.

Since this is a data channel, let's just set retrans to 0, and the default
timeout to 60s. The user can continue to adjust these defaults using the
dataserver_retrans and dataserver_timeo module parameters.

Signed-off-by: Trond Myklebust <[email protected]>

NFS: Allow the mount option retrans=0

We should allow retrans=0 as just meaning that every timeout is a major
timeout, and that there is no increment in the timeout value.

For instance, this means that we would allow TCP users to specify a
flat timeout value of 60s, by specifying "timeo=600,retrans=0" in their
mount option string.

Siged-off-by: Trond Myklebust <[email protected]>

drm/amdgpu: Change GART offset to 64-bit

The GART aperture size can be bigger than 4GB. Therefore the offset
used in amdgpu_gart_bind and amdgpu_gart_unbind must be 64-bit.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup

commit cbaadf0f90d6 ("ASoC: atmel_ssc_dai: refactor the startup and
shutdown") refactored code such that the SSC is reset on every
startup; this breaks duplex audio (e.g. first start audio playback,
then start record, causing the playback to stop/hang)

Fixes: cbaadf0f90d6 (ASoC: atmel_ssc_dai: refactor the startup and shutdown)
Signed-off-by: Christoph Huber <[email protected]>
Signed-off-by: Peter Meerwald-Stadler <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]

PM / hibernate: Fix rtree_next_node() to avoid walking off list ends

rtree_next_node() walks the linked list of leaf nodes to find the next
block of pages in the struct memory_bitmap. If it walks off the end of
the list of nodes, it walks the list of memory zones to find the next
region of memory. If it walks off the end of the list of zones, it
returns false.

This leaves the struct bm_position's node and zone pointers pointing
at their respective struct list_heads in struct mem_zone_bm_rtree.

memory_bm_find_bit() uses struct bm_position's node and zone pointers
to avoid walking lists and trees if the next bit appears in the same
node/zone. It handles these values being stale.

Swap rtree_next_node()s 'step then test' to 'test-next then step',
this means if we reach the end of memory we return false and leave
the node and zone pointers as they were.

This fixes a panic on resume using AMD Seattle with 64K pages:
[    6.868732] Freezing user space processes ... (elapsed 0.000 seconds) done.
[    6.875753] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
[    6.896453] PM: Using 3 thread(s) for decompression.
[    6.896453] PM: Loading and decompressing image data (5339 pages)...
[    7.318890] PM: Image loading progress:   0%
[    7.323395] Unable to handle kernel paging request at virtual address 00800040
[    7.330611] pgd = ffff000008df0000
[    7.334003] [00800040] *pgd=00000083fffe0003, *pud=00000083fffe0003, *pmd=00000083fffd0003, *pte=0000000000000000
[    7.344266] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    7.349825] Modules linked in:
[    7.352871] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G        W I     4.8.0-rc1 #4737
[    7.360512] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1002C 04/08/2016
[    7.369109] task: ffff8003c0220000 task.stack: ffff8003c0280000
[    7.375020] PC is at set_bit+0x18/0x30
[    7.378758] LR is at memory_bm_set_bit+0x24/0x30
[    7.383362] pc : [<ffff00000835bbc8>] lr : [<ffff0000080faf18>] pstate: 60000045
[    7.390743] sp : ffff8003c0283b00
[    7.473551]
[    7.475031] Process swapper/0 (pid: 1, stack limit = 0xffff8003c0280020)
[    7.481718] Stack: (0xffff8003c0283b00 to 0xffff8003c0284000)
[    7.800075] Call trace:
[    7.887097] [<ffff00000835bbc8>] set_bit+0x18/0x30
[    7.891876] [<ffff0000080fb038>] duplicate_memory_bitmap.constprop.38+0x54/0x70
[    7.899172] [<ffff0000080fcc40>] snapshot_write_next+0x22c/0x47c
[    7.905166] [<ffff0000080fe1b4>] load_image_lzo+0x754/0xa88
[    7.910725] [<ffff0000080ff0a8>] swsusp_read+0x144/0x230
[    7.916025] [<ffff0000080fa338>] load_image_and_restore+0x58/0x90
[    7.922105] [<ffff0000080fa660>] software_resume+0x2f0/0x338
[    7.927752] [<ffff000008083350>] do_one_initcall+0x38/0x11c
[    7.933314] [<ffff000008b40cc0>] kernel_init_freeable+0x14c/0x1ec
[    7.939395] [<ffff0000087ce564>] kernel_init+0x10/0xfc
[    7.944520] [<ffff000008082e90>] ret_from_fork+0x10/0x40
[    7.949820] Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
[    7.955909] ---[ end trace 0024a5986e6ff323 ]---
[    7.960529] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Here struct mem_zone_bm_rtree's start_pfn has been returned instead of
struct rtree_node's addr as the node/zone pointers are corrupt after
we walked off the end of the lists during mark_unsafe_pages().

This behaviour was exposed by commit 6dbecfd345a6 ("PM / hibernate:
Simplify mark_unsafe_pages()"), which caused mark_unsafe_pages() to call
duplicate_memory_bitmap(), which uses memory_bm_find_bit() after walking
off the end of the memory bitmap.

Fixes: 3a20cb177961 (PM / Hibernate: Implement position keeping in radix tree)
Signed-off-by: James Morse <[email protected]>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

ASoC: compress: Fix leak of a widget list in soc_compr_open_fe

After we have called dpcm_path_get we should make sure to call
dpcm_path_put on all error paths. This was not happening causing the
allocated widget list to be leaked, this patch corrects this by adding a
dpcm_path_put to the error path.

Signed-off-by: Charles Keepax <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

crypto: sha512-mb - fix ctx pointer

1. fix ctx pointer
Use req_ctx which is the ctx for the next job that have
been completed in the lanes instead of the first
completed job rctx, whose completion could have been
called and released.

Signed-off-by: Xiaodong Liu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: sha256-mb - fix ctx pointer and digest copy

1. fix ctx pointer
Use req_ctx which is the ctx for the next job that have
been completed in the lanes instead of the first
completed job rctx, whose completion could have been
called and released.
2. fix digest copy
Use XMM register to copy another 16 bytes sha256 digest
instead of a regular register.

Signed-off-by: Xiaodong Liu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

xhci: don't dereference a xhci member after removing xhci

Remove the hcd after checking for the xhci last quirks, not before.

This caused a hang on a Alpine Ridge xhci based maching which remove
the whole xhci controller when unplugging the last usb device

CC: <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: xhci: Fix panic if disconnect

After a device is disconnected, xhci_stop_device() will be invoked
in xhci_bus_suspend().
Also the "disconnect" IRQ will have ISR to invoke
xhci_free_virt_device() in this sequence.
xhci_irq -> xhci_handle_event -> handle_cmd_completion ->
xhci_handle_cmd_disable_slot -> xhci_free_virt_device

If xhci->devs[slot_id] has been assigned to NULL in
xhci_free_virt_device(), then virt_dev->eps[i].ring in
xhci_stop_device() may point to an invlid address to cause kernel
panic.

virt_dev = xhci->devs[slot_id];
:
if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue)

[] Unable to handle kernel paging request at virtual address 00001a68
[] pgd=ffffffc001430000
[] [00001a68] *pgd=000000013c807003, *pud=000000013c807003,
*pmd=000000013c808003, *pte=0000000000000000
[] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[] CPU: 0 PID: 39 Comm: kworker/0:1 Tainted: G U
[] Workqueue: pm pm_runtime_work
[] task: ffffffc0bc0e0bc0 ti: ffffffc0bc0ec000 task.ti:
ffffffc0bc0ec000
[] PC is at xhci_stop_device.constprop.11+0xb4/0x1a4

This issue is found when running with realtek ethernet device
(0bda:8153).

Signed-off-by: Jim Lin <[email protected]>
Cc: <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: really enqueue zero length TRBs.

Enqueue the first TRB even if full_len is zero.
Without this "adb install <apk>" freezes the system.

Signed-off-by: Alban Browaeys <[email protected]>
Fixes: 86065c2719a5 ("xhci: don't rely on precalculated value of needed trbs in the enqueue loop")
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: always handle "Command Ring Stopped" events

Fix "Command completion event does not match command" errors by always
handling the command ring stopped events.

The command ring stopped event is generated as a result of aborting
or stopping the command ring with a register write. It is not caused
by a command in the command queue, and thus won't have a matching command
in the comman list.

Solve it by handling the command ring stopped event before checking for a
matching command.

In most command time out cases we abort the command ring, and get
a command ring stopped event. The events command pointer will point at
the current command ring dequeue, which in most cases matches the timed
out command in the command list, and no error messages are seen.

If we instead get a command aborted event before the command ring stopped
event, the abort event will increse the command ring dequeue pointer, and
the following command ring stopped events command pointer will point at the
next, not yet queued command. This case triggered the error message

Signed-off-by: Mathias Nyman <[email protected]>
CC: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge branch 'mediatek-fixes'

Sean Wang says:

====================
mediatek: Fix warning and issue

This patch set fixes the following warning and issues

v1 -> v2: Fix message typos and add coverletter

v2 -> v3: Split from the previous series for submitting bug fixes
as a series targeting 'net'
====================

Signed-off-by: David S. Miller <[email protected]>

net: ethernet: mediatek: fix runtime warning raised by inconsistent struct device pointers passed to DMA API

Runtime warning occurs if DMA-API debug feature is enabled that would be
raised by pointers passed to DMA API as arguments to inconsistent struct
device objects, so that the patch makes them usage aligned between DMA
operations such as dma_map_*() and dma_unmap_*() to eliminate the warning.

Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: mediatek: fix flow control settings on GMAC0 is not being enabled properly

Commit 08ef55c6f257acf3bdc6940813f80e8f0f5d90ec
("net-next: mediatek: fix gigabit and flow control advertisement")
had supported proper flow control settings for GMAC1. But for GMAC0,

1.GMAC0 shares the common logic with GMAC1 inside mtk_phy_link_adjust()
to adapt various settings for the target phy.

2.GMAC0 uses fixed-phy to connect to a builtin gigabit switch with
fixed link speed as commit 0c72c50f6f93b0c3daa9ea35d89ab3a933c7b5a0
("net-next: mediatek: add fixed-phy support") describes.

3.However, fixed-phy doesn't enable SUPPORTED_Pause & SUPPORTED_Asym_Pause
supported flag on default that would cause mtk_phy_link_adjust() not to
enable flow control setting on GMAC0 properly and cause packet dropped
when high traffic.

Due to these reasons, the patch adds SUPPORTED_Pause & SUPPORTED_Asym_Pause
supported flags on fixed-phy used by the driver to have proper handling on
the both GMAC with the shared common logic.

Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: mediatek: fix RMII mode and add REVMII supported by GMAC

The patch fixes up the incorrect setup of reduced MII (RMII) on GMAC
and adds the supplement for the setup of reverse MII (REVMII) on GMAC
, and rearranges the error handling for invalid PHY argument.

Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

x86/power/64: Use __pa() for physical address computation

The value of temp_level4_pgt is the physical address of the
top-level page directory, so use __pa() to compute it.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Ingo Molnar <[email protected]>

perf intel-pt: Fix occasional decoding errors when tracing system-wide

In order to successfully decode Intel PT traces, context switch events
are needed from the moment the trace starts. Currently that is ensured
by using the 'immediate' flag which enables the switch event when it is
opened.

However, since commit 86c2786994bd ("perf intel-pt: Add support for
PERF_RECORD_SWITCH") that might not always happen. When tracing
system-wide the context switch event is added to the tracking event
which was not set as 'immediate'. Change that so it is.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected] # v4.4+
Fixes: 86c2786994bd ("perf intel-pt: Add support for PERF_RECORD_SWITCH")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

tools: Sync kvm related header files for arm64 and s390

From a quick look nothing stands out as requiring changes to kvm tools
such as tools/perf/arch/s390/util/kvm-stat.c.

Silences these header checking warnings:

  $ make -C tools/perf
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
  Warning: tools/arch/s390/include/uapi/asm/kvm.h differs from kernel
  Warning: tools/arch/s390/include/uapi/asm/sie.h differs from kernel
  Warning: tools/arch/arm64/include/uapi/asm/kvm.h differs from kernel
  <SNIP>

Cc: Adrian Hunter <[email protected]>
Cc: Alexander Yarygin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Naveen N. Rao <[email protected]>
Cc: Scott Wood <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf probe: Release resources on error when handling exit paths

Cc: Adrian Hunter <[email protected]>
Cc: Colin King <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

power_supply: tps65217-charger: fix missing platform_set_drvdata()

Add missing platform_set_drvdata() in tps65217_charger_probe(), otherwise
calling platform_get_drvdata() in remove returns NULL.

This is detected by Coccinelle semantic patch.

Fixes: 3636859b280c ("power_supply: Add support for tps65217-charger")
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Sebastian Reichel <[email protected]>

KVM: arm64: check for ITS device on MSI injection

When userspace provides the doorbell address for an MSI to be
injected into the guest, we find a KVM device which feels responsible.
Lets check that this device is really an emulated ITS before we make
real use of the container_of-ed pointer.

[ Moved NULL-pointer check to caller of static function
- Christoffer ]

Signed-off-by: Andre Przywara <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

KVM: arm64: ITS: move ITS registration into first VCPU run

Currently we register an ITS device upon userland issuing the CTLR_INIT
ioctl to mark initialization of the ITS as done.
This deviates from the initialization sequence of the existing GIC
devices and does not play well with the way QEMU handles things.
To be more in line with what we are used to, register the ITS(es) just
before the first VCPU is about to run, so in the map_resources() call.
This involves iterating through the list of KVM devices and map each
ITS that we find.

Signed-off-by: Andre Przywara <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic

There are two problems with the current implementation of the MMIO
handlers for the propbaser and pendbaser:

First, the write to the value itself is not guaranteed to be an atomic
64-bit write so two concurrent writes to the structure field could be
intermixed.

Second, because we do a read-modify-update operation without any
synchronization, if we have two 32-bit accesses to separate parts of the
register, we can loose one of them.

By using the atomic cmpxchg64 we should cover both issues above.

Reviewed-by: Andre Przywara <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

tipc: fix NULL pointer dereference in shutdown()

tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
call tipc_node_xmit_skb() on it.

    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
    RIP: 0010:[<ffffffff830bb46b>]  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
    RSP: 0018:ffff8800595bfce8  EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
    RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
    RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
    R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
    R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
    FS:  00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
    DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
    Stack:
     0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
     ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
     0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
    Call Trace:
     [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880
     [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
    RIP  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
     RSP <ffff8800595bfce8>
    ---[ end trace 57b0484e351e71f1 ]---

I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
userspace is equipped to handle that. Anyway, this is better than a GPF
and looks somewhat consistent with other tipc_msg_create() callers.

Signed-off-by: Vegard Nossum <[email protected]>
Acked-by: Ying Xue <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'hv_netvsc-VF-removal-fixes'

Vitaly Kuznetsov says:

====================
hv_netvsc: fixes for VF removal path

Kernel crash is reported after VF is removed and detached from netvsc
device. Turns out we have multiple different (but related) issues on the
VF removal path which I'm trying to address with PATCHes 2-5 of this
series. PATCH1 is required to support the change.

Changes since v1:
- Re-arrange patches in the series to not introduce new issues [David Miller]
- Add PATCH5 which fixes a new issue I discovered while testing.
- Add Haiyang' A-b tags to PATCH1-4

With regards to Stephen's suggestion: I believe that switching to using RCU
and eliminating vf_use_cnt/vf_inject is the right thing to do long-term, we
can either put this on top of this series or do it later in net-next.
====================

Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: fix bonding devices check in netvsc_netdev_event()

Bonding driver sets IFF_BONDING on both master (the bonding device) and
slave (the real NIC) devices and in netvsc_netdev_event() we want to skip
master devices only. Currently, there is an uncertainty when a slave
interface is removed: if bonding module comes first in netdev_chain it
clears IFF_BONDING flag on the netdev and netvsc_netdev_event() correctly
handles NETDEV_UNREGISTER event, but in case netvsc comes first on the
chain it sees the device with IFF_BONDING still attached and skips it. As
we still hold vf_netdev pointer to the device we crash on the next inject.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Acked-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev

We're not guaranteed to see NETDEV_REGISTER/NETDEV_UNREGISTER notifications
only once per VF but we increase/decrease module refcount unconditionally.
Check vf_netdev to make sure we don't take/release it twice. We presume
that only one VF per netvsc device may exist.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Acked-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>