Git Repo - linux.git/log

net/mlx5_core: Firmware commands to support flow counters

Getting packet/byte statistics on flows is done through flow counters.
Implement the firmware commands to alloc, free and query flow counters.

Signed-off-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5_core: Use a macro in mlx5_command_str()

Use a macro instead of copying the OP name.

Signed-off-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: cls_flower: Hardware offloaded filters statistics support

Introduce a new command in ndo_setup_tc() for hardware offloaded
filters, to call the NIC driver, and make it update the statistics.
This will be done before dumping the filter and its statistics.

Signed-off-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: act_gact: Update statistics when offloaded to hardware

Implement the stats_update callback that will be called by NIC drivers
for hardware offloaded filters.

Signed-off-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: Enable netdev drivers to update statistics of offloaded actions

Introduce stats_update callback. netdev driver could call it for offloaded
actions to update the basic statistics (packets, bytes and last use).
Since bstats_update() and bstats_cpu_update() use skb as an argument to
get the counters, _bstats_update() and _bstats_cpu_update(), that get
bytes and packets as arguments, were added.

Signed-off-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'pxa168_eth-perf'

Jisheng Zhang says:

====================
net: pxa168_eth: improve performance

This series is to improve the pxa168_eth driver performance by using
{readl|writel}_relaxed or appropriate memory barriers.
====================

Signed-off-by: David S. Miller <[email protected]>

net: pxa168_eth: Use dma_wmb/rmb where appropriate

Update the pxa168_eth driver to use the dma_rmb/wmb calls instead of the
full barriers in order to improve performance: reduced 97ns/39ns on
average in tx/rx path on Marvell BG4CT platform.

Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: pxa168_eth: use {readl|writel}_relaxed instead of readl/writel

Since appropriate memory barriers are already there, use the relaxed
version to improve performance a bit.

Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vxlan: set mac_header correctly in GPE mode

For VXLAN-GPE, the interface is ARPHRD_NONE, thus we need to reset
mac_header after pulling the outer header.

v2: Put the code to the existing conditional block as suggested by
Shmulik Ladkani.

Fixes: e1e5314de08b ("vxlan: implement GPE")
Signed-off-by: Jiri Benc <[email protected]>
Reviewed-by: Shmulik Ladkani <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'xen-netback-control-ring'

Paul Durrant says:

====================
xen-netback: support for control ring

My recent patch to import an up-to-date include/xen/interface/io/netif.h
from the Xen Project brought in the necessary definitions to support the
new control shared ring and protocol. This patch series updates xen-netback
to support the new ring.

Patch #1 adds the necessary boilerplate to map the control ring and handle
messages. No implementation of the new protocol is included in this patch
so that it can be kept to a reasonable size.

Patch #2 adds the protocol implementation.

Patch #3 adds support for passing has values calculated by xen-netback to
capable frontends.

Patch #4 adds support for accepting hash values calculated by capable
frontends and using them the set the socket buffer hash.
====================

Signed-off-by: David S. Miller <[email protected]>

xen-netback: use hash value from the frontend

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.

Signed-off-by: Paul Durrant <[email protected]>
Acked-by: Wei Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xen-netback: pass hash value to the frontend

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.

Signed-off-by: Paul Durrant <[email protected]>
Acked-by: Wei Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xen-netback: add control protocol implementation

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.

Signed-off-by: Paul Durrant <[email protected]>
Cc: Wei Liu <[email protected]>
Acked-by: Wei Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xen-netback: add control ring boilerplate

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.

Signed-off-by: Paul Durrant <[email protected]>
Acked-by: Wei Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'cls_u32_hw_sw'

Sridhar Samudrala says:

====================
Enable SW only or HW only offloads with u32 classifier

This set of patches export TCA_CLS_FLAGS_SKIP_HW to userspace and also
introduces another flag TCA_CLS_FLAGS_SKIP_SW. These flags enable offloading
u32 filters to either SW or HW only.

The default semantics with no flags is to add the filter to HW if possible and
also into SW.
With SKIP_HW flag, the filter is only added to SW.
With SKIP_SW flag, the filter is added to HW and an error is returned
to user on failure.
These flags are mutually exclusive.
There was an earlier discussion on these semantics in the following email
thread.
http://thread.gmane.org/gmane.linux.network/401733
====================

Signed-off-by: David S. Miller <[email protected]>

net: cls_u32: Add support for skip-sw flag to tc u32 classifier.

On devices that support TC U32 offloads, this flag enables a filter to be
added only to HW. skip-sw and skip-hw are mutually exclusive flags. By
default without any flags, the filter is added to both HW and SW, but no
error checks are done in case of failure to add to HW. With skip-sw,
failure to add to HW is treated as an error.

Here is a sample script that adds 2 filters, one with skip-sw and the other
with skip-hw flag.

   # add ingress qdisc
   tc qdisc add dev p4p1 ingress

   # enable hw tc offload.
   ethtool -K p4p1 hw-tc-offload on

   # add u32 filter with skip-sw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:1 u32 ht 800: flowid 800:1 \
      skip-sw \
      match ip src 192.168.1.0/24 \
      action drop

   # add u32 filter with skip-hw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:2 u32 ht 800: flowid 800:2 \
      skip-hw \
      match ip src 192.168.2.0/24 \
      action drop

Signed-off-by: Sridhar Samudrala <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sched: Move TCA_CLS_FLAGS_SKIP_HW to uapi header file.

Signed-off-by: Sridhar Samudrala <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'hv_netvsc-races'

Vitaly Kuznetsov says:

====================
hv_netvsc: avoid races on mtu change/set channels

Changes since v1:
- Rebased to net-next [Haiyang Zhang]

Original description:

MTU change and set channels operations are implemented as netvsc device
re-creation destroying internal structures (struct net_device stays). This
is really unfortunate but there is no support from Hyper-V host to do it
in a different way. Such re-creation is unsurprisingly racy, Haiyang
reported a crash when netvsc_change_mtu() is racing with
netvsc_link_change() but I was able to identify additional races upon
investigation. Both netvsc_set_channels() and netvsc_change_mtu() race
against:
1) netvsc_link_change()
2) netvsc_remove()
3) netvsc_send()

To solve these issues without introducing new locks some refactoring is
required. We need to get rid of very complex link graph in all the
internal structures and avoid traveling through structures which are being
removed.
====================

Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: set nvdev link after populating chn_table

Crash in netvsc_send() is observed when netvsc device is re-created on
mtu change/set channels. The crash is caused by dereferencing of NULL
channel pointer which comes from chn_table. The root cause is a mixture
of two facts:
- we set nvdev pointer in net_device_context in alloc_net_device()
before we populate chn_table.
- we populate chn_table[0] only.

The issue could be papered over by checking channel != NULL in
netvsc_send() but populating the whole chn_table and writing the
nvdev pointer afterwards seems more appropriate.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: synchronize netvsc_change_mtu()/netvsc_set_channels() with netvsc_remove()

When netvsc device is removed during mtu change or channels setup we get
into troubles as both paths are trying to remove the device. Synchronize
them with start_remove flag and rtnl lock.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: get rid of struct net_device pointer in struct netvsc_device

Simplify netvsvc pointer graph by getting rid of the redundant ndev
pointer. We can always get a pointer to struct net_device from somewhere
else.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: untangle the pointer mess

We have the following structures keeping netvsc adapter state:
- struct net_device
- struct net_device_context
- struct netvsc_device
- struct rndis_device
- struct hv_device
and there are pointers/dependencies between them:
- struct net_device_context is contained in struct net_device
- struct hv_device has driver_data pointer which points to
  'struct net_device' OR 'struct netvsc_device' depending on driver's
  state (!).
- struct net_device_context has a pointer to 'struct hv_device'.
- struct netvsc_device has pointers to 'struct hv_device' and
  'struct net_device_context'.
- struct rndis_device has a pointer to 'struct netvsc_device'.

Different functions get different structures as parameters and use these
pointers for traveling. The problem is (in addition to keeping in mind
this complex graph) that some of these structures (struct netvsc_device
and struct rndis_device) are being removed and re-created on mtu change
(as we implement it as re-creation of hyper-v device) so our travel using
these pointers is dangerous.

Simplify this to a the following:
- add struct netvsc_device pointer to struct net_device_context (which is
  a part of struct net_device and thus never disappears)
- remove struct hv_device and struct net_device_context pointers from
  struct netvsc_device
- replace pointer to 'struct netvsc_device' with pointer to
  'struct net_device'.
- always keep 'struct net_device' in hv_device driver_data.

We'll end up with the following 'circular' structure:

net_device:
[net_device_context] -> netvsc_device -> rndis_device -> net_device
                      -> hv_device -> net_device

On MTU change we'll be removing the 'netvsc_device -> rndis_device'
branch and re-creating it making the synchronization easier.

There is one additional redundant pointer left, it is struct net_device
link in struct netvsc_device, it is going to be removed in a separate
commit.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: use start_remove flag to protect netvsc_link_change()

netvsc_link_change() can race with netvsc_change_mtu() or
netvsc_set_channels() as these functions destroy struct netvsc_device and
rndis filter. Use start_remove flag for syncronization. As
netvsc_change_mtu()/netvsc_set_channels() are called with rtnl lock held
we need to take it before checking start_remove value in
netvsc_link_change().

Reported-by: Haiyang Zhang <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: move start_remove flag to net_device_context

struct netvsc_device is destroyed on mtu change so keeping the
protection flag there is not a good idea. Move it to struct
net_device_context which is preserved.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phy: add support for a reset-gpio specification

The framework only asserts (for now) that the reset gpio is not active.

Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: Roger Quadros <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

blk-mq: fix undefined behaviour in order_to_size()

When this_order variable in blk_mq_init_rq_map() becomes zero
the code incorrectly decrements the variable and passes the result
to order_to_size() helper causing undefined behaviour:

UBSAN: Undefined behaviour in block/blk-mq.c:1459:27
shift exponent 4294967295 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00072-g33656a1 #22

Fix the code by checking this_order variable for not having the zero
value first.

Reported-by: Meelis Roos <[email protected]>
Fixes: 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism")
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

ibft: Expose iBFT acpi header via sysfs

Some ethernet adapter vendors are supplying products which support optional
(payed license) features. On some adapters this includes a hardware iscsi
initiator.  The same adapters in a normal (no extra licenses) mode of
operation can be used as a software iscsi initiator.  In addition, software
iscsi boot initiators are becoming a standard part of many vendors uefi
implementations.  This is creating difficulties during early boot/install
determining the proper configuration method for these adapters when they
are used as a boot device.

The attached patch creates sysfs entries to expose information from the
acpi header of the ibft table.  This information allows for a method to
easily determining if an ibft table was created by a ethernet card's
firmware or the system uefi/bios.  In the case of a hardware initiator this
information in combination with the pci vendor and device id can be used
to ascertain any vendor specific behaviors that need to be accommodated.

Reviewed-by: Lee Duncan <[email protected]>
Signed-off-by: David Bond <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

iscsi_ibft: Add prefix-len attr and display netmask

The iBFT table only specifies a prefix length, not a netmask.
And the netmask is pretty much pointless for IPv6.
So introduce a new attribute 'prefix-len'.

Some older user-space code might rely on the netmask attribute
being present, so we should always display it.

Changes from v1:
- Combined two patches into one

Changes from v2:
- Cleaned up/corrected wording for patch description

Signed-off-by: Hannes Reinecke <[email protected]>
Signed-off-by: Lee Duncan <[email protected]>
Reviewed-by: Mike Christie <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Merge branches 'acpi-pci', 'acpi-misc' and 'acpi-tools'

* acpi-pci:
  ACPI,PCI,IRQ: remove SCI penalize function
  ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()
  ACPI,PCI,IRQ: reduce static IRQ array size to 16
  ACPI,PCI,IRQ: reduce resource requirements

* acpi-misc:
  ACPI / sysfs: fix error code in get_status()
  ACPI / device_sysfs: Clean up checkpatch errors
  ACPI / device_sysfs: Change _SUN and _STA show functions error return to EIO
  ACPI / device_sysfs: Add sysfs support for _HRV hardware revision
  arm64: defconfig: Enable ACPI
  ACPI / ARM64: Remove EXPERT dependency for ACPI on ARM64
  ACPI / ARM64: Don't enable ACPI by default on ARM64
  acer-wmi: Use acpi_dev_found()
  eeepc-wmi: Use acpi_dev_found()
  ACPI / utils: Rename acpi_dev_present()

* acpi-tools:
  tools/power/acpi: close file only if it is open

Merge branches 'acpi-numa', 'acpi-tables' and 'acpi-osi'

* acpi-numa:
  ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present

* acpi-tables:
  ACPI / tables: Fix DSDT override mechanism
  ACPI / tables: Convert initrd table override to table upgrade mechanism
  ACPI / x86: Cleanup initrd related code
  ACPI / tables: Move table override mechanisms to tables.c

* acpi-osi:
  ACPI / osi: Collect _OSI handling into one single file
  ACPI / osi: Cleanup coding style issues before creating a separate OSI source file
  ACPI / osi: Cleanup OSI handling code to use bool
  ACPI / osi: Fix default _OSI(Darwin) support
  ACPI / osi: Add acpi_osi=!! to allow reverting acpi_osi=!
  ACPI / osi: Cleanup _OSI("Linux") related code before introducing new support
  ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings

Conflicts:
drivers/acpi/internal.h

Merge branches 'acpi-drivers', 'acpi-pm', 'acpi-ec' and 'acpi-video'

* acpi-drivers:
  ACPI / GED: make evged.c explicitly non-modular
  ACPI / amba: Remove CLK_IS_ROOT
  ACPI / APD: Remove CLK_IS_ROOT
  ACPI: implement Generic Event Device

* acpi-pm:
  ACPI / PM: Introduce efi poweroff for HW-full platforms without _S5

* acpi-ec:
  ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis
  ACPI 2.0 / ECDT: Enable correct ECDT initialization order
  ACPI 2.0 / ECDT: Remove early namespace reference from EC
  ACPI 2.0 / ECDT: Split EC_FLAGS_HANDLERS_INSTALLED

* acpi-video:
  ACPI / video: mark acpi_video_get_levels() inline
  Thermal / ACPI / video: add INT3406 thermal driver
  ACPI/video: export acpi_video_get_levels
  video / backlight: remove the backlight_device_registered API
  video / backlight: add two APIs for drivers to use

Merge branch 'acpica'

* acpica: (41 commits)
  ACPICA: Update version to 20160422
  ACPICA: Move all ASCII utilities to a common file
  ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support for acpi_hw_write()
  ACPICA: ACPI 2.0, Hardware: Add access_width/bit_offset support in acpi_hw_read()
  ACPICA: Executer: Introduce a set of macros to handle bit width mask generation
  ACPICA: Hardware: Add optimized access bit width support
  ACPICA: Utilities: Add ACPI_IS_ALIGNED() macro
  ACPICA: Renamed some #defined flag constants for clarity
  ACPICA: ACPI 6.0, tools/iasl: Add support for new resource descriptors
  ACPICA: ACPI 6.0: Update _BIX support for new package element
  ACPICA: ACPI 6.1: Support for new PCCT subtable
  ACPICA: Refactor evaluate_object to reduce nesting
  ACPICA: Divergence: remove unwanted spaces for typedef
  ACPICA: Update version to 20160318
  ACPICA: Namespace: Reorder \_SB._INI to make sure it is evaluated before _REG evaluations
  ACPICA: Events: Fix an issue that _REG association can happen before namespace is initialized
  ACPICA: Tables: Fix wrong MLC condition for dynamic table loading
  ACPICA: Interpreter: Fix wrong conditions for acpi_ev_install_region_handlers() invocation
  ACPICA: Hardware: Enhance acpi_hw_validate_register() with access_width/bit_offset awareness
  Utilities: Fix missing parentheses in ACPI_GET_BITS()/ACPI_SET_BITS()
  ...

Merge branches 'pm-avs', 'pm-clk', 'powercap' and 'pm-tools'

* pm-avs:
  PM / AVS: rockchip-io: make io-domains a child of the GRF

* pm-clk:
  PM / clk: ensure we don't allocate a -ve size of count clks

* powercap:
  powercap/intel_rapl: Add support for Kabylake

* pm-tools:
  cpupower: fix potential memory leak
  cpupower: Add cpuidle parts into library
  cpupowerutils: bench: trivial fix of spelling mistake on "average"
  Fix cpupower manpages "NAME" section
  cpupower: bench: parse.c: fix several resource leaks
  Honour user's LDFLAGS

Merge branches 'pm-core' and 'pm-domains'

* pm-core:
  PM / sleep: Drop unused `info' variable
  PM / Runtime: Move ignore_children flag under CONFIG_PM
  PM / Runtime: Fix error path in pm_runtime_force_resume()

* pm-domains:
  PM / Domains: Drop unnecessary wakeup code from pm_genpd_prepare()
  PM / Domains: Remove redundant pm_runtime_get|put*() in pm_genpd_prepare()
  PM / Domains: Remove ->save|restore_state() callbacks
  PM / Domains: Rename pm_genpd_runtime_suspend|resume()
  PM / Domains: Rename stop_ok to suspend_ok for the genpd governor

Merge branch 'pm-devfreq'

* pm-devfreq:
  PM / devfreq: style/typo fixes
  PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus
  PM / devfreq: event: Find the instance of devfreq-event device by using phandle
  PM / devfreq: event: Add new Exynos NoC probe driver
  MAINTAINERS: Add samsung bus frequency driver entry
  PM / devfreq: exynos: Remove unused exynos4/5 busfreq driver
  PM / devfreq: exynos: Add the detailed correlation between sub-blocks and power line
  PM / devfreq: exynos: Update documentation for bus devices using passive governor
  PM / devfreq: exynos: Add support of bus frequency of sub-blocks using passive governor
  PM / devfreq: Add new passive governor
  PM / devfreq: Add new DEVFREQ_TRANSITION_NOTIFIER notifier
  PM / devfreq: Add devfreq_get_devfreq_by_phandle()
  PM / devfreq: exynos: Add documentation for generic exynos bus frequency driver
  PM / devfreq: exynos: Add generic exynos bus frequency driver

Merge branch 'pm-cpuidle'

* pm-cpuidle:
  cpuidle: Replace ktime_get() with local_clock()
  drivers: firmware: psci: use const and __initconst for psci_cpuidle_ops
  soc: qcom: spm: Use const and __initconst for qcom_cpuidle_ops
  ARM: cpuidle: constify return value of arm_cpuidle_get_ops()
  ARM: cpuidle: add const qualifier to cpuidle_ops member in structures
  intel_idle: add BXT support
  cpuidle: Indicate when a device has been unregistered

Merge branch 'pm-cpufreq'

* pm-cpufreq: (63 commits)
  intel_pstate: Clean up get_target_pstate_use_performance()
  intel_pstate: Use sample.core_avg_perf in get_avg_pstate()
  intel_pstate: Clarify average performance computation
  intel_pstate: Avoid unnecessary synchronize_sched() during initialization
  cpufreq: schedutil: Make default depend on CONFIG_SMP
  cpufreq: powernv: del_timer_sync when global and local pstate are equal
  cpufreq: powernv: Move smp_call_function_any() out of irq safe block
  intel_pstate: Clean up intel_pstate_get()
  cpufreq: schedutil: Make it depend on CONFIG_SMP
  cpufreq: governor: Fix handling of special cases in dbs_update()
  cpufreq: intel_pstate: Ignore _PPC processing under HWP
  cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table
  cpufreq: tango: Use generic platdev driver
  cpufreq: Fix GOV_LIMITS handling for the userspace governor
  cpufreq: mvebu: Move cpufreq code into drivers/cpufreq/
  cpufreq: dt: Kill platform-data
  mvebu: Use dev_pm_opp_set_sharing_cpus() to mark OPP tables as shared
  cpufreq: dt: Identify cpu-sharing for platforms without operating-points-v2
  cpufreq: governor: Change confusing struct field and variable names
  cpufreq: intel_pstate: Enable PPC enforcement for servers
  ...

Merge branch 'pm-opp'

* pm-opp:
  PM / OPP: Move CONFIG_OF dependent code in a separate file
  PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
  PM / OPP: pass cpumask by reference
  PM / OPP: Add dev_pm_opp_get_sharing_cpus()
  PM / OPP: Mark cpumask as const in dev_pm_opp_set_sharing_cpus()
  PM / OPP: -ENOSYS is applicable only to syscalls
  PM / OPP: Mark shared-opp for non-dt case
  PM / OPP: Relocate dev_pm_opp_set_sharing_cpus()
  PM / OPP: dev_pm_opp_set_sharing_cpus() doesn't depend on CONFIG_OF
  PM / OPP: Add missing doc style comments
  PM / OPP: Propagate the error returned by _find_opp_table()

locking/rwsem: Fix comment on register clobbering

Document explicitly that %edx can get clobbered on the slow path, on
32-bit kernels. Something I learned the hard way. :-\

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

mmc: mmc: Fix partition switch timeout for some eMMCs

Some eMMCs set the partition switch timeout too low.

Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable. Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.

Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.

The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.

Cc: [email protected]
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sh_mobile_sdhi: enable SDIO IRQs for RCar Gen3

Tested on a Salvator-X board with a Spectec SDW-823 WLAN card.

Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdio: fall back to SDIO 1.0 for broken 1.1 cards

I have two SDIO WLAN cards which specify being SDIO Rev. 1.1 cards but
their FUNCE tuple reports the smaller size of a Rev 1.0 card. So,
enforce 1.0 on these cards to avoid reading the not present registers.
They are not really used anyhow. My cards initialize properly after this
patch.

Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci-st: correct name of sd-uhs-sdr50 property

Correct what appears to be a typo in the name of the sd-uhs-sdr50.

Also fix mixed tab/space indentation.

Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

MAINTAINERS: update entry for TMIO MMC driver

I have some more additions planned for this driver, so I'd like to get
notified of other changes and coordinate them. Drop Ian as maintainer
because he hasn't been involved in development for a while. Thanks for
all the initial work, of course! Also, reflect the recent changes to
the include file layout.

Signed-off-by: Wolfram Sang <[email protected]>
Cc: Ian Molton <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: block: improve logging of handling emmc timeouts

Add some logging to make it clear just how the emmc timeout
was handled.

Signed-off-by: Ken Sumrall <[email protected]>
[AmitP: cherry-picked this Android patch from aosp
common kernel android-4.4]
Signed-off-by: Amit Pundir <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci: removed unneeded function wrappers

After commit d6463f170cf0 ("mmc: sdhci: Remove redundant runtime PM calls"),
some of original sdhci_do_xx() function wrappers becomes meaningless,
so remove them.

Signed-off-by: Dong Aisheng <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

Linux 4.6

locking/rwsem: Fix down_write_killable()

The new signal_pending exit path in __rwsem_down_write_failed_common()
was fingered as breaking his kernel by Tetsuo Handa.

Upon inspection it was found that there are two things wrong with it;

- it forgets to remove WAITING_BIAS if it leaves the list empty, or
- it forgets to wake further waiters that were blocked on the now
removed waiter.

Especially the first issue causes new lock attempts to block and stall
indefinitely, as the code assumes that pending waiters mean there is
an owner that will wake when it releases the lock.

Reported-by: Tetsuo Handa <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Tested-by: Michal Hocko <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: Waiman Long <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-05-14

This series contains updates to i40e and i40evf.

Kevin adds support to disable link on all ports and changes bits set
for telling firmware the PHY needs to be modified by the driver.

Anjali adds a feature to enable/disable all multicast for a trusted
VF.  Added priv-flag knob to configure global true promiscuous
support.

Shannon adds the support code for calling the admin queue API call
aq_set_switch_config().

Mitch modifies the VF, to log a message if an untrusted VF attempts to
configure promiscuous mode, but lies to it and returns everything is ok
instead of returning an error.  Corrects the logic for reporting the
receive packet hash.  Fixed the adding of a broadcast filter for VFs,
since that all VSIs are configured to receive broadcasts as default,
so do not need to add a filter.

Catherine refactors the ethtool get_settings to report the possible
supported link modes from what we know about the current PHY type and
that with the firmware supported PHY types.

Jacob changes the driver to use WARN_ONCE in order to highlight the
issue, but do not display a warning every time when receive hang
message is received.

Akeem corrects receive ptype payload layer for non_tunneled IPv6, when
it should be layer 4 for UDP, instead of layer 3.

Dan Carpenter fixes an uninitialized variable bug.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'bnxt_en-next'

Michael Chan says:

====================
bnxt_en: updates for net-next.

Non-critical bug fixes, improvements, a new ethtool feature, and a new
device ID.

v2: Fixed a bug in bnxt_get_module_eeprom() found by Ben Hutchings.

Ajit Khaparde (2):
  bnxt_en: Add Support for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPRO
  bnxt_en: Report PCIe link speed and width during driver load

Michael Chan (6):
  bnxt_en: Reduce maximum ring pages if page size is 64K.
  bnxt_en: Improve the delay logic for firmware response.
  bnxt_en: Fix length value in dmesg log firmware error message.
  bnxt_en: Simplify and improve unsupported SFP+ module reporting.
  bnxt_en: Add BCM57314 device ID.
  bnxt_en: Use dma_rmb() instead of rmb().

Satish Baddipadige (1):
  bnxt_en: Fix invalid max channel parameter in ethtool -l.
====================

Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Use dma_rmb() instead of rmb().

Use the weaker but more appropriate dma_rmb() to order the reading of
the completion ring.

Suggested-by: Ajit Khaparde <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Add BCM57314 device ID.

Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Simplify and improve unsupported SFP+ module reporting.

The current code is more complicated than necessary and can only report
unsupported SFP+ module if it is plugged in after the device is up.

Rename bnxt_port_module_event() to bnxt_get_port_module_status().  We
already have the current module_status in the link_info structure, so
just check that and report any unsupported SFP+ module status.  Delete
the unnecessary last_port_module_event.  Call this function at the
end of bnxt_open to report unsupported module already plugged in.

Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Fix length value in dmesg log firmware error message.

The len value in the hwrm error message is wrong. Use the properly adjusted
value in the variable len.

Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Improve the delay logic for firmware response.

The current code has 2 problems:

1. The maximum wait time is not long enough.  It is about 60% of the
duration specified by the firmware.  It is calling usleep_range(600, 800)
for every 1 msec we are supposed to wait.

2. The granularity of the delay is too coarse.  Many simple firmware
commands finish in 25 usec or less.

We fix these 2 issues by multiplying the original 1 msec loop counter by
40 and calling usleep_range(25, 40) for each iteration.

There is also a second delay loop to wait for the last DMA word to
complete.  This delay loop should be a very short 5 usec wait.

This change results in much faster bring-up/down time:

Before the patch:

time ip link set p4p1 up

real    0m0.120s
user    0m0.001s
sys     0m0.009s

After the patch:

time ip link set p4p1 up

real    0m0.030s
user    0m0.000s
sys     0m0.010s

Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Reduce maximum ring pages if page size is 64K.

The chip supports 4K/8K/64K page sizes for the rings and we try to
match it to the CPU PAGE_SIZE. The current page size limits for the rings
are based on 4K/8K page size. If the page size is 64K, these limits are
too large. Reduce them appropriately.

Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Report PCIe link speed and width during driver load

Add code to log a message during driver load indicating PCIe link
speed and width.

The log message will look like this:
bnxt_en 0000:86:00.0 eth0: PCIe: Speed 8.0GT/s Width x8

Signed-off-by: Ajit Khaparde <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Add Support for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPRO

Add support to fetch the SFP EEPROM settings from the firmware
and display it via the ethtool -m command. We support SFP+ and QSFP
modules.

v2: Fixed a bug in bnxt_get_module_eeprom() found by Ben Hutchings.

Signed-off-by: Ajit Khaparde <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnxt_en: Fix invalid max channel parameter in ethtool -l.

When there is only 1 MSI-X vector or in INTA mode, tx and rx pre-set
max channel parameters are shown incorrectly in ethtool -l. With only 1
vector, bnxt_get_max_rings() will return -ENOMEM. bnxt_get_channels
should check this return value, and set max_rx/max_tx to 0 if it is
non-zero.

Signed-off-by: Satish Baddipadige <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.

The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.

Conflicts:
drivers/net/wireless/intel/iwlwifi/mvm/tx.c
net/ipv4/ip_gre.c
net/netfilter/nf_conntrack_core.c

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
"Just the missing compat entry for the new pread/writev2"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Use compat version for preadv2 and pwritev2

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix mvneta/bm dependencies, from Arnd Bergmann.

2) RX completion hw bug workaround in bnxt_en, from Michael Chan.

3) Kernel pointer leak in nf_conntrack, from Linus.

4) Hoplimit route attribute limits not enforced properly, from Paolo
    Abeni.

5) qlcnic driver NULL deref fix from Dan Carpenter.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  arm64: bpf: jit JMP_JSET_{X,K}
  net/route: enforce hoplimit max value
  nf_conntrack: avoid kernel pointer value leak in slab name
  drivers: net: xgene: fix register offset
  drivers: net: xgene: fix statistics counters race condition
  drivers: net: xgene: fix ununiform latency across queues
  drivers: net: xgene: fix sharing of irqs
  drivers: net: xgene: fix IPv4 forward crash
  xen-netback: fix extra_info handling in xenvif_tx_err()
  net: mvneta: bm: fix dependencies again
  bnxt_en: Add workaround to detect bad opaque in rx completion (part 2)
  bnxt_en: Add workaround to detect bad opaque in rx completion (part 1)
  qlcnic: potential NULL dereference in qlcnic_83xx_get_minidump_template()

net: switchdev: Drop EXPERIMENTAL from description

Switchdev has been around for quite a while now, putting "EXPERIMENTAL"
in the description is no longer accurate, drop it.

Signed-off-by: Florian Fainelli <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

arm64: bpf: jit JMP_JSET_{X,K}

Original implementation commit e54bcde3d69d ("arm64: eBPF JIT compiler")
had the relevant code paths, but due to an oversight always fail jiting.

As a result, we had been falling back to BPF interpreter whenever a BPF
program has JMP_JSET_{X,K} instructions.

With this fix, we confirm that the corresponding tests in lib/test_bpf
continue to pass, and also jited.

...
[    2.784553] test_bpf: #30 JSET jited:1 188 192 197 PASS
[    2.791373] test_bpf: #31 tcpdump port 22 jited:1 325 677 625 PASS
[    2.808800] test_bpf: #32 tcpdump complex jited:1 323 731 991 PASS
...
[    3.190759] test_bpf: #237 JMP_JSET_K: if (0x3 & 0x2) return 1 jited:1 110 PASS
[    3.192524] test_bpf: #238 JMP_JSET_K: if (0x3 & 0xffffffff) return 1 jited:1 98 PASS
[    3.211014] test_bpf: #249 JMP_JSET_X: if (0x3 & 0x2) return 1 jited:1 120 PASS
[    3.212973] test_bpf: #250 JMP_JSET_X: if (0x3 & 0xffffffff) return 1 jited:1 89 PASS
...

Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
Signed-off-by: Zi Shen Lim <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Yang Shi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/route: enforce hoplimit max value

Currently, when creating or updating a route, no check is performed
in both ipv4 and ipv6 code to the hoplimit value.

The caller can i.e. set hoplimit to 256, and when such route will
be used, packets will be sent with hoplimit/ttl equal to 0.

This commit adds checks for the RTAX_HOPLIMIT value, in both ipv4
ipv6 route code, substituting any value greater than 255 with 255.

This is consistent with what is currently done for ADVMSS and MTU
in the ipv4 code.

Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nf_conntrack: avoid kernel pointer value leak in slab name

The slab name ends up being visible in the directory structure under
/sys, and even if you don't have access rights to the file you can see
the filenames.

Just use a 64-bit counter instead of the pointer to the 'net' structure
to generate a unique name.

This code will go away in 4.7 when the conntrack code moves to a single
kmemcache, but this is the backportable simple solution to avoiding
leaking kernel pointers to user space.

Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"Overlayfs fixes from Miklos, assorted fixes from me.

  Stable fodder of varying severity, all sat in -next for a while"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: ignore permissions on underlying lookup
  vfs: add lookup_hash() helper
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  get_rock_ridge_filename(): handle malformed NM entries
  ecryptfs: fix handling of directory opening
  atomic_open(): fix the handling of create_error
  fix the copy vs. map logics in blk_rq_map_user_iov()
  do_splice_to(): cap the size before passing to ->splice_read()

i40e: fix an uninitialized variable bug

We removed this initialization but it is required. Let's put it back.

Fixes: 895106a577c4 ('i40e: trivial fixes')
Signed-off-by: Dan Carpenter <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Bump version from 1.5.10 to 1.5.16

Signed-off-by: Bimmy Pujari <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: don't add broadcast filter for VFs

Now that all VSIs are configured to receive broadcasts as default, we
don't need to add a filter. This eliminates an annoying but harmless
error message each time VFs are created or reset.

Change-ID: I4cd6339684df45b0d2722133eeb84c14fa93ea19
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: properly report Rx packet hash

This logic is inverted. If the RXHASH flag is set, then we should go
ahead and call skb_set_hash.

Change-ID: Ib2e30356dced1d3e939c8061ab6ad5bd94197e7c
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: set context to use VSI RSS LUT for SR-IOV

For the SR-IOV VSIs, when the queue filtering section is valid,
the RSS LUT needs to be set to use the VSI specific lookup table
(otherwise it will use the PF RSS LUT table).

Change-ID: Ia9377cc818078238a75c3bdeade1b593a91b3480
Signed-off-by: Ashish Shah <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Correct UDP packet header for non_tunnel-ipv6

This patch corrects Rx ptype payload layer for non_tunneled ipv6. It
should be layer 4 for UDP, instead of layer 3.

Change-ID: I9382e4458ab3c4e58f6d2e9f195d5d4ee513805e
Signed-off-by: Akeem G Abodunrin <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: change Rx hang message into a WARN_ONCE

Use WARN_ONCE in order to highlight the issue, but don't display
a warning every time. The user should be able to see the ethtool counter
we created if necessary to see how often it is occurring.

Change-ID: I40c4ea159819b64a7d33b7f5716749089791533a
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Refactor ethtool get_settings

Previously we were only looking at the FW supported PHY types if link
was down, because we want to be more specific when link is up. This
refactor changes this. When link is down, we still rely on the FW
supported PHY types, but when link is up, we select the possible
supported link modes from what we know about the current PHY type, and
AND that with the FW supported PHY types.

Change-ID: Ice5dad83f2a17932b0b8b59f07439696ad6aa013
Signed-off-by: Catherine Sullivan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: lie to the VF

If an untrusted VF attempts to configure promiscuous mode, log a message
pointing out its naughty behavior. But then, instead of returning an
error to the offender, just lie to it and say everything's OK. It will
continue on its way, thinking it's in promiscuous mode, but receiving no
packets except its own.

Change-ID: I63369215b1720f3c531eedfc06af86ff8c0e3dc8
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Add vf-true-promisc-support priv flag

This patch adds priv-flag knob to configure global true promisc
support. With this patch the user can decide the flavor of
promiscuous that the VFs will see when promiscuous mode is enabled
on the interface. Since this a global setting for the whole device,
the priv-flag is exposed only on the first PF of the device.

The default is true promisc support is off, which means the promisc
mode for the VF will be limited/defport mode.

For the PF, we still will be in limited promisc unless in MFP mode
irrespective of the flavor picked through this knob.

Usage:
On PF0
ethtool --show-priv-flags p261p1
Private flags for p261p1:
MFP                    : off
LinkPolling            : off
flow-director-atr      : on
veb-stats              : off
hw-atr-eviction        : off
vf-true-promisc-support: off

to enable setting true promisc
ethtool --set-priv-flags p261p1 vf-true-promisc-support on

At this point if the VF is set to trust and promisc is enabled
on the VF through
ip link set ... promisc on
The VF/VFs will be able to see ALL ingress traffic

Change-Id: I8fac4b6eb1af9ca77b5376b79c50bdce5055bd94
Signed-off-by: Anjali Singhai Jain <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Implement the API function for aq_set_switch_config

Add the support code for calling the AdminQ API call aq_set_switch_config

Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Add allmulti support for the VF

This patch enables a feature to enable/disable all multicast
for a trusted VF.

Change-Id: I926eba7f8850c8d40f8ad7e08bbe4056bbd3985f
Signed-off-by: Anjali Singhai Jain <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Add support for disabling all link and change bits needed for PHY interactions

Add flag to tell firmware to disable link on all ports.

This patch changes the bits set for telling firmware the PHY needs
to be modified by driver. Without this patch, the setting will only
set that mode for the current port on the device. Because the
MDIO interface is common for the copper device. The command needs to
set the mode for all ports.

Change-ID: I8baa7da91d384291ac95b41ae1a516604f8eb67f
Signed-off-by: Kevin Scott <[email protected]>
Signed-off-by: Carolyn Wyborny <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

Merge branch 'xgene-fixes'

Iyappan Subramanian says:

====================
drivers: net: xgene: Bug fixes

This patch set addresses the following bug fixes that were found during testing.

  1. IPv4 forward test crash
    - drivers: net: xgene: fix IPv4 forward crash

  2. Sharing of irqs
    - drivers: net: xgene: fix sharing of irqs

  3. Ununiform latency across queues
    - drivers: net: xgene: fix ununiform latency across queues

  4. Fix statistics counters race condition
    - drivers: net: xgene: fix statistics counters race condition

  5. Correcting register offset and field lengths
    - drivers: net: xgene: fix register offset

v2: Address review comments from v1
- Defer TSO fix, and reposting all other patches from v1

v1:
- Initial version
====================

Signed-off-by: Iyappan Subramanian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: xgene: fix register offset

This patch fixes SG_RX_DV_GATE_REG_0_ADDR register offset
and ring state field lengths.

Signed-off-by: Iyappan Subramanian <[email protected]>
Tested-by: Toan Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: xgene: fix statistics counters race condition

This patch fixes the race condition on updating the statistics
counters by moving the counters to the ring structure.

Signed-off-by: Iyappan Subramanian <[email protected]>
Tested-by: Toan Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: xgene: fix ununiform latency across queues

This patch addresses ununiform latency across queues by adding
more queues to match with, upto number of CPU cores.

Also, number of interrupts are increased and the channel numbers
are reordered.

Signed-off-by: Iyappan Subramanian <[email protected]>
Tested-by: Toan Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: xgene: fix sharing of irqs

Since hardware doesn't allow sharing of interrupts,
this patch fixes the same by removing IRQF_SHARED flag.

Signed-off-by: Iyappan Subramanian <[email protected]>
Tested-by: Toan Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: xgene: fix IPv4 forward crash

This patch fixes the crash observed during IPv4 forward test by
setting the drop field in the dbptr.

Signed-off-by: Iyappan Subramanian <[email protected]>
Tested-by: Toan Le <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2016-05-13

This series contains updates to e1000e, igb and igbvf.

Steve Shih fixes an issue for disabling auto-negotiation and forcing
speed and duplex settings for non-copper media.

Brian Walsh cleanups some inconsistency in the use of return variables
names to avoid confusion.

Jake cleans up the drivers to use the BIT() macro when it can, which will
future proof the drivers for GCC 6 when it gets released.  Cleaned up
dead code which was never being used.  Also fixed e1000e, where it was
incorrectly restting the SYSTIM registers every time the ioctl was being
run.

Denys Vlasenko fixes an oversight where incvalue variable holds a 32
bit value so we should declare it as such, instead of 64 bits.  Also
fixed an overflow check, where two reads are the same, then it is not
an overflow.

Nathan Sullivan fixes the PTP timestamps for transmit and receive
latency based on the current link speed.

Alexander Duyck adds support for partial GSO segmentation in the case
of tunnels for igb and igbvf.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"During v4.6-rc1 cgroup namespace support was merged.  There is an
  issue where it's impossible to tell whether a given cgroup mount point
  is bind mounted or namespaced.  Serge has been working on the issue
  but it took longer than expected to resolve, so the late pull request.

  Given that it's a completely new feature and the patches don't touch
  anything else, the risk seems acceptable.  However, if this is too
  late, an alternative is plugging new cgroup ns creation for v4.6 and
  retrying for v4.7"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix compile warning
  kernfs: kernfs_sop_show_path: don't return 0 after seq_dentry call
  cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces
  kernfs_path_from_node_locked: don't overwrite nlen

Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fix from Tejun Heo:
"CPU hotplug callbacks can invoke DOWN_FAILED w/o preceding
  DOWN_PREPARE which can trigger a WARN_ON() in workqueue.

  The bug has been there for a very long time.  It only triggers if CPU
  down fails at a specific point and I don't think it has adverse
  effects other than the warning messages.  The fix is very low impact"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix rebind bound workers warning

e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl

The e1000e_config_hwtstamp function was incorrectly resetting the SYSTIM
registers every time the ioctl was being run. If you happened to be
running ptp4l and lost the PTP connect (removing cable, or blocking the
UDP traffic for example), then ptp4l will eventually perform a restart
which involves re-requesting timestamp settings. In e1000e this has the
unfortunate and incorrect result of resetting SYSTIME to the kernel
time. Since kernel time is usually in UTC, and PTP time is in TAI, this
results in the leap second being re-applied.

Fix this by extracting the SYSTIME reset out into its own function,
e1000e_ptp_reset, which we call during reset to restore the hardware
registers. This function will (a) restart the timecounter based on the
new system time, (b) restore the previous PPB setting, and (c) restore
the previous hwtstamp settings.

In order to perform (b), I had to modify the adjfreq ptp function
pointer to store the old delta each time it is called. This also has the
side effect of restoring the correct base timinca register correctly.
The driver does not need to explicitly zero the ptp_delta variable since
the entire adapter structure comes zero-initialized.

Reported-by: Brian Walsh <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Brian Walsh <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb/igbvf: Add support for GSO partial

This patch adds support for partial GSO segmentation in the case of
tunnels. Specifically with this change the driver an perform segmentation
as long as the frame either has IPv6 inner headers, or we are allowed to
mangle the IP IDs on the inner header. This is needed because we will not
be modifying any fields from the start of the start of the outer transport
header to the start of the inner transport header as we are treating them
like they are just a block of IP options.

Signed-off-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000e: mark shifted values as unsigned

The E1000_ICH_NVM_SIG_MASK value is shifted, out to the 31st bit, which
is the signed bit for signed constants. Mark these values as unsigned to
prevent compiler warnings and issues on platforms which a different
signed bit implementation.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000e: use BIT() macro for bit defines

This prevents signed bitshift issues when the shift would overwrite the
signed bit, and prevents making this mistake in the future when copying
and modifying code.

Use GENMASK or the unsigned postfix for cases which aren't suitable for
BIT() macro.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igbvf: use BIT() macro instead of shifts

To prevent signed bitshift issues, and improve code readability, use the
BIT() macro. Also make use of GENMASK or the unsigned postfix where this
is more appropriate than BIT()

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igbvf: remove unused variable and dead code

The variable rdlen is set but never used, and thus setting it is dead
code. Remove it.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: adjust PTP timestamps for Tx/Rx latency

Table 7-62 on page 338 of the i210 datasheet lists TX and RX latencies
for the various speeds the chip supports. To give better PTP timestamp
accuracy, adjust the timestamps by the amounts Intel gives based on
current link speed.

Signed-off-by: Nathan Sullivan <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ACPI / video: mark acpi_video_get_levels() inline

A recent patch added a stub function for acpi_video_get_levels when
CONFIG_ACPI_VIDEO is disabled. However, this is marked as 'static'
and causes a warning about an unused function whereever the header
gets included:

In file included from ../drivers/gpu/drm/radeon/radeon_acpi.c:28:0:
include/acpi/video.h:74:12: error: 'acpi_video_get_levels' defined but not used [-Werror=unused-function]

This makes the declaration 'static inline', which gets rid of the
warning.

Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: 059500940def (ACPI/video: export acpi_video_get_levels)
Signed-off-by: Rafael J. Wysocki <[email protected]>

e1000e: e1000e_cyclecounter_read(): do overflow check only if needed

SYSTIMH:SYSTIML registers are incremented by 24-bit value TIMINCA[23..0]

er32(SYSTIML) are probably moderately expensive (they are pci bus reads).
Can we avoid one of them? Yes, we can.

If the SYSTIML value we see is smaller than 0xff000000, the overflow
into SYSTIMH would require at least two increments.

We do two reads, er32(SYSTIML) and er32(SYSTIMH), in this order.

Even if one increment happens between them, the overflow into SYSTIMH
is impossible, and we can avoid doing another er32(SYSTIML) read
and overflow check.

Signed-off-by: Denys Vlasenko <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000e: e1000e_cyclecounter_read(): fix er32(SYSTIML) overflow check

If two consecutive reads of the counter are the same, it is also
not an overflow. "systimel_1 < systimel_2" should be
"systimel_1 <= systimel_2".

Before the patch, we could perform an *erroneous* correction:

Let's say that systimel_1 == systimel_2 == 0xffffffff.
"systimel_1 < systimel_2" is false, we think it's an overflow,
we read "systimeh = er32(SYSTIMH)" which meanwhile had incremented,
and use "(systimeh << 32) + systimel_2" value which is 2^32 too large.

Signed-off-by: Denys Vlasenko <[email protected]>
CC: [email protected]
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000e: e1000e_cyclecounter_read(): incvalue is 32 bits, not 64

"incvalue" variable holds a result of "er32(TIMINCA) &
E1000_TIMINCA_INCVALUE_MASK" and used in "do_div(temp, incvalue)"
as a divisor.

Thus, "u64 incvalue" declaration is probably a mistake.
Even though it seems to be a harmless one, let's fix it.

Signed-off-by: Denys Vlasenko <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>