Git Repo - linux.git/log

seccomp, filter: add and use bpf_prog_create_from_user from seccomp

Seccomp has always been a special candidate when it comes to preparation
of its filters in seccomp_prepare_filter(). Due to the extra checks and
filter rewrite it partially duplicates code and has BPF internals exposed.

This patch adds a generic API inside the BPF code code that seccomp can use
and thus keep it's filter preparation code minimal and better maintainable.
The other side-effect is that now classic JITs can add seccomp support as
well by only providing a BPF_LDX | BPF_W | BPF_ABS translation.

Tested with seccomp and BPF test suites.

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Nicolas Schichan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: filter: add __GFP_NOWARN flag for larger kmem allocs

When seccomp BPF was added, it was discussed to add __GFP_NOWARN
flag for their configuration path as f.e. up to 32K allocations are
more prone to fail under stress. As we're going to reuse BPF API,
add __GFP_NOWARN flags where larger kmalloc() and friends allocations
could fail.

It doesn't make much sense to pass around __GFP_NOWARN everywhere as
an extra argument only for seccomp while we just as well could run
into similar issues for socket filters, where it's not desired to
have a user application throw a WARN() due to allocation failure.

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Nicolas Schichan <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

seccomp: simplify seccomp_prepare_filter and reuse bpf_prepare_filter

Remove the calls to bpf_check_classic(), bpf_convert_filter() and
bpf_migrate_runtime() and let bpf_prepare_filter() take care of that
instead.

seccomp_check_filter() is passed to bpf_prepare_filter() so that it
gets called from there, after bpf_check_classic().

We can now remove exposure of two internal classic BPF functions
previously used by seccomp. The export of bpf_check_classic() symbol,
previously known as sk_chk_filter(), was there since pre git times,
and no in-tree module was using it, therefore remove it.

Joint work with Daniel Borkmann.

Signed-off-by: Nicolas Schichan <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: filter: add a callback to allow classic post-verifier transformations

This is in preparation for use by the seccomp code, the rationale is
not to duplicate additional code within the seccomp layer, but instead,
have it abstracted and hidden within the classic BPF API.

As an interim step, this now also makes bpf_prepare_filter() visible
(not as exported symbol though), so that seccomp can reuse that code
path instead of reimplementing it.

Joint work with Daniel Borkmann.

Signed-off-by: Nicolas Schichan <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mac80211-next-for-davem-2015-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Lots of updates for net-next for this cycle. As usual, we have
a lot of small fixes and cleanups, the bigger items are:
* proper mac80211 rate control locking, to fix some random crashes
   (this required changing other locking as well)
* mac80211 "fast-xmit", a mechanism to reduce, in most cases, the
   amount of code we execute while going from ndo_start_xmit() to
   the driver
* this also clears the way for properly supporting S/G and checksum
   and segmentation offloads
====================

Signed-off-by: David S. Miller <[email protected]>

usbnet: avoid integer overflow in start_xmit

transfer_buffer_length is of type u32. It's therefore wrong to assign it
to a signed integer. This patch avoids the overflow.

It's worth noting that entry->length here is a long; perhaps it would be
beneficial at somepoint to change this to be unsigned as well, if
nothing else relies on its signedness for error conditions or the like.

Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netxen_nic: use spin_[un]lock_bh around tx_clean_lock (2)

This patch should have been part of the previous patch having the
same summary. See http://marc.info/?l=linux-kernel&m=143039470103795&w=2
Unfortunately, I didn't check to see where else this lock was used before
submitting that patch. This should take care of it for netxen_nic, as I
did a thorough search this time.

To recap from the original patch; although testing this driver with
DEBUG_LOCKDEP and DEBUG_SPINLOCK enabled did not produce any traces,
it would be more prudent in the case of tx_clean_lock to use _bh
versions of spin_[un]lock, since this lock is manipulated in both
the process and softirq contexts.

This patch was tested for functionality and regressions with netperf
and DEBUG_LOCKDEP and DEBUG_SPINLOCK enabled.

Signed-off-by: Tony Camuso <[email protected]>
Acked-By: Neil Horman <[email protected]>
Acked-By: Manish Chopra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'tcp-more-reliable-window-probes'

Eric Dumazet says:

====================
tcp: more reliable window probes

This series address a problem caused by small rto_min timers in DC,
leading to either timer storms or early flow terminations.

We also add two new SNMP counters for proper monitoring :
TCPWinProbe and TCPKeepAlive

v2: added TCPKeepAlive counter, as suggested by Yuchung & Neal
====================

Signed-off-by: David S. Miller <[email protected]>

tcp: add TCPWinProbe and TCPKeepAlive SNMP counters

Diagnosing problems related to Window Probes has been hard because
we lack a counter.

TCPWinProbe counts the number of ACK packets a sender has to send
at regular intervals to make sure a reverse ACK packet opening back
a window had not been lost.

TCPKeepAlive counts the number of ACK packets sent to keep TCP
flows alive (SO_KEEPALIVE)

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Acked-by: Nandita Dukkipati <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: adjust window probe timers to safer values

With the advent of small rto timers in datacenter TCP,
(ip route ... rto_min x), the following can happen :

1) Qdisc is full, transmit fails.

   TCP sets a timer based on icsk_rto to retry the transmit, without
   exponential backoff.
   With low icsk_rto, and lot of sockets, all cpus are servicing timer
   interrupts like crazy.
   Intent of the code was to retry with a timer between 200 (TCP_RTO_MIN)
   and 500ms (TCP_RESOURCE_PROBE_INTERVAL)

2) Receivers can send zero windows if they don't drain their receive queue.

   TCP sends zero window probes, based on icsk_rto current value, with
   exponential backoff.
   With /proc/sys/net/ipv4/tcp_retries2 being 15 (or even smaller in
   some cases), sender can abort in less than one or two minutes !
   If receiver stops the sender, it obviously doesn't care of very tight
   rto. Probability of dropping the ACK reopening the window is not
   worth the risk.

Lets change the base timer to be at least 200ms (TCP_RTO_MIN) for these
events (but not normal RTO based retransmits)

A followup patch adds a new SNMP counter, as it would have helped a lot
diagnosing this issue.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: send explicit not supported error in nl compat

The legacy netlink API treated EPERM (permission denied) as
"operation not supported".

Reported-by: Tomi Ollila <[email protected]>
Signed-off-by: Richard Alpe <[email protected]>
Reviewed-by: Erik Hugne <[email protected]>
Reviewed-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: add broadcast link window set/get to nl api

Add the ability to get or set the broadcast link window through the
new netlink API. The functionality was unintentionally missing from
the new netlink API. Adding this means that we also fix the breakage
in the old API when coming through the compat layer.

Fixes: 37e2d4843f9e (tipc: convert legacy nl link prop set to nl compat)
Reported-by: Tomi Ollila <[email protected]>
Signed-off-by: Richard Alpe <[email protected]>
Reviewed-by: Erik Hugne <[email protected]>
Reviewed-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix default link prop regression in nl compat

Default link properties can be set for media or bearer. This
functionality was missed when introducing the NL compatibility layer.

This patch implements this functionality in the compat netlink
layer. It works the same way as it did in the old API. We search for
media and bearers matching the "link name". If we find a matching
media or bearer the link tolerance, priority or window is used as
default for new links on that media or bearer.

Fixes: 37e2d4843f9e (tipc: convert legacy nl link prop set to nl compat)
Reported-by: Tomi Ollila <[email protected]>
Signed-off-by: Richard Alpe <[email protected]>
Reviewed-by: Erik Hugne <[email protected]>
Reviewed-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: Initialize RSS mode for all Ports

Implements t4_init_rss_mode() to initialize the rss_mode for all the ports. If
Tunnel All Lookup isn't specified in the global RSS Configuration, then we need
to specify a default Ingress Queue for any ingress packets which aren't hashed.
We'll use our first ingress queue.

Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'be2net'

Sathya Perla says:

====================
be2net: patch-set

The following patch-set has two new feature additions, and a few
minor fixes and cleanups.
Pls consider applying to the net-next tree. Thanks.

v2 changes:
        a) dropped the "don't enable pause by default" patch
        b) described how the "spoof check" works in patch 1's commit log
c) I had to update our email addresses from "@emulex" to
   "@avagotech". I'll send a separate patch updating the
   maintainers

Patch 1 adds support for the "spoofchk" knob for VFs.
When it is enabled, "spoof checking" is done for both MAC-address
and VLAN. For each VF, the HW ensures that the source MAC address
(or vlan) of every outgoing packet of the VF exists in the MAC-list
(or vlan-list) configured for RX filtering for that VF.
If not, the packet is dropped and an error is reported to the driver
in the TX completion.

Patch 2 improves interrupt moderation on Skyhawk-R chip by using
the EQ-DB mechanism to set a "re-arm to interrupt" delay. Currently
interrupt moderation is adjusted by calculating and configuring an
EQ-delay every second. This is done via a FW-cmd. This patch uses
the EQ_DB facility to calculate and set the interrupt delay every 1ms.
This helps moderating interrupts better when the traffic is bursty.

Patch 3 adds L3/L4 error accounting to BE3 VFs, by passing L3/4 error
packets to the network stack.

Patch 4 adds an extra FW-cmd error value check in the driver to identify
an "out of vlan filters" scenario.

Patch 5 stops enabling pause by default as this setting fails in
some HW-configs where priority pause is enabled in FW. If the user
tries to do the same, an appropriate error is returned via ethtool.

Patch 5 posts the full RXQ in be_open() to prevent packet drops due to
bursty traffic when the interface is enabled.

Patch 6 refactors the be_check_ufi_compatibility() routine, that checks
to see if a UFI file meant for a lower rev of a chip is being flashed
on a higher rev, to make it simpler.

Patch 7 replaces the usage of !be_physfn() macro with be_virtfn()
that is already avialble in the driver.

Patch 8 updates the year in the copyright text to 2015.

Path 9 bumps up the driver version to 10.6.02.
====================

Signed-off-by: David S. Miller <[email protected]>

be2net: update the driver version to 10.6.0.2

Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: update copyright year to 2015

Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: use be_virtfn() instead of !be_physfn()

Use be_virtfn() to determine a VF instead of !be_physfn() for better
readability.

Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: simplify UFI compatibility checking

The code in be_check_ufi_compatibility() checks to see if a UFI file meant
for a lower rev of a chip is being flashed on a higher rev, which is
disallowed. This patch re-writes the code needed for this check in a much
simpler manner.

Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: post full RXQ on interface enable

When an RXQ is created in be_open(), the driver currently posts only
64 buffers. This sometimes results in packet drops when there is a traffic
burst as soon as the interface is enabled.
This patch fixes this problem by posting the full RXQ on interface enable.

Signed-off-by: Suresh Reddy <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: check for INSUFFICIENT_VLANS error

When the FW runs out of vlan filters it can either return an
INSUFFICIENT_RESOURCES error or an INSUFFICIENT_VLANS error.
The driver currently checks only for the former error value.
This patch adds a check for the latter value too.

Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: receive pkts with L3, L4 errors on VFs

Currently pkts with L3 or L4 errors received on PFs are not dropped
by the adapter, but instead sent to the stack. This helps the network stack
to better reflect error statistics. This was not being done on BE3 VFs.
This patch fixes this for BE3 VFs.

Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: set interrupt moderation for Skyhawk-R using EQ-DB

Currently adaptive interrupt moderation is set by calculating
and configuring an EQ-delay every second. This is done via
a FW-cmd. But, on Skyhawk-R a "re-arm to interrupt" delay
can be set while ringing the EQ-DB. This patch uses this
facility to calculate and set the interrupt delay every 1ms.
This helps moderating interrupts better when the traffic
is bursty.

Signed-off-by: Padmanabh Ratnakar <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: add support for spoofchk setting

This patch adds support for spoofchk configuration for VFs.
When it is enabled, "spoof checking" is done for both MAC-address and VLAN.
For each VF, the HW ensures that the source MAC address (or vlan) of
every outgoing packet exists in the MAC-list (or vlan-list) configured
for RX filtering for that VF. If not, the packet is dropped and an error
is reported to the driver in the TX completion; this is reflected in the
"tx_spoof_check_err" ethtool counter.
This feature is supported in Skyhawk FW version 10.6.31.0 and above.

Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: xgene_enet: Set hardware dependency

The xgene_enet driver is only useful on X-Gene SoC.

Signed-off-by: Jean Delvare <[email protected]>
Cc: Iyappan Subramanian <[email protected]>
Cc: Keyur Chudgar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: amd-xgbe: Add hardware dependency

The amd-xgbe driver currently only works with the Seattle SoC, which
is ARM64 architecture, so there is no point in building this driver on
other architectures except for build testing purpose. The dependency
list can be updated later if the driver ever supports other
architectures.

Signed-off-by: Jean Delvare <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'sfc-next'

Shradha Shah says:

====================
sfc: Enabling EF10 Vf's, set up vswitching and bind the SFC driver to the VF's

This set of patches makes way for the implementation of EF10
SR-IOV driver starting with some cleanup code.
NIC specific SR-IOV functions are moved to their own header
and netdev_ops are made generic instead of being NIC specific

Next in line comes the patch to enable VF's using sriov_configure.
VEB vswitching hierarchy is set up next followed by patches to
prepare sfc driver to bind to enabled VF's

This is followed by patch to support use of shared RSS contexts
which makes VF's use shared RSS contexts in all cases.

Patch series ends with a patch to bind the sfc driver to the
enabled VF's which creates network interfaces corresponding to
the VF's.

Coming up soon are the patches to set_vf_mac, set_vf_config,
set_vf_vlan, vf_spoofcheck, etc.

These patches have been tested with and without CONFIG_SFC_SRIOV.
In the case of CONFIG_SFC_SRIOV=y enabling of VF's using
sriov_configure is also tested. The enabled VF's bind to the
installed sfc driver succesfully to create network interfaces.
In the case of CONFIG_SFC_SRIOV=n enabling of VF's using
sriov_configure returns the correct error message:
"Function not implemented".
====================

Signed-off-by: David S. Miller <[email protected]>

sfc: Bind the sfc driver to any available VF's

Add the device ID of the VF to the PCI device ID table.

Added a boolean flag is_vf in efx_nic_type to differentiate
between a VF and PF at probe time. This flag is useful in later
patches while setting MAC address specially in the
PCI-passthrough case.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Add use of shared RSS contexts.

Allow PFs to allocate shared RSS contexts if we exhaust our
exclusive RSS contexts. Make VFs use shared RSS contexts in
all cases.
Spruce up error handling so that the shadow copy of the RSS
table is updated after successful update, rather than in all
cases, so that we report the actual contents of the RSS table
after a failure to set it, rather than what we'd like it to be.

Populate context_size parameter when vacuously allocating RSS
context of size 1.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Cope with permissions enforcement added to firmware for SR-IOV

* Accept EPERM in some simple cases, the following cases are handled:
1) efx_mcdi_read_assertion()
Unprivileged PCI functions aren't allowed to GET_ASSERTS.
We return success as it's up to the primary PF to deal with asserts.
2) efx_mcdi_mon_probe() in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to read sensor info, and
worrying about sensor data is the primary PF's job.
3) phy_op->reconfigure() in efx_init_port() and efx_reset_up()
Unprivileged functions aren't allowed to MC_CMD_SET_LINK, they just have
to accept the settings (including flow-control, which is what
efx_init_port() is worried about) they've been given.
4) Fallback to GET_WORKAROUNDS in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to set workarounds. So if
efx_mcdi_set_workaround() fails EPERM, use efx_mcdi_get_workarounds()
to find out if workaround_35388 is enabled.
5) If DRV_ATTACH gets EPERM, try without specifying fw-variant
Unprivileged PCI functions have to use a FIRMWARE_ID of 0xffffffff
(MC_CMD_FW_DONT_CARE).
6) Don't try to exit_assertion unless one had fired
Previously we called efx_mcdi_exit_assertion even if
efx_mcdi_read_assertion had received MC_CMD_GET_ASSERTS_FLAGS_NO_FAILS.
This is unnecessary, and the resulting MC_CMD_REBOOT, even if the
AFTER_ASSERTION flag made it a no-op, would fail EPERM for unprivileged
PCI functions.
So make efx_mcdi_read_assertion return whether an assert happened, and only
call efx_mcdi_exit_assertion if it has.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: manually allocate and free vadaptors

To be able to use MC_CMD_VADAPTOR_SET_MAC, vadaptors must be
manually allocated and freed as automatic vadaptors will disappear
when their reference_count reaches zero, which must happen before
the MAC address is changed.

Vadaptors are allocated and freed in the vswitching_probe/remove
functions for PFs and VFs, and this means that vadaptors are restored
correctly following an MC reboot or other reset when required.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: create vports for VFs and assign random MAC addresses

The parent PF creates vports for all its child VFs and adds MAC
addresses to these.  When the VF driver loads, it can make an MCDI
call to get the MAC address that the parent PF assigned it.

The parent PF also assigns a mac address to its own vport because
implicit creation of a vAdaptor will only work on evb ports with
MAC addresses assigned.

The vport MAC address needs to be stored in the PF's nic_data
struct as it can later be changed on the vadaptor (and its net_dev
struct). When removing a vport the original MAC address must be
deleted.

A new flag is needed in the VF data structure to identify whether
a vport has been assigned to the VF.  This is to determine whether
it needs to be un-assigned before freeing the vport.  Also,
attempting to un-assign a vport which is not assigned will result
in an EALREADY error.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Prepare to bind the sfc driver to the VF.

Added efx_nic_type structure for VF.
Mapped a different BAR for VF as it uses BAR 0 for memory.
Added functions sriov_init and sriov_fini.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: get the PF number and record in nic_data

Use MC_CMD_GET_FUNCTION_INFO to record the PF number in nic_data.
This will be needed when assigned vports to VFs.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: create VEB vswitch and vport above default firmware setup

Adds functions to allocate and free vswitches and vports; vadaptors
are automatically allocated and freed when TX/RX queues are
initialised and finalised. This vswitching structure is only created
if the firmware supports it, so a check that full-featured firmware
is running is performed first.

If the MC resets, the vswitching infrastructure will need to be
recreated, so mark the "must_probe_vswitching" flag when an MC reboot
is detected.

Don't try to create a vswitch if vf-count=0

This allocation of vswitches and vports does not currently support
configuring VLAN tags, but that can be added in a future change.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: record the PF's vport ID in nic_data

The default port ID of EVB_PORT_ID_ASSIGNED is a "magic" number
for the MCFW to select the physical port of the PF. If other
vswitches and vports are created on top of the default firmware
configuration, the ID of the newly created vport is then required
when passed to MCDI commands. Currently, this doesn't happen so
the vport_id is never changed, but a subsequent patch will change
this behaviour so that other vswitches and vports are created.

The vport_id recorded in nic_data is only relevant for PFs.
VFs will have their vports created by their parent PF, and in
that case the parent PF will record the vport ID of each VF.
For a VF, nic_data->vport_id is expected to remain at the default
value.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Record [rt]x_dpcpu_fw_id in EF10 nic_data

The (future) code to add/remove vswitches and vports will be
dependent on the firmware variant.
To simplify the checking of the firmware variant, record
values for rx_dpcpu_fw_id and tx_dpcpu_fw_id in EF10 nic_data.

There was only one place where this was previously used:
efx_mcdi_print_fwver() in ethtool.c.
The MC_CMD_GET_CAPABILITIES can be replaced and the values from
nic_data used instead.

Note that the printing of "?" if the MC command fails or if the
outlength is incorrect no longer apply, because errors are returned
in efx_ef10_init_datapath_caps() in both of these cases.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Use MCDI to set FILTER_OP_IN_TX_DOMAIN

The TX_DOMAIN field is currently reserved but its safer to set
it to 0 for future compatibility.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Enable VF's via a write to the sysfs file sriov_numvfs

This patch adds support for the use of sriov_configure on EF10
to enable Virtual Functions while the driver is loaded.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Move and rename efx_vf struct to siena_vf

The efx_vf struct contains Siena-specific fields for VFs,
so rename to siena_vf.
Also move it into the siena_nic_data struct, as EF10 will
track its VFs in its own ef10_nic_data, storing much less
information about them since VFDI is no longer used.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: Own header for nic-specific sriov functions, single instance of netdev_ops and sriov removed from Falcon code

By putting all the efx_{siena,ef10}_sriov_* declarations in
{siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code.
Also fixes up an instance of this, where mcdi.c was calling
efx_siena_sriov_flr.

The single instance of netdev_ops should call general high level
functions that can then call something adapter specific in efx_nic_type.
We should only do adapter specialisation via efx_nic_type.

Removal of sriov functionality from the Falcon code means that tests
are needed for the presence of some callbacks.

Signed-off-by: Shradha Shah <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: fix a use-after-free in tc_ctl_tfilter()

When tcf_destroy() returns true, tp could be already destroyed,
we should not use tp->next after that.

For long term, we probably should move tp list to list_head.

Fixes: 1e052be69d04 ("net_sched: destroy proto tp when all filters are gone")
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers/net/usb: Add support for 'Lenovo OneLink Pro Dock'

This device is sold as 'Lenovo OneLink Pro Dock'.
Chipset is RTL8153 and works with r8152.

Signed-off-by: Vasily Titskiy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'dsa-next'

Andrew Lunn says:

====================
More Marvell DSA refactring and fixup

This patch setup continues the refactoring and cleanup of the Marvell
DSA drivers.

Patch #1 Centralizes the duplicated parts of port setup and global
setup into the shared mv88e6xxx.

Patch #2 Centralizes looping over the ports setting them up

Patch #3 Uses mnemonics for the remaining register access in the
drivers.

Patch #4 The 6172 is actually a member of the 6352 family. This moves
the probe code into the correct driver.

Patch #5 Adds more members of the 6171 family to the 6171 driver. The
new devices are untested.

Patch #6 The 6185 is a member of the 6131 family. Add it to the probe
code of the 6131 driver.

Patch #7 and Patch #8 Simply the mutex's in mv88e6xxx.c. The SMI bus
is the bottleneck, not the granularity of the mutex's so simply the
code down to a single mutex.

Patch #8 Fixes a false positive lockdep splat, due to nested uses of
MDIO busses.

Patch #9 Fixes another false positive lockdep splat with the transmit
queue because of stacked Ethernet devices.
====================

Signed-off-by: David S. Miller <[email protected]>

net: dsa: Add lockdep class to tx queues to avoid lockdep splat

DSA stacks an Ethernet device on top of an Ethernet device. This can
cause false positive lockdep splats for the transmit queue:
Acked-by: Florian Fainelli <[email protected]>
=============================================
[ INFO: possible recursive locking detected ]
4.0.0-rc7-01838-g70621a215fc7 #386 Not tainted
---------------------------------------------
kworker/0:0/4 is trying to acquire lock:
(_xmit_ETHER#2){+.-...}, at: [<c040e95c>] sch_direct_xmit+0xa8/0x1fc

but task is already holding lock:
(_xmit_ETHER#2){+.-...}, at: [<c03f4208>] __dev_queue_xmit+0x4d4/0x56c

other info that might help us debug this:
Possible unsafe locking scenario:

       CPU0
       ----
  lock(_xmit_ETHER#2);
  lock(_xmit_ETHER#2);

To avoid this, walk the tq queues of the dsa slaves and set a lockdep
class.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Fix false positive lockdep splat

DSA can have nested MDIO busses, where the Ethernet MDIO bus is used
to access an MDIO bus within the switch which has the PHYs connected
to it. This nesting causes lockdep to give false positives. Use
mutex_lock_nested() to avoid this.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Replace stats mutex with SMI mutex

The SMI bus is the bottleneck in all switch operations, not the
granularity of locks. Replace the stats mutex by the SMI mutex to make
the locking concept simpler.

The REG_READ/REG_WRITE macros cannot be used while holding the SMI
mutex, since they try to acquire it. Replace with calls to the
appropriate function which does not try to get the mutex.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Replace PHY mutex by SMI mutex

The SMI bus is the bottleneck in all switch operations, not the
granularity of locks. Replace the PHY mutex by the SMI mutex to make
the locking concept simpler.

The REG_READ/REG_WRITE macros cannot be used while holding the SMI
mutex, since they try to acquire it. Replace with calls to the
appropriate function which does not try to get the mutex.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6131: Add support for mv88e6185

The mv88e6185 is part of the family that the mv88e6131 driver
supports. Add it to the probe function, and set the number of ports.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6171: Add other members of the family

The 6171 is one member of the family 6171/6175/6350/6351. Add the
other family members to the driver.

Not tested on these new devices.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Move mv88e6172 support into mv88e6352 family driver

The mv88e6172 is part of the mv88e6352 family of devices. Move support
for it out of the mv88e6171 driver into the mv88e6352, which results
in some simplifications to the code.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Converting remaining registers to mnemonics

Use defines for registers, shifts and bits in the remaining register
accesses in the individual drivers, in order to aid readability.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Centralize setting up ports

Now that setting up a port is identical for all switches, centralisers
the code looping over all the ports to set them up.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Centralise global and port setup code into mv88e6xxx.

The port setup code in the individual drivers is identical for 6123,
6171, and 6352, and very similar in 6131. Move it all into mv88e6xxx,
using the chip families to differentiate on features.

Similarly, the global setup is also very similar. Move the majority
into mv8e6xxx.

The chips themselves fall into families. Add helpers which uses the
device IDs to determine if a device is a member of a family or not.
Add some additional device IDs to the existing list, to make these
helper functions more complete. However these IDs are not yet added to
the probe functions.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: macb: Handle the RXUBR interrupt on all devices

The same hardware issue the at91 must work around applies to at least the
Zynq ethernet, and possibly more devices. The driver also needs to handle
the RXUBR interrupt since it turns it on with MACB_RX_INT_FLAGS anyway.

Signed-off-by: Nathan Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'rds'

Sowmini Varadhan says:

====================
net/rds: RDS-TCP robustness fixes

This patch-set contains bug fixes for state-recovery at the RDS
layer when the underlying transport is TCP and the TCP state at one
of the endpoints is reset

V2 changes: DaveM comments to reduce memory footprint, follow
            NFS/RPC model where possible. Added test-case #3

Without the changes in this set, when one of the endpoints is reset,
the existing code does not correctly clean up RDS socket state for stale
connections, resulting in some unstable, timing-dependant behavior on
the wire, including an infinite exchange of 3WHs back-and-forth, and a
resulting potential to never converge RDS state.

Test cases used to verify the changes in this set are:

1. Start rds client/server applications on two participating nodes,
   node1 and node2. After at least one packet has been sent (to establish
   the TCP connection), restart the rds_tcp module on the client, and
   now resend packets. Tcpdump should show server sending a FIN for the
   "old" client port, and clean connection establishment/exchange for
   the new client port.

2. At the end of step 1, restart rds srever on node2, and start client on
   node1, make sure using tcpdump, 'netstat -an|grep 16385' that
   packets flow correctly.

3. start RDS client/server application on two participating nodes, and
   repeat steps 1 and 2, but this time, simulate node failure by doing
   "ifconfig <intf> down", so no FIN is sent.
====================

Signed-off-by: David S. Miller <[email protected]>

net/rds: RDS-TCP: only initiate reconnect attempt on outgoing TCP socket.

When the peer of an RDS-TCP connection restarts, a reconnect
attempt should only be made from the active side of the TCP
connection, i.e. the side that has a transient TCP port
number. Do not add the passive side of the TCP connection
to the c_hash_node and thus avoid triggering rds_queue_reconnect()
for passive rds connections.

Signed-off-by: Sowmini Varadhan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: RDS-TCP: Always create a new rds_sock for an incoming connection.

When running RDS over TCP, the active (client) side connects to the
listening ("passive") side at the RDS_TCP_PORT. After the connection
is established, if the client side reboots (potentially without even
sending a FIN) the server still has a TCP socket in the esablished
state. If the server-side now gets a new SYN comes from the client
with a different client port, TCP will create a new socket-pair, but
the RDS layer will incorrectly pull up the old rds_connection (which
is still associated with the stale t_sock and RDS socket state).

This patch corrects this behavior by having rds_tcp_accept_one()
always create a new connection for an incoming TCP SYN.
The rds and tcp state associated with the old socket-pair is cleaned
up via the rds_tcp_state_change() callback which would typically be
invoked in most cases when the client-TCP sends a FIN on TCP restart,
triggering a transition to CLOSE_WAIT state. In the rarer event of client
death without a FIN, TCP_KEEPALIVE probes on the socket will detect
the stale socket, and the TCP transition to CLOSE state will trigger
the RDS state cleanup.

Signed-off-by: Sowmini Varadhan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: Fixed source specific default route handling.

If there are only IPv6 source specific default routes present, the
host gets -ENETUNREACH on e.g. connect() because ip6_dst_lookup_tail
calls ip6_route_output first, and given source address any, it fails,
and ip6_route_get_saddr is never called.

The change is to use the ip6_route_get_saddr, even if the initial
ip6_route_output fails, and then doing ip6_route_output _again_ after
we have appropriate source address available.

Note that this is '99% fix' to the problem; a correct fix would be to
do route lookups only within addrconf.c when picking a source address,
and never call ip6_route_output before source address has been
populated.

Signed-off-by: Markus Stenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Johan Hedberg says:

====================
Here are a couple of important Bluetooth & mac802154 fixes for 4.1:

- mac802154 fix for crypto algorithm allocation failure checking
- mac802154 wpan phy leak fix for error code path
- Fix for not calling Bluetooth shutdown() if interface is not up

Let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <[email protected]>

m32r: make flush_cpumask non-volatile.

We cast away the volatile, but really, why make it volatile at all?
We already do a mb() inside the cpumask_empty() loop.

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mnt: Fix fs_fully_visible to verify the root directory is visible

This fixes a dumb bug in fs_fully_visible that allows proc or sys to
be mounted if there is a bind mount of part of /proc/ or /sys/ visible.

Cc: [email protected]
Reported-by: Eric Windisch <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"A couple of fixes for bugs caught while digging in fs/namei.c.  The
  first one is this cycle regression, the second is 3.11 and later"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  path_openat(): fix double fput()
  namei: d_is_negative() should be checked before ->d_seq validation

path_openat(): fix double fput()

path_openat() jumps to the wrong place after do_tmpfile() - it has
already done path_cleanup() (as part of path_lookupat() called by
do_tmpfile()), so doing that again can lead to double fput().

Cc: [email protected] # v3.11+
Signed-off-by: Al Viro <[email protected]>

namei: d_is_negative() should be checked before ->d_seq validation

Fetching ->d_inode, verifying ->d_seq and finding d_is_negative() to
be true does *not* mean that inode we'd fetched had been NULL - that
holds only while ->d_seq is still unchanged.

Shift d_is_negative() checks into lookup_fast() prior to ->d_seq
verification.

Reported-by: Steven Rostedt <[email protected]>
Tested-by: Steven Rostedt <[email protected]>
Signed-off-by: Al Viro <[email protected]>

Merge branch 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Pull btrfs fix from Chris Mason:
"When an arm user reported crashes near page_address(page) in my new
  code, it became clear that I can't be trusted with GFP masks.  Filipe
  beat me to the patch, and I'll just be in the corner with my dunce cap
  on"

* 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix wrong mapping flags for free space inode

Merge tag 'dm-4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
"Two additional fixes for changes introduced via DM during the 4.1
  merge window.

  The first reverts a dm-crypt change that wasn't correct.  The second
  fixes a device format regression that impacted userspace"

* tag 'dm-4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  init: fix regression by supporting devices with major:minor:offset format
  Revert "dm crypt: fix deadlock when async crypto algorithm returns -EBUSY"

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A collection of fixes since the merge window;

   - fix for a double elevator module release, from Chao Yu.  Ancient bug.

   - the splice() MORE flag fix from Christophe Leroy.

   - a fix for NVMe, fixing a patch that went in in the merge window.
     From Keith.

   - two fixes for blk-mq CPU hotplug handling, from Ming Lei.

   - bdi vs blockdev lifetime fix from Neil Brown, fixing and oops in md.

   - two blk-mq fixes from Shaohua, fixing a race on queue stop and a
     bad merge issue with FUA writes.

   - division-by-zero fix for writeback from Tejun.

   - a block bounce page accounting fix, making sure we inc/dec after
     bouncing so that pre/post IO pages match up.  From Wang YanQing"

* 'for-linus' of git://git.kernel.dk/linux-block:
  splice: sendfile() at once fails for big files
  blk-mq: don't lose requests if a stopped queue restarts
  blk-mq: fix FUA request hang
  block: destroy bdi before blockdev is unregistered.
  block:bounce: fix call inc_|dec_zone_page_state on different pages confuse value of NR_BOUNCE
  elevator: fix double release of elevator module
  writeback: use |1 instead of +1 to protect against div by zero
  blk-mq: fix CPU hotplug handling
  blk-mq: fix race between timeout and CPU hotplug
  NVMe: Fix VPD B0 max sectors translation

Merge tag 'gpio-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"Here is a bunch of GPIO fixes that I collected since -rc1, nothing
  controversial, nothing special:

   - fix a memory leak for GPIO hotplug.

   - fix a signedness bug in the ACPI GPIO pin validation.

   - driver fixes: Qualcomm SPMI and OMAP MPUIO IRQ issues"

* tag 'gpio-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: omap: Fix regression for MPUIO interrupts
  gpio: sysfs: fix memory leaks and device hotplug
  pinctrl: qcom-spmi-gpio: Fix input value report
  pinctrl: qcom-spmi-gpio: Fix output type configuration
  gpiolib: change gpio pin from unsigned to signed in acpi callback

Merge tag 'mmc-4.1-rc2' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Don't access RPMB partitions for normal read/write
   - Fix hibernation restore sequence

  MMC host:
   - dw_mmc: Fix card detection for non removable cards
   - dw_mmc: Fix sglist issue in 32-bit mode
   - sh_mmcif: Fix timeout value for command request"

* tag 'mmc-4.1-rc2' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: dw_mmc: dw_mci_get_cd check MMC_CAP_NONREMOVABLE
  mmc: dw_mmc: init desc in dw_mci_idmac_init
  mmc: card: Don't access RPMB partitions for normal read/write
  mmc: sh_mmcif: Fix timeout value for command request
  mmc: core: add missing pm event in mmc_pm_notify to fix hib restore

Merge tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"The newly added ftrace_print_array_seq() function had a bug in it.
  Luckily, the only user of it didn't make the 4.1 merge window.

  But the helper function should be fixed before 4.2 when the users
  start coming in"

* tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Make ftrace_print_array_seq compute buf_len

ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for exynos5250-snow

The Marvell mwifiex driver prevents the system to enter into a suspend
state if the card power is not preserved during a suspend/resume cycle.

So Suspend-to-RAM and Suspend-to-idle are failing on Exynos5250 Snow.

Add the keep-power-in-suspend Power Management property to the SDIO/MMC
node so the mwifiex suspend handler doesn't fail and the system is able
to enter into a suspend state.

Signed-off-by: Javier Martinez Canillas <[email protected]>
Reviewed-by: Doug Anderson <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>

ARM: dts: Fix typo in trip point temperature for exynos5420/5440

Remove the extra zero in the "cpu-crit-0" trip point for exynos5420
and exynos5440.

Signed-off-by: Abhilash Kesavan <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>

ARM: dts: add 'rtc_src' clock to rtc node for exynos4412-odroid boards

The Exynos4412 SoC has a s3c6410 RTC where the source clock
is now a mandatory property.
This patch fixes probe failure of s3c-rtc on Odroid-X2/U2/U3 boards.

Signed-off-by: Markus Reichl <[email protected]>
Tested-by: Tobias Jakobi <[email protected]>
Reviewed-by: Chanwoo Choi <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Tested-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>

ARM: dts: Make DP a consumer of DISP1 power domain on Exynos5420

Commit ea08de16eb1b ("ARM: dts: Add DISP1 power domain for exynos5420")
added a device node for the Exynos5420 DISP1 power domain but dit not
make the DP controller a consumer of that power domain.

This causes an "Unhandled fault: imprecise external abort" error if the
exynos-dp driver tries to access the DP controller registers and the PD
was turned off. This lead to a kernel panic and a complete system hang.

Make the DP controller device node a consumer of the DISP1 power domain
to ensure that the PD is turned on when the exynos-dp driver is probed.

Fixes: ea08de16eb1b ("ARM: dts: Add DISP1 power domain for exynos5420")
Signed-off-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>

MAINTAINERS: add Conexant Digicolor machines entry

This adds Baruch as the maintainer for the Digicolor platform.

Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

MAINTAINERS: socfpga: update the git repo for SoCFPGA

The git tree at rocketboards.org is going away. Update the entry to reflect
the address of the new location. Also add an entry for all the socfpga_*
dts files.

Signed-off-by: Dinh Nguyen <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

drm/tegra: Don't use vblank_disable_immediate on incapable driver.

Tegra would not only need a hardware vblank counter that
increments at leading edge of vblank, but also support
for instantaneous high precision vblank timestamp queries, ie.
a proper implementation of dev->driver->get_vblank_timestamp().

Without these, there can be off-by-one errors during vblank
disable/enable if the scanout is inside vblank at en/disable
time, and additionally clients will never see any useable
vblank timestamps when querying via drmWaitVblank ioctl. This
would negatively affect swap scheduling under X11 and Wayland.

Signed-off-by: Mario Kleiner <[email protected]>
Acked-by: Thierry Reding <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

Merge tag 'drm-amdkfd-fixes-2015-05-07' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes

- Add missing initialization of SDMA vm register when creating an SDMA queue
- Don't report local memory size, as we don't support local memory allocation
  yet.
- Allow to unregister process with exisiting queues. Until now we blocked
  it with BUG_ON, which was also an error by itself.

* tag 'drm-amdkfd-fixes-2015-05-07' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: Initialize sdma vm when creating sdma queue
  drm/amdkfd: Don't report local memory size
  drm/amdkfd: allow unregister process with queues

Merge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Mostly stability fixes for UVD and VCE, plus a few other bug and regression
fixes.

* 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: stop trying to suspend UVD sessions
  drm/radeon: more strictly validate the UVD codec
  drm/radeon: make UVD handle checking more strict
  drm/radeon: make VCE handle check more strict
  drm/radeon: fix userptr lockup
  drm/radeon: fix userptr BO unpin bug v3
  drm/radeon: don't setup audio on asics that don't support it
  drm/radeon: disable semaphores for UVD V1 (v2)

md/raid5: fix handling of degraded stripes in batches.

There is no need for special handling of stripe-batches when the array
is degraded.

There may be if there is a failure in the batch, but STRIPE_DEGRADED
does not imply an error.

So don't set STRIPE_BATCH_ERR in ops_run_io just because the array is
degraded.
This actually causes a bug: the STRIPE_DEGRADED flag gets cleared in
check_break_stripe_batch_list() and so the bitmap bit gets cleared
when it shouldn't.

So in check_break_stripe_batch_list(), split the batch up completely -
again STRIPE_DEGRADED isn't meaningful.

Also don't set STRIPE_BATCH_ERR when there is a write error to a
replacement device. This simply removes the replacement device and
requires no extra handling.

Signed-off-by: NeilBrown <[email protected]>

md/raid5: fix allocation of 'scribble' array.

As the new 'scribble' array is sized based on chunk size,
we need to make sure the size matches the largest of 'old'
and 'new' chunk sizes when the array is undergoing reshape.

We also potentially need to resize it even when not resizing
the stripe cache, as chunk size can change without changing
number of devices.

So move the 'resize' code into a separate function, and
consider old and new sizes when allocating.

Signed-off-by: NeilBrown <[email protected]>
Fixes: 46d5b785621a ("raid5: use flex_array for scribble data")

md/raid5: don't record new size if resize_stripes fails.

If any memory allocation in resize_stripes fails we will return
-ENOMEM, but in some cases we update conf->pool_size anyway.

This means that if we try again, the allocations will be assumed
to be larger than they are, and badness results.

So only update pool_size if there is no error.

This bug was introduced in 2.6.17 and the patch is suitable for
-stable.

Fixes: ad01c9e3752f ("[PATCH] md: Allow stripes to be expanded in preparation for expanding an array")
Cc: [email protected] (v2.6.17+)
Signed-off-by: NeilBrown <[email protected]>

md/raid5: avoid reading parity blocks for full-stripe write to degraded array

When performing a reconstruct write, we need to read all blocks
that are not being over-written .. except the parity (P and Q) blocks.

The code currently reads these (as they are not being over-written!)
unnecessarily.

Signed-off-by: NeilBrown <[email protected]>
Fixes: ea664c8245f3 ("md/raid5: need_this_block: tidy/fix last condition.")

md/raid5: more incorrect BUG_ON in handle_stripe_fill.

It is not incorrect to call handle_stripe_fill() when
a batch of full-stripe writes is active.
It is, however, a BUG if fetch_block() then decides
it needs to actually fetch anything.

So move the 'BUG_ON' to where it belongs.

Signed-off-by: NeilBrown <[email protected]>
Fixes: 59fc630b8b5f ("RAID5: batch adjacent full stripe write")

md/raid5: new alloc_stripe() to allocate an initialize a stripe.

The new batch_lock and batch_list fields are being initialized in
grow_one_stripe() but not in resize_stripes(). This causes a crash
on resize.

So separate the core initialization into a new function and call it
from both allocation sites.

Signed-off-by: NeilBrown <[email protected]>
Fixes: 59fc630b8b5f ("RAID5: batch adjacent full stripe write")

md-raid0: conditional mddev->queue access to suit dm-raid

This patch is a prerequisite for dm-raid "raid0" support to allow
dm-raid to access the MD RAID0 personality doing unconditional
accesses to mddev->queue, which is NULL in case of dm-raid stacked on
top of MD.

Most of the conditional mddev->queue accesses made it to upstream but
this missing one, which prohibits md raid0 to set disk stack limits
(being done in dm core in case of md underneath dm).

Signed-off-by: Heinz Mauelshagen <[email protected]>
Tested-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: NeilBrown <[email protected]>

mmc: dw_mmc: dw_mci_get_cd check MMC_CAP_NONREMOVABLE

When non-removable is used for emmc, MMC_CAP_NONREMOVABLE should
also be checked, otherwise detection fail since present=0

Signed-off-by: Zhangfei Gao <[email protected]>
Signed-off-by: Jaehoon Chung <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: dw_mmc: init desc in dw_mci_idmac_init

Set 0 to des1 in 32bit case.
Otherwise the random value of des1 will be used in
dw_mci_translate_sglist: IDMAC_SET_BUFFER1_SIZE(desc, length)

Signed-off-by: Fei Wang <[email protected]>
Signed-off-by: Zhangfei Gao <[email protected]>
Signed-off-by: Jaehoon Chung <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

Merge tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"These include three regression fixes (PCI resources management,
  ACPI/PNP device enumeration, ACPI SBS on MacBook) and two ACPI
  documentation fixes related to GPIO.

  Specifics:

   - Fix for a PCI resources management regression introduced during the
     4.0 cycle and related to the handling of ACPI resources'
     Producer/Consumer flags that turn out to be useless (Jiang Liu)

   - Fix for a MacBook regression related to the Smart Battery Subsystem
     (SBS) driver causing various problems (stalls on boot, failure to
     detect or report battery) to happen and introduced during the 3.18
     cycle (Chris Bainbridge)

   - Fix for an ACPI/PNP device enumeration regression introduced during
     the 3.16 cycle caused by failing to include two PNP device IDs into
     the list of IDs that PNP device objects need to be created for
     (Witold Szczeponik)

   - Fixes for two minor mistakes in the ACPI GPIO properties
     documentation (Antonio Ospite, Rafael J Wysocki)"

* tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus

Merge branches 'acpi-resources', 'acpi-battery', 'acpi-doc' and 'acpi-pnp'

* acpi-resources:
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus

* acpi-battery:
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook

* acpi-doc:
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources

* acpi-pnp:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration

Merge tag 'for-f2fs-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:
"Fix a performance regression and a bug"

* tag 'for-f2fs-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: fix wrong error hanlder in f2fs_follow_link
Revert "f2fs: enhance multi-threads performance"

ARM: multi_v7_defconfig: Select more FSL SoCs

Select IMX50, IMX6SX and LS1021A SoC support.

Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

MAINTAINERS: replace an AT91 maintainer

As some help is needed from an active maintainer, replace Andrew Victor
by Alexandre Belloni in the ARM/Atmel MAINTAINERS' entry (aka AT91).
Add an entry to the CREDITS file.

Thanks Andrew for the great role you played during the early days of this
product family.

Signed-off-by: Nicolas Ferre <[email protected]>
Acked-by: Andrew Victor <[email protected]>
Acked-by: Alexandre Belloni <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

drivers: CCI: fix used_mask init in validate_group()

Currently in validate_group(), there is a static initializer
for fake_pmu.used_mask which is based on CPU_BITS_NONE but
the used_mask array size is based on CCI_PMU_MAX_HW_EVENTS.
CCI_PMU_MAX_HW_EVENTS is not based on NR_CPUS, so CPU_BITS_NONE
is not correct and will cause a build failure if NR_CPUS
is set high enough to make CPU_BITS_NONE larger than used_mask.

Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Mark Salter <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'stericsson-fixes-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into fixes

Merge "Ux500 fixes" from Linus Walleij:

This fixes an MMC/SD configuration issue present for some time
in the Ux500 DT but triggered by proper error handling in v4.1-rc1.

* tag 'stericsson-fixes-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
  ARM: ux500: Enable GPIO regulator for SD-card for snowball
  ARM: ux500: Enable GPIO regulator for SD-card for HREF boards
  ARM: ux500: Move GPIO regulator for SD-card into board DTSs

Merge tag 'renesas-fixes-for-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Merge "Renesas ARM Based SoC Fixes for v4.1" from Simon Horman:

* Fix adv7511 IRQ sensing on koelsch board

* tag 'renesas-fixes-for-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: koelsch: Fix adv7511 IRQ sensing

Merge tag 'omap-for-v4.1/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Merge "omap fixes against v4.1-rc1" from Tony Lindgren:

Fixes for omaps, mostly a fix for power power consumption
creeping up during idle, and two l3-noc device fixes:

- Fix power consumption creeping up with I2C4 staying on
- Fix n900 microphone bias voltages
- Fix dra7 l3-noc for host clock
- Fix omap5 l3-noc id address decoding

The rest are all just minor dts fixes:

- Fix changed EXTCON_USB_GPIO_USB in defconfig
- Fix missing isp and iva #iommu-cells property
- Various beagle x15 dts fixes for pre-production changes
- Fix am437x-sk display dts entries

* tag 'omap-for-v4.1/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  bus: omap_l3_noc: Fix master id address decoding for OMAP5
  bus: omap_l3_noc: Fix offset for DRA7 CLK1_HOST_CLK1_2 instance
  ARM: dts: dra7: Fix efuse register size for ABB
  ARM: dts: am57xx-beagle-x15: Switch GPIO fan number
  ARM: dts: am57xx-beagle-x15: Switch UART mux pins
  ARM: dts: am437x-sk: reduce col-scan-delay-us
  ARM: dts: am437x-sk: fix for new newhaven display module revision
  ARM: dts: am57xx-beagle-x15: Fix RTC aliases
  ARM: dts: am57xx-beagle-x15: Fix IRQ type for mcp7941x
  ARM: dts: omap3: Add #iommu-cells to isp and iva iommu
  ARM: omap2plus_defconfig: Enable EXTCON_USB_GPIO
  ARM: dts: OMAP3-N900: Add microphone bias voltages
  ARM: OMAP2+: Fix omap off idle power consumption creeping up

Merge tag 'arm-soc/for-4.1/maintainers' of http://github.com/broadcom/stblinux into fixes

Merge "MAINTAINERS update for Broadcom SoCs for 4.1 #2" from Florian Fainelli:

This pull request contains 3 changes to the MAINTAINERS file for Broadcom SoCs:

- add Ray and Scott for mach-bcm
- remove Christian for mach-bcm
- remove Marc for brcmstb

* tag 'arm-soc/for-4.1/maintainers' of http://github.com/broadcom/stblinux:
  MAINTAINERS: Update brcmstb entry
  MAINTAINERS: Remove Christian Daudt for mach-bcm
  MAINTAINERS: Update mach-bcm maintainers list

Merge tag 'mvebu-fixes-4.1' of git://git.infradead.org/linux-mvebu into fixes

Pull "mvebu fix for 4.1" from Gregory CLEMENT:

Disable the unused internal RTC in the dts of the OpenBlock AX3

* tag 'mvebu-fixes-4.1' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: armada-xp-openblocks-ax3-4: Disable internal RTC