Git Repo - linux.git/log

ipv4: Fix ip_queue_xmit to pass sk into ip_local_out_sk

After a packet has been encapsulated by a tunnel we should use the
tunnel sockets local multicast loopback flag to control if the
encapsulated packet should be locally loopback back.

Pass sk into ip_local_out_sk so that in the rare case we are dealing
with a tunneled packet whose tunnel destination address is a multicast
address the kernel properly decides to loopback this packet.

In practice I don't think this matters as ip_queue_xmit is used by
tcp, l2tp and sctp none of which I am aware of uses ip level
multicasting as they are all point to point communications protocols.
Let's fix this before someone uses ip_queue_xmit for a tunnel protocol
that does use multicast.

Fixes: aad88724c9d5 ("ipv4: add a sock pointer to dst->output() path.")
Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: Fix ip_local_out_sk by passing the sk into __ip_local_out_sk

In the rare case where sk != skb->sk ip_local_out_sk arranges
to call dst->output differently if the skb is queued or not.
This is a bug.

Fix this bug by passing the sk parameter of ip_local_out_sk through
from ip_local_out_sk to __ip_local_out_sk (skipping __ip_local_out).

Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-07

This series contains updates to i40e and i40evf only.

Paul updates i40e to simply increase the amount of time we wait for a
reset to complete since we have seen in some rare occasions the reset
can take longer to complete.

Shannon updates the driver to turn on Wake-on-LAN by default if it is
enabled in the hardware config to begin with, rather than always disable
it and wait for the user to expressly turn it on.  Added new device id's
and support for future devices.  Fixed a possible type compare problem
between a size and possible negative number.  Also fixed a shift value
that was wrong, which ended up with a bad bitmask.  Did general house
cleaning of the driver to cleanup several low lying fruit in the
driver.  Fixed an issue where new unicast address's would be added to
the VSI list and then immediately removed and would never actually
make it down to the hardware.  Resolved the issue by removing the
separation from unicast and multicast in the search for filters to be
deleted.

Mitch fixes an issue where the hardware would continue to access the
memory formerly used by the rings for a VF which have been removed,
causing memory corruption or DMAR errors.  To relieve this condition,
explicitly stop all rings associated with each VF before releasing its
resources.  Also fixed a panic if the driver is unable to enable MSI-X
or its unable to acquire enough vectors, so propagate interrupt
allocation failure information to the calling function.  Cleaned up
opcode that is not required.

Carolyn extends the size of the test available for the interrupt names
so that all the descriptive data available for the Flow Director
interrupts is not truncated.

Catherine fixes an issue where there was a possibility of speed getting
set to 0 if advertised is set to 0 (which is the case when autoneg is
disabled).

Jesse fixes the checksum on big endian machines, so added code to swap
it correctly.  Also fixed a bug in the return from get_link_status()
where only true or false was being returned, but false could mean
multiple things.  So allow the caller to get all the return values
in the call chain bubbled back to the source so that the reason for
the failure does not get lost.

Anjali adds statistics to keep track of how many times we ask the stack
to linearize the SKB because the hardware cannot handle SKBs with more
than 8 frags per segment/single packet.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge tag 'regmap-offload-update-bits' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

regmap: Allow buses to provide a custom update_bits() operation

Some buses provide a native _update_bits() operation which for uncached
registers is faster than doing a read/modify/write cycle as it is a
single bus transaction. Add support for implementing this to regmap.

Bluetooth: 6lowpan: Remove unnecessary chan_get() function

The chan_get() function just adds unnecessary indirection to calling
the chan_create() call. The only added value it gives is the chan->ops
assignment, but that can equally well be done in the calling code.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: 6lowpan: Rename confusing 'pchan' variables

The typical convention when having both a child and a parent channel
variable is to call the former 'chan' and the latter 'pchan'. When
there's only one variable it's called chan. Rename the 'pchan'
variables in the 6lowpan code to follow this convention.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: 6lowpan: Remove unnecessary chan_open() function

All the chan_open() function now does is to call chan_create() so it
doesn't really add any value.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: 6lowpan: Remove redundant BT_CONNECTED assignment

The L2CAP core code makes sure of setting the channel state to
BT_CONNECTED, so there's no need for the implementation code (6lowpan
in this case) to do it.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: 6lowpan: Remove redundant (and incorrect) MPS assignments

The L2CAP core code already sets the local MPS to a sane value. The
remote MPS value otoh comes from the remote side so there's no point
in trying to hard-code it to any value.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: 6lowpan: Fix imtu & omtu values

The omtu value is determined by the remote peer so there's no point in
trying to hard-code it to any value. The IPSP specification otoh gives
a more reasonable value for the imtu, i.e. 1280.

Signed-off-by: Johan Hedberg <[email protected]>
Acked-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>

Bluetooth: Enforce packet types in hci_recv_frame driver function

When calling the hci_recv_frame driver function check for valid packet
types that the core should process. This should catch issues with
drivers trying to feed vendor packet types through this interface.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: bpa10x: Use h4_recv_buf helper for frame reassembly

The manually coded frame reassembly is actually broken. The h4_recv_buf
helper from the UART driver is a perfect fit for frame reassembly for
this driver. So just export that function and use it here as well.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: bpa10x: Add support for set_diag driver callback

The BPA-10x devices support tracing operation. Use the set_diag driver
callback to allow enabling and disabling that functionality.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: bpa10x: Read revision information in setup stage

For debugging pruposes, read the revision string of the BPA-10x devices
and print it. For example one of the latest devices respond with the
string SNIF_102,BB930,02/01/18,10:37:56.

  < HCI Command: Vendor (0x3f|0x000e) plen 1
          07                                               .
  > HCI Event: Command Complete (0x0e) plen 49
        Vendor (0x3f|0x000e) ncmd 1
          Status: Success (0x00)
          53 4e 49 46 5f 31 30 32 2c 42 42 39 33 30 2c 30  SNIF_102,BB930,0
          32 2f 30 31 2f 31 38 2c 31 30 3a 33 37 3a 35 36  2/01/18,10:37:56
          00 00 00 00 00 00 00 00 00 00 00 00 00           .............

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: Fix interaction of HCI_QUIRK_RESET_ON_CLOSE and HCI_AUTO_OFF

When the controller requires the HCI Reset command to be send when
closing the transport, the HCI_AUTO_OFF needs to be accounted for. The
current code tries to actually do that, but the flag gets cleared to
early. So store its value and use it that stored value instead of
checking for a flag that is always cleared.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: hci_bcm: Enable support for set_diag driver callback

The set_diag driver callback allows enabling and disabling the vendor
specific diagnostic information. Since Broadcom chips have support for
a dedicated LM_DIAG channel, hook it up accordingly.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: Add debugfs entry for setting vendor diagnostic mode

This adds a new debugfs entry for enabling and disabling the vendor
diagnostic mode. It is only exposed for drivers that provide the
set_diag driver callback and actually have an option for vendor
specific diagnostic information.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: hci_bcm: Enable parsing of LM_DIAG messages

The Broadcom UART based controllers can send LM_DIAG messages with the
identifier 0x07 inside the HCI stream. These messages are 63 octets in
size and have no variable payload or length indicator.

This patch adds correct parsing information for the h4_recv_buf handler
and in case these packets are received, they are forwarded to the
Bluetooth core via hci_recv_diag interface.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: Add support for vendor specific diagnostic channel

Introduce hci_recv_diag function for HCI drivers to allow sending vendor
specific diagnostic messages into the Bluetooth core stack. The messages
are not processed, but they are forwarded to the monitor channel and can
be retrieved by user space diagnostic tools.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

Bluetooth: Send index information updates to monitor channel

The Bluetooth public device address might change during controller setup
and it makes it a lot simpler for monitoring tools if they just get told
what the new address is. In addition include the manufacturer / company
information of the controller. That allows for easy vendor specific HCI
command and event handling.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

i40e/i40evf: remove unused opcode

This opcode is not required. VFs that program RSS through the firmware
do it by interacting directly with the firmware, and do not need to use
the virtual channel for this functionality.

Change-ID: Iaf17d2600e28ff1b6be8653f2fe9df1facd23b0e
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40evf: propagate interrupt allocation failure

Lower level functions are properly reporting errors, and higher-level
functions are correctly responding to errors, but the errors aren't
actually getting through. Typically, the middle-manager function seems
to want to shield its boss from any bad news.

This change fixes a panic if the driver is unable to enable MSI-X or is
unable to acquire enough vectors.

Change-ID: Ifd5787ce92519a5d97e4b465902db930d97b71a1
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Additional checks for CEE APP priority validity

The firmware has added additional status information to allow software
to determine if the APP priority for FCoE/iSCSI/FIP is valid or not in
CEE DCBX mode.

This patch adds to support those additional checks and will only add
applications to the software table that have oper and sync bits set
without any error.

Change-ID: I0a76c52427dadf97d4dba4538a3068d05e4eb56b
Signed-off-by: Neerav Parikh <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: Add a stat to keep track of linearization count

Keep track of how many times we ask the stack to linearize the
skb because the HW cannot handle skbs with more than 8 frags per
segment/single packet.

Change-ID: If455452060963a769bbe6112cba952e79e944b52
Signed-off-by: Anjali Singhai Jain <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: fix unicast mac address add

When using something like "ip maddr add ..." to add another unicast mac
address to the netdev, the mac address comes into the set_rx_mode handler
in the multicast list whether it is a unicast or multicast address.
This was confusing the code when it was trying to search for addresses
that needed to be deleted from the VSI, because it was looking for the
VSI unicast address in the netdev unicast list.  The result was that a
new unicast address would get added to the VSI list and then immediately
removed, and would never actually make it down into the hardware.

This patch removes the separation from unicast and multicast in the search
for filters to be deleted.  It also simplifies the logic a little with a
jump to the bottom of the loop when an address is found.  Now it doesn't
matter which netdev list the address is hiding in, we'll check them all.

Change-ID: Ie3685a92427ae7d2212bf948919ce295bc7a874c
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fix bug in return from get_link_status and avoid spurious link messages

Previously, the driver could call this function and have only true/false
returned, but false could mean multiple things like failure to read
or link was down. This change allows the caller to get all return values
in the call chain bubbled back to the source, which keeps information about
failures from being lost.

Also, in some unlikely scenarios, the firmware can become slow to respond
to admin queue (AQ) queries for link state. Should the AQ time out,
the driver can detect the state and avoid a link change when there
may have been none.

Change-ID: Ib2ac38407b7880750fb891b392fa77457fe6c21c
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: add little endian conversion for checksum

The checksum is not correct on big endian machines so add code to swap it
correctly.

Change-ID: Ic92b886d172a2cbe49f5d7eee1bc78e447023c7b
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Paul M Stillwell Jr <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: give up the __func__

During early development, we added the function name to all of the error
strings to make debugging simpler. Now that we've released the driver,
our users should have more comprehensible error messages. So tear the
roof off and give up the __func__. Ow.

Change-ID: I7e1766252c7a032b9af6520da6aff536bdfd533c
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Never let speed get set to 0 in get_settings

In ethtool, there is a possibility of speed getting set to 0
if advertise is set to 0 (which it is when autoneg is disabled).
We never want this to happen as the firmware will actually attempt
to set the speed to 0 sending link down, so add an extra check
to make sure this doesn't happen.

Change-ID: I62e0eeee2cbf043d8e6f5c9c9f0b92794e877f01
Signed-off-by: Catherine Sullivan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Fix for truncated interrupt name

This patch extends the size of the text available for the interrupt names.
Without this patch, all the descriptive data available for the Flow
Director interrupts is truncated.

Change-ID: I2ac458f23ac3b4ea8f1edf73edc283b1d3704c7f
Signed-off-by: Carolyn Wyborny <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: assure clean asq status report

There was a possibility where the asq_last_status could get through without
update and thus report a previous error. I don't think we've actually seen
this happen, but this patch will help make sure it doesn't.

Change-ID: I9e33927052a5ee6ea21f80b66d4c4b76c2760b17
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Christopher Pau <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: make i40e_init_pf_fcoe to void

i40e_init_pf_fcoe() didn't return anything except 0, it prints enough
error info already, and no driver logic depends on the return value,
so this can be void.

Change-ID: Ie6afad849857d87a7064c42c3cce14c74c2f29d8
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fix bad CEE status shift value

Fix a shift value that was wrong, ending up with a bad bitmask. Also add
a blank line between two sets of #defines for better readability.

Change-ID: I3e41fa2a2ab904d3a4e6cbf13972ab0036a10601
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: fix a potential type compare issue

Rework an if expression to assure there is no type compare problem between
a size and a possible negative number.

Change-ID: I4921fcc96abfcf69490efce020a9e4007f251c99
Reported-by: Helin Zhang <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e/i40evf: add driver support for new device ids

Early addition of new a device id.

Change-ID: I61a8c8556fdf4f5714be4e4089689e374f30293c
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: stop VF rings

Explicitly stop the rings belonging to each VF when disabling SR-IOV.
Even though the VFs were gone, and the associated VSIs were removed, the
rings were not stopped, and in some circumstances the hardware would
continue to access the memory formerly used by the rings, causing memory
corruption or DMAR errors, both of which would lead to general malaise
of the kernel.

To relieve this condition, explicitly stop all the rings associated with
each VF before releasing its resources.

Change-ID: I78c05d562c66e7b594b7e48d67860f49b3e5b6ec
Signed-off-by: Mitch Williams <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: enable WoL operation if config bit show WoL capable

The driver was disabling Wake-on-LAN by default and waiting for the user
to expressly turn it on. This patch has the driver turning on WoL from
the start if enabled in the hardware config, which matches the behavior
of our other drivers.

Change-ID: I43faedb907f8ba4d1a61b72a7c86072b97af12b1
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Increase the amount of time we wait for reset to be done

In some rare cases the reset can take longer to complete so increase the
amount of time we wait.

Change-ID: Ib5628ec54b526a811ee33d1214fe763226406671
Signed-off-by: Paul M Stillwell Jr <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

tcp: ensure prior synack rtx behavior with small backlogs

Some applications use a listen() backlog of 1.

Prior kernels were silently enforcing a qlen_log of 4, so that we were
sending up to /proc/sys/net/ipv4/tcp_synack_retries SYNACK messages.

Fixes: ef547f2ac16b ("tcp: remove max_qlen_log")
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv4: tcp.c Fixed an assignment coding style issue

Fixed an assignment coding style issue

Signed-off-by: Yuvaraja Mariappan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 's390-net'

Ursula Braun says:

====================
s390: qeth patches for net-next

here are some s390 related patches for net-next. The qeth patches
are performance optimizations in the driver. The qdio patch corrects
a warning condition.
====================

Signed-off-by: David S. Miller <[email protected]>

s390/qdio: fix WARN_ON_ONCE condition

If HiperSockets Completion Queueing is enabled, qdio always
issues a warning, since the condition is always met.
This patch fixes the condition in WARN_ON_ONCE that was always
true.

Signed-off-by: Eugene Crosser <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: optimize MAC handling in rx_mode callback

In layer2 mode of the qeth driver, MAC address lists
from struct net_device require mapping to the OSA-card.
The existing implementation is inefficient for lists with
more than several MAC addresses, since for every
ndo_set_rx_mode callback it removes all MAC addresses first,
and then registers the current MAC address list.
This patch changes implementation of ndo_set_rx_mode callback
in qeth, only performing hardware registration/removal for
new/deleted addresses. To shorten lookup of MAC addresses
registered addresses are kept in a hashtable instead of a
linear list.

Signed-off-by: Lakhvich Dmitriy <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Reviewed-by: Eugene Crosser <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Tested-by: Christian Borntraeger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: switch to napi_gro_receive

Add support for GRO (generic receive offload) in the layer 2
part of device driver qeth. This results in a performance
improvement when GRO and RX is turned on.

Signed-off-by: Thomas Richter <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'bridge-netlink-port-attrs'

Nikolay Aleksandrov says:

====================
bridge: netlink: complete port attribute support

This is the second set that completes the bridge port's netlink support and
makes everything from sysfs available via netlink. I've used sysfs as a
guide of what and how to set again. I've tested setting/getting every
option and also this time tested enabling KASAN. Again there're a few long
line warnings about the ifla attribute names in br_port_info_size() but
as the previous set - it's good to know what's been accounted for.
====================

Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: add support for port's multicast_router attribute

Add IFLA_BRPORT_MULTICAST_ROUTER to allow setting/getting port's
multicast_router via netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: allow to flush port's fdb

Add IFLA_BRPORT_FLUSH to allow flushing port's fdb similar to sysfs's
flush.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's timer values

Add the following attributes in order to export port's timer values:
IFLA_BRPORT_MESSAGE_AGE_TIMER, IFLA_BRPORT_FORWARD_DELAY_TIMER and
IFLA_BRPORT_HOLD_TIMER.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's topology_change_ack and config_pending

Add IFLA_BRPORT_TOPOLOGY_CHANGE_ACK and IFLA_BRPORT_CONFIG_PENDING to
allow getting port's topology_change_ack and config_pending respectively
via netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's id and number

Add IFLA_BRPORT_(ID|NO) to allow getting port's port_id and port_no
respectively via netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's designated cost and port

Add IFLA_BRPORT_DESIGNATED_(COST|PORT) to allow getting the port's
designated cost and port respectively via netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's bridge id

Add IFLA_BRPORT_BRIDGE_ID to allow getting the designated bridge id via
netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: export port's root id

Add IFLA_BRPORT_ROOT_ID to allow getting the designated root id via
netlink.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Lookup actual route when oif is VRF device

If the user specifies a VRF device in a get route query the custom route
pointing to the VRF device is returned:

    $ ip route ls table vrf-red
    unreachable default
    broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
    10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
    local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
    broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2

    $ ip route get oif vrf-red 10.2.1.40
    10.2.1.40 dev vrf-red
        cache

Add the flags to skip the custom route and go directly to the FIB. With
this patch the actual route is returned:

    $ ip route get oif vrf-red 10.2.1.40
    10.2.1.40 dev eth1  src 10.2.1.2
        cache

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mac80211-next-for-davem-2015-10-05' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
For the current cycle, we have the following right now:
* many internal fixes, API improvements, cleanups, etc.
* full AP client state tracking in cfg80211/mac80211 from Ayala
* VHT support (in mac80211) for mesh
* some A-MSDU in A-MPDU support from Emmanuel
* show current TX power to userspace (from Rafał)
* support for netlink dump in vendor commands (myself)
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'l3mdev_saddr_op'

David Ahern says:

====================
net: Add saddr op to l3mdev and vrf

First 2 patches are re-sends of patches that got lost in the ethosphere
Tuesday; they were part of the first round of l3mdev conversions.
Next 3 handle the source address lookup for raw and datagram sockets
bound to a VRF device.

The conversion to the get_saddr op also fixes locally originated TCP
packets showing up at the VRF device. The use of the FLOWI_FLAG_L3MDEV_SRC
flag in ip_route_connect_init was causing locally generated packets
to skip the VRF device.

v2
- rebased to top of net-next per device delete fix and hash based
multipath patches
====================

Signed-off-by: David S. Miller <[email protected]>

net: Add l3mdev saddr lookup to raw_sendmsg

ping originated on box through a VRF device is showing up in tcpdump
without a source address:
    $ tcpdump -n -i vrf-blue
    08:58:33.311303 IP 0.0.0.0 > 10.2.2.254: ICMP echo request, id 2834, seq 1, length 64
    08:58:33.311562 IP 10.2.2.254 > 10.2.2.2: ICMP echo reply, id 2834, seq 1, length 64

Add the call to l3mdev_get_saddr to raw_sendmsg.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add source address lookup op for VRF

Add operation to l3mdev to lookup source address for a given flow.
Add support for the operation to VRF driver and convert existing
IPv4 hooks to use the new lookup.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Refactor path selection in __ip_route_output_key_hash

VRF device needs the same path selection following lookup to set source
address. Rather than duplicating code, move existing code into a
function that is exported to modules.

Code move only; no functional change.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add netif_is_l3_slave

IPv6 addrconf keys off of IFF_SLAVE so can not use it for L3 slave.
Add a new private flag and add netif_is_l3_slave function for checking
it.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Fix vti use case with oif in dst lookups for IPv6

It occurred to me yesterday that 741a11d9e4103 ("net: ipv6: Add
RT6_LOOKUP_F_IFACE flag if oif is set") means that xfrm6_dst_lookup
needs the FLOWI_FLAG_SKIP_NH_OIF flag set. This latest commit causes
the oif to be considered in lookups which is known to break vti. This
explains why 58189ca7b274 did not the IPv6 change at the time it was
submitted.

Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gianfar: Add WAKE_UCAST and "wake-on-filer" support

This enables eTSEC's filer (Rx parser) and the FGPI Rx
interrupt (Filer General Purpose Interrupt) as a wakeup
source event.

Upon entering suspend state, the eTSEC filer is given
a rule to match incoming L2 unicast packets. A packet
matching the rule will be enqueued in the Rx ring and
a FGPI Rx interrupt will be asserted by the filer to
wakeup the system. Other packet types will be dropped.
On resume the filer table is restored to the content
before entering suspend state.
The set of rules from gfar_filer_config_wol() could be
extended to implement other WoL capabilities as well.

The "fsl,wake-on-filer" DT binding enables this capability
on certain platforms that feature the necessary power
management infrastructure, targeting mainly printing and
imaging applications.
(refer to Power Management section of the SoC Ref Man)

Cc: Li Yang <[email protected]>
Cc: Zhao Chenhui <[email protected]>
Signed-off-by: Claudiu Manoil <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

powerpc: dts: p1022si: Add fsl,wake-on-filer for eTSEC

Enable the "wake-on-filer" (aka. wake on user defined packet)
wake on lan capability for the eTSEC ethernet nodes.

Cc: Li Yang <[email protected]>
Cc: Zhao Chenhui <[email protected]>
Signed-off-by: Claudiu Manoil <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

doc: dt: net: Add fsl,wake-on-filer for eTSEC

Add the "fsl,wake-on-filer" property for eTSEC nodes to
indicate that the system has the power management
infrastructure needed to be able to wake up the system
via FGPI (filer, aka. h/w rx parser) interrupt.

Cc: Li Yang <[email protected]>
Cc: Zhao Chenhui <[email protected]>
Signed-off-by: Claudiu Manoil <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ovs-ipv6-tunnel'

Jiri Benc says:

====================
openvswitch: add IPv6 tunneling support

This builds on the previous work that added IPv6 support to lwtunnels and
adds IPv6 tunneling support to ovs.

To use IPv6 tunneling, there needs to be a metadata based tunnel net_device
created and added to the ovs bridge. Currently, only vxlan is supported by
the kernel, with geneve to follow shortly. There's no need nor intent to add
a support for this into the vport-vxlan (etc.) compat layer.

v3: dropped the last two patches added in v2.
====================

Signed-off-by: David S. Miller <[email protected]>

openvswitch: netlink attributes for IPv6 tunneling

Add netlink attributes for IPv6 tunnel addresses. This enables IPv6 support
for tunnels.

Signed-off-by: Jiri Benc <[email protected]>
Acked-by: Pravin B Shelar <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

openvswitch: add tunnel protocol to sw_flow_key

Store tunnel protocol (AF_INET or AF_INET6) in sw_flow_key. This field now
also acts as an indicator whether the flow contains tunnel data (this was
previously indicated by tun_key.u.ipv4.dst being set but with IPv6 addresses
in an union with IPv4 ones this won't work anymore).

The new field was added to a hole in sw_flow_key.

Signed-off-by: Jiri Benc <[email protected]>
Acked-by: Pravin B Shelar <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: netlink: make br_fill_info's frame size smaller

When KASAN is enabled the frame size grows > 2048 bytes and we get a
warning, so make it smaller.
net/bridge/br_netlink.c: In function 'br_fill_info':
>> net/bridge/br_netlink.c:1110:1: warning: the frame size of 2160 bytes
>> is larger than 2048 bytes [-Wframe-larger-than=]

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add support for filtering neigh dump by device index

Add support for filtering neighbor dumps by device by adding the
NDA_IFINDEX attribute to the dump request.

Signed-off-by: David Ahern <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-03

This series contains updates to i40e and i40evf, some of which are to
resolve more Red Hat bugzilla issues.

Jiang Liu updates the i40e and i40evf drivers to use numa_mem_id()
instead of numa_node_id() to get the nearest node with memory which
better supports memoryless nodes.

Anjali fixes an issue from Dan Carpenter <[email protected]>,
to resolve a memory leak in X722 RSS configuration path, where we should
free the memory allocated before exiting.

Shannon modifies the drivers to ensure we have the spinlocks before we
clear the ARQ and ASQ management registers.  In addition, we widen the
locked portion insert a sanity check to ensure we are working with safe
register values.

Mitch fixes an issue where under certain circumstances, we can get an
extra VF_RESOURCES message from the PF driver at runtime.  When this
occurs, we need to parse it because our VSI may have changed and that
will affect the relationship with the PF driver.  But this parsing also
blows away our current MAC address, so resolve the issue by restoring
the current MAC address from the netdev struct after we parse the
resource message.
====================

Signed-off-by: David S. Miller <[email protected]>

net: dsa: better error reporting

Add additional error reporting to the generic DSA code, so it's easier
to debug when things go wrong. This was useful when initially bringing
up 88e6176 on a new board.

Signed-off-by: Russell King <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: remove link polling

The link status is polled by the generic phy layer, there's no need to
duplicate that polling with additional polling. This additional polling
adds additional MDIO traffic, and races with the generic phy layer,
resulting in missing or duplicated link status messages.

Tested-by: Andrew Lunn <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Bluetooth: btbcm: Read the local name in setup stage

The Broadcom Bluetooth controllers have the chip name included in the
ROM firmware or later in the patchram firmware. For debugging purposes
read the local name and print it out. This is only done during setup
stage and only once before loading the firmware and once after loading
the firmware.

For the Broadcom based controllers from Apple, the name is only read once
after determining the chip id.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>

regmap: Allow installing custom reg_update_bits function

This commit allows installing a custom reg_update_bits function for cases where
the hardware provides a mechanism to set or clear register bits without a
read/modify/write cycle. Such is the case with the Microchip ENCX24J600.

If a custom reg_update_bits function is provided, it will only be used against
volatile registers.

Signed-off-by: Jon Ringle <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

Revert "regmap: Allow installing custom reg_update_bits function"

This reverts commit 7741c373cf3ea1f5383fa97fb7a640a429d3dd7c.

Revert "net: Microchip encx24j600 driver"

This reverts commit 04fbfce7a222327b97ca165294ef19f0faa45960.

Revert "net: encx24j600_exit() can be static"

This reverts commit 9886ce2b9d4e5a8bb3d78d0f7eff3c0f1ed58d67.

ipv4: Fix compilation errors in fib_rebalance

This fixes

net/built-in.o: In function `fib_rebalance':
fib_semantics.c:(.text+0x9df14): undefined reference to `__divdi3'

and

net/built-in.o: In function `fib_rebalance':
net/ipv4/fib_semantics.c:572: undefined reference to `__aeabi_ldivmod'

Fixes: 0e884c78ee19 ("ipv4: L3 hash-based multipath")
Signed-off-by: Peter Nørlund <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

RDS: IB: split mr pool to improve 8K messages performance

8K message sizes are pretty important usecase for RDS current
workloads so we make provison to have 8K mrs available from the pool.
Based on number of SG's in the RDS message, we pick a pool to use.

Also to make sure that we don't under utlise mrs when say 8k messages
are dominating which could lead to 8k pull being exhausted, we fall-back
to 1m pool till 8k pool recovers for use.

This helps to at least push ~55 kB/s bidirectional data which
is a nice improvement.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: use max_mr from HCA caps than max_fmr

All HCA drivers seems to popullate max_mr caps and few of
them do both max_mr and max_fmr.

Hence update RDS code to make use of max_mr.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: mark rds_ib_fmr_wq static

Fix below warning by marking rds_ib_fmr_wq static

net/rds/ib_rdma.c:87:25: warning: symbol 'rds_ib_fmr_wq' was not declared. Should it be static?

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: use already available pool handle from ibmr

rds_ib_mr already keeps the pool handle which it associates
with. Lets use that instead of round about way of fetching
it from rds_ib_device.

No functional change.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: fix the rds_ib_fmr_wq kick call

RDS IB mr pool has its own workqueue 'rds_ib_fmr_wq', so we need
to use queue_delayed_work() to kick the work. This was hurting
the performance since pool maintenance was less often triggered
from other path.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: handle rds_ibdev release case instead of crashing the kernel

Just in case we are still handling the QP receive completion while the
rds_ibdev is released, drop the connection instead of crashing the kernel.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: split send completion handling and do batch ack

Similar to what we did with receive CQ completion handling, we split
the transmit completion handler so that it lets us implement batched
work completion handling.

We re-use the cq_poll routine and makes use of RDS_IB_SEND_OP to
identify the send vs receive completion event handler invocation.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: IB: ack more receive completions to improve performance

For better performance, we split the receive completion IRQ handler. That
lets us acknowledge several WCE events in one call. We also limit the WC
to max 32 to avoid latency. Acknowledging several completions in one call
instead of several calls each time will provide better performance since
less mutual exclusion locks are being performed.

In next patch, send completion is also split which re-uses the poll_cq()
and hence the code is moved to ib_cm.c

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: use rds_send_xmit() state instead of RDS_LL_SEND_FULL

In Transport indepedent rds_sendmsg(), we shouldn't make decisions based
on RDS_LL_SEND_FULL which is used to manage the ring for RDMA based
transports. We can safely issue rds_send_xmit() and the using its
return value take decision on deferred work. This will also fix
the scenario where at times we are seeing connections stuck with
the LL_SEND_FULL bit getting set and never cleared.

We kick krdsd after any time we see -ENOMEM or -EAGAIN from the
ring allocation code.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

RDS: defer the over_batch work to send worker

Current process gives up if its send work over the batch limit.
The work queue will get kicked to finish off any other requests.
This fixes remainder condition from commit 443be0e5affe ("RDS: make
sure not to loop forever inside rds_send_xmit").

The restart condition is only for the case where we reached to
over_batch code for some other reason so just retrying again
before giving up.

While at it, make sure we use already available 'send_batch_count'
parameter instead of magic value. The batch count threshold value
of 1024 came via commit 443be0e5affe ("RDS: make sure not to loop
forever inside rds_send_xmit"). The idea is to process as big a
batch as we can but at the same time we don't hold other waiting
processes for send. Hence back-off after the send_batch_count
limit (1024) to avoid soft-lock ups.

Signed-off-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Santosh Shilimkar <[email protected]>

mac80211: make ieee80211_new_mesh_header return unsigned

The function returns always non-negative values.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

ebpf: include perf_event only where really needed

Commit ea317b267e9d ("bpf: Add new bpf map type to store the pointer
to struct perf_event") added perf_event.h to the main eBPF header, so
it gets included for all users. perf_event.h is actually only needed
from array map side, so lets sanitize this a bit.

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Kaixu Xia <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

For ARMv7 with UDIV instruction support, generate an UDIV instruction
followed by an MLS instruction.

For other ARM variants, generate code calling a C wrapper similar to
the jit_udiv() function used for BPF_ALU | BPF_DIV instructions.

Some performance numbers reported by the test_bpf module (the duration
per filter run is reported in nanoseconds, between "jitted:<x>" and
"PASS":

ARMv7 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2196 PASS
ARMv7 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 104 PASS
ARMv5 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2176 PASS
ARMv5 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 1104 PASS
ARMv5 kirkwood nojit: test_bpf: #3 DIV_MOD_KX jited:0 1103 PASS
ARMv5 kirkwood jit: test_bpf: #3 DIV_MOD_KX jited:1 311 PASS

Signed-off-by: Nicolas Schichan <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'asix-rx-mem-handling'

Mark Craske says:

====================
Improve ASIX RX memory allocation error handling

The ASIX RX handler algorithm is weak on error handling.
There is a design flaw in the ASIX RX handler algorithm because the
implementation for handling RX Ethernet frames for the DUB-E100 C1 can
have Ethernet frames spanning multiple URBs. This means that payload data
from more than 1 URB is sometimes needed to fill the socket buffer with a
complete Ethernet frame. When the URB with the start of an Ethernet frame
is received then an attempt is made to allocate a socket buffer. If the
memory allocation fails then the algorithm sets the buffer pointer member
to NULL and the function exits (no crash yet). Subsequently, the RX hander
is called again to process the next URB which assumes there is a socket
buffer available and the kernel crashes when there is no buffer.

This patchset implements an improvement to the RX handling algorithm to
avoid a crash when no memory is available for the socket buffer.

The patchset will apply cleanly to the net-next master branch but the
created kernel has not been tested. The driver was tested on ARM kernels
v3.8 and v3.14 for a commercial product.
====================

Signed-off-by: David S. Miller <[email protected]>

asix: Continue processing URB if no RX netdev buffer

Avoid a loss of synchronisation of the Ethernet Data header 32-bit
word due to a failure to get a netdev socket buffer.

The ASIX RX handling algorithm returned 0 upon a failure to get
an allocation of a netdev socket buffer. This causes the URB
processing to stop which potentially causes a loss of synchronisation
with the Ethernet Data header 32-bit word. Therefore, subsequent
processing of URBs may be rejected due to a loss of synchronisation.
This may cause additional good Ethernet frames to be discarded
along with outputting of synchronisation error messages.

Implement a solution which checks whether a netdev socket buffer
has been allocated before trying to copy the Ethernet frame into
the netdev socket buffer. But continue to process the URB so that
synchronisation is maintained. Therefore, only a single Ethernet
frame is discarded when no netdev socket buffer is available.

Signed-off-by: Dean Jenkins <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

asix: On RX avoid creating bad Ethernet frames

When RX Ethernet frames span multiple URB socket buffers,
the data stream may suffer a discontinuity which will cause
the current Ethernet frame in the netdev socket buffer
to be incomplete. This frame needs to be discarded instead
of appending unrelated data from the current URB socket buffer
to the Ethernet frame in the netdev socket buffer. This avoids
creating a corrupted Ethernet frame in the netdev socket buffer.

A discontinuity can occur when the previous URB socket buffer
held an incomplete Ethernet frame due to truncation or a
URB socket buffer containing the end of the Ethernet frame
was missing.

Therefore, add a sanity test for when an Ethernet frame
spans multiple URB socket buffers to check that the remaining
bytes of the currently received Ethernet frame point to
a good Data header 32-bit word of the next Ethernet
frame. Upon error, reset the remaining bytes variable to
zero and discard the current netdev socket buffer.
Assume that the Data header is located at the start of
the current socket buffer and attempt to process the next
Ethernet frame from there. This avoids unnecessarily
discarding a good URB socket buffer that contains a new
Ethernet frame.

Signed-off-by: Dean Jenkins <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

asix: Simplify asix_rx_fixup_internal() netdev alloc

The code is checking that the Ethernet frame will fit into a
netdev allocated socket buffer within the constraints of MTU size,
Ethernet header length plus VLAN header length.

The original code was checking rx->remaining each loop of the while
loop that processes multiple Ethernet frames per URB and/or Ethernet
frames that span across URBs. rx->remaining decreases per while loop
so there is no point in potentially checking multiple times that the
Ethernet frame (remaining part) will fit into the netdev socket buffer.

The modification checks that the size of the Ethernet frame will fit
the netdev socket buffer before allocating the netdev socket buffer.
This avoids grabbing memory and then deciding that the Ethernet frame
is too big and then freeing the memory.

Signed-off-by: Dean Jenkins <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

asix: Tidy-up 32-bit header word synchronisation

Tidy-up the Data header 32-bit word synchronisation logic in
asix_rx_fixup_internal() by removing redundant logic tests.

The code is looking at the following cases of the Data header
32-bit word that is present before each Ethernet frame:

a) all 32 bits of the Data header word are in the URB socket buffer
b) first 16 bits of the Data header word are at the end of the URB
socket buffer
c) last 16 bits of the Data header word are at the start of the URB
socket buffer eg. split_head = true

Note that the lifetime of rx->split_head exists outside of the
function call and is accessed per processing of each URB. Therefore,
split_head being true acts on the next URB to be processed.

To check for b) the offset will be 16 bits (2 bytes) from the end of
the buffer then indicate split_head is true.
To check for c) split_head must be true because the first 16 bits
have been found.
To check for a) else c)

Note that the || logic of the old code included the state
(skb->len - offset == sizeof(u16) && rx->split_head) which is not
possible because the split_head cannot be true whilst checking for b).
This is because the split_head indicates that the first 16 bits have
been found and that is not possible whilst checking for the first 16
bits. Therefore simplify the logic.

Signed-off-by: Dean Jenkins <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

asix: Rename remaining and size for clarity

The Data header synchronisation is easier to understand
if the variables "remaining" and "size" are renamed.

Therefore, the lifetime of the "remaining" variable exists
outside of asix_rx_fixup_internal() and is used to indicate
any remaining pending bytes of the Ethernet frame that need
to be obtained from the next socket buffer. This allows an
Ethernet frame to span across multiple socket buffers.

"size" is now local to asix_rx_fixup_internal() and contains
the size read from the Data header 32-bit word.

Add "copy_length" to hold the number of the Ethernet frame
bytes (maybe a part of a full frame) that are to be copied
out of the socket buffer.

Signed-off-by: Dean Jenkins <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bpf, seccomp: prepare for upcoming criu support

The current ongoing effort to dump existing cBPF seccomp filters back
to user space requires to hold the pre-transformed instructions like
we do in case of socket filters from sk_attach_filter() side, so they
can be reloaded in original form at a later point in time by utilities
such as criu.

To prepare for this, simply extend the bpf_prog_create_from_user()
API to hold a flag that tells whether we should store the original
or not. Also, fanout filters could make use of that in future for
things like diag. While fanout filters already use bpf_prog_destroy(),
move seccomp over to them as well to handle original programs when
present.

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Tycho Andersen <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Tested-by: Tycho Andersen <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vrf: fix a kernel warning

This fixes:

tried to remove device ip6gre0 from (null)
------------[ cut here ]------------
kernel BUG at net/core/dev.c:5219!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
CPU: 3 PID: 161 Comm: kworker/u8:2 Not tainted 4.3.0-rc2+ #1142
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: netns cleanup_net
task: ffff8800d784a9c0 ti: ffff8800d74a4000 task.ti: ffff8800d74a4000
RIP: 0010:[<ffffffff817f0797>]  [<ffffffff817f0797>] __netdev_adjacent_dev_remove+0x40/0xec
RSP: 0018:ffff8800d74a7a98  EFLAGS: 00010282
RAX: 000000000000002a RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88011adcf701 RSI: ffff88011adccbf8 RDI: ffff88011adccbf8
RBP: ffff8800d74a7ab8 R08: 0000000000000001 R09: 0000000000000000
R10: ffffffff81d190ff R11: 00000000ffffffff R12: ffff8800d599e7c0
R13: 0000000000000000 R14: ffff8800d599e890 R15: ffffffff82385e00
FS:  0000000000000000(0000) GS:ffff88011ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ffd6f003000 CR3: 000000000220c000 CR4: 00000000000006e0
Stack:
  0000000000000000 ffff8800d599e7c0 0000000000000b00 ffff8800d599e8a0
  ffff8800d74a7ad8 ffffffff817f0861 0000000000000000 ffff8800d599e7c0
  ffff8800d74a7af8 ffffffff817f088f 0000000000000000 ffff8800d599e7c0
Call Trace:
  [<ffffffff817f0861>] __netdev_adjacent_dev_unlink+0x1e/0x35
  [<ffffffff817f088f>] __netdev_adjacent_dev_unlink_neighbour+0x17/0x41
  [<ffffffff817f56e6>] netdev_upper_dev_unlink+0x6c/0x13d
  [<ffffffff81674a3d>] vrf_del_slave+0x26/0x7d
  [<ffffffff81674ac3>] vrf_device_event+0x2f/0x34
  [<ffffffff81098c40>] notifier_call_chain+0x75/0x9c
  [<ffffffff81098fa2>] raw_notifier_call_chain+0x14/0x16
  [<ffffffff817ee129>] call_netdevice_notifiers_info+0x52/0x59
  [<ffffffff817f179d>] call_netdevice_notifiers+0x13/0x15
  [<ffffffff817f6f18>] rollback_registered_many+0x14f/0x24f
  [<ffffffff817f70f2>] unregister_netdevice_many+0x19/0x64
  [<ffffffff819a2455>] ip6gre_exit_net+0x163/0x177
  [<ffffffff817eb019>] ops_exit_list+0x44/0x55
  [<ffffffff817ebcb7>] cleanup_net+0x193/0x226
  [<ffffffff81091e1c>] process_one_work+0x26c/0x4d8
  [<ffffffff81091d20>] ? process_one_work+0x170/0x4d8
  [<ffffffff81092296>] worker_thread+0x1df/0x2c2
  [<ffffffff810920b7>] ? process_scheduled_works+0x2f/0x2f
  [<ffffffff810920b7>] ? process_scheduled_works+0x2f/0x2f
  [<ffffffff81097a20>] kthread+0xd4/0xdc
  [<ffffffff810bc523>] ? trace_hardirqs_on_caller+0x17d/0x199
  [<ffffffff8109794c>] ? __kthread_parkme+0x83/0x83
  [<ffffffff81a5240f>] ret_from_fork+0x3f/0x70
  [<ffffffff8109794c>] ? __kthread_parkme+0x83/0x83

Fixes: 93a7e7e837af ("net: Remove the now unused vrf_ptr")
Cc: David Ahern <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>