David Ahern [Tue, 5 Jul 2016 01:47:41 +0000 (18:47 -0700)]
net: vrf: Add support for PREROUTING rules on vrf device
Add support for PREROUTING rules with skb->dev set to the vrf device.
INPUT rules are already allowed. Provides symmetry with the output path
which allows POSTROUTING rules.
Paul Gortmaker [Mon, 4 Jul 2016 21:50:58 +0000 (17:50 -0400)]
connector: make cn_proc explicitly non-modular
The Kconfig controlling build of this code is currently:
drivers/connector/Kconfig:config PROC_EVENTS
drivers/connector/Kconfig: bool "Report process events to userspace"
...meaning that it currently is not being built as a module by anyone.
Lets remove the two modular references, so that when reading the driver
there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.
net: hns: fix return value check in hns_dsaf_get_cfg()
In case of error, function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
This patchset enables IPv4 unicast routing in the Mellanox Spectrum ASIC
switch driver. This builds upon the work that was done by a couple of
previous patchsets.
Patches 1,2,6 add a couple of dependencies outside the driver. Namely, the
ability to propagate ndo_neigh_construct()/destroy() through stacked devices and
a notification whenever DELAY_PROBE_TIME changes. When propagated down, the
ndos allow drivers to add and remove neighbour entries from their private
neighbour table. The DELAY_PROBE_TIME notification gives drivers the ability to
correctly configure their polling interval for neighbour activity, so that
active neighbour won't be marked as STALE.
Patches 3-5,7-8 add the neighbour offloading infrastructure, where patch 7 uses
the DELAY_PROBE_TIME notification in order to correctly configure the device's
polling interval. Patch 8 finally programs neighbours to the device's table
based on NEIGH_UPDATE notifications, so that directly connected routes can
be used.
Patches 9-16 build upon the previous patches and extend the router with
remote routes (nexthop) support.
====================
Now, the driver sends arp probes for all unresolved neighbours that are
currently a nexthop for some route on the system. The job is set
periodically every 5 seconds.
mlxsw: spectrum_router: Add the nexthop neigh activity update
For nexthop neighbours we need to make kernel to think there is a traffic
flowing to them preventing it from going to stale state. Otherwise
kernel would stale it and eventually the neigh would be removed from HW
and nexthop as well. That would reduce ECMP group in HW.
Implement next-hop routing offload including ECMP. To make it possible,
introduce next-hop group entity. This entity keeps track of resolved
neighbours and updates HW adjacency table accordingly. Note that HW
next-hops are stored in this adjacency table, in form of MAC.
mlxsw: Introduce simplistic KVD linear area manager
This is a very simple manager for KVD linear area. Currently, the
allocator will either allocate a single entry from pre-defined sub-area,
or in case more than one entry is needed, it will allocate 32-entry chunk
in other pre-defined sub-area.
mlxsw: spectrum_router: Offload neighbours based on NUD state change
Listen to any NEIGH_UPDATE events sent and program the device
accordingly. If NUD state is VALID and neighbour isn't yet offloaded,
then program it into the device's table. Otherwise, just edit its
parameters.
If NUD state machine transitioned neighbour out of VALID state and it's
present in the device's table, then remove it.
Note that the device is programmed in delayed work, as the netevent
notification chain is atomic and prevents us from going to sleep.
mlxsw: spectrum_router: Periodically update the kernel's neigh table
As previously explained, the driver should periodically poll the device
for neighbours activity according to the configured DELAY_PROBE_TIME.
This will prevent active neighbours from staying in STALE state for long
periods of time.
During init configure the polling interval according to the
DELAY_PROBE_TIME used in the default table. In addition, register a
netevent notification block, so that the interval is updated whenever
DELAY_PROBE_TIME changes.
Using the computed interval schedule a delayed work, which will update
the kernel via neigh_event_send() on any active neighbour since the last
delayed work.
neigh: Send a notification when DELAY_PROBE_TIME changes
When the data plane is offloaded the traffic doesn't go through the
networking stack. Therefore, after first resolving a neighbour the NUD
state machine will transition it from REACHABLE to STALE until it's
finally deleted by the garbage collector.
To prevent such situations the offloading driver should notify the NUD
state machine on any neighbours that were recently used. The driver's
polling interval should be set so that the NUD state machine can
function as if the traffic wasn't offloaded.
Currently, there are no in-tree drivers that can report confirmation for
a neighbour, but only 'used' indication. Therefore, the polling interval
should be set according to DELAY_FIRST_PROBE_TIME, as a neighbour will
transition from REACHABLE state to DELAY (instead of STALE) if "a packet
was sent within the last DELAY_FIRST_PROBE_TIME seconds" (RFC 4861).
Send a netevent whenever the DELAY_FIRST_PROBE_TIME changes - either via
netlink or sysctl - so that offloading drivers can correctly set their
polling interval.
The RAUHT register is used to configure and query the Unicast Host Table
in devices that implement the Algorithmic LPM. In other words, it is
used to configure neighbour entries in the device.
We need to hold some private data for every neigh entry. It would be
possible to do it using neigh_priv_len/ndo_neigh_construct/
ndo_neigh_destroy however only for the port device itself. That would not
work for stacked devices like bridge/team/bond. So introduce a private
neigh table. Hook onto ndos neigh_construct/destroy and add/remove
table entry according to that.
net: introduce default neigh_construct/destroy ndo calls for L2 upper devices
L2 upper device needs to propagate neigh_construct/destroy calls down to
lower devices. Do this by defining default ndo functions and use them in
team, bond, bridge and vlan.
Instead of taking one interrupt per packet transmitted, re-use the same
NAPI context to free transmitted buffers. Since we are no longer in hard
IRQ context replace dev_kfree_skb_irq() by dev_kfree_skb().
Pad the SKB to the minimum length of ETH_ZLEN by using skb_put_padto()
and take this operation out of the critical section since there is no
need to check any HW resources before doing that.
net: r6040: Increase statistics upon transmit completion
r6040_xmit() is increasing transmit statistics during transmission while
this may still fail, do this in r6040_tx() where we complete transmitted
buffers instead.
This series adds Ethernet ethtool ntuple steering 'ethtool -N|U' and exposes two more
counter sets to Ethtool statistics, RDMA vport and global flow control statistics.
We start from three refactoring patches of the flow steering infrastructure
- mlx5_add_flow_rule will now receive mlx5 flow spec to simplify and reduce
number of parameters
- All low level steering objects are now wrapped in mlx5_flow_steering structure
for better encapsulation
- Flow steering object will now be removed properly and generically rather than
traversing on a well-known steering tree objects
Patch#4 adds the infrastructure and the data structures needed for the ethtool ntuple
steering, all implemented in a new file 'en_fs_ethtool.c'. Add the support for set_rxnfc
ethtool callback to add/remove/replace a flow spec of ethter type L2.
Patch#5 adds the support for L3/L4 flow specs and a higher priority in favor for L3/L4
rules when interleaving with L2 rules.
Patch#6 adds the support for get_rxnfc ethtool callback.
Patch#7,8 adds RDMA vport and global flow control statistics.
Gal Pressman [Mon, 4 Jul 2016 14:23:12 +0000 (17:23 +0300)]
net/mlx5e: Expose flow control counters to ethtool
Just like per prio counters, the global flow counters are queried from
per priority counters register.
Global flow control counters are stored in priority 0 PFC counters.
Maor Gottlieb [Mon, 4 Jul 2016 14:23:09 +0000 (17:23 +0300)]
net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering
Add support to add flow steering rules with ethtool
of L3/L4 flow types (ip4/tcp4/udp4).
Those rules will be in higher priority than l2 flow rules, in order
to prefer more specific rules.
Maor Gottlieb [Mon, 4 Jul 2016 14:23:08 +0000 (17:23 +0300)]
net/mlx5e: Add ethtool flow steering support
Implement etrhtool set_rxnfc callback to support ethtool flow spec
direct steering. This patch adds only the support of ether flow type
spec. L3/L4 flow specs support will be added in downstream patches.
Maor Gottlieb [Mon, 4 Jul 2016 14:23:07 +0000 (17:23 +0300)]
net/mlx5: Properly remove all steering objects
Instead of explicitly cleaning up the well known parts of the steering
tree, we use the generic tree structure to traverse for cleanup.
No functional changes.
Maor Gottlieb [Mon, 4 Jul 2016 14:23:06 +0000 (17:23 +0300)]
net/mlx5: Introduce mlx5_flow_steering structure
Instead of having all steering private name spaces and
steering module fields flat in mlx5_core_priv, we wrap
them in mlx5_flow_steering for better modularity and
API exposure.
Commit 8067302973a1 ("net-next: mediatek: add support for IRQ grouping")
adds handling for irq 1 and 2 to the uninit function but did not remove
irq 0 which is not used since irq grouping was introduced. Fix this by
removing the superfluous call to free_irq().
David S. Miller [Tue, 5 Jul 2016 06:33:59 +0000 (23:33 -0700)]
Merge tag 'batadv-next-for-davem-20160704' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature patchset includes the following changes:
- Cleanup work by Markus Pargmann and Sven Eckelmann (six patches)
- Initial Netlink support by Matthias Schiffer (two patches)
- Throughput Meter implementation by Antonio Quartulli, a kernel-space
traffic generator to estimate link speeds. This feature is useful on
low-end WiFi APs where running iperf or netperf from userspace
gives wrong results due to heavy userspace/kernelspace overhead.
(two patches)
- API clean-up work by Antonio Quartulli (one patch)
====================
qeth: delete napi struct when removing a qeth device
A qeth_card contains a napi_struct linked to the net_device during
device probing. This struct must be deleted when removing the qeth
device, otherwise Panic on oops can occur when qeth devices are
repeatedly removed and added.
David S. Miller [Tue, 5 Jul 2016 01:25:16 +0000 (18:25 -0700)]
Merge branch 'mlxsw-fib-offload'
Jiri Pirko says:
====================
mlxsw: Implement basic FIB offload and router interfaces
Introduce LPM trees management including virtual router management for HW.
Implement basic FIB offloading using switchdev FIB objects. For now only support
local routes and direct routes (next-hop support will be introduced in
a follow-up patchset).
Introduce router interfaces in patches 10-14.
====================
mlxsw: spectrum: Enable L3 interfaces on top of bridge devices
As with the previously introduced L3 interfaces, listen to 'inetaddr'
notifications sent for bridges devices configured on top of the port
netdevs and create / destroy router interfaces (RIFs) accordingly.
This also includes VLAN devices configured on top of the VLAN-aware
bridge.
The RIFs will be destroyed either when the last IP address is removed or
when the underlying FID is is destroyed.
mlxsw: spectrum: Configure FIDs based on bridge events
Before introducing support for L3 interfaces on top of the VLAN-aware
bridge we need to add some missing infrastructure.
Such an interface can either be the bridge device itself or a VLAN
device on top of it. In the first case the router interface (RIF) is
associated with FID 1, which is created whenever the first port netdev
joins the bridge. We currently assume the default PVID is 1 and that
it's already created, as it seems reasonable. This can be extended in
the future.
However, in the second case it's entirely possible we've yet to create a
matching FID. This can happen if the VLAN device was configured before
making any bridge port member in the VLAN.
Prevent such ordering problems by using the VLAN device's CHANGEUPPER
event to configure the FID. Make the VLAN device hold a reference to the
FID and prevent it from being destroyed even if none of the port netdevs
is using it.
Previous commit deprecated the vFIDs used to get traffic to the CPU
('port_vfids'). Thus, we now use the vFIDs as god intended and the
artificial split is no longer needed.
mlxsw: spectrum: Introduce support for router interfaces
Up until now we only supported bridged interfaces. Packets ingressing
through the switch ports were either classified to FIDs (in the case of
the VLAN-aware bridge) or vFIDs (in the case of VLAN-unaware bridges).
The packets were then forwarded according to the FDB. Routing was done
entirely in slowpath, by splitting the vFID range in two and using the
lower 0.5K vFIDs as dummy bridges that simply flooded all incoming
traffic to the CPU.
Instead, allow packets to be routed in the device by creating router
interfaces (RIFs) that will direct them to the router block.
Specifically, the RIFs introduced here are Sub-port RIFs used for VLAN
devices and port netdevs. Packets ingressing from the {Port / LAG ID, VID}
with which the RIF was programmed with will be assigned to a special
kind of FIDs called rFIDs and from there directed to the router.
Create a RIF whenever the first IPv4 address was programmed on a VLAN /
LAG / port netdev. Destroy it upon removal of the last IPv4 address.
Receive these notifications by registering for the 'inetaddr'
notification chain. A non-zero (10) priority is used for the
notification block, so that RIFs will be created before routes are
offloaded via FIB code.
Note that another trigger for RIF destruction are CHANGEUPPER
notifications causing the underlying FID's reference count to go down to
zero. This can happen, for example, when a VLAN netdev with an IP address
is put under bridge. While this configuration doesn't make sense it does
cause the device and the kernel to get out of sync when the netdev is
unbridged. We intend to address this in the future, hopefully in current
cycle.
Finally, Remove the lower 0.5K vFIDs, as they are deprecated by the RIFs,
which will trap packets according to their DIP.
mlxsw: spectrum: Edit RIF properties based on netdev events
We are just about to introduce router interfaces (RIFs), but before that
we need to be able update the device with the correct RIF attributes
whenever they change for the netdev the RIF is backing. Two such
attributes are MTU and MAC.
The MAC is used both to set the source MAC of packets egressing from the
RIF and also to program an FDB rule that will direct packets to the
router block.
Use the existing netdevice notification block and respond to CHANGEADDR
and CHANGEMTU accordingly. Store both attributes in the RIF struct
in case we need to revert to old attributes following a failed update.
mlxsw: spectrum: Add couple of lower device helper functions
Add functions that iterate over lower devices and find port device.
As a dependency add netdev_for_each_all_lower_dev and
netdev_for_each_all_lower_dev_rcu macro with
netdev_all_lower_get_next and netdev_all_lower_get_next_rcu shelpers.
Also, add functions to return mlxsw struct according to lower device
found and mlxsw_port struct with a reference to lower device.
Implement ipv4 FIB entries addition and removal. Initially, we support
local and broadcast routes using "ip2me" trap action.
Also, unicast routes without nexthop are supported using "local" action.
Virtual router is a construct used inside HW. In this implementation
we map kernel tables to virtual routers one to one. Introduce management
logic to create virtual routers when needed and destroy in case they are
no longer in use. According to that, call into LPM tree management.
Each virtual router is always bound to one LPM tree.
mlxsw: spectrum_router: Implement LPM trees management
Introduce basic LPM tree management allowing to share the trees in
between tables if the used prefixes in the tables are the same.
Build the tree structure according to the used prefixes. Although it is
not optimal for many use cases, this initial implementation does only
simple linear left-tree. More advanced structures will be introduced
later on, possibly including mechanisms to change trees on the fly.
Shadow FIB is needed in order to hold additional information for FIB
entries and keep track of used prefixes. That is needed for the LPM tree
construction to be introduced later on in this set.
Jason Wang [Mon, 4 Jul 2016 05:53:38 +0000 (13:53 +0800)]
tun: fix build warnings
Stephen Rothwell reports a build warnings(powerpc ppc64_defconfig)
drivers/net/tun.c: In function 'tun_do_read.part.5':
/home/sfr/next/next/drivers/net/tun.c:1491:6: warning: 'err' may be
used uninitialized in this function [-Wmaybe-uninitialized]
int err;
This is because tun_ring_recv() may return an uninitialized err, fix this.
David S. Miller [Mon, 4 Jul 2016 23:15:32 +0000 (16:15 -0700)]
Merge branch 'liquidio-next'
Raghu Vatsavayi says:
====================
liquidio updates and bug fixes
Following V2 patchset contains updates and bug fixes for
liquidio driver. This patchset also has changes that you
suggested in vxlan code. Please apply the patches in following
order as some of the patches depend on earlier patches in the
series.
====================
cdc_ncm: workaround for EM7455 "silent" data interface
Several Lenovo users have reported problems with their Sierra
Wireless EM7455 modem. The driver has loaded successfully and
the MBIM management channel has appeared to work, including
establishing a connection to the mobile network. But no frames
have been received over the data interface.
The problem affects all EM7455 and MC7455, and is assumed to
affect other modems based on the same Qualcomm chipset and
baseband firmware.
Testing narrowed the problem down to what seems to be a
firmware timing bug during initialization. Adding a short sleep
while probing is sufficient to make the problem disappear.
Experiments have shown that 1-2 ms is too little to have any
effect, while 10-20 ms is enough to reliably succeed.
Daniel Borkmann [Sat, 2 Jul 2016 23:28:47 +0000 (01:28 +0200)]
bpf: add bpf_get_hash_recalc helper
If skb_clear_hash() was invoked due to mangling of relevant headers and
BPF program needs skb->hash later on, we can add a helper to trigger hash
recalculation via bpf_get_hash_recalc().
The helper will return the newly retrieved hash directly, but later access
can also be done via skb context again through skb->hash directly (inline)
without needing to call the helper once more.
John Fastabend [Sat, 2 Jul 2016 21:13:13 +0000 (14:13 -0700)]
net: samples: pktgen mode samples/tests for qdisc layer
This adds samples for pktgen to use with new mode to inject pkts into
the qdisc layer. This also doubles as nice test cases to test any
patches against qdisc layer.
John Fastabend [Sat, 2 Jul 2016 21:12:54 +0000 (14:12 -0700)]
net: pktgen: support injecting packets for qdisc testing
Add another xmit_mode to pktgen to allow testing xmit functionality
of qdiscs. The new mode "queue_xmit" injects packets at
__dev_queue_xmit() so that qdisc is called.
Philippe Reynes [Sun, 3 Jul 2016 15:33:56 +0000 (17:33 +0200)]
net: ethernet: bcmgenet: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Philippe Reynes [Sat, 2 Jul 2016 18:06:51 +0000 (20:06 +0200)]
net: ethernet: arc: emac: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Philippe Reynes [Sat, 2 Jul 2016 12:06:14 +0000 (14:06 +0200)]
net: ethernet: ixp4xx_eth: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Philippe Reynes [Sat, 2 Jul 2016 23:14:20 +0000 (01:14 +0200)]
net: ethernet: smsc: smsc911x: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Philippe Reynes [Sat, 2 Jul 2016 22:05:04 +0000 (00:05 +0200)]
net: ethernet: lantiq_etop: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Philippe Reynes [Sat, 2 Jul 2016 21:36:59 +0000 (23:36 +0200)]
net: ethernet: cavium: octeon: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Silent a few smatch warnings about indentation.
This include the removal of a 'return' statement in 'resource_tracker.c'.
This 'return' will still be performed when breaking out of the
corresponding 'switch' block.
ALSA: timer: Fix negative queue usage by racy accesses
The user timer tu->qused counter may go to a negative value when
multiple concurrent reads are performed since both the check and the
decrement of tu->qused are done in two individual locked contexts.
This results in bogus read outs, and the endless loop in the
user-space side.
The fix is to move the decrement of the tu->qused counter into the
same spinlock context as the zero-check of the counter.
batman-adv: split routing API data structure in subobjects
The routing API data structure contains several function
pointers that can easily be grouped together based on the
component they work with.
Split the API in subobjects in order to improve definition readability.
At the same time, remove the "bat_" prefix from the API object and
its fields names. These are batman-adv private structs and there is no
need to always prepend such prefix, which only makes function invocations
much much longer.