Florian Fainelli [Thu, 10 May 2018 20:17:30 +0000 (13:17 -0700)]
net: phy: phylink: Release link GPIO
We are not releasing the link GPIO descriptor with gpiod_put() which results in
subsequent probing to get -EBUSY when calling fwnode_get_named_gpiod(). Fix this
by doing the release in phylink_destroy().
Fixes: 9525ae83959b ("phylink: add phylink infrastructure") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Florian Fainelli [Thu, 10 May 2018 20:17:29 +0000 (13:17 -0700)]
net: phy: phylink: Use gpiod_get_value_cansleep()
The GPIO provider for the link GPIO line might require the use of the
_cansleep() API, utilize that. This is safe to do since we run in workqueue
context.
Fixes: 9525ae83959b ("phylink: add phylink infrastructure") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
There was a regression at some point from the intended functionality of
commit f60c3704e87d ("bonding: Fix alb mode to only use first level
vlans.")
Given the return value vlan_get_encap_level() we need to store the nest
level of the bond device, and then compare the vlan's encap level to
this. Without this, this check always fails and learning packets are
never sent.
In addition, this same commit caused a regression in the behavior of
balance_alb, which requires learning packets be sent for all interfaces
using the slave's mac in order to load balance properly. For vlan's
that have not set a user mac, we can send after checking one bit.
Otherwise we need send the set mac, albeit defeating rx load balancing
for that vlan.
Make sure multicast, broadcast, and zero mac's cannot be the output of rlb
updates, which should all be directed arps. Receive load balancing will be
collapsed if any of these happen, as the switch will broadcast.
tracing: Fix regex_match_front() to not over compare the test string
The regex match function regex_match_front() in the tracing filter logic,
was fixed to test just the pattern length from testing the entire test
string. That is, it went from strncmp(str, r->pattern, len) to
strcmp(str, r->pattern, r->len).
The issue is that str is not guaranteed to be nul terminated, and if r->len
is greater than the length of str, it can access more memory than is
allocated.
The solution is to add a simple test if (len < r->len) return 0.
Jann Horn [Fri, 11 May 2018 00:19:01 +0000 (02:19 +0200)]
compat: fix 4-byte infoleak via uninitialized struct field
Commit 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to
native counterparts") removed the memset() in compat_get_timex(). Since
then, the compat adjtimex syscall can invoke do_adjtimex() with an
uninitialized ->tai.
If do_adjtimex() doesn't write to ->tai (e.g. because the arguments are
invalid), compat_put_timex() then copies the uninitialized ->tai field
to userspace.
Daniel Jurgens [Mon, 26 Mar 2018 18:35:29 +0000 (13:35 -0500)]
net/mlx5: Free IRQs in shutdown path
Some platforms require IRQs to be free'd in the shutdown path. Otherwise
they will fail to be reallocated after a kexec.
Fixes: 8812c24d28f4 ("net/mlx5: Add fast unload support in shutdown flow") Signed-off-by: Daniel Jurgens <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
David Howells [Thu, 10 May 2018 22:26:00 +0000 (23:26 +0100)]
rxrpc: Fix error reception on AF_INET6 sockets
AF_RXRPC tries to turn on IP_RECVERR and IP_MTU_DISCOVER on the UDP socket
it just opened for communications with the outside world, regardless of the
type of socket. Unfortunately, this doesn't work with an AF_INET6 socket.
Fix this by turning on IPV6_RECVERR and IPV6_MTU_DISCOVER instead if the
socket is of the AF_INET6 family.
Without this, kAFS server and address rotation doesn't work correctly
because the algorithm doesn't detect received network errors.
David Howells [Thu, 10 May 2018 22:26:00 +0000 (23:26 +0100)]
rxrpc: Fix missing start of call timeout
The expect_rx_by call timeout is supposed to be set when a call is started
to indicate that we need to receive a packet by that point. This is
currently put back every time we receive a packet, but it isn't started
when we first send a packet. Without this, the call may wait forever if
the server doesn't deign to reply.
Fix this by setting the timeout upon a successful UDP sendmsg call for the
first DATA packet. The timeout is initiated only for initial transmission
and not for subsequent retries as we don't want the retry mechanism to
extend the timeout indefinitely.
Fixes: a158bdd3247b ("rxrpc: Fix call timeouts") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
Petr Machata [Thu, 10 May 2018 13:29:46 +0000 (15:29 +0200)]
rocker: Postpone filtering of !added_by_user FDB
Breaking out of the switch in rocker_switchdev_event() still ends up
scheduling work, except an ill-defined one. This leads to an OOPS cited
below. Fix by postponing the check until rocker_switchdev_event_work().
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Suggested-by: Vivien Didelot <[email protected]> Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
====================
mlxsw: Support VLAN devices in mirroring offloads
Petr says:
When offloading "tc action mirred mirror", there are several scenarios
where VLAN devices can show up, that mlxsw can offload on Spectrum
machines.
I) A direct mirror to a VLAN device on top of a front-panel port device
(commonly referred to as "RSPAN")
II) VLAN device in egress path of a packet when resolving a mirror to
gretap or ip6gretap netdevice.
Specifically in the latter case, the following are the cases that can be
offloaded:
IIa) VLAN device directly above a physical device.
IIb) A VLAN-unaware bridge where the egress device is as in IIa.
IIc) VLAN device on top of a VLAN-aware bridge where the egress device
is a physical device.
This patch set implements all the above cases.
First, in patch #1, br_vlan_get_info() is extended to allow bridge
master argument.
Case I is then implemented in patches #2 and #3, case II in patch #4.
Note that handling of VLAN protocol is not included. In case I, mirrored
packets may end up being double-tagged, and it might be reasonable for
the outer tag to be an 802.1ad. However, the protocol type configuration
would have to be put on the same VLAN netdevice that represents normal
VLAN traffic, and mlxsw currently ignores this setting in that case. Thus
this support was left out and the encapsulation always uses 802.1q
protocol.
====================
Petr Machata [Thu, 10 May 2018 10:13:06 +0000 (13:13 +0300)]
mlxsw: spectrum_span: Support VLAN under mirror-to-gretap
When mirroring to a gretap or ip6gretap device, allow the underlay
packet path to include VLAN devices. The following configurations are
supported in underlay:
- vlan over phys
- vlan-unaware bridge where the egress device is vlan over phys
- vlan over vlan-aware bridge where the egress device is phys
Petr Machata [Thu, 10 May 2018 10:13:05 +0000 (13:13 +0300)]
mlxsw: spectrum_span: Support mirror-to-VLAN
Offload "tc action mirred mirror" to a device that is a vlan device on
top of a front-panel port device. The hardware encapsulates the mirrored
packets in a VLAN tag. That includes the case that the mirrored traffic
is already VLAN-tagged--in that case the monitor traffic will be
double-tagged, just like in the software path.
Petr Machata [Thu, 10 May 2018 10:13:03 +0000 (13:13 +0300)]
net: bridge: Allow bridge master in br_vlan_get_info()
Mirroring offload in mlxsw needs to check that a given VLAN is allowed
to ingress the bridge device. br_vlan_get_info() is the function that is
used for this, however currently it only supports bridge port devices.
Extend it to support bridge masters as well.
Xin Long [Thu, 10 May 2018 09:34:13 +0000 (17:34 +0800)]
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
In Commit 1f45f78f8e51 ("sctp: allow GSO frags to access the chunk too"),
it held the chunk in sctp_ulpevent_make_rcvmsg to access it safely later
in recvmsg. However, it also added sctp_chunk_put in fail_mark err path,
which is only triggered before holding the chunk.
syzbot reported a use-after-free crash happened on this err path, where
it shouldn't call sctp_chunk_put.
Jon Maxwell [Thu, 10 May 2018 06:53:51 +0000 (16:53 +1000)]
tcp: Add mark for TIMEWAIT sockets
This version has some suggestions by Eric Dumazet:
- Use a local variable for the mark in IPv6 instead of ctl_sk to avoid SMP
races.
- Use the more elegant "IP4_REPLY_MARK(net, skb->mark) ?: sk->sk_mark"
statement.
- Factorize code as sk_fullsock() check is not necessary.
Aidan McGurn from Openwave Mobility systems reported the following bug:
"Marked routing is broken on customer deployment. Its effects are large
increase in Uplink retransmissions caused by the client never receiving
the final ACK to their FINACK - this ACK misses the mark and routes out
of the incorrect route."
Currently marks are added to sk_buffs for replies when the "fwmark_reflect"
sysctl is enabled. But not for TW sockets that had sk->sk_mark set via
setsockopt(SO_MARK..).
Fix this in IPv4/v6 by adding tw->tw_mark for TIME_WAIT sockets. Copy the the
original sk->sk_mark in __inet_twsk_hashdance() to the new tw->tw_mark location.
Then progate this so that the skb gets sent with the correct mark. Do the same
for resets. Give the "fwmark_reflect" sysctl precedence over sk->sk_mark so that
netfilter rules are still honored.
Joe Perches [Thu, 10 May 2018 06:24:07 +0000 (23:24 -0700)]
net: ipv4: remove define INET_CSK_DEBUG and unnecessary EXPORT_SYMBOL
INET_CSK_DEBUG is always set and only is used for 2 pr_debug calls.
EXPORT_SYMBOL(inet_csk_timer_bug_msg) is only used by these 2
pr_debug calls and is also unnecessary as the exported string can
be used directly by these calls.
The hyper-v transparent bonding should have used master_dev_link.
The netvsc device should look like a master bond device not
like the upper side of a tunnel.
This makes the semantics the same so that userspace applications
looking at network devices see the correct master relationshipship.
Fixes: 0c195567a8f6 ("netvsc: transparent VF management") Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Thu, 10 May 2018 21:34:50 +0000 (17:34 -0400)]
Merge tag 'mac80211-for-davem-2018-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
We only have a few fixes this time:
* WMM element validation
* SAE timeout
* add-BA timeout
* docbook parsing
* a few memory leaks in error paths
====================
Felix Manlunas [Wed, 9 May 2018 18:31:31 +0000 (11:31 -0700)]
liquidio: monitor all of Octeon's cores in watchdog thread
The liquidio_watchdog kernel thread is watching over only 12 cores of the
Octeon CN23XX; it's neglecting the other 4 cores that are present in the
CN2360. Fix it by defining LIO_MAX_CORES as 16.
David S. Miller [Thu, 10 May 2018 21:31:03 +0000 (17:31 -0400)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-05-09
This series contains updates to fm10k only.
Jake provides all the changes in the series, starting with adding
support for accelerated MACVLAN devices. Reduced code duplication by
implementing a macro to be used when setting up the type specific
macros. Avoided potential bugs with stats by using a macro to calculate
the array size when passing to ensure that the size is correct.
v2: changed macro reference '#' with __stringify() as suggested by
Joe Perches to patch 2 of the series. Also made sure the updated
series of patches is actually pushed to my kernel.org tree
====================
Eric Dumazet [Wed, 9 May 2018 17:05:46 +0000 (10:05 -0700)]
net/ipv6: fix lock imbalance in ip6_route_del()
WARNING: lock held when returning to user space!
4.17.0-rc3+ #37 Not tainted
syz-executor1/27662 is leaving the kernel with locks still held!
1 lock held by syz-executor1/27662:
#0: 00000000f661aee7 (rcu_read_lock){....}, at: ip6_route_del+0xea/0x13f0 net/ipv6/route.c:3206
BUG: scheduling while atomic: syz-executor1/27662/0x00000002
INFO: lockdep is turned off.
Modules linked in:
Kernel panic - not syncing: scheduling while atomic
Davide Caratti [Wed, 9 May 2018 16:45:42 +0000 (18:45 +0200)]
tc-testing: fix tdc tests for 'bpf' action
- correct a typo in the value of 'matchPattern' of test 282d, potentially
causing false negative
- allow errors when 'teardown' executes '$TC action flush action bpf' in
test 282d, to fix false positive when it is run with act_bpf unloaded
- correct the value of 'matchPattern' in test e939, causing false positive
in case the BPF JIT is enabled
Fixes: 440ea4ae1828 ("tc-testing: add selftests for 'bpf' action") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Lucas Bates <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yunsheng Lin [Wed, 9 May 2018 16:24:40 +0000 (17:24 +0100)]
net: hns3: fix for cleaning ring problem
The head or tail in hardware is not longer valid when resetting,
current hns3_clear_all_ring use them to clean the ring, which
will cause problem during resetting.
This patch fixes it by using next_to_use and next_to_clean in
the ring struct.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yunsheng Lin [Wed, 9 May 2018 16:24:39 +0000 (17:24 +0100)]
net: hns3: remove add/del_tunnel_udp in hns3_enet module
The add/del_tunnel_udp is not implemented in hclge_main moulde,
the NETIF_F_RX_UDP_TUNNEL_PORT feature bit is added automatically
by stack when ndo_udp_tunnel_add is not null in dev->netdev_ops.
This patch removes the add/del_tunnel_udp related function, for
we do not support this feature now.
Yunsheng Lin [Wed, 9 May 2018 16:24:38 +0000 (17:24 +0100)]
net: hns3: Fix for setting mac address when resetting
When hns3_init_mac_addr is called during reset process, it will
get the mac address from NCL_CONFIG and set it to hardware. If
user has changed the mac address, then the mac address set by
user is lost during resetting.
This patch fixes it by not getting the mac address from NCL_CONFIG
when resetting.
Fixes: 424eb834a9be ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: David S. Miller <[email protected]>
====================
net: dsa: mv88e6xxx: cleanup Global Control 2 register
The mv88e6xxx driver still writes arbitrary values in the Global 1
Control 2 register at setup, which layout differs a lot between chips.
This results in an inconsistent configuration, for example with the
Remote Management Unit (RMU).
The first patch adds an operation for the Cascade Port bits, the second
patch sets the device number in the device mapping function and the
third patch adds an operation to correctly disable the RMU.
====================
Vivien Didelot [Wed, 9 May 2018 15:38:51 +0000 (11:38 -0400)]
net: dsa: mv88e6xxx: add RMU disable op
The RMU mode bits moved a lot within the Global Control 2 register of
the Marvell switch families. Add an .rmu_disable op to support at least
3 known alternatives.
Vivien Didelot [Wed, 9 May 2018 15:38:50 +0000 (11:38 -0400)]
net: dsa: mv88e6xxx: set device number
All Marvell switches supported by mv88e6xxx have to set their device
number in the Global Control 2 register. Extract this in a read then
write function, called from the device mapping setup code.
Vivien Didelot [Wed, 9 May 2018 15:38:49 +0000 (11:38 -0400)]
net: dsa: mv88e6xxx: add a cascade port op
Only the 88E6185 family has bits 15:12 Cascade Port bits in the Global
Control 2 register. Hence inconsistent values are actually written in
this register for other families.
Add a .set_cascade_port operation to isolate the 88E6185 case, and call
it from the device mapping setup function.
Moshe Shemesh [Wed, 9 May 2018 15:35:13 +0000 (18:35 +0300)]
net/mlx4_en: Verify coalescing parameters are in range
Add check of coalescing parameters received through ethtool are within
range of values supported by the HW.
Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
users through ethtool. The ethtool support up to 32 bit value for each.
However, mlx4 modify cq limits the coalescing time parameter and
coalescing frames parameters to 16 bits.
Return out of range error if user tries to set these parameters to
higher values.
Change type of sample-interval and adaptive_rx_coal parameters in mlx4
driver to u32 as the ethtool holds them as u32 and these parameters are
not limited due to mlx4 HW.
Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC') Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Thu, 10 May 2018 20:08:26 +0000 (16:08 -0400)]
Merge branch 'mlx4-misc-next'
Tariq Toukan says:
====================
mlx4_core misc for 4.18
This patchset contains misc enhancements from the team
to the mlx4 Core driver.
Patch 1 by Eran adds driver version report in FW.
Patch 2 by Yishai implements suspend/resume PCI callbacks.
Patch 3 extends the range of an existing module param from boolean to numerical.
Series generated against net-next commit: 53a7bdfb2a27 dt-bindings: dsa: Remove unnecessary #address/#size-cells
====================
Tariq Toukan [Wed, 9 May 2018 15:29:04 +0000 (18:29 +0300)]
net/mlx4_core: Use msi_x module param to limit num of MSI-X irqs
Extend the boolean interpretation of msi_x module parameter
to numerical, as follows:
0 - Don't use MSI-X.
1 - Use MSI-X, driver decides the num of MSI-X irqs.
>=2 - Use MSI-X, limit number of MSI-X irqs to msi_x.
In SRIOV, this limits the number of MSI-X irqs per VF.
Paolo Abeni [Wed, 9 May 2018 10:42:34 +0000 (12:42 +0200)]
udp: fix SO_BINDTODEVICE
Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
In absence of VRF devices, after commit fb74c27735f0 ("net:
ipv4: add second dif to udp socket lookups") the dif mismatch
isn't fatal anymore for UDP socket lookup with non null
sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.
This changeset addresses the issue making the dif match mandatory
again in the above scenario.
Reported-by: Damir Mansurov <[email protected]> Fixes: fb74c27735f0 ("net: ipv4: add second dif to udp socket lookups") Fixes: 1801b570dd2a ("net: ipv6: add second dif to udp socket lookups") Signed-off-by: Paolo Abeni <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Hangbin Liu [Wed, 9 May 2018 10:06:44 +0000 (18:06 +0800)]
ipv4: reset fnhe_mtu_locked after cache route flushed
After route cache is flushed via ipv4_sysctl_rtcache_flush(), we forget
to reset fnhe_mtu_locked in rt_bind_exception(). When pmtu is updated
in __ip_rt_update_pmtu(), it will return directly since the pmtu is
still locked. e.g.
+ ip netns exec client ping 10.10.1.1 -c 1 -s 1400 -M do
PING 10.10.1.1 (10.10.1.1) 1400(1428) bytes of data.
>From 10.10.0.254 icmp_seq=1 Frag needed and DF set (mtu = 0)
Mohammed Gamal [Wed, 9 May 2018 08:17:34 +0000 (10:17 +0200)]
hv_netvsc: Fix net device attach on older Windows hosts
On older windows hosts the net_device instance is returned to
the caller of rndis_filter_device_add() without having the presence
bit set first. This would cause any subsequent calls to network device
operations (e.g. MTU change, channel change) to fail after the device
is detached once, returning -ENODEV.
Instead of returning the device instabce, we take the exit path where
we call netif_device_attach()
Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Signed-off-by: Mohammed Gamal <[email protected]> Reviewed-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
nfp: flower: remove headroom from max MTU calculation
Since commit 29a5dcae2790 ("nfp: flower: offload phys port MTU change") we
take encapsulation headroom into account when calculating the max allowed
MTU. This is unnecessary as the max MTU advertised by firmware should have
already accounted for encap headroom.
Subtracting headroom twice brings the max MTU below what's necessary for
some deployments.
Commit 161d82de1ff8 ("net: bridge: Notify about !added_by_user FDB
entries") causes the below oops when bringing up a slave interface,
because dsa_port_fdb_add is still scheduled, but with a NULL address.
To fix this, keep the dsa_slave_switchdev_event function agnostic of the
notified info structure and handle the added_by_user flag in the
specific dsa_slave_switchdev_event_work function.
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Signed-off-by: Vivien Didelot <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Jon Maloy [Wed, 9 May 2018 00:59:41 +0000 (02:59 +0200)]
tipc: clean up removal of binding table items
In commit be47e41d77fb ("tipc: fix use-after-free in tipc_nametbl_stop")
we fixed a problem caused by premature release of service range items.
That fix is correct, and solved the problem. However, it doesn't address
the root of the problem, which is that we don't lookup the tipc_service
-> service_range -> publication items in the correct hierarchical
order.
In this commit we try to make this right, and as a side effect obtain
some code simplification.
David S. Miller [Thu, 10 May 2018 19:22:36 +0000 (15:22 -0400)]
Merge branch 'qed-rdma-fixes'
Michal Kalderon says:
====================
qed*: Rdma fixes
This patch series include two fixes for bugs related to rdma.
The first has to do with loading the driver over an iWARP
device.
The second fixes a previous commit that added proper link
indication for iWARP / RoCE.
====================
Michal Kalderon [Tue, 8 May 2018 18:29:19 +0000 (21:29 +0300)]
qede: Fix gfp flags sent to rdma event node allocation
A previous commit 4609adc27175 ("qede: Fix qedr link update")
added a flow that could allocate rdma event objects from an
interrupt path (link notification). Therefore the kzalloc call
should be done with GFP_ATOMIC.
====================
net: Update static keys to modern api
The following patches update pretty much all core net static key users
to the modern api. Changes are mostly trivial conversion without affecting
any semantics. The motivation is a resend of patches 1 and 2 from a while[1]
back, and the rest are added patches, specific for -net.
Applies against today's linux-next. Compile tested only.
Linus Torvalds [Thu, 10 May 2018 18:42:01 +0000 (11:42 -0700)]
Merge tag 'for-4.17/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- a stable fix for DM integrity to use kvfree
- fix for a 4.17-rc1 change to dm-bufio's buffer alignment
- fixes for a few sparse warnings
- remove VLA usage in DM mirror target
- improve DM thinp Documentation for the "read_only" feature
* tag 'for-4.17/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: update Documentation to clarify when "read_only" is valid
dm mirror: remove VLA usage
dm: fix some sparse warnings and whitespace in dax methods
dm cache background tracker: fix sparse warning
dm bufio: fix buffer alignment
dm integrity: use kvfree for kvmalloc'd memory
Jose Abreu [Fri, 4 May 2018 09:01:38 +0000 (10:01 +0100)]
net: stmmac: Add support for U32 TC filter using Flexible RX Parser
This adds support for U32 filter by using an HW only feature called
Flexible RX Parser. This allow us to match any given packet field with a
pattern and accept/reject or even route the packet to a specific DMA
channel.
Right now we only support acception or rejection of frame and we only
support simple rules. Though, the Parser has the flexibility of jumping to
specific rules as an if condition so complex rules can be established.
This is only supported in GMAC5.10+.
The following commands can be used to test this code:
1) Setup an ingress qdisk:
# tc qdisc add dev eth0 handle ffff: ingress
2) Setup a filter (e.g. filter by IP):
# tc filter add dev eth0 parent ffff: protocol ip u32 match ip \
src 192.168.0.3 skip_sw action drop
In every tests performed we always used the "skip_sw" flag to make sure
only the RX Parser was involved.
Nisar Sayed [Wed, 2 May 2018 15:39:17 +0000 (21:09 +0530)]
microchip_t1: Add driver for Microchip LAN87XX T1 PHYs
Add driver for Microchip LAN87XX T1 PHYs
This patch support driver for Microchp T1 PHYs.
There will be followup patches to this driver to support T1 PHY
features such as cable diagnostics, signal quality indicator(SQI),
sleep and wakeup (TC10) support.
Lukas Wunner [Wed, 9 May 2018 12:43:43 +0000 (14:43 +0200)]
can: hi311x: Work around TX complete interrupt erratum
When sending packets as fast as possible using "cangen -g 0 -i -x", the
HI-3110 occasionally latches the interrupt pin high on completion of a
packet, but doesn't set the TXCPLT bit in the INTF register. The INTF
register contains 0x00 as if no interrupt has occurred. Even waiting
for a few milliseconds after the interrupt doesn't help.
Work around this apparent erratum by instead checking the TXMTY bit in
the STATF register ("TX FIFO empty"). We know that we've queued up a
packet for transmission if priv->tx_len is nonzero. If the TX FIFO is
empty, transmission of that packet must have completed.
Note that this is congruent with our handling of received packets, which
likewise gleans from the STATF register whether a packet is waiting in
the RX FIFO, instead of looking at the INTF register.
Lukas Wunner [Wed, 9 May 2018 12:38:43 +0000 (14:38 +0200)]
can: hi311x: Acquire SPI lock on ->do_get_berr_counter
hi3110_get_berr_counter() may run concurrently to the rest of the driver
but neglects to acquire the lock protecting access to the SPI device.
As a result, it and the rest of the driver may clobber each other's tx
and rx buffers.
We became aware of this issue because transmission of packets with
"cangen -g 0 -i -x" frequently hung. It turns out that agetty executes
->do_get_berr_counter every few seconds via the following call stack:
agetty listens to netlink messages in order to update the login prompt
when IP addresses change (if /etc/issue contains \4 or \6 escape codes):
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=e36deb6424e8
It's a useful feature, though it seems questionable that it causes CAN
bit error statistics to be queried.
Be that as it may, if hi3110_get_berr_counter() is invoked while a frame
is sent by hi3110_hw_tx(), bogus SPI transfers like the following may
occur:
=> 12 00 (hi3110_get_berr_counter() wanted to transmit
EC 00 to query the transmit error counter,
but the first byte was overwritten by
hi3110_hw_tx_frame())
=> EA 00 3E 80 01 FB (hi3110_hw_tx_frame() wanted to transmit a
frame, but the first byte was overwritten by
hi3110_get_berr_counter() because it wanted
to query the receive error counter)
This sequence hangs the transmission because the driver believes it has
sent a frame and waits for the interrupt signaling completion, but in
reality the chip has never sent away the frame since the commands it
received were malformed.
Fix by acquiring the SPI lock in hi3110_get_berr_counter().
I've scrutinized the entire driver for further unlocked SPI accesses but
found no others.
PCI / PM: Check device_may_wakeup() in pci_enable_wake()
Commit 0847684cfc5f0 (PCI / PM: Simplify device wakeup settings code)
went too far and dropped the device_may_wakeup() check from
pci_enable_wake() which causes wakeup to be enabled during system
suspend, hibernation or shutdown for some PCI devices that are not
allowed by user space to wake up the system from sleep (or power off).
As a result of this, excessive power is drawn by some of the affected
systems while in sleep states or off.
Restore the device_may_wakeup() check in pci_enable_wake(), but make
sure that the PCI bus type's runtime suspend callback will not call
device_may_wakeup() which is about system wakeup from sleep and not
about device wakeup from runtime suspend.
Ying Xue [Tue, 8 May 2018 13:44:06 +0000 (21:44 +0800)]
tipc: eliminate KMSAN uninit-value in strcmp complaint
When we get link properties through netlink interface with
tipc_nl_node_get_link(), we don't validate TIPC_NLA_LINK_NAME
attribute at all, instead we directly use it. As a consequence,
KMSAN detected the TIPC_NLA_LINK_NAME attribute was an uninitialized
value, and then posted the following complaint:
Sun Lianwen [Tue, 8 May 2018 01:49:38 +0000 (09:49 +0800)]
net/9p: correct some comment errors in 9p file system code
There are follow comment errors:
1 The function name is wrong in p9_release_pages() comment.
2 The function name and variable name is wrong in p9_poll_workfn() comment.
3 There is no variable dm_mr and lkey in struct p9_trans_rdma.
4 The function name is wrong in rdma_create_trans() comment.
5 There is no variable initialized in struct virtio_chan.
6 The variable name is wrong in p9_virtio_zc_request() comment.
David S. Miller [Thu, 10 May 2018 12:17:55 +0000 (08:17 -0400)]
Merge tag 'mlx5-updates-2018-05-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-05-07
mlx5 core driver misc cleanups and updates:
- fix spelling mistake: "modfiy" -> "modify"
- Cleanup unused field in Work Queue parameters
- dump_command mailbox length printed
- Refactor num of blocks in mailbox calculation
- Decrease level of prints about non-existent MKEY
- remove some extraneous spaces in indentations
====================
Ilya Dryomov [Fri, 4 May 2018 14:57:31 +0000 (16:57 +0200)]
ceph: fix iov_iter issues in ceph_direct_read_write()
dio_get_pagev_size() and dio_get_pages_alloc() introduced in commit b5b98989dc7e ("ceph: combine as many iovec as possile into one OSD
request") assume that the passed iov_iter is ITER_IOVEC. This isn't
the case with splice where it ends up poking into the guts of ITER_BVEC
or ITER_PIPE iterators, causing lockups and crashes easily reproduced
with generic/095.
Rather than trying to figure out gap alignment and stuff pages into
a page vector, add a helper for going from iov_iter to a bio_vec array
and make use of the new CEPH_OSD_DATA_TYPE_BVECS code.
Ilya Dryomov [Thu, 3 May 2018 14:10:09 +0000 (16:10 +0200)]
ceph: fix rsize/wsize capping in ceph_direct_read_write()
rsize/wsize cap should be applied before ceph_osdc_new_request() is
called. Otherwise, if the size is limited by the cap instead of the
stripe unit, ceph_osdc_new_request() would setup an extent op that is
bigger than what dio_get_pages_alloc() would pin and add to the page
vector, triggering asserts in the messenger.
Boris Brezillon [Fri, 4 May 2018 19:24:31 +0000 (21:24 +0200)]
mtd: rawnand: Make sure we wait tWB before polling the STATUS reg
NAND chips require a bit of time to take the NAND operation into
account and set the BUSY bit in the STATUS reg. Make sure we don't poll
the STATUS reg too early in nand_soft_waitrdy().
Dave Airlie [Thu, 10 May 2018 03:48:52 +0000 (13:48 +1000)]
Merge branch 'linux-4.17' of git://github.com/skeggsb/linux into drm-fixes
Two nouveau crasher/deadlock fixes.
* 'linux-4.17' of git://github.com/skeggsb/linux:
drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
drm/nouveau/ttm: don't dereference nvbo::cli, it can outlive client
Lyude Paul [Wed, 2 May 2018 23:38:48 +0000 (19:38 -0400)]
drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
Currently; we're grabbing all of the modesetting locks before adding MST
connectors to fbdev. This isn't actually necessary, and causes a
deadlock as well:
======================================================
WARNING: possible circular locking dependency detected
4.17.0-rc3Lyude-Test+ #1 Tainted: G O
------------------------------------------------------
kworker/1:0/18 is trying to acquire lock: 00000000c832f62d (&helper->lock){+.+.}, at: drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
but task is already holding lock: 00000000942e28e2 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8e/0x1c0 [drm]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
other info that might help us debug this:
Chain exists of:
&helper->lock --> crtc_ww_class_acquire --> crtc_ww_class_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(crtc_ww_class_mutex);
lock(crtc_ww_class_acquire);
lock(crtc_ww_class_mutex);
lock(&helper->lock);
Taking example from i915, the only time we need to hold any modesetting
locks is when changing the port on the mstc, and in that case we only
need to hold the connection mutex.
Dave Airlie [Thu, 10 May 2018 01:28:46 +0000 (11:28 +1000)]
Merge branch 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A little bigger than normal since this is two weeks of fixes.
- Atom firmware table updates for vega12
- Fix fallout from huge page support
- Fix up smu7 power profile interface to be consistent with vega
- Misc other fixes
* 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/pp: Refine the output of pp_power_profile_mode on VI
drm/amdgpu: Switch to interruptable wait to recover from ring hang.
drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pages
drm/amd/display: Use kvzalloc for potentially large allocations
drm/amd/display: Don't return ddc result and read_bytes in same return value
drm/amd/display: Add get_firmware_info_v3_2 for VG12
drm/amd: Add BIOS smu_info v3_3 required struct def.
drm/amd/display: Add VG12 ASIC IDs