Patrick McHardy [Fri, 19 Apr 2013 02:04:28 +0000 (02:04 +0000)]
net: vlan: prepare for 802.1ad VLAN filtering offload
Change the rx_{add,kill}_vid callbacks to take a protocol argument in
preparation of 802.1ad support. The protocol argument used so far is
always htons(ETH_P_8021Q).
Patrick McHardy [Fri, 19 Apr 2013 02:04:27 +0000 (02:04 +0000)]
net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*
Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.
David S. Miller [Fri, 19 Apr 2013 18:19:07 +0000 (14:19 -0400)]
Merge branch 'intel'
Jeff Kirsher says:
====================
This series contains updates to ixgbe and igb.
The ixgbe changes contains 2 patches from the community, one which is a
fix from akepner to fix a issue where netif_running() in shutdown was
not done under rtnl_lock. The other community fix from Joe Perches
cleans up #ifdef CONFIG_DEBUG_FS which is no longer necessary. The
last ixgbe patch, from Jacob Keller, adds support for WoL on 82559
SFP+ LOM.
The remaining patches are against igb, 10 of which were previously
submitted in a pull request where changes were requested.
The following igb patches:
igb: Support for 100base-fx SFP
igb: Support to read and export SFF-8472/8079 data
are v2 based on feedback from Dan Carpenter and Ben Hutchings in
the previous pull request.
The largest set of changes are in my patch to cleanup code comments
and whitespace to align the igb driver with the networking style of
code comments. While cleaning up the code comments, fixed several
other whitespace/checkpatch.pl code formatting issues.
Other notable igb patches are EEE capable devices query the PHY to
determine what the link partner is advertising, added support for
i354 devices and added support for spoofchk config.
====================
Jeff Kirsher [Sat, 23 Feb 2013 07:29:56 +0000 (07:29 +0000)]
igb: Fix code comments and whitespace
Aligns the multi-line code comments with the desired style for the
networking tree. Also cleaned up whitespace issues found during the
cleanup of code comments (i.e. remove unnecessary blank lines,
use tabs where possible, properly wrap lines and keep strings on a
single line)
Alexander Duyck [Tue, 12 Feb 2013 02:31:01 +0000 (02:31 +0000)]
igb: Use rx/tx_itr_setting when setting up initial value of itr
It turns out that the InterruptThrottleRate module parameter was only
having the effect of locking the ITR at the starting ITR value. This was
because the values stored in rx_itr_setting and tx_itr_setting were being
ignored when configuring the initial itr_val of the q_vector.
Alexander Duyck [Thu, 7 Feb 2013 08:55:46 +0000 (08:55 +0000)]
igb: Pull adapter out of main path in igb_xmit_frame_ring
We only need the adapter pointer in the case of ptp. As such we can pull the
adapter out of the main path and place it inside the if statement to avoid
the temptation of accessing the adapter pointer in the fast path.
Alexander Duyck [Fri, 1 Feb 2013 08:56:47 +0000 (08:56 +0000)]
igb: Mask off check of frag_off as we only want fragment offset
We were incorrectly checking the entire frag_off field when we only wanted the
fragment offset. As a result we were not pulling in TCP headers when the DNF
flag was set.
To correct that we will now check for frag off using the IP_OFFSET mask.
akepner [Wed, 13 Mar 2013 14:54:58 +0000 (14:54 +0000)]
ixgbe: in shutdown, do netif_running() under rtnl_lock
During shutdown it's possible for __dev_close() (which holds
rtnl_lock) to clear the __LINK_STATE_START bit, and for ixgbe
to then read that bit (without holding rtnl_lock), and then
not fail to free irqs, etc. The result is a crash like this:
David S. Miller [Thu, 18 Apr 2013 19:00:59 +0000 (15:00 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to ixgbe only.
v2- Dropped the following 2 patches from the series:
ixgbe: Support using build_skb in the case that jumbo frames are disabled
ixgbe: walk pci-e bus to find minimum width
Ben Hutchings found a bug with Alex's patch, so that patch was dropped
permanently. Jacob's "walk PCIe bus" patch is being re-worked for
a more generic solution so that other drivers can benefit.
In the remaining patches...
Alex provides a fix where we were incorrectly checking the entire frag_off
field when we only wanted the fragment offset. Alex also cleans up
the check for PAGE_SIZE, since the default configuration allocates 32K
for all buffers.
Emil provides a change to the calculation of eerd so that it is consistent
between the read and write functions by using | instead of +.
Jacob adds support for displaying PCIe Gen3 link speed, which was
previously missing from the ixgbe driver. He also provides a patch
to clean up ixgbe_get_bus_info_generic to call some conversion
functions, which are used also in another patch provided by Jacob.
Jacob modifies the driver to enable certain devices (which have an
internal switch) to read from the physical slot rather than reading
data from the internal switch.
Don provides a couple of fixes (which are more appropriate for net-next),
one of which resolves an issue where ixgbe was only turning on the laser
when the adapter was up which caused issues for those who wanted to
access the MNG firmware while the port was in a down state. The other
fix is for WoL when currently linked at 1G. Lastly Don bumps the driver
version keep the in-kernel driver up to date with the current functionality.
====================
Don Skidmore [Thu, 28 Feb 2013 08:08:44 +0000 (08:08 +0000)]
ixgbe: Fix 1G link WoL
We reset during the shutdown path which will reset AUTOC register. This
would change LMS to 10G. If we were currently linked at 1G we will lose
link, which is a bad thing if we wanted WoL to work. For the fix I needed
to know if WoL is supported so I created a new bool in the ixgbe_hw struct.
If this is set we will not allow the reset to change the current LMS value
in AUTOC.
Don Skidmore [Thu, 21 Feb 2013 03:00:04 +0000 (03:00 +0000)]
ixgbe: fix MNG FW support when adapter not up
We were only turning the laser on when the adapter was up. This
causes issues for those who wanted to access the MNG FW while the
port was in a down state. This patch makes sure the laser is turned
on in probe and remain up even after the port is brought down.
Jacob Keller [Tue, 9 Apr 2013 07:20:09 +0000 (07:20 +0000)]
ixgbe: enable devices with internal switch to read pci parent
This patch modifies the driver to enable certain devices, which have an internal
switch, to read data from the physical slot rather than reading data from the
internal switch. The internal switch will always report the same PCI width and
speed, which is not useful compared to knowing the width and speed of the slot
the physical card is plugged into.
Jacob Keller [Fri, 15 Feb 2013 09:18:15 +0000 (09:18 +0000)]
ixgbe: create conversion functions from link_status to bus/speed
This patch cleans up ixgbe_get_bus_info_generic to call some conversion
functions, which are used also in a follow on patch that needs to convert
between the link_status PCIe config values into ixgbe's internal enum
representations.
Alexander Duyck [Sat, 9 Feb 2013 01:19:55 +0000 (01:19 +0000)]
ixgbe: Drop check for PAGE_SIZE from ixgbe_xmit_frame_ring
The check for PAGE_SIZE is pointless now that the default configuration is to
allocate 32K for all buffers. Since the Tx descriptor limit is 16K we can
just drop the check and always compare the descriptors to the maximum size
supported.
Alexander Duyck [Fri, 1 Feb 2013 08:56:41 +0000 (08:56 +0000)]
ixgbe: Mask off check of frag_off as we only want fragment offset
We were incorrectly checking the entire frag_off field when we only wanted the
fragment offset. As a result we were not pulling in TCP headers when the DNF
flag was set.
To correct that we will now check for frag off using the IP_OFFSET mask.
David S. Miller [Wed, 17 Apr 2013 18:18:43 +0000 (14:18 -0400)]
Merge branch 'tipc-ipoib'
Patrick McHardy says:
====================
The following patchset adds support for running TIPC over InfiniBand.
The patchset consists of three parts (+ a minor fix for the ethernet media
type):
- Preparation: removal of an the unused str2addr callback and move of the
bcast_addr from struct tipc_media to struct tipc_bearer. This is necessary
because InfiniBand doesn't have a fixed broadcast address like ethernet,
so it needs to be initialized with the device's broadcast address when
the bearer is enabled
- Introduction of a TIPC InfiniBand media type. A new media type is needed
to deal with the different address sizes
- Support for ETH_P_TIPC in IPoIB
Since the last posting I've addressed all feedback I received and rebased
to the current net-next tree.
I consider these patches ready for merging. Since they mainly affect TIPC
code, I'd propose to have them either go through the TIPC tree or through
Dave directly (not sure how TIPC patches are managed).
====================
Patrick McHardy [Wed, 17 Apr 2013 06:18:29 +0000 (06:18 +0000)]
IPoIB: add support for TIPC protocol
Support TIPC in the IPoIB driver. Since IPoIB now keeps track of its own
neighbour entries and doesn't require the packet to have a dst_entry
anymore, the only necessary changes are to:
- not drop multicast TIPC packets because of the unknown ethernet type
- handle unicast TIPC packets similar to IPv4/IPv6 unicast packets
in ipoib_start_xmit().
An alternative would be to remove all ethertype limitations since they're
not necessary anymore, all TIPC needs to know about is ARP and RARP since
it wants to always perform "path find", even if a path is already known.
Patrick McHardy [Wed, 17 Apr 2013 06:18:28 +0000 (06:18 +0000)]
tipc: add InfiniBand media type
Add InfiniBand media type based on the ethernet media type.
The only real difference is that in case of InfiniBand, we need the entire
20 bytes of space reserved for media addresses, so the TIPC media type ID is
not explicitly stored in the packet payload.
Patrick McHardy [Wed, 17 Apr 2013 06:18:26 +0000 (06:18 +0000)]
tipc: move bcast_addr from struct tipc_media to struct tipc_bearer
Some network protocols, like InfiniBand, don't have a fixed broadcast
address but one that depends on the configuration. Move the bcast_addr
to struct tipc_bearer and initialize it with the broadcast address of
the network device when the bearer is enabled.
Daniel Borkmann [Tue, 16 Apr 2013 11:07:17 +0000 (11:07 +0000)]
net: sctp: sctp_ulpq: remove 'malloced' struct member
The structure sctp_ulpq is embedded into sctp_association and never
separately allocated, also ulpq->malloced is always 0, so that
kfree() is never called. Therefore, remove this code.
Daniel Borkmann [Tue, 16 Apr 2013 11:07:16 +0000 (11:07 +0000)]
net: sctp: sctp_bind_addr: remove dead code
The sctp_bind_addr structure has a 'malloced' member that is
always set to 0, thus in sctp_bind_addr_free() the kfree()
part can never be called. This part is embedded into
sctp_ep_common anyway and never alloced.
Daniel Borkmann [Tue, 16 Apr 2013 11:07:12 +0000 (11:07 +0000)]
net: sctp: sctp_outq: remove 'malloced' from its struct
sctp_outq is embedded into sctp_association, and thus never
kmalloced in any way. Also, malloced is always 0, thus kfree()
is never called. Therefore, remove that dead piece of code.
Daniel Borkmann [Tue, 16 Apr 2013 11:07:11 +0000 (11:07 +0000)]
net: sctp: sctp_inq: remove dead code
sctp_inq is never kmalloced, since it's integrated into sctp_ep_common
and only initialized from eps and assocs. Therefore, remove the dead
code from there.
Daniel Borkmann [Tue, 16 Apr 2013 11:07:10 +0000 (11:07 +0000)]
net: sctp: sctp_ssnmap: remove 'malloced' element from struct
sctp_ssnmap_init() can only be called from sctp_ssnmap_new()
where malloced is always set to 1. Thus, when we call
sctp_ssnmap_free() the test for map->malloced evaluates always
to true.
David S. Miller [Wed, 17 Apr 2013 17:30:32 +0000 (13:30 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:
====================
A number of improvements for net-next/3.10.
Highlights include:
* Properly exposing linux/openvswitch.h to userspace after the uapi
changes.
* Simplification of locking. It immediately makes things simpler to
reason about and avoids holding RTNL mutex for longer than
necessary. In the near future it will also enable tunnel
registration and more fine-grained locking.
* Miscellaneous cleanups and simplifications.
====================
commit 7b7a2bbb690 (atl1: Remove unneeded PM_OPS definitions) removed the
definition of atl1_suspend for the !CONFIG_PM_SLEEP case.
So only call atl1_suspend() when CONFIG_PM_SLEEP is defined and fix the
following build error from randconfig:
drivers/net/ethernet/atheros/atlx/atl1.c: In function 'atl1_shutdown':
drivers/net/ethernet/atheros/atlx/atl1.c:2888:2: error: implicit declaration of function 'atl1_suspend' [-Werror=implicit-function-declaration]
David S. Miller [Tue, 16 Apr 2013 20:43:39 +0000 (16:43 -0400)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
Marc Kleine-Budde says:
====================
this is a pull-request for net-next/master. It consists of a patch by
Oliver Hartkopp. In this patch he cleans up the sja1000 header file by
using a common prefix for all sja1000 defines.
====================
pch_gbe: minor: report the actual error on MTU change
If we can't _up() after changing the MTU, report the actual error instead
of -ENOMEM. It can be really misleading cause pch_gbe is usually used in
scenarios where the memory amount is really small, and thus hiding the
real cause.
vxlan: Allow setting destination to unicast address.
This patch allows setting VXLAN destination to unicast address.
It allows that VXLAN can be used as peer-to-peer tunnel without
multicast.
v4: generalize struct vxlan_dev, "gaddr" is replaced with vxlan_rdst.
"GROUP" attribute is replaced with "REMOTE".
they are based by David Stevens's comments.
v3: move a new attribute REMOTE into the last of an enum list
based by Stephen Hemminger's comments.
v2: use a new attribute REMOTE instead of GROUP based by
Cong Wang's comments.
Daniel Borkmann [Tue, 16 Apr 2013 01:57:46 +0000 (01:57 +0000)]
packet: minor: add generic tpacket_uhdr to access packet headers
There is no need to add a dozen unions each time at the start
of the function. So, do this once and use it instead. Thus, we
can remove some duplicate code and make it more readable.
Dilip Daya [Tue, 16 Apr 2013 01:39:07 +0000 (01:39 +0000)]
sctp: Add buffer utilization fields to /proc/net/sctp/assocs
sctp: Add buffer utilization fields to /proc/net/sctp/assocs
This patch adds the following fields to /proc/net/sctp/assocs output:
- sk->sk_wmem_alloc as "wmema" (transmit queue bytes committed)
- sk->sk_wmem_queued as "wmemq" (persistent queue size)
- sk->sk_sndbuf as "sndbuf" (size of send buffer in bytes)
- sk->sk_rcvbuf as "rcvbuf" (size of receive buffer in bytes)
When small DATA chunks containing 136 bytes data are sent the TX_QUEUE
(assoc->sndbuf_used) reaches a maximum of 40.9% of sk_sndbuf value when
peer.rwnd = 0. This was diagnosed from sk_wmem_alloc value reaching maximum
value of sk_sndbuf.
TX_QUEUE (assoc->sndbuf_used), sk_wmem_alloc and sk_wmem_queued values are
incremented in sctp_set_owner_w() for outgoing data chunks. Having access to
the above values in /proc/net/sctp/assocs will provide a better understanding
of SCTP buffer management.
With patch applied, example output when peer.rwnd = 0
the work has been scheduled from interrupt, and not been
cancelled when the driver is unloaded, which doesn't remove
the work item from the global workqueue. call the
cancel_work_sync when the driver is removed (rmmod'ed).
at86rf230: change irq handling to prevent lockups with edge type irq
Implemented separate irq handling for edge and level type interrupt
configuration. For edge type interrupts calls to disable_irq_nosync()
and enable_irq() are removed. The at86rf230 resets the irq line only
after the irq status register is read. Disabling the irq can lock the
driver in situations where a irq is set by the radio while the driver
is still reading the frame buffer.
With irq_type configuration set to 0 the original behavior is
preserverd.
Additional the irq filter register is set to filter out all unused
interrupts and the irq status register is read in the probe
function to clear the irq line.
Signed-off-by: Sascha Herrmann <[email protected]>
Conflicts:
drivers/net/ieee802154/at86rf230.c Signed-off-by: David S. Miller <[email protected]>
Add option to at86rf230 platform data to configure the type of the
interrupt used by the driver. The irq polarity of the device will
be configured accordingly.
SIMPLE_DEV_PM_OPS macro can handle !CONFIG_PM_SLEEP case nicely, so there is no
need to define PM_OPS for both CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP cases.
SIMPLE_DEV_PM_OPS macro can handle !CONFIG_PM_SLEEP case nicely, so there is no
need to define PM_OPS for both CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP cases.
SIMPLE_DEV_PM_OPS macro can handle !CONFIG_PM_SLEEP case nicely, so there is no
need to define PM_OPS for both CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP cases.
SIMPLE_DEV_PM_OPS macro can handle !CONFIG_PM_SLEEP case nicely, so there is no
need to define PM_OPS for both CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP cases.
SIMPLE_DEV_PM_OPS macro can handle !CONFIG_PM_SLEEP case nicely, so there is no
need to define PM_OPS for both CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP cases.
That patch fixed a define conflict between the SH architecture and the sja1000
driver, by addind a prefix to one macro only. This patch consistently renames
the prefix of the SJA1000 controller registers from "REG_" to "SJA1000_".
Currently OVS uses combination of genl and rtnl lock to protect
datapath state. This was done due to networking stack locking.
But this has complicated locking and there are few lock ordering
issues with new tunneling protocols.
Following patch simplifies locking by introducing new ovs mutex
and now this lock is used to protect entire ovs state.
David S. Miller [Mon, 15 Apr 2013 20:10:53 +0000 (16:10 -0400)]
Merge branch 'sync_multiple'
Vlad Yasevich says:
====================
Current dev_[uc|mc]_addr_sync() API currently correctly syncs the
addresses to the first device. Any subsequent calls to sync will
not do anything since the synched variable will be set. This
variable is used as an optimization to skip over addresses that have
been synched.
There are some devices (ex: team) that attempt to do the above. There
is other work in progress that needs to above to work corretly.
The short series introduces dev_[uc|mc]_addr_synch_multiple() that
allows multiple calls to sync to multiple different devices. Original
API is left alone and still has the limitation.
====================
team: Use new sync_multiple api to sync devices adressess.
Team drivers attempts to sync addresses to each of the port
devices; however, the current api doesn't really perform the sync
for any device after the first one. Switch to using the new api
that will actually sync the addresses to all ports.
net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api
The current implementation of dev_uc_sync/unsync() assumes that there is
a strict 1-to-1 relationship between the source and destination of the sync.
In other words, once an address has been synced to a destination device, it
will not be synced to any other device through the sync API.
However, there are some virtual devices that aggreate a number of lower
devices and need to sync addresses to all of them. The current
API falls short there.
This patch introduces a new dev_uc_sync_multiple() api that can be called
in the above circumstances and allows sync to work for every invocation.
Daniel Borkmann [Mon, 15 Apr 2013 03:27:17 +0000 (03:27 +0000)]
net: sctp: remove sctp_ep_common struct member 'malloced'
There is actually no need to keep this member in the structure, because
after init it's always 1 anyway, thus always kfree called. This seems to
be an ancient leftover from the very initial implementation from 2.5
times. Only in case the initialization of an association fails, we leave
base.malloced as 0, but we nevertheless kfree it in the error path in
sctp_association_new().
Mike Rapoport [Sat, 13 Apr 2013 23:21:51 +0000 (23:21 +0000)]
vxlan: don't bypass encapsulation for multi- and broadcasts
The multicast and broadcast packets may have RTCF_LOCAL set in rt_flags
and therefore will be sent out bypassing encapsulation. This breaks
delivery of packets sent to the vxlan multicast group.
Disabling encapsulation bypass for multicasts and broadcasts fixes the
issue.
Commit 10b96f7306e5 (``tcp_memcontrol: remove a redundant statement
in tcp_destroy_cgroup()'') says ``We read the value but make no use
of it.'', but forgot to remove the variable declaration as well. This
was a follow-up commit of 3f1346193 (``memcg: decrement static keys
at real destroy time'') that removed the read of variable 'val'.
This fixes therefore:
CC net/ipv4/tcp_memcontrol.o
net/ipv4/tcp_memcontrol.c: In function ‘tcp_destroy_cgroup’:
net/ipv4/tcp_memcontrol.c:67:6: warning: unused variable ‘val’ [-Wunused-variable]
Daniel Borkmann [Sun, 14 Apr 2013 08:08:13 +0000 (08:08 +0000)]
net: sock: make sock_tx_timestamp void
Currently, sock_tx_timestamp() always returns 0. The comment that
describes the sock_tx_timestamp() function wrongly says that it
returns an error when an invalid argument is passed (from commit 20d4947353be, ``net: socket infrastructure for SO_TIMESTAMPING'').
Make the function void, so that we can also remove all the unneeded
if conditions that check for such a _non-existant_ error case in the
output path.
Mike Rapoport [Sat, 13 Apr 2013 23:21:39 +0000 (23:21 +0000)]
vxlan: use htonl when snooping for loopback address
Currently "bridge fdb show dev vxlan0" lists loopback address as
"1.0.0.127". Using htonl(INADDR_LOOPBACK) rather than passing it
directly to vxlan_snoop fixes the problem.
Joe Perches [Sat, 13 Apr 2013 19:03:17 +0000 (19:03 +0000)]
fec: Convert printks to netdev_<level>
Use a more current logging message style.
Convert the printks where a struct net_device is available to
netdev_<level>. Convert the other printks to pr_<level> and
add pr_fmt where appropriate.
As network adapters supporting PTP are becoming more common, machines with
many NICs suddenly have many PHCs, too. The current limit of eight /dev/ptp*
char devices (and thus, 8 network interfaces with PHC) is insufficient. Let
the ptp driver allocate the char devices dynamically.
Tested with 28 PHCs, removing and re-adding some of them.
Thanks to Ben Hutchings for advice leading to simpler and cleaner patch.
Eric Dumazet [Fri, 12 Apr 2013 11:31:52 +0000 (11:31 +0000)]
tcp: GSO should be TSQ friendly
I noticed that TSQ (TCP Small queues) was less effective when TSO is
turned off, and GSO is on. If BQL is not enabled, TSQ has then no
effect.
It turns out the GSO engine frees the original gso_skb at the time the
fragments are generated and queued to the NIC.
We should instead call the tcp_wfree() destructor for the last fragment,
to keep the flow control as intended in TSQ. This effectively limits
the number of queued packets on qdisc + NIC layers.
net: mv643xx_eth: remove deprecated inet_lro support
With recent support for GRO, there is no need to keep both LRO and
GRO. This patch therefore removes the deprecated inet_lro support
from mv643xx_eth. This is work is based on an experimental patch
provided by Eric Dumazet and Willy Tarreau.
Fixes following warning:
drivers/net/vxlan.c:406:6: warning: symbol 'vxlan_fdb_free' was not declared. Should it be static?
drivers/net/vxlan.c:1111:37: warning: Using plain integer as NULL pointer
It causes build regressions, as per Stephen Rothwell:
====================
After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:
net/core/netprio_cgroup.c:250:29: error: static declaration of 'net_prio_subsys' follows non-static declaration
include/linux/cgroup_subsys.h:71:1: note: previous declaration of 'net_prio_subsys' was here
====================
Jason Wang [Wed, 10 Apr 2013 23:32:22 +0000 (23:32 +0000)]
tuntap: initialize vlan_features
The vlan_features was zero which prevents vlan GSO packets to be transmitted to
userspace. This is suboptimal so enable this by initialize vlan_features for
tuntap.
Netperf shows better performance of guest receiving since vlan TSO works for
tuntap:
before:
netperf -H 192.168.5.4
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.4 ()
port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.01 2786.67
after:
netperf -H 192.168.5.4
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.4 ()
port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
Jason Wang [Wed, 10 Apr 2013 23:32:21 +0000 (23:32 +0000)]
virtio-net: initialize vlan_features
There's nothing that prevent passing the device features of virtio_net to its
vlan device. So this patch simply passes those to vlan device to benefit from
advanced features.
Netperf shows better sending performance for vlan device since TSO can work on
vlan now.
before:
netperf -H 192.168.5.2
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.2 ()
port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 4162.35
after:
netperf -H 192.168.5.2
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.5.2 ()
port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
net: mv643xx_eth: add shared clk and cleanup existing clk handling
This patch adds an optional shared block clock to avoid lockups on
clock gated controllers. Besides the new clock, clock handling for
existing clocks is cleaned up and moved to devm_clk_get. Device
tree binding documentation is updated for the new clocks property.
Jason Wang [Wed, 10 Apr 2013 20:50:48 +0000 (20:50 +0000)]
vhost_net: remove tx polling state
After commit 2b8b328b61c799957a456a5a8dab8cc7dea68575 (vhost_net: handle polling
errors when setting backend), we in fact track the polling state through
poll->wqh, so there's no need to duplicate the work with an extra
vhost_net_polling_state. So this patch removes this and make the code simpler.
This patch also removes the all tx starting/stopping code in tx path according
to Michael's suggestion.
Netperf test shows almost the same result in stream test, but gets improvements
on TCP_RR tests (both zerocopy or copy) especially on low load cases.
Tested between multiqueue kvm guest and external host with two direct
connected 82599s.