Peter Pan(潘卫平) [Thu, 12 May 2011 15:46:56 +0000 (15:46 +0000)]
net:set valid name before calling ndo_init()
In commit 1c5cae815d19 (net: call dev_alloc_name from register_netdevice),
a bug of bonding was involved, see example 1 and 2.
In register_netdevice(), the name of net_device is not valid until
dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
bond_init) is called before dev_get_valid_name(),
and it uses the invalid name of net_device.
I think register_netdevice() should make sure that the name of net_device is
valid before calling ndo_init().
example 1:
modprobe bonding
ls /proc/net/bonding/bond%d
David Decotigny [Thu, 12 May 2011 20:28:04 +0000 (20:28 +0000)]
stmmac: don't go through ethtool to start auto-negotiation
The driver used to call phy's ethtool configuration routine to start
auto-negotiation. This change has it call directly phy's routine to
start auto-negotiation.
The initial version was hiding phy_start_aneg() return value,
this patch returns it (<0 upon error).
Tested: module compiles, tested on STM HDK7108 STB.
Julia Lawall [Fri, 13 May 2011 04:15:39 +0000 (04:15 +0000)]
drivers/isdn/hisax: Drop unused list
The file st5481_init.c locally defines and initializes the adapter_list
variable, but does not use it for anything. Removing the list makes it
possible to remove the list field from the st5481_adapter data structure.
In the function probe_st5481, it also makes it possible to free the locally
allocated adapter value on an error exit.
Vasiliy Kulikov [Fri, 13 May 2011 10:01:00 +0000 (10:01 +0000)]
net: ipv4: add IPPROTO_ICMP socket kind
This patch adds IPPROTO_ICMP socket kind. It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges. In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping. In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).
Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/
A new ping socket is created with
socket(PF_INET, SOCK_DGRAM, PROT_ICMP)
Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.
Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.
ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.
ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).
socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range". It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets. Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100 4294967295" would enable it for the users, but not daemons.
The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).
Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping
For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro. A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/
Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.
All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.
PATCH v3:
- switched to flowi4.
- minor changes to be consistent with raw sockets code.
PATCH v2:
- changed ping_debug() to pr_debug().
- removed CONFIG_IP_PING.
- removed ping_seq_fops.owner field (unused for procfs).
- switched to proc_net_fops_create().
- switched to %pK in seq_printf().
PATCH v1:
- fixed checksumming bug.
- CAP_NET_RAW may not create icmp sockets anymore.
RFC v2:
- minor cleanups.
- introduced sysctl'able group range to restrict socket(2).
The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.
bonding,llc: Fix structure sizeof incompatibility for some PDUs
With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof
operator on structure returns value greater than expected. In cases when the
structure is used for mapping PDU fields it may lead to unexpected results
(such as holes and alignment problems in skb data). __packed prevents this
undesired behavior.
John W. Linville [Fri, 13 May 2011 13:23:47 +0000 (09:23 -0400)]
ssb: fix pcicore build breakage
drivers/ssb/main.c:1336: error: 'SSB_PCICORE_BCAST_ADDR' undeclared (first use in this function)
drivers/ssb/main.c:1337: error: 'SSB_PCICORE_BCAST_DATA' undeclared (first use in this function)
drivers/ssb/main.c:1349: error: 'struct ssb_pcicore' has no member named 'dev'
Frank Blaschka [Thu, 12 May 2011 18:45:02 +0000 (18:45 +0000)]
qeth: add OSA concurrent hardware trap
This patch improves FFDC (first failure data capture) by requesting
a hardware trace in case the device driver, the hardware or a user
detects an error.
Frank Blaschka [Thu, 12 May 2011 18:45:01 +0000 (18:45 +0000)]
qeth: convert to hw_features part 2
Set rx csum default to hw checksumming again.
Remove sysfs interface for rx csum (checksumming) and TSO (large_send).
With the new hw_features it does not work to keep the old sysfs
interface in parallel. Convert options.checksum_type to new hw_features.
Added code to take FW dump.
o Driver queries FW at the init time and gets the dump template
o It takes FW dump as per the dump template
o Level of FW dump (and its size) is configured via dump flag
Sathya Perla [Thu, 12 May 2011 19:32:16 +0000 (19:32 +0000)]
be2net: fix mbox polling for signal reception
Sending mbox cmds require multiple steps of writing to the DB register and polling
for an ack. Gettting interrupted in the middle by a signal breaks the mbox protocol.
Use msleep() to not get interrupted.
Added code to take FW dump via ethtool. Dump level can be controlled via setting the
dump flag. A get function is provided to query the current setting of the dump flag.
Dump data is obtained from the driver via a separate get function.
Changes from v3:
Fixed buffer length issue in ethtool_get_dump_data function.
Updated kernel doc for ethtool_dump struct and get_dump_flag function.
Changes from v2:
Provided separate commands for get flag and data.
Check for minimum of the two buffer length obtained via ethtool and driver and
use that for dump buffer
Pass up the driver return error codes up to the caller.
Added kernel doc comments.
Eliad Peller [Fri, 13 May 2011 08:57:12 +0000 (11:57 +0300)]
wl12xx_sdio: declare support for NL80211_WOW_TRIGGER_ANYTHING trigger
Since wowlan requires the ability to stay awake while the host
is suspended, declare support for NL80211_WOW_TRIGGER_ANYTHING
if the MMC_PM_KEEP_POWER capability is being supported.
Eliad Peller [Fri, 13 May 2011 08:57:11 +0000 (11:57 +0300)]
wl12xx: prevent scheduling while suspending (WoW enabled)
When WoW is enabled, the interface will stay up and the chip will
be powered on, so we have to flush/cancel any remaining work, and
prevent the irq handler from scheduling a new work until the system
is resumed.
Add 2 new flags:
* WL1271_FLAG_SUSPENDED - the system is (about to be) suspended.
* WL1271_FLAG_PENDING_WORK - there is a pending irq work which
should be scheduled when the system is being resumed.
In order to wake-up the system while getting an irq, we initialize
the device as wakeup device, and calling pm_wakeup_event() upon
getting the interrupt (while the system is about to be suspended)
Since commit e9df2e8fd8fbc9 (Use appropriate sock tclass setting for
routing lookup) we lost ability to properly add ECN codemarks to ipv6
TCP frames.
It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the
ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch,
what's checked is inet6_sk(sk)->tclass, which is a completely different
field.
Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322
[Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros
by inline functions for clarity.
Alexey Orishko [Fri, 6 May 2011 03:01:30 +0000 (03:01 +0000)]
CDC NCM: Add mising short packet in cdc_ncm driver
Changes:
- while making NTB, driver shall check if device dwNtbOutMaxSize is higher than
host value and shall add a short packet if this is the case
- previous temporary patch for this issue is replaced by this one
Julian Anastasov [Tue, 10 May 2011 12:46:05 +0000 (12:46 +0000)]
ipvs: Remove all remaining references to rt->rt_{src,dst}
Remove all remaining references to rt->rt_{src,dst}
by using dest->dst_saddr to cache saddr (used for TUN mode).
For ICMP in FORWARD hook just restrict the rt_mode for NAT
to disable LOCALNODE. All other modes do not allow
IP_VS_RT_MODE_RDR, so we should be safe with the ICMP
forwarding. Using cp->daddr as replacement for rt_dst
is safe for all modes except BYPASS, even when cp->dest is
NULL because it is cp->daddr that is used to assign cp->dest
for sync-ed connections.
Mahesh Bandewar [Sun, 8 May 2011 06:51:48 +0000 (06:51 +0000)]
tg3: Allow ethtool to enable/disable loopback.
This patch adds tg3_set_features() to handle loopback mode. Currently the
capability is added for the devices which support internal MAC loopback mode.
So when enabled, it enables internal-MAC loopback.
Amit Virdi [Thu, 12 May 2011 01:04:40 +0000 (01:04 +0000)]
net/irda/ircomm_tty.c: Use flip buffers to deliver data
use tty_insert_flip_string and tty_flip_buffer_push to deliver incoming data
packets from the IrDA device instead of delivering the packets directly to the
line discipline. Following later approach resulted in warning "Sleeping function
called from invalid context".
Yi Zou [Mon, 9 May 2011 11:53:27 +0000 (11:53 +0000)]
net: group FCoE related feature flags
Michał Mirosław's patch (http://patchwork.ozlabs.org/patch/94421/) fixes the
issue (http://patchwork.ozlabs.org/patch/94188/) about not populating FCoE related
flags correctly on vlan devices. However, only NETIF_F_FCOE_CRC is part of the
NETIF_F_ALL_TX_OFFLOADS right now, where weed NETIF_F_FCOE_MTU and NETIF_F_FSO
as well.
Therefore, add NETIF_F_ALL_FCOE to indicate feature flags used by FCoE TX offloads.
These include NETIF_F_FCOE_CRC, NETIF_F_FCOE_MTU, and NETIF_F_FSO and add them to
be part of NETIF_F_ALL_TX_OFFLOADS. This would eventually make sure all FCoE needed
flags are populated properly to vlan devices.
Michał Mirosław [Fri, 6 May 2011 07:56:29 +0000 (07:56 +0000)]
net: Fix vlan_features propagation
Fix VLAN features propagation for devices which change vlan_features.
For this to work, driver needs to make sure netdev_features_changed()
gets called after the change (it is e.g. after ndo_set_features()).
Side effect is that a user might request features that will never
be enabled on a VLAN device.
Eric Dumazet [Thu, 12 May 2011 21:46:56 +0000 (17:46 -0400)]
garp: remove last synchronize_rcu() call
When removing last vlan from a device, garp_uninit_applicant() calls
synchronize_rcu() to make sure no user can still manipulate struct
garp_applicant before we free it.
Use call_rcu() instead, as a step to further net_device dismantle
optimizations.
Add the temporary garp_cleanup_module() function to make sure no pending
call_rcu() are left at module unload time [ this will be removed when
kfree_rcu() is available ]
vmxnet3: Use single tx queue when CONFIG_PCI_MSI not defined
Resending this patch with few changes.
Avoid multiple queues when MSI or MSI-X not available
Limit number of Tx queues to 1 if MSI/MSI-X support is not configured in
the kernel. This will make number of tx and rx queues equal when MSI/X
is not configured thus providing better performance.
Eric Dumazet [Wed, 11 May 2011 18:22:36 +0000 (18:22 +0000)]
l2tp: fix potential rcu race
While trying to remove useless synchronize_rcu() calls, I found l2tp is
indeed incorrectly using two of such calls, but also bumps tunnel
refcount after list insertion.
tunnel refcount must be incremented before being made publically visible
by rcu readers.
This fix can be applied to 2.6.35+ and might need a backport for older
kernels, since things were shuffled in commit fd558d186df2c
(l2tp: Split pppol2tp patch into separate l2tp and ppp parts)
Luciano Coelho [Thu, 12 May 2011 14:07:55 +0000 (17:07 +0300)]
wl12xx: prevent sched_scan when not idle or not in station mode
The current firmware only supports scheduled scan in station mode and
when idle. To prevent the firmware from crashing, return -EOPNOTSUPP
when sched_scan start is called in an invalid state.
Shahar Levi [Wed, 11 May 2011 09:12:56 +0000 (12:12 +0300)]
wl12xx: add IEEE80211_HW_SPECTRUM_MGMT bit to the hw flags
Set the spectrum management bit in the hw flags so that mac80211 will
set the WLAN_CAPABILITY_SPECTRUM_MGMT bit in association requests
(which in practice means that we support 802.11h spectrum management).
Shahar Levi [Wed, 11 May 2011 08:14:22 +0000 (11:14 +0300)]
wl12xx: Don't filter beacons that include changed HT IEs
This patch adds a beacon filter rule to pass up the beacons that
contain changed HT information elements. These beacons need to be
passed to mac80211 so that it can act on such changes.
ne-h8300: Fix regression caused during net_device_ops conversion
Changeset dcd39c90290297f6e6ed8a04bb20da7ac2b043c5 ("ne-h8300: convert to
net_device_ops") broke ne-h8300 by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in ne-h8300.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.
hydra: Fix regression caused during net_device_ops conversion
Changeset 5618f0d1193d6b051da9b59b0e32ad24397f06a4 ("hydra: convert to
net_device_ops") broke hydra by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in hydra.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.
zorro8390: Fix regression caused during net_device_ops conversion
Changeset b6114794a1c394534659f4a17420e48cf23aa922 ("zorro8390: convert to
net_device_ops") broke zorro8390 by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in zorro8390.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.
Johannes Berg [Thu, 12 May 2011 11:38:50 +0000 (13:38 +0200)]
mac80211: mesh: move some code to make it static
There's no need to have table functions in one
file and all users in another, move the functions
to the right file and make them static. Also move
a static variable to the beginning of the file to
make it easier to find.
Luciano Coelho [Thu, 12 May 2011 13:28:29 +0000 (16:28 +0300)]
cfg80211/mac80211: avoid bounce back mac->cfg->mac on sched_scan_stopped
When sched_scan_stopped was called by the driver, mac80211 calls
cfg80211, which in turn was calling mac80211 back with a flag
"driver_initiated". This flag was used so that mac80211 would do the
necessary cleanup but would not call the driver. This was enough to
prevent the bounce back between the driver and mac80211, but not
between mac80211 and cfg80211.
To fix this, we now do the cleanup in mac80211 before calling
cfg80211. To help with locking issues, the workqueue was moved from
cfg80211 to mac80211.
Johannes Berg [Thu, 12 May 2011 13:07:15 +0000 (15:07 +0200)]
mac80211: fix another key non-race
The code here is only not racy because all the
places that assign the pointers it uses are
holding the sta_mtx as well as the key_mtx and
so can't race against this because this code
holds the sta_mtx. But that's not intuitive,
so fix it to hold the key_mtx.
Johannes Berg [Thu, 12 May 2011 12:31:49 +0000 (14:31 +0200)]
mac80211: make key locking clearer
The code in ieee80211_del_key() doesn't acquire the
key_mtx properly when it dereferences the keys. It
turns out that isn't actually necessary since the
key_mtx itself seems to be redundant since all key
manipulations are done under the RTNL, but as long
as we have the key_mtx we should use it the right
way too.
The code here to RCU-dereference a pointer that's
on the stack is totally pointless, RCU isn't magic
(like say Java's weak references are), so the code
can't work like whoever wrote it thought it might.
Remove it so readers don't get confused. Note that
it seems that a bug is there anyway: I don't see
any code that cancels the timer when a mesh path
struct is destroyed.
The XB113 cards are single band, 5 GHz-only, but the
default settings were configured to assume it was dual
band. Users of these cards then would see 2.4 GHz channels
but you would never get any scan results from these channels
given that the radio is not present.
Daniel Halperin [Wed, 11 May 2011 02:00:45 +0000 (19:00 -0700)]
mac80211: fix contention time computation in minstrel, minstrel_ht
When transmitting a frame, the transmitter waits a random number of
slots between 0 and cw. Thus, the contention time is (cw / 2) * t_slot
which we can represent instead as (cw * t_slot) >> 1. Also fix a few
other accounting bugs around contention time, and add comments.
Johannes Berg [Mon, 9 May 2011 16:41:15 +0000 (18:41 +0200)]
cfg80211: restrict AP beacon intervals
Multiple virtual AP interfaces can currently try
to use different beacon intervals, but that just
leads to problems since it won't actually be done
that way by drivers. Return an error in this case
to make sure it won't be done wrong.
Also, ignore attempts to change the DTIM period
or beacon interval during the lifetime of the BSS.
Ben Hutchings [Wed, 11 May 2011 16:41:18 +0000 (17:41 +0100)]
sfc: Always map MCDI shared memory as uncacheable
We enabled write-combining for memory-mapped registers in commit 65f0b417dee94f779ce9b77102b7d73c93723b39, but inhibited it for the
MCDI shared memory where this is not supported. However,
write-combining mappings also allow read-reordering, which may also
be a problem.
I found that when an SFC9000-family controller is connected to an
Intel 3000 chipset, and write-combining is enabled, the controller
stops responding to PCIe read requests during driver initialisation
while the driver is polling for completion of an MCDI command. This
results in an NMI and system hang. Adding read memory barriers
between all reads to the shared memory area appears to reduce but not
eliminate the probability of this.
We have not yet established whether this is a bug in our BIU or in the
PCIe bridge. For now, work around by mapping the shared memory area
separately.
net/mac80211/cfg.c: In function ‘sta_apply_parameters’:
net/mac80211/cfg.c:746: error: ‘struct sta_info’ has no member named ‘plink_state’
make[1]: *** [net/mac80211/cfg.o] Error 1
make: *** [net/mac80211/mac80211.ko] Error 2
Start/stop TX queue is controlled by TX queue "used" counter.
It is incremented while WRBs are posted to TX queue and
decremented when TX completions are received. This counter was
getting decremented before HW is informed about processing of TX
completions. As used counter is decremented, transmit function
posts new WRBs and creates completion queue full scenario in HW.
In Lancer if a frame is DMAed partially due to lack of RX buffers,
an error completion is sent with packet size as zero and num_recvd
indicating number of used buffers. These buffers need to be freed
and packet dropped.
Luciano Coelho [Wed, 11 May 2011 14:09:37 +0000 (17:09 +0300)]
cfg80211/nl80211: add interval attribute for scheduled scans
Introduce NL80211_ATTR_SCHED_SCAN_INTERVAL as a required attribute for
NL80211_CMD_START_SCHED_SCAN. This value informs the driver at which
intervals the scheduled scan cycles should be executed.
Luciano Coelho [Wed, 11 May 2011 14:09:36 +0000 (17:09 +0300)]
mac80211: add support for HW scheduled scan
Implement support for HW scheduled scan. The mac80211 code doesn't perform
scheduled scans itself, but calls the driver to start and stop scheduled
scans.
This patch also creates a trace event class to be used by drv_hw_scan
and the new drv_sched_scan_start and drv_sched_stop functions, in
order to avoid duplicate code.
Luciano Coelho [Wed, 11 May 2011 14:09:35 +0000 (17:09 +0300)]
cfg80211/nl80211: add support for scheduled scans
Implement new functionality for scheduled scan offload. With this feature we
can scan automatically at certain intervals.
The idea is that the hardware can perform scan automatically and filter on
desired results without waking up the host unnecessarily.
Add NL80211_CMD_START_SCHED_SCAN and NL80211_CMD_STOP_SCHED_SCAN
commands to the nl80211 interface. When results are available they are
reported by NL80211_CMD_SCHED_SCAN_RESULTS events. The userspace is
informed when the scheduled scan has stopped with a
NL80211_CMD_SCHED_SCAN_STOPPED event, which can be triggered either by
the driver or by a call to NL80211_CMD_STOP_SCHED_SCAN.