Nimrod Andy [Thu, 12 Jun 2014 00:16:19 +0000 (08:16 +0800)]
net: fec: Enable IP header hardware checksum
IP header checksum is calcalated by network layer in default.
To support software TSO, it is better to use HW calculate the
IP header checksum.
FEC hw checksum feature request the checksum field in frame
is zero, otherwise the calculative CRC is not correct.
For segmentated TCP packet, HW calculate the IP header checksum again,
it doesn't bring any impact. For SW TSO, HW calculated checksum bring
better performance.
Linus Lüssing [Wed, 11 Jun 2014 23:41:24 +0000 (01:41 +0200)]
bridge: fix compile error when compiling without IPv6 support
Some fields in "struct net_bridge" aren't available when compiling the
kernel without IPv6 support. Therefore adding a check/macro to skip the
complaining code sections in that case.
With some specific configuration (VT6105M on Soekris 5510 and depending
on the device at the other end), fragmented packets were not transmitted
when forcing 100 full-duplex with autoneg disable.
This fix now write full-duplex chips register when forcing full or
half-duplex not only when autoneg is enable.
Linus Torvalds [Thu, 12 Jun 2014 17:30:18 +0000 (10:30 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
"This the bunch that sat in -next + lock_parent() fix. This is the
minimal set; there's more pending stuff.
In particular, I really hope to get acct.c fixes merged this cycle -
we need that to deal sanely with delayed-mntput stuff. In the next
pile, hopefully - that series is fairly short and localized
(kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
iov_iter work. Most of prereqs for ->splice_write with sane locking
order are there and Kent's dio rewrite would also fit nicely on top of
this pile"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
lock_parent: don't step on stale ->d_parent of all-but-freed one
kill generic_file_splice_write()
ceph: switch to iter_file_splice_write()
shmem: switch to iter_file_splice_write()
nfs: switch to iter_splice_write_file()
fs/splice.c: remove unneeded exports
ocfs2: switch to iter_file_splice_write()
->splice_write() via ->write_iter()
bio_vec-backed iov_iter
optimize copy_page_{to,from}_iter()
bury generic_file_aio_{read,write}
lustre: get rid of messing with iovecs
ceph: switch to ->write_iter()
ceph_sync_direct_write: stop poking into iov_iter guts
ceph_sync_read: stop poking into iov_iter guts
new helper: copy_page_from_iter()
fuse: switch to ->write_iter()
btrfs: switch to ->write_iter()
ocfs2: switch to ->write_iter()
xfs: switch to ->write_iter()
...
David S. Miller [Thu, 12 Jun 2014 17:28:49 +0000 (10:28 -0700)]
Merge branch 'bnx2x'
Yuval Mintz says:
====================
bnx2x: Bug fixes patch series
This patch series contains various bug fixes - 2 link related fixes,
one sriov-related issue and an additional fix for a theoretical bug
on new boards.
Please consider applying these patches to `net'.
====================
Ariel Elior [Thu, 12 Jun 2014 04:55:32 +0000 (07:55 +0300)]
bnx2x: Enlarge the dorq threshold for VFs
A malicious VF might try to starve the other VFs & PF by creating
contineous doorbell floods. In order to negate this, HW has a threshold of
doorbells per client, which will stop the client doorbells from arriving
if crossed.
The threshold currently configured for VFs is too low - under extreme traffic
scenarios, it's possible for a VF to reach the threshold and thus for its
fastpath to stop working.
Yuval Mintz [Thu, 12 Jun 2014 04:55:31 +0000 (07:55 +0300)]
bnx2x: Check for UNDI in uncommon branch
If L2FW utilized by the UNDI driver has the same version number as that
of the regular FW, a driver loading after UNDI and receiving an uncommon
answer from management will mistakenly assume the loaded FW matches its
own requirement and try to exist the flow via FLR.
Xufeng Zhang [Thu, 12 Jun 2014 02:53:36 +0000 (10:53 +0800)]
sctp: Fix sk_ack_backlog wrap-around problem
Consider the scenario:
For a TCP-style socket, while processing the COOKIE_ECHO chunk in
sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check,
a new association would be created in sctp_unpack_cookie(), but afterwards,
some processing maybe failed, and sctp_association_free() will be called to
free the previously allocated association, in sctp_association_free(),
sk_ack_backlog value is decremented for this socket, since the initial
value for sk_ack_backlog is 0, after the decrement, it will be 65535,
a wrap-around problem happens, and if we want to establish new associations
afterward in the same socket, ABORT would be triggered since sctp deem the
accept queue as full.
Fix this issue by only decrementing sk_ack_backlog for associations in
the endpoint's list.
Jean Delvare [Wed, 11 Jun 2014 16:35:56 +0000 (18:35 +0200)]
hwmon: (lm85) Drop generic detection
Generic detection leads to too many false positives, so drop it. FWIW
sensors-detect does not have such generic detection. If the user wants
to force the driver to bind to a not yet supported chip, he/she can
still do so using sysfs attribute new_device.
Fabio Baltieri [Sun, 8 Jun 2014 21:06:24 +0000 (22:06 +0100)]
hwmon: (ina2xx) Cast to s16 on shunt and current regs
All devices supported by ina2xx are bidirectional and report the
measured shunt voltage and power values as a signed 16 bit, but the
current driver implementation caches all registers as u16, leading
to an incorrect sign extension when reporting to userspace in
ina2xx_get_value().
This patch fixes the problem by casting the signed registers to s16.
Tested on an INA219.
Lidong Zhong [Thu, 12 Jun 2014 15:26:14 +0000 (10:26 -0500)]
dlm: keep listening connection alive with sctp mode
The connection struct with nodeid 0 is the listening socket,
not a connection to another node. The sctp resend function
was not checking that the nodeid was valid (non-zero), so it
would mistakenly get and resend on the listening connection
when nodeid was zero.
* pm-cpufreq:
cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR
cpufreq: tegra: update comment for clarity
cpufreq: intel_pstate: Remove duplicate CPU ID check
cpufreq: Mark CPU0 driver with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info'
cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
cpufreq: ppc-corenet-cpu-freq: do_div use quotient
Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
cpufreq: Tegra: implement intermediate frequency callbacks
cpufreq: add support for intermediate (stable) frequencies
Thomas Gleixner [Wed, 11 Jun 2014 23:59:15 +0000 (23:59 +0000)]
ALSA: intel8x0: Use ktime and ktime_get()
do_posix_clock_monotonic_gettime() is a leftover from the initial
posix timer implementation which maps to ktime_get_ts() and returns
the monotonic time in a timespec.
Use ktime based ktime_get() and use the ktime_delta_us() function to
calculate the delta instead of open coding the timespec math.
Mengdong Lin [Thu, 12 Jun 2014 06:42:25 +0000 (14:42 +0800)]
ALSA: hda - verify pin:converter connection on unsol event for HSW and VLV
This patch will verify the pin's coverter selection for an active stream
when an unsol event reports this pin becomes available again after a display
mode change or hot-plug event.
For Haswell+ and Valleyview: display mode change or hot-plug can change the
transcoder:port connection and make all the involved audio pins share the 1st
converter. So the stream using 1st convertor will flow to multiple pins
but active streams using other converters will fail. This workaround
is to assure the pin selects the right conveter and an assigned converter is
not shared by other unused pins.
Wang, Xiaoming [Thu, 12 Jun 2014 22:47:07 +0000 (18:47 -0400)]
ALSA: compress: Cancel the optimization of compiler and fix the size of struct for all platform.
Cancel the optimization of compiler for struct snd_compr_avail
which size will be 0x1c in 32bit kernel while 0x20 in 64bit
kernel under the optimizer. That will make compaction between
32bit and 64bit. So add packed to fix the size of struct
snd_compr_avail to 0x1c for all platform.
Brian Norris [Thu, 17 Apr 2014 07:21:46 +0000 (00:21 -0700)]
blackfin: defconfigs: add MTD_SPI_NOR (new dependency for M25P80)
These defconfigs contain the CONFIG_M25P80 symbol, which is now
dependent on the MTD_SPI_NOR symbol. Add CONFIG_MTD_SPI_NOR to the
relevant defconfigs.
At the same time, drop the now-nonexistent CONFIG_MTD_CHAR symbol.
Arnd Bergmann [Thu, 5 Jun 2014 21:14:41 +0000 (23:14 +0200)]
mmc: simplify SDHCI Kconfig dependencies
We have a number of front-end drivers for SDHCI_PLTFM, some of them
use 'select MMC_SDHCI_PLTFM', others use 'depends on'. This is
inconsistent and confusing, and in one case has also led to a
build error because of incomplete dependencies:
warning: (MMC_SDHCI_PXAV3 && MMC_SDHCI_PXAV2 && MMC_SDHCI_BCM_KONA) selects MMC_SDHCI_PLTFM which has unmet direct dependencies (MMC && MMC_SDHCI)
drivers/built-in.o: In function `sdhci_sirf_resume':
:(.text+0xaaacb4): undefined reference to `sdhci_resume_host'
drivers/built-in.o: In function `sdhci_sirf_suspend':
:(.text+0xaaacf8): undefined reference to `sdhci_suspend_host'
drivers/built-in.o: In function `sdhci_sirf_probe':
:(.text+0xaaaf44): undefined reference to `sdhci_add_host'
:(.text+0xaaaf50): undefined reference to `sdhci_remove_host'
This changes Kconfig to use 'depends on MMC_SDHCI_PLTFM' for all these
cases, to fix the build error and make the logic more logical.
Arnd Bergmann [Thu, 5 Jun 2014 21:14:40 +0000 (23:14 +0200)]
mmc: omap: don't select TPS65010
The MMC host driver should not select the pmic driver, since that
may have other dependencies, notably i2c in this case. It's not
clear what the exact requirement of the driver is, but to preserve
the behavior, this patch changes the 'select' into 'depends on',
meaning you now have to turn on TPS65010 explicitly and then
MMC_OMAP.
Arnd Bergmann [Thu, 5 Jun 2014 21:14:39 +0000 (23:14 +0200)]
mmc: mvsdio: avoid compiler warning
gcc correctly points out that hw_state can be used uninitially
in the mvsd_setup_data() function. This rearranges the function
to ensure it always contains a proper value.
Arnd Bergmann [Thu, 5 Jun 2014 21:14:38 +0000 (23:14 +0200)]
mmc: atmel-mci: incude asm/cacheclush.h
This avoids a build error due to the use of flush_dcache_page.
drivers/mmc/host/atmel-mci.c: In function 'atmci_read_data_pio':
drivers/mmc/host/atmel-mci.c:1870:5: error: implicit declaration of function 'flush_dcache_page' [-Werror=implicit-function-declaration]
flush_dcache_page(sg_page(sg));
^
Stephen Boyd [Tue, 10 Jun 2014 18:27:19 +0000 (11:27 -0700)]
mmc: sdhci-msm: Fix fallout from sdhci refactoring
The sdhci core was refactored recently and some of those
refactorings required changes in every sdhci platform driver.
Those updates happened around the same time as when the msm
driver was merged so the refactorings missed the msm driver.
Hook in the basic library functions so that we can boot apq8074
dragonboards again instead of crashing when we try to jump to
NULL function pointers.
Michal Marek [Wed, 11 Jun 2014 11:53:48 +0000 (13:53 +0200)]
powerpc: Avoid circular dependency with zImage.%
The rule to create the final images uses a zImage.% pattern.
Unfortunately, this also matches the names of the zImage.*.lds linker
scripts, which appear as a dependency of the final images. This somehow
worked when $(srctree) used to be an absolute path, but now the pattern
matches too much. List only the images from $(image-y) as the target of
the rule, to avoid the circular dependency.
Tony Lindgren [Mon, 2 Jun 2014 23:13:46 +0000 (16:13 -0700)]
gpio: of: Fix handling for deferred probe for -gpio suffix
Commit dd34c37aa3e (gpio: of: Allow -gpio suffix for property names)
added parsing for both -gpio and -gpios suffix but also changed
the handling for deferred probe unintentionally. Because of the
looping the second name will now return -ENOENT instead of
-EPROBE_DEFER. Fix the issue by breaking out of the loop if
-EPROBE_DEFER is encountered.
Al Viro [Thu, 12 Jun 2014 04:29:13 +0000 (00:29 -0400)]
lock_parent: don't step on stale ->d_parent of all-but-freed one
Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished. The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.
We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().
Al Viro [Sat, 5 Apr 2014 08:27:08 +0000 (04:27 -0400)]
->splice_write() via ->write_iter()
iter_file_splice_write() - a ->splice_write() instance that gathers the
pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
it to ->write_iter(). A bunch of simple cases coverted to that...
Linus Torvalds [Thu, 12 Jun 2014 04:10:33 +0000 (21:10 -0700)]
Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
"Main excitement is a virtio_scsi fix for alloc holding spinlock on the
abort path, which I refuse to CC stable since (1) I discovered it
myself, and (2) it's been there forever with no reports"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio_scsi: don't call virtqueue_add_sgs(... GFP_NOIO) holding spinlock.
virtio-rng: fixes for device registration/unregistration
virtio-rng: fix boot with virtio-rng device
virtio-rng: support multiple virtio-rng devices
virtio_ccw: introduce device_lost in virtio_ccw_device
virtio: virtio_break_device() to mark all virtqueues broken.
Linus Torvalds [Thu, 12 Jun 2014 00:08:16 +0000 (17:08 -0700)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost infrastructure updates from Michael S. Tsirkin:
"This reworks vhost core dropping unnecessary RCU uses in favor of VQ
mutexes which are used on fast path anyway. This fixes worst-case
latency for users which change the memory mappings a lot. Memory
allocation for vhost-net now supports fallback on vmalloc (same as for
vhost-scsi) this makes it possible to create the device on systems
where memory is very fragmented, with slightly lower performance"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: move memory pointer to VQs
vhost: move acked_features to VQs
vhost: replace rcu with mutex
vhost-net: extend device allocation to vmalloc
Pull arch/tile changes from Chris Metcalf:
"These mostly just address smaller issues reported to me"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch: tile: kernel: unaligned.c: Cleaning up uninitialized variables
drivers/tty/hvc/hvc_tile.c: use PTR_ERR_OR_ZERO
replace strict_strto* call with kstrto*
tile: Update comments for generic idle conversion
tile: cleanup the comment in init_pgprot
tile: use BOOTMEM_DEFAULT instead of magic number 0 for reserve_bootmem flags
Linus Torvalds [Wed, 11 Jun 2014 23:09:14 +0000 (16:09 -0700)]
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"Most of this is cleaning up various driver sysfs permissions so we can
re-add the perm check (we unified the module param and sysfs checks,
but the module ones were stronger so we weakened them temporarily).
Param parsing gets documented, and also "--" now forces args to be
handed to init (and ignored by the kernel).
Module NX/RO protections get tightened: we now set them before calling
parse_args()"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: set nx before marking module MODULE_STATE_COMING.
samples/kobject/: avoid world-writable sysfs files.
drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
drivers/staging/speakup/: avoid world-writable sysfs files.
drivers/regulator/virtual: avoid world-writable sysfs files.
drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
speakup: fix incorrect perms on speakup_acntsa.c
cpumask.h: silence warning with -Wsign-compare
Documentation: Update kernel-parameters.tx
param: hand arguments after -- straight to init
modpost: Fix resource leak in read_dump()
Doug Ledford [Wed, 11 Jun 2014 14:38:03 +0000 (10:38 -0400)]
net/core: Add VF link state control policy
Commit 1d8faf48c7 (net/core: Add VF link state control) added VF link state
control to the netlink VF nested structure, but failed to add a proper entry
for the new structure into the VF policy table. Add the missing entry so
the table and the actual data copied into the netlink nested struct are in
sync.
Florian Westphal [Wed, 11 Jun 2014 18:35:18 +0000 (20:35 +0200)]
net_sched: drr: warn when qdisc is not work conserving
The DRR scheduler requires that items on the active list are work
conserving, i.e. do not hold on to skbs for throttling purposes, etc.
Attaching e.g. tbf renders DRR useless because all other classes on the
active list are delayed as well.
So, warn users that this configuration won't work as expected; we
already do this in couple of other qdiscs, see e.g.
David S. Miller [Wed, 11 Jun 2014 22:46:17 +0000 (15:46 -0700)]
Merge branch 'inet_csums'
Tom Herbert says:
====================
net: Checksum offload changes - Part IV
I am working on overhauling RX checksum offload. Goals of this effort
are:
- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simply code
What is in this fourth patch set:
- Preserve CHECKSUM_COMPLETE instead of changing it to
CHECKSUM_UNNECESSARY. This allows correct reuse in validating multiple
csums in a packet.
- When SW needs to compute the packet checksum, save it as
CHECKSUM_COMPLETE. Also mark that checksum was compute by SW.
- Add skb_gro_postpull_rcsum to udp and vxlan to make GRO work with
CHECKSUM_COMPLETE.
v2: Removed patch setting skb_encapsulation when validating checksum
in tcp_gro_receive
Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================
Tom Herbert [Wed, 11 Jun 2014 01:54:19 +0000 (18:54 -0700)]
net: Save software checksum complete
In skb_checksum complete, if we need to compute the checksum for the
packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
Subsequent checksum verification can use this.
Also, added csum_complete_sw flag to distinguish between software and
hardware generated checksum complete, we should always be able to trust
the software computation.
Tom Herbert [Wed, 11 Jun 2014 01:54:13 +0000 (18:54 -0700)]
net: Preserve CHECKSUM_COMPLETE at validation
Currently when the first checksum in a packet is validated using
CHECKSUM_COMPLETE, ip_summed is overwritten to be CHECKSUM_UNNECESSARY
so that any subsequent checksums in the packet are not correctly
validated.
This patch adds csum_valid flag in sk_buff and uses that to indicate
validated checksum instead of setting CHECKSUM_UNNECESSARY. The bit
is set accordingly in the skb_checksum_validate_* functions. The flag
is checked in skb_checksum_complete, so that validation is communicated
between checksum_init and checksum_complete sequence in TCP and UDP.
David S. Miller [Wed, 11 Jun 2014 22:44:42 +0000 (15:44 -0700)]
Merge branch 'qlcnic-next'
Shahed Shaikh says:
====================
This series contains an enhancement in the area of firmware minidump collection
and optimization of ring count validation function.
Please apply this series to net-next.
====================
Shahed Shaikh [Wed, 11 Jun 2014 18:09:11 +0000 (14:09 -0400)]
qlcnic: Pre-allocate DMA buffer used for minidump collection
Pre-allocate the physically contiguous DMA buffer used for
minidump collection at driver load time, rather than at
run time, to minimize allocation failures. Driver will allocate
the buffer at load time if PEX DMA support capability is indicated
by the adapter.
Dmitry Popov [Wed, 11 Jun 2014 11:09:14 +0000 (15:09 +0400)]
ip_vti: fix sparse warnings for VTI_ISVTI
This patch fixes the following sparse warnings:
net/ipv4/ip_tunnel.c:245:53: warning: restricted __be16 degrades to integer
net/ipv4/ip_vti.c:321:19: warning: incorrect type in assignment (different base types)
net/ipv4/ip_vti.c:321:19: expected restricted __be16 [addressable] [assigned] [usertype] i_flags
net/ipv4/ip_vti.c:321:19: got int
net/ipv4/ip_vti.c:447:24: warning: incorrect type in assignment (different base types)
net/ipv4/ip_vti.c:447:24: expected restricted __be16 [usertype] i_flags
net/ipv4/ip_vti.c:447:24: got int
Since VTI_ISVTI is always used with ip_tunnel_parm->i_flags (which is __be16),
we can __force cast VTI_ISVTI to __be16 in header file.
Dan Carpenter [Wed, 11 Jun 2014 08:16:51 +0000 (11:16 +0300)]
drivers: net: davinci_cpdma: double free on error
We recently change the kzalloc() to devm_kzalloc() so freeing "ctlr"
here could lead to a double free.
Fixes: e194312854ed ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().') Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Since the term eBPF is used anyway on mailing list discussions, lets
also document that in the main BPF documentation file and replace a
couple of occurrences with eBPF terminology to be more clear.
Eric Dumazet [Tue, 10 Jun 2014 13:43:01 +0000 (06:43 -0700)]
ipv4: fix a race in ip4_datagram_release_cb()
Alexey gave a AddressSanitizer[1] report that finally gave a good hint
at where was the origin of various problems already reported by Dormando
in the past [2]
Problem comes from the fact that UDP can have a lockless TX path, and
concurrent threads can manipulate sk_dst_cache, while another thread,
is holding socket lock and calls __sk_dst_set() in
ip4_datagram_release_cb() (this was added in linux-3.8)
It seems that all we need to do is to use sk_dst_check() and
sk_dst_set() so that all the writers hold same spinlock
(sk->sk_dst_lock) to prevent corruptions.
TCP stack do not need this protection, as all sk_dst_cache writers hold
the socket lock.
Daniel Borkmann [Tue, 10 Jun 2014 10:31:10 +0000 (12:31 +0200)]
net: filter: add test_bpf module under MAINTAINERS' networking section
Add lib/test_bpf.c entry to maintainers file under networking.
All changes were posted via netdev for review, so make sure
other people Cc it as well when they call get_maintainer.pl.
Jon Cooper [Wed, 11 Jun 2014 13:33:08 +0000 (14:33 +0100)]
sfc: PIO:Restrict to 64bit arch and use 64-bit writes.
Fixes:ee45fd92c739
("sfc: Use TX PIO for sufficiently small packets")
The linux net driver uses memcpy_toio() in order to copy into
the PIO buffers.
Even on a 64bit machine this causes 32bit accesses to a write-
combined memory region.
There are hardware limitations that mean that only 64bit
naturally aligned accesses are safe in all cases.
Due to being write-combined memory region two 32bit accesses
may be coalesced to form a 64bit non 64bit aligned access.
Solution was to open-code the memory copy routines using pointers
and to only enable PIO for x86_64 machines.
Not tested on platforms other than x86_64 because this patch
disables the PIO feature on other platforms.
Compile-tested on x86 to ensure that works.
The WARN_ON_ONCE() code in the previous version of this patch
has been moved into the internal sfc debug driver as the
assertion was unnecessary in the upstream kernel code.
This bug fix applies to v3.13 and v3.14 stable branches.
David S. Miller [Wed, 11 Jun 2014 22:23:03 +0000 (15:23 -0700)]
Merge branch 'bridge-next'
Toshiaki Makita says:
====================
bridge: 802.1ad vlan protocol support
Currently bridge vlan filtering doesn't work fine with 802.1ad protocol.
Only if a bridge is configured without pvid, the bridge receives only
802.1ad tagged frames and no STP is used, it will work.
Otherwise:
- If pvid is configured, it can put only 802.1Q tags but cannot put 802.1ad
tags.
- If 802.1Q and 802.1ad tagged frames arrive in mixture, it applies filtering
regardless of their protocols.
- While an 802.1ad bridge should use another mac address for STP BPDU and
should forward customer's BPDU frames, it can't.
Thus, we can't properly handle frames once 802.1ad is used.
Handling 802.1ad is useful if we want to allow stacked vlans to be used,
e.g., guest VMs wants to use vlan tags and the host also wants to segregate
guest's traffic from other guests' by vlan tags.
Here is the image describing how to configure a bridge to filter VMs traffic.
This patch set enables us to set vlan protocols per bridge.
This tries to implement a bridge like S-VLAN component in IEEE 802.1Q-2011
spec.
Note that there is another possible implementation that sets vlan protocols
per port. Some HW switches seem to take that approach.
However, I think per-bridge approach is better, because;
- I think the typical usage of an 802.1ad bridge is segregating 802.1Q tagged
traffic (like what is described above), and this doesn't need the ability to
be set protocols per port. Also, If a bridge has many ports and it supports
per-port setting, we might have to make much more extra configurations to
change protocols of all ports.
- I assume that the main perpose to set protocol per port is to assign S-VID
according to C-VID, or to realize two logical bridges (one is an 802.1Q
filtering bridge and the other is an 802.1ad filtering bridge) in one bridge.
The former usually needs additional features such as vlan id mapping, and
is likely to make bridge's code complicated. If a user wants, such enhanced
features can be accomplished by a combination of multiple bridges, so it is
not absolutely necessary to implement these features in a bridge itself.
The latter is simply unnecessary because we can easily make two bridges of
which one is an 802.1Q bridge and the other is an 802.1ad bridge.
Here is an example of the enhanced feature that we can realize by using
multiple bridges and veth interfaces. This way is documented in
IEEE 802.1Q-2011 clause 15.4 (C-tagged service interface).
In this configuration, we can map C-VIDs to any S-VID.
For example;
C-VID 10 and 20 to S-VID 100
C-VID 30 to S-VID 110
This is achieved through the 802.1Q bridge that forwards C-tagged frames to
proper ports of the 802.1ad bridge.
Changes:
v1 -> v2:
- Make the way to forward bridge group addresses more generic by introducing
new mask, group_fwd_mask_required.
RFC -> v1:
- Add S-TAG tx offload.
- Remove a fix around stacked vlan which has already been fixed.
- Take into account Bridge Group Addresses.
- Separate handling of protocol-mismatch from br_vlan_get_tag().
- Change the way to set vlan_proto from netlink to sysfs because no other
existing configuration per bridge can be set by netlink.
====================
Toshiaki Makita [Tue, 10 Jun 2014 11:59:25 +0000 (20:59 +0900)]
bridge: Support 802.1ad vlan filtering
This enables us to change the vlan protocol for vlan filtering.
We come to be able to filter frames on the basis of 802.1ad vlan tags
through a bridge.
This also changes br->group_addr if it has not been set by user.
This is needed for an 802.1ad bridge.
(See IEEE 802.1Q-2011 8.13.5.)
Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad
bridge can forward the Nearest Customer Bridge group addresses except
for br->group_addr, which should be passed to higher layer.
To change the vlan protocol, write a protocol in sysfs:
# echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol
Toshiaki Makita [Tue, 10 Jun 2014 11:59:24 +0000 (20:59 +0900)]
bridge: Prepare for forwarding another bridge group addresses
If a bridge is an 802.1ad bridge, it must forward another bridge group
addresses (the Nearest Customer Bridge group addresses).
(For details, see IEEE 802.1Q-2011 8.6.3.)
As user might not want group_fwd_mask to be modified by enabling 802.1ad,
introduce a new mask, group_fwd_mask_required, which indicates addresses
the bridge wants to forward. This will be set by enabling 802.1ad.
Toshiaki Makita [Tue, 10 Jun 2014 11:59:23 +0000 (20:59 +0900)]
bridge: Prepare for 802.1ad vlan filtering support
This enables a bridge to have vlan protocol informantion and allows vlan
tag manipulation (retrieve, insert and remove tags) according to the vlan
protocol.
Arnd Bergmann [Tue, 10 Jun 2014 08:34:36 +0000 (10:34 +0200)]
net: xen-netback: include linux/vmalloc.h again
commit e9ce7cb6b107 ("xen-netback: Factor queue-specific data into
queue struct") added a use of vzalloc/vfree to interface.c, but
removed the #include <linux/vmalloc.h> statement at the same time,
which causes this build error:
drivers/net/xen-netback/interface.c: In function 'xenvif_free':
drivers/net/xen-netback/interface.c:754:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
vfree(vif->queues);
^
cc1: some warnings being treated as errors
Using phy_drivers_register/_unregister functions is proper way to
handle multiple PHY drivers registration. For Realtek PHY drivers
module, it fixes incomplete current error-handlings up and adds
missed unregistration for the RTL8201CP driver.
net: sh_eth: Fix timing of RACT setting in sh_eth_rx()
This patch fixes an issue that we cannot use nfs rootfs correctly
on r8a7790 when the command below runs on a host PC.
$ sudo ping -f -l 8 $BOARD_IP_ADDR
Since the driver sets the RACT to 1 in the first while loop of
sh_eth_rx(), the controller accepts a next frame into the next RX
descriptor during the while loop. But, in the first while loop
doesn't allocate a next skb. So, this patch removes the RACT setting
in the first while loop of sh_eth_rx().
net: sh_eth: Fix receive packet "exceeded" condition in sh_eth_rx()
This patch fixes the packet "exceeded" condition in sh_eth_rx() when
RACT in an RX descriptor is not set and the "quota" is 0.
Otherwise, kernel panic happens because the "&n->poll_list" is deleted
twice in sh_eth_poll() which calls napi_complete() and net_rx_action().
net/core/filter.c: In function '__sk_run_filter':
net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Jon Paul Maloy [Mon, 9 Jun 2014 16:08:18 +0000 (11:08 -0500)]
tipc: fix potential bug in function tipc_backlog_rcv
In commit 4f4482dcd9a0606a30541ff165ddaca64748299b ("tipc: compensate
for double accounting in socket rcv buffer") we access 'truesize' of
a received buffer after it might have been released by the function
filter_rcv().
In this commit we correct this by reading the value of 'truesize' to
the stack before delivering the buffer to filter_rcv().
David S. Miller [Wed, 11 Jun 2014 21:59:21 +0000 (14:59 -0700)]
Merge branch 'mlx4'
Amir Vadai says:
====================
cpumask,net: affinity hint helper function
This patchset will set affinity hint to influence IRQs to be allocated on the
same NUMA node as the one where the card resides. As discussed in
http://www.spinics.net/lists/netdev/msg271497.html
If number of IRQs allocated is greater than the number of local NUMA cores, all
local cores will be used first, and the rest of the IRQs will be on a remote
NUMA node.
If no NUMA support - IRQ's and cores will be mapped 1:1
Since the utility function to calculate the mapping could be useful in other mq
drivers in the kernel, it was added to cpumask.[ch]
This patchset was tested and applied on top of net-next since the first
consumer is a network device (mlx4_en). Over commit fff1f59 "mac802154:
llsec: add forgotten list_del_rcu in key removal"
====================
Yuval Atias [Mon, 9 Jun 2014 07:24:39 +0000 (10:24 +0300)]
net/mlx4_en: Use affinity hint
The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.
We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores. If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Amir Vadai [Mon, 9 Jun 2014 07:24:38 +0000 (10:24 +0300)]
cpumask: Utility function to set n'th cpu - local cpu first
This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set
Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.
Linus Torvalds [Wed, 11 Jun 2014 21:26:21 +0000 (14:26 -0700)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management update from Zhang Rui:
"Specifics:
- fix a bug in Exynos thermal driver, which overwrites the hardware
trip point threshold when updating software trigger levels and
results in emergency shutdown. From: Tushar Behera.
- add thermal sensor support for Armada 375 and 38x SoCs. From
Ezequiel Garcia.
- add TMU (Thermal Management Unit) support for Exynos5260 and
Exynos5420 SoCs. From Naveen Krishna Chatradhi.
- add support for the additional digital temperature sensors in the
Intel SoCs like Bay Trail. From: Srinivas Pandruvada.
- a couple of cleanups and small fixes from Jingoo Han, Bartlomiej
Zolnierkiewicz, Geert Uytterhoeven, Jacob Pan, Paul Walmsley and
Lan,Tianyu"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (21 commits)
thermal: spear: remove unnecessary OOM messages
thermal: exynos: remove unnecessary OOM messages
thermal: rcar: remove unnecessary OOM messages
thermal: armada: Support Armada 380 SoC
thermal: armada: Support Armada 375 SoC
thermal: armada: Allow to specify an 'inverted readout' sensor
thermal: armada: Pass the platform_device to init_sensor()
thermal: armada: Add generic infrastructure to handle the sensor
thermal: armada: Add infrastructure to support generic formulas
thermal: armada: Rename armada_thermal_ops struct
thermal/intel_powerclamp: add newer cpu ids
thermal: rcar: Use pm_runtime_put() i.s.o. pm_runtime_put_sync()
thermal: samsung: Only update available threshold limits
Thermal/int3403: Fix thermal hysteresis unit conversion
thermal: Intel SoC DTS thermal
thermal: samsung: Add TMU support for Exynos5260 SoCs
thermal: samsung: Add TMU support for Exynos5420 SoCs
thermal: samsung: change base_common to more meaningful base_second
thermal: samsung: replace inten_ bit fields with intclr_
thermal: offer Samsung thermal support only when ARCH_EXYNOS is defined
...
Linus Torvalds [Wed, 11 Jun 2014 21:06:55 +0000 (14:06 -0700)]
Merge tag 'pwm/for-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm changes from Thierry Reding:
"The majority of these changes are cleanups and fixes across all
drivers. Redundant error messages are removed and more PWM
controllers set the .can_sleep flag to signal that they can't be used
in atomic context.
Support is added for the Broadcom Kona family of SoCs and the Intel
LPSS driver can now probe PCI devices in addition to ACPI devices.
Upon shutdown, the pwm-backlight driver will now power off the
backlight. It also uses the new descriptor-based GPIO API for more
concise GPIO handling.
A large chunk of these changes also converts platforms to use the
lookup mechanism rather than relying on the global number space to
reference PWM devices. This is largely in preparation for more
unification and cleanups in future patches. Eventually it will allow
the legacy PWM API to be removed"
* tag 'pwm/for-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (38 commits)
pwm: fsl-ftm: set pwm_chip can_sleep flag
pwm: ab8500: Fix wrong value shift for disable/enable PWM
pwm: samsung: do not set manual update bit in pwm_samsung_config
pwm: lp3943: Set pwm_chip can_sleep flag
pwm: atmel: set pwm_chip can_sleep flag
pwm: mxs: set pwm_chip can_sleep flag
pwm: tiehrpwm: inline accessor functions
pwm: tiehrpwm: don't build PM related functions when not needed
pwm-backlight: retrieve configured PWM period
leds: leds-pwm: retrieve configured PWM period
ARM: pxa: hx4700: use PWM_LOOKUP to initialize struct pwm_lookup
ARM: shmobile: armadillo: use PWM_LOOKUP to initialize struct pwm_lookup
ARM: OMAP3: Beagle: use PWM_LOOKUP to initialize struct pwm_lookup
pwm: modify PWM_LOOKUP to initialize all struct pwm_lookup members
ARM: pxa: hx4700: initialize all the struct pwm_lookup members
ARM: OMAP3: Beagle: initialize all the struct pwm_lookup members
pwm: renesas-tpu: remove unused struct tpu_pwm_platform_data
ARM: shmobile: armadillo: initialize all struct pwm_lookup members
pwm: add period and polarity to struct pwm_lookup
pwm: twl: Really disable twl6030 PWMs
...
Lukas Czerner [Wed, 11 Jun 2014 16:28:43 +0000 (12:28 -0400)]
dm thin: update discard_granularity to reflect the thin-pool blocksize
DM thinp already checks whether the discard_granularity of the data
device is a factor of the thin-pool block size. But when using the
dm-thin-pool's discard passdown support, DM thinp was not selecting the
max of the underlying data device's discard_granularity and the
thin-pool's block size.
Update set_discard_limits() to set discard_granularity to the max of
these values. This enables blkdev_issue_discard() to properly align the
discards that are sent to the DM thin device on a full block boundary.
As such each discard will now cover an entire DM thin-pool block and the
block will be reclaimed.
dm bio prison: implement per bucket locking in the dm_bio_prison hash table
Split the single per bio-prison lock by using per bucket locking. Per
bucket locking benefits both dm-thin and dm-cache targets by reducing
bio-prison lock contention.
vhost-scsi: Include prot_bytes into expected data transfer length
This patch updates vhost_scsi_get_tag() to accept the combined
expected data transfer length + T10 PI bytes as the value passed
into target_submit_cmd().
This is required now that target-core logic in commit 14ef9200
expects to subtract se_cmd->prot_length from se_cmd->data_length.
Sagi Grimberg [Wed, 11 Jun 2014 09:09:59 +0000 (12:09 +0300)]
TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
In various areas of the code, it is assumed that
se_cmd->data_length describes pure data. In case
that protection information exists over the wire
(protect bits is are on) the target core re-calculates
the data length from the CDB and the backed device
block size (instead of each transport peeking in the cdb).
Modify loopback device to include protection information
in the transferred data length (like other scsi transports).
Sagi Grimberg [Wed, 11 Jun 2014 09:09:58 +0000 (12:09 +0300)]
libiscsi, iser: Adjust data_length to include protection information
In case protection information exists over the wire
iscsi header data length is required to include it.
Use protection information aware scsi helpers to set
the correct transfer length.
In order to avoid breakage, remove iser transfer length
checks for each task as they are not always true and
somewhat redundant anyway.
Sagi Grimberg [Wed, 11 Jun 2014 09:09:57 +0000 (12:09 +0300)]
scsi_cmnd: Introduce scsi_transfer_length helper
In case protection information exists on the wire
scsi transports should include it in the transfer
byte count (even if protection information does not
exist in the host memory space). This helper will
compute the total transfer length from the scsi
command data length and protection attributes.