Brian King [Thu, 29 Jan 2015 21:54:40 +0000 (15:54 -0600)]
sd: Fix max transfer length for 4k disks
The following patch fixes an issue observed with 4k sector disks
where the max_hw_sectors attribute was getting set too large in
sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units
of SCSI logical blocks and queue_max_hw_sectors is in units of
512 byte blocks, on a 4k sector disk, every time we went through
sd_revalidate_disk, we were taking the current value of
queue_max_hw_sectors and increasing it by a factor of 8. Fix
this by only shifting sdkp->max_xfer_blocks.
Mike Christie [Wed, 28 Jan 2015 09:46:53 +0000 (03:46 -0600)]
scsi: fix device handler detach oops
This fixes a regression caused by commit 1d5203 ("scsi: handle more device
handler setup/teardown in common code").
The bug is that the alua detach() callout will try to access the
sddev->scsi_dh_data, but we have already set it to NULL. This patch
moves the clearing of that field to after detach() is called.
Eric Dumazet [Fri, 30 Jan 2015 05:35:05 +0000 (21:35 -0800)]
ipv4: tcp: get rid of ugly unicast_sock
In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock")
I tried to address contention on a socket lock, but the solution
I chose was horrible :
commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside
of TCP stack") addressed a selinux regression.
commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1")
took care of another regression.
commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression.
commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
was another shot in the dark.
Really, just use a proper socket per cpu, and remove the skb_orphan()
call, to re-enable flow control.
This solves a serious problem with FQ packet scheduler when used in
hostile environments, as we do not want to allocate a flow structure
for every RST packet sent in response to a spoofed packet.
NeilBrown [Sun, 1 Feb 2015 23:44:29 +0000 (10:44 +1100)]
md/raid5: fix another livelock caused by non-aligned writes.
If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.
As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.
This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.
However since commit 67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.
So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.
Linus Torvalds [Sun, 1 Feb 2015 21:20:47 +0000 (13:20 -0800)]
Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"One more week's worth of fixes. Worth pointing out here are:
- A patch fixing detaching of iommu registrations when a device is
removed -- earlier the ops pointer wasn't managed properly
- Another set of Renesas boards get the same GIC setup fixup as
others have in previous -rcs
- Serial port aliases fixups for sunxi. We did the same to tegra but
we caught that in time before the merge window due to more machines
being affected. Here it took longer for anyone to notice.
- A couple more DT tweaks on sunxi
- A follow-up patch for the mvebu coherency disabling in last -rc
batch"
* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
ARM: sunxi: dt: Fix aliases
ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline
ARM: dts: sun6i: ippo-q8h-v5: Fix serial0 alias
ARM: dts: sunxi: Fix usb-phy support for sun4i/sun5i
Linus Torvalds [Sun, 1 Feb 2015 20:23:32 +0000 (12:23 -0800)]
sched: don't cause task state changes in nested sleep debugging
Commit 8eb23b9f35aa ("sched: Debug nested sleeps") added code to report
on nested sleep conditions, which we generally want to avoid because the
inner sleeping operation can re-set the thread state to TASK_RUNNING,
but that will then cause the outer sleep loop not actually sleep when it
calls schedule.
However, that's actually valid traditional behavior, with the inner
sleep being some fairly rare case (like taking a sleeping lock that
normally doesn't actually need to sleep).
And the debug code would actually change the state of the task to
TASK_RUNNING internally, which makes that kind of traditional and
working code not work at all, because now the nested sleep doesn't just
sometimes cause the outer one to not block, but will cause it to happen
every time.
In particular, it will cause the cardbus kernel daemon (pccardd) to
basically busy-loop doing scheduling, converting a laptop into a heater,
as reported by Bruno Prémont. But there may be other legacy uses of
that nested sleep model in other drivers that are also likely to never
get converted to the new model.
This fixes both cases:
- don't set TASK_RUNNING when the nested condition happens (note: even
if WARN_ONCE() only _warns_ once, the return value isn't whether the
warning happened, but whether the condition for the warning was true.
So despite the warning only happening once, the "if (WARN_ON(..))"
would trigger for every nested sleep.
- in the cases where we knowingly disable the warning by using
"sched_annotate_sleep()", don't change the task state (that is used
for all core scheduling decisions), instead use '->task_state_change'
that is used for the debugging decision itself.
(Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
suggested change to use 'task_state_change' as part of the test)
Olof Johansson [Sun, 1 Feb 2015 16:51:12 +0000 (08:51 -0800)]
Merge tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Merge "Third Round of Renesas ARM Based SoC Fixes for v3.19" from Simon Horman:
* Instantiate GIC from C board code in legacy builds on r8a7790 and r8a73a4
* tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
Eric Dumazet [Fri, 30 Jan 2015 01:30:12 +0000 (17:30 -0800)]
net: sched: fix panic in rate estimators
Doing the following commands on a non idle network device
panics the box instantly, because cpu_bstats gets overwritten
by stats.
tc qdisc add dev eth0 root <your_favorite_qdisc>
... some traffic (one packet is enough) ...
tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc>
Lets play safe and not use an union : percpu 'pointers' are mostly read
anyway, and we have typically few qdiscs per host.
Signed-off-by: Eric Dumazet <[email protected]> Cc: John Fastabend <[email protected]> Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe") Signed-off-by: David S. Miller <[email protected]>
Haiyang Zhang [Thu, 29 Jan 2015 20:34:49 +0000 (12:34 -0800)]
hyperv: Fix the error processing in netvsc_send()
The existing code frees the skb in EAGAIN case, in which the skb will be
retried from upper layer and used again.
Also, the existing code doesn't free send buffer slot in error case, because
there is no completion message for unsent packets.
This patch fixes these problems.
(Please also include this patch for stable trees. Thanks!)
Also, one documentation update (da9063), so some devicetrees can now
be verified"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sh_mobile: terminate DMA reads properly
i2c: Only include slave support if selected
i2c: s3c2410: fix ABBA deadlock by keeping clock prepared
i2c: slave-eeprom: fix boundary check when using sysfs
i2c: st: Rename clock reference to something that exists
DT: i2c: Add devices handled by the da9063 MFD driver
Linus Torvalds [Sat, 31 Jan 2015 03:49:44 +0000 (19:49 -0800)]
Merge tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are two tiny patches, one fixing up the drivers/Kconfig file, and
one adding a MAINTAINERS entry for the UIO git tree"
* tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
drivers/Kconfig: remove duplicate entry for soc
MAINTAINERS: add git url entry for UIO
Linus Torvalds [Sat, 31 Jan 2015 03:44:56 +0000 (19:44 -0800)]
Merge tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
"Here are two tiny staging tree fixes. One for the nvec driver to
resolve a reported problem, and one to add a MAINTAINERS entry for the
Android drivers"
* tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
MAINTAINERS: add Android driver entries
staging: nvec: specify a platform-device base id
Linus Torvalds [Sat, 31 Jan 2015 03:35:35 +0000 (19:35 -0800)]
Merge tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes and quirk additions for 3.19-rc7.
All have been in linux-next for a while with no reported problems"
* tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: Add OTG PET device to TPL
usb-storage/SCSI: blacklist FUA on JMicron 152d:2566 USB-SATA controller
uas: Add no-report-opcodes quirk for Simpletech devices with id 4971:8017
storage: Revise/fix quirk for 04E6:000F SCM USB-SCSI converter
usb: phy: never defer probe in non-OF case
usb: dwc2: call dwc2_is_controller_alive() under spinlock
Software writes poison data into the descriptor bytes[15:8] and upon
receiving the interrupt, if those bytes are overwritten by the hardware with
the valid data, software also reads bytes[7:0] and executes receive/tx
completion logic.
If the CPU executes the above two reads in out of order fashion, then the
bytes[7:0] will have older data and causing the kernel panic. We have to
force the order of the reads and thus this patch introduces read memory
barrier between these reads.
David S. Miller [Sat, 31 Jan 2015 02:03:58 +0000 (18:03 -0800)]
Merge branch 'vlan_get_protocol'
Toshiaki Makita says:
====================
Fix checksum error when using stacked vlan
When I was testing 802.1ad, I found several drivers don't take into
account 802.1ad or multiple vlans when retrieving L3 (IP/IPv6) or
L4 (TCP/UDP) protocol for checksum offload.
It is mainly due to vlan_get_protocol(), which extracts ether type only
when it is tagged with single 802.1Q. When 802.1ad is used or there are
multiple vlans, it extracts vlan protocol and drivers cannot determine
which L3/L4 protocol is used.
Those drivers, most of which have IP_CSUM/IPV6_CSUM features, get L3/L4
header-offset by software, so it seems that their checksum offload works
with multiple vlans if we can parse protocols correctly.
(They know mac header length, and probably don't care about what is in it.)
And another thing, some of Intel's drivers seem to use skb->protocol where
vlan_get_protocol() is more suitable.
I tested that at least igb/igbvf on I350 works with this patch set.
Note:
We can hand a double tagged packet with CHECKSUM_PARTIAL to a HW driver
by creating a vlan device on a bridge device and enabling vlan_filtering
of the bridge with 802.1ad protocol.
====================
Toshiaki Makita [Thu, 29 Jan 2015 11:37:10 +0000 (20:37 +0900)]
ixgbevf: Fix checksum error when using stacked vlan
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
ixgbevf_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use first->protocol instead of skb->protocol to get the proper network
protocol.
Toshiaki Makita [Thu, 29 Jan 2015 11:37:09 +0000 (20:37 +0900)]
ixgbe: Fix checksum error when using stacked vlan
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
ixgbe_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use vlan_get_protocol() to get the proper network protocol.
Toshiaki Makita [Thu, 29 Jan 2015 11:37:08 +0000 (20:37 +0900)]
igbvf: Fix checksum error when using stacked vlan
When a skb has multiple vlans and it is CHECKSUM_PARTIAL,
igbvf_tx_csum() fails to get the network protocol and checksum related
descriptor fields are not configured correctly because skb->protocol
doesn't show the L3 protocol in this case.
Use vlan_get_protocol() to get the proper network protocol.
Toshiaki Makita [Thu, 29 Jan 2015 11:37:07 +0000 (20:37 +0900)]
net: Fix vlan_get_protocol for stacked vlan
vlan_get_protocol() could not get network protocol if a skb has a 802.1ad
vlan tag or multiple vlans, which caused incorrect checksum calculation
in several drivers.
Fix vlan_get_protocol() to retrieve network protocol instead of incorrect
vlan protocol.
As the logic is the same as skb_network_protocol(), create a common helper
function __vlan_get_protocol() and call it from existing functions.
net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param
When making use of RFC5061, section 4.2.4. for setting the primary IP
address, we're passing a wrong parameter header to param_type2af(),
resulting always in NULL being returned.
At this point, param.p points to a sctp_addip_param struct, containing
a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation
id. Followed by that, as also presented in RFC5061 section 4.2.4., comes
the actual sctp_addr_param, which also contains a sctp_paramhdr, but
this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that
param_type2af() can make use of. Since we already hold a pointer to
addr_param from previous line, just reuse it for param_type2af().
Pablo Neira [Thu, 29 Jan 2015 09:51:53 +0000 (10:51 +0100)]
netlink: fix wrong subscription bitmask to group mapping in
The subscription bitmask passed via struct sockaddr_nl is converted to
the group number when calling the netlink_bind() and netlink_unbind()
callbacks.
The conversion is however incorrect since bitmask (1 << 0) needs to be
mapped to group number 1. Note that you cannot specify the group number 0
(usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP
since this is rejected through -EINVAL.
This problem became noticeable since 97840cb ("netfilter: nfnetlink:
fix insufficient validation in nfnetlink_bind") when binding to bitmask
(1 << 0) in ctnetlink.
There is a race in the MIPS fork code which allows the child to get a
stale copy of parent MSA/FPU/DSP state that is active in hardware
registers when the fork() is called. This is because copy_thread() saves
the live register state into the child context only if the hardware is
currently in use, apparently on the assumption that the hardware state
cannot have been saved and disabled since the initial duplication of the
task_struct. However preemption is certainly possible during this
window.
An example sequence of events is as follows:
1) The parent userland process puts important data into saved floating
point registers ($f20-$f31), which are then dirty compared to the
process' stored context.
2) The parent process calls fork() which does a clone system call.
3) In the kernel, do_fork() -> copy_process() -> dup_task_struct() ->
arch_dup_task_struct() (which uses the weakly defined default
implementation). This duplicates the parent process' task context,
which includes a stale version of its FP context from when it was
last saved, probably some time before (1).
4) At some point before copy_process() calls copy_thread(), such as when
duplicating the memory map, the process is desceduled. Perhaps it is
preempted asynchronously, or perhaps it sleeps while blocked on a
mutex. The dirty FP state in the FP registers is saved to the parent
process' context and the FPU is disabled.
5) When the process is rescheduled again it continues copying state
until it gets to copy_thread(), which checks whether the FPU is in
use, so that it can copy that dirty state to the child process' task
context. Because of the deschedule however the FPU is not in use, so
the child process' context is left with stale FP context from the
last time the parent saved it (some time before (1)).
6) When the new child process is scheduled it reads the important data
from the saved floating point register, and ends up doing a NULL
pointer dereference as a result of the stale data.
This use of saved floating point registers across function calls can be
triggered fairly easily by explicitly using inline asm with a current
(MIPS R2) compiler, but is far more likely to happen unintentionally
with a MIPS R6 compiler where the FP registers are more likely to get
used as scratch registers for storing non-fp data.
It is easily fixed, in the same way that other architectures do it, by
overriding the implementation of arch_dup_task_struct() to sync the
dirty hardware state to the parent process' task context *prior* to
duplicating it, rather than copying straight to the child process' task
context in copy_thread(). Note, the FPU hardware is not disabled so the
parent process may continue executing with the live register context,
but now the child process is guaranteed to have an identical copy of it
at that point.
David Daney [Tue, 6 Jan 2015 18:42:23 +0000 (10:42 -0800)]
MIPS: Fix C0_Pagegrain[IEC] support.
The following commits:
5890f70f15c52d (MIPS: Use dedicated exception handler if CPU supports RI/XI exceptions) 6575b1d4173eae (MIPS: kernel: cpu-probe: Detect unique RI/XI exceptions)
break the kernel for *all* existing MIPS CPUs that implement the
CP0_PageGrain[IEC] bit. They cause the TLB exception handlers to be
generated without the legacy execute-inhibit handling, but never set
the CP0_PageGrain[IEC] bit to activate the use of dedicated exception
vectors for execute-inhibit exceptions. The result is that upon
detection of an execute-inhibit violation, we loop forever in the TLB
exception handlers instead of sending SIGSEGV to the task.
If we are generating TLB exception handlers expecting separate
vectors, we must also enable the CP0_PageGrain[IEC] feature.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
The series of IPoIB bug fixes that went into 3.19-rc1 introduce
regressions, and after trying to sort things out, we decided to revert
to 3.18's IPoIB driver and get things right for 3.20.
Linus Torvalds [Fri, 30 Jan 2015 22:34:55 +0000 (14:34 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also an event groups fix, two PMU driver
fixes and a CPU model variant addition"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Tighten (and fix) the grouping condition
perf/x86/intel: Add model number for Airmont
perf/rapl: Fix crash in rapl_scale()
perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization
perf probe: Fix probing kretprobes
perf symbols: Introduce 'for' method to iterate over the symbols with a given name
perf probe: Do not rely on map__load() filter to find symbols
perf symbols: Introduce method to iterate symbols ordered by name
perf symbols: Return the first entry with a given name in find_by_name method
perf annotate: Fix memory leaks in LOCK handling
perf annotate: Handle ins parsing failures
perf scripting perl: Force to use stdbool
perf evlist: Remove extraneous 'was' on error message
Commit 842dfc11ea9a ("MIPS: Fix build with binutils 2.24.51+") in v3.18
enabled -msoft-float and sprinkled ".set hardfloat" where necessary to
use FP instructions. However it missed enable_restore_fp_context() which
since v3.17 does a ctc1 with inline assembly, causing the following
assembler errors on Mentor's 2014.05 toolchain:
{standard input}: Assembler messages:
{standard input}:2913: Error: opcode not supported on this processor: mips32r2 (mips32r2) `ctc1 $2,$31'
scripts/Makefile.build:257: recipe for target 'arch/mips/kernel/traps.o' failed
Fix that to use the new write_32bit_cp1_register() macro so that ".set
hardfloat" is automatically added when -msoft-float is in use.
James Hogan [Fri, 30 Jan 2015 15:40:19 +0000 (15:40 +0000)]
MIPS: mipsregs.h: Add write_32bit_cp1_register()
Add a write_32bit_cp1_register() macro to compliment the
read_32bit_cp1_register() macro. This is to abstract whether .set
hardfloat needs to be used based on GAS_HAS_SET_HARDFLOAT.
The implementation of _read_32bit_cp1_register() .sets mips1 due to
failure of gas v2.19 to assemble cfc1 for Octeon (see commit 25c300030016 ("MIPS: Override assembler target architecture for
octeon.")). I haven't copied this over to _write_32bit_cp1_register() as
I'm uncertain whether it applies to ctc1 too, or whether anybody cares
about that version of binutils any longer.
Linus Torvalds [Fri, 30 Jan 2015 21:46:04 +0000 (13:46 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and UDF fix from Jan Kara:
"A fix for UDF to properly free preallocated blocks and a fix for quota
so that Q_GETQUOTA quotactl reports correct numbers for XFS filesystem
(and similarly Q_XGETQUOTA quotactl works properly for other
filesystems)"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units
udf: Release preallocation on last writeable close
Linus Torvalds [Fri, 30 Jan 2015 18:45:24 +0000 (10:45 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"The ARM changes are largish, but not too scary. And a simple fix for
x86 (bug introduced in 3.19)"
(Paolo sayus these are the "Final" fixes. We'll see).
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: check LAPIC presence when building apic_map
arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault
arm/arm64: KVM: Invalidate data cache on unmap
arm/arm64: KVM: Use set/way op trapping to track the state of the caches
Linus Torvalds [Fri, 30 Jan 2015 18:41:26 +0000 (10:41 -0800)]
Merge tag 'iommu-fixes-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two small fixes for the Tegra GART IOMMU driver:
- provide a .map_sg function for iommu_ops
- do not register Tegra GART driver as a workaround because of issues
with it when used from DRM code"
* tag 'iommu-fixes-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/tegra: gart: Provide default ->map_sg() callback
iommu/tegra: gart: Do not register with bus
Linus Torvalds [Fri, 30 Jan 2015 18:34:24 +0000 (10:34 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull intel and dp mst drm fixes from Dave Airlie:
"Intel had a few more fixes lined up and no point me sitting on them,
along with a DP MST fix from Rob for a race at undock + vt switch"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: fix fb-helper vs MST dangling connector ptrs (v2)
drm/i915: BDW Fix Halo PCI IDs marked as ULT.
drm/i915: Fix and clean BDW PCH identification
drm/i915: Only fence tiled region of object.
drm/i915: fix inconsistent brightness after resume
drm/i915: Init PPGTT before context enable
Wolfram Sang [Thu, 29 Jan 2015 17:33:16 +0000 (18:33 +0100)]
i2c: sh_mobile: terminate DMA reads properly
DMA read requests could miss proper termination, so two more bytes would
have been read via PIO overwriting the end of the buffer with wrong
data. Make DMA stop handling more readable while we are here.
Filip Brozovic [Fri, 30 Jan 2015 11:58:24 +0000 (12:58 +0100)]
ASoC: sgtl5000: Use shift mask when setting codec mode
Shift the I2S mode value by the necessary amount before writing the
registers. This makes Right Justified and PCM mode work in addition to
the default Left Justified mode.
Johan Hovold [Mon, 26 Jan 2015 11:02:46 +0000 (12:02 +0100)]
gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low
Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when setting the
gpio-line polarity.
Fixes: 0769746183ca ("gpiolib: add support for changing value polarity
in sysfs") Cc: stable <[email protected]> # v2.6.33 Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
Dave Airlie [Fri, 30 Jan 2015 03:32:24 +0000 (13:32 +1000)]
Merge tag 'drm-intel-fixes-2015-01-29' of git://anongit.freedesktop.org/drm-intel into drm-fixes
misc i915 fixes, mostly all stable material as well.
* tag 'drm-intel-fixes-2015-01-29' of git://anongit.freedesktop.org/drm-intel:
drm/i915: BDW Fix Halo PCI IDs marked as ULT.
drm/i915: Fix and clean BDW PCH identification
drm/i915: Only fence tiled region of object.
drm/i915: fix inconsistent brightness after resume
drm/i915: Init PPGTT before context enable
Rob Clark [Mon, 26 Jan 2015 15:11:08 +0000 (10:11 -0500)]
drm: fix fb-helper vs MST dangling connector ptrs (v2)
VT switch back/forth from console to xserver (for example) has potential
to go horribly wrong if a dynamic DP MST connector ends up in the saved
modeset that is restored when switching back to fbcon.
When removing a dynamic connector, don't forget to clean up the saved
state.
v1: original
v2: null out set->fb if no more connectors to avoid making i915 cranky
Julian Anastasov [Thu, 18 Dec 2014 20:41:23 +0000 (22:41 +0200)]
ipvs: rerouting to local clients is not needed anymore
commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP")
from 2.6.37 introduced ip_route_me_harder() call for responses to
local clients, so that we can provide valid rt_src after SNAT.
It was used by TCP to provide valid daddr for ip_send_reply().
After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to
ip_send_reply()." from 3.0 this rerouting is not needed anymore
and should be avoided, especially in LOCAL_IN.
Fixes 3.12.33 crash in xfrm reported by Florian Wiessner:
"3.12.33 - BUG xfrm_selector_match+0x25/0x2f6"
Linus Torvalds [Thu, 29 Jan 2015 23:41:27 +0000 (15:41 -0800)]
Merge tag 'dm-3.19-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"One stable fix for a dm-cache 3.19-rc6 regression and one stable fix
for dm-thin:
- fix DM cache metadata open/lookup error paths to properly use
ERR_PTR and IS_ERR (fixes: 3.19-rc6 "stable" commit 9b1cc9f251)
- fix DM thin-provisioning to disallow userspace from sending
messages to the thin-pool if the pool is in READ_ONLY or FAIL mode
since no metadata changes are allowed in these modes"
* tag 'dm-3.19-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: don't allow messages to be sent to a pool target in READ_ONLY or FAIL mode
dm cache: fix missing ERR_PTR returns and handling
Linus Torvalds [Thu, 29 Jan 2015 23:18:12 +0000 (15:18 -0800)]
Merge tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- Stable fix for a NFSv4.1 Oops on mount
- Stable fix for an O_DIRECT deadlock condition
- Fix an issue with submounted volumes and fake duplicate inode
numbers"
* tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix use of nfs_attr_use_mounted_on_fileid()
NFSv4.1: Fix an Oops in nfs41_walk_client_list
nfs: fix dio deadlock when O_DIRECT flag is flipped
Linus Torvalds [Thu, 29 Jan 2015 23:13:56 +0000 (15:13 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"These paches from Ilya finally squash a race condition with layered
images that he's been chasing for a while"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
rbd: fix rbd_dev_parent_get() when parent_overlap == 0
David S. Miller [Thu, 29 Jan 2015 23:08:27 +0000 (15:08 -0800)]
Merge branch 'arm-build-fixes'
Arnd Bergmann says:
====================
net: driver fixes from arm randconfig builds
These four patches are fallout from test builds on ARM. I have a
few more of them in my backlog but have not yet confirmed them
to still be valid.
The first three patches are about incomplete dependencies on
old drivers. One could backport them to the beginning of time
in theory, but there is little value since nobody would run into
these problems.
The final patch is one I had submitted before together with the
respective pcmcia patch but forgot to follow up on that. It's
still a valid but relatively theoretical bug, because the previous
behavior of the driver was just as broken as what we have in
mainline.
====================
A recent patch tried to work around a valid warning for the use of a
deprecated interface by blindly changing from the old
pcmcia_request_exclusive_irq() interface to pcmcia_request_irq().
This driver has an interrupt handler that is not currently aware
of shared interrupts, but can be easily converted to be.
At the moment, the driver reads the interrupt status register
repeatedly until it contains only zeroes in the interesting bits,
and handles each bit individually.
This patch adds the missing part of returning IRQ_NONE in case none
of the bits are set to start with, so we can move on to the next
interrupt source.
Signed-off-by: Arnd Bergmann <[email protected]> Fixes: 5f5316fcd08ef7 ("am2150: Update nmclan_cs.c to use update PCMCIA API") Signed-off-by: David S. Miller <[email protected]>
Arnd Bergmann [Wed, 28 Jan 2015 14:15:03 +0000 (15:15 +0100)]
net: lance,ni64: don't build for ARM
The ni65 and lance ethernet drivers manually program the ISA DMA
controller that is only available on x86 PCs and a few compatible
systems. Trying to build it on ARM results in this error:
ni65.c: In function 'ni65_probe1':
ni65.c:496:62: error: 'DMA1_STAT_REG' undeclared (first use in this function)
((inb(DMA1_STAT_REG) >> 4) & 0x0f)
^
ni65.c:496:62: note: each undeclared identifier is reported only once for each function it appears in
ni65.c:497:63: error: 'DMA2_STAT_REG' undeclared (first use in this function)
| (inb(DMA2_STAT_REG) & 0xf0);
The DMA1_STAT_REG and DMA2_STAT_REG registers are only defined for
alpha, mips, parisc, powerpc and x86, although it is not clear
which subarchitectures actually have them at the correct location.
This patch for now just disables it for ARM, to avoid randconfig
build errors. We could also decide to limit it to the set of
architectures on which it does compile, but that might look more
deliberate than guessing based on where the drivers build.
Arnd Bergmann [Wed, 28 Jan 2015 14:15:02 +0000 (15:15 +0100)]
net: wan: add missing virt_to_bus dependencies
The cosa driver is rather outdated and does not get built on most
platforms because it requires the ISA_DMA_API symbol. However
there are some ARM platforms that have ISA_DMA_API but no virt_to_bus,
and they get this build error when enabling the ltpc driver.
drivers/net/wan/cosa.c: In function 'tx_interrupt':
drivers/net/wan/cosa.c:1768:3: error: implicit declaration of function 'virt_to_bus'
unsigned long addr = virt_to_bus(cosa->txbuf);
^
The same problem exists for the Hostess SV-11 and Sealevel Systems 4021
drivers.
This adds another dependency in Kconfig to avoid that configuration.
Arnd Bergmann [Wed, 28 Jan 2015 14:15:01 +0000 (15:15 +0100)]
net: cs89x0: always build platform code if !HAS_IOPORT_MAP
The cs89x0 driver can either be built as an ISA driver or a platform
driver, the choice is controlled by the CS89x0_PLATFORM Kconfig
symbol. Building the ISA driver on a system that does not have
a way to map I/O ports fails with this error:
drivers/built-in.o: In function `cs89x0_ioport_probe.constprop.1':
:(.init.text+0x4794): undefined reference to `ioport_map'
:(.init.text+0x4830): undefined reference to `ioport_unmap'
This changes the Kconfig logic to take that option away and
always force building the platform variant of this driver if
CONFIG_HAS_IOPORT_MAP is not set. This is the only correct
choice in this case, and it avoids the build error.
Hemmo Nieminen [Thu, 15 Jan 2015 21:01:59 +0000 (23:01 +0200)]
MIPS: Fix kernel lockup or crash after CPU offline/online
As printk() invocation can cause e.g. a TLB miss, printk() cannot be
called before the exception handlers have been properly initialized.
This can happen e.g. when netconsole has been loaded as a kernel module
and the TLB table has been cleared when a CPU was offline.
Call cpu_report() in start_secondary() only after the exception handlers
have been initialized to fix this.
Without the patch the kernel will randomly either lockup or crash
after a CPU is onlined and the console driver is a module.
Florian Westphal [Wed, 28 Jan 2015 09:56:04 +0000 (10:56 +0100)]
ppp: deflate: never return len larger than output buffer
When we've run out of space in the output buffer to store more data, we
will call zlib_deflate with a NULL output buffer until we've consumed
remaining input.
When this happens, olen contains the size the output buffer would have
consumed iff we'd have had enough room.
This can later cause skb_over_panic when ppp_generic skb_put()s
the returned length.
Aaro Koskinen [Thu, 15 Jan 2015 21:01:58 +0000 (23:01 +0200)]
MIPS: OCTEON: fix kernel crash when offlining a CPU
octeon_cpu_disable() will unconditionally enable interrupts when called.
We can assume that the routine is always called with interrupts disabled,
so just delete the incorrect local_irq_disable/enable().
The patch fixes the following crash when offlining a CPU:
Marc Zyngier [Mon, 5 Jan 2015 21:13:24 +0000 (21:13 +0000)]
arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault
When handling a fault in stage-2, we need to resync I$ and D$, just
to be sure we don't leave any old cache line behind.
That's very good, except that we do so using the *user* address.
Under heavy load (swapping like crazy), we may end up in a situation
where the page gets mapped in stage-2 while being unmapped from
userspace by another CPU.
At that point, the DC/IC instructions can generate a fault, which
we handle with kvm->mmu_lock held. The box quickly deadlocks, user
is unhappy.
Instead, perform this invalidation through the kernel mapping,
which is guaranteed to be present. The box is much happier, and so
am I.
Marc Zyngier [Fri, 19 Dec 2014 16:48:06 +0000 (16:48 +0000)]
arm/arm64: KVM: Invalidate data cache on unmap
Let's assume a guest has created an uncached mapping, and written
to that page. Let's also assume that the host uses a cache-coherent
IO subsystem. Let's finally assume that the host is under memory
pressure and starts to swap things out.
Before this "uncached" page is evicted, we need to make sure
we invalidate potential speculated, clean cache lines that are
sitting there, or the IO subsystem is going to swap out the
cached view, loosing the data that has been written directly
into memory.
Marc Zyngier [Fri, 19 Dec 2014 16:05:31 +0000 (16:05 +0000)]
arm/arm64: KVM: Use set/way op trapping to track the state of the caches
Trying to emulate the behaviour of set/way cache ops is fairly
pointless, as there are too many ways we can end-up missing stuff.
Also, there is some system caches out there that simply ignore
set/way operations.
So instead of trying to implement them, let's convert it to VA ops,
and use them as a way to re-enable the trapping of VM ops. That way,
we can detect the point when the MMU/caches are turned off, and do
a full VM flush (which is what the guest was trying to do anyway).
This allows a 32bit zImage to boot on the APM thingy, and will
probably help bootloaders in general.
David S. Miller [Thu, 29 Jan 2015 22:20:16 +0000 (14:20 -0800)]
Merge branch 'netns'
Nicolas Dichtel says:
====================
netns: audit netdevice creation with IFLA_NET_NS_[PID|FD]
When one of these attributes is set, the netdevice is created into the netns
pointed by IFLA_NET_NS_[PID|FD] (see the call to rtnl_create_link() in
rtnl_newlink()). Let's call this netns the dest_net. After this creation, if the
newlink handler exists, it is called with a netns argument that points to the
netns where the netlink message has been received (called src_net in the code)
which is the link netns.
Hence, with one of these attributes, it's possible to create a x-netns
netdevice.
Here is the result of my code review:
- all ip tunnels (sit, ipip, ip6_tunnels, gre[tap][v6], ip_vti[6]) does not
really allows to use this feature: the netdevice is created in the dest_net
and the src_net is completely ignored in the newlink handler.
- VLAN properly handles this x-netns creation.
- bridge ignores src_net, which seems fine (NETIF_F_NETNS_LOCAL is set).
- CAIF subsystem is not clear for me (I don't know how it works), but it seems
to wrongly use src_net. Patch #1 tries to fix this, but it was done only by
code review (and only compile-tested), so please carefully review it. I may
miss something.
- HSR subsystem uses src_net to parse IFLA_HSR_SLAVE[1|2], but the netdevice has
the flag NETIF_F_NETNS_LOCAL, so the question is: does this netdevice really
supports x-netns? If not, the newlink handler should use the dest_net instead
of src_net, I can provide the patch.
- ieee802154 uses also src_net and does not have NETIF_F_NETNS_LOCAL. Same
question: does this netdevice really supports x-netns?
- bonding ignores src_net and flag NETIF_F_NETNS_LOCAL is set, ie x-netns is not
supported. Fine.
- CAN does not support rtnl/newlink, ok.
- ipvlan uses src_net and does not have NETIF_F_NETNS_LOCAL. After looking at
the code, it seems that this drivers support x-netns. Am I right?
- macvlan/macvtap uses src_net and seems to have x-netns support.
- team ignores src_net and has the flag NETIF_F_NETNS_LOCAL, ie x-netns is not
supported. Ok.
- veth uses src_net and have x-netns support ;-) Ok.
- VXLAN didn't properly handle this. The link netns (vxlan->net) is the src_net
and not dest_net (see patch #2). Note that it was already possible to create a
x-netns vxlan before the commit f01ec1c017de ("vxlan: add x-netns support")
but the nedevice remains broken.
To summarize:
- CAIF patch must be carefully reviewed
- for HSR, ieee802154, ipvlan: is x-netns supported?
====================
Nicolas Dichtel [Mon, 26 Jan 2015 21:28:14 +0000 (22:28 +0100)]
vxlan: setup the right link netns in newlink hdlr
Rename the netns to src_net to avoid confusion with the netns where the
interface stands. The user may specify IFLA_NET_NS_[PID|FD] to create
a x-netns netndevice: IFLA_NET_NS_[PID|FD] points to the netns where the
netdevice stands and src_net to the link netns.
Note that before commit f01ec1c017de ("vxlan: add x-netns support"), it was
possible to create a x-netns vxlan netdevice, but the netdevice was not
operational.
Fixes: f01ec1c017de ("vxlan: add x-netns support") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Nicolas Dichtel [Mon, 26 Jan 2015 21:28:13 +0000 (22:28 +0100)]
caif: remove wrong dev_net_set() call
src_net points to the netns where the netlink message has been received. This
netns may be different from the netns where the interface is created (because
the user may add IFLA_NET_NS_[PID|FD]). In this case, src_net is the link netns.
It seems wrong to override the netns in the newlink() handler because if it
was not already src_net, it means that the user explicitly asks to create the
netdevice in another netns.
CC: Sjur Brændeland <[email protected]> CC: Dmitry Tarnyagin <[email protected]> Fixes: 8391c4aab1aa ("caif: Bugfixes in CAIF netdevice for close and flow control") Fixes: c41254006377 ("caif-hsi: Add rtnl support") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
karl beldan [Thu, 29 Jan 2015 10:10:22 +0000 (11:10 +0100)]
lib/checksum.c: fix build for generic csum_tcpudp_nofold
Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.
Linus Torvalds [Thu, 29 Jan 2015 19:48:36 +0000 (11:48 -0800)]
Merge tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This batch ended up being larger than wished, but there is nothing to
worry too much there.
Most of commits are for ASoC, a compress NULL dereference fix, a fix
for probe error handling, and the rest are device-specific fixes. In
addition, we have a fix for a long-standing but of seq-dummy driver,
which just cuts off the buggy part in the end"
* tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq-dummy: remove deadlock-causing events on close
ASoC: omap-mcbsp: Correct CBM_CFS dai format configuration
ASoC: soc-compress.c: fix NULL dereference
ASoC: rt286: set the same format for dac and adc
ASoC: wm8904: fix runtime warning
ASoC: simple-card: Fix crash in asoc_simple_card_unref()
ASoC: fsl: imx-wm8962: Set the card owner field
ASoC: pcm512x: Fix DSP program selection
ASoC: rt5677: Modify the behavior that updates the PLL parameter.
ASoC: fsl_ssi: Fix irq error check
ASoC: rockchip: i2s: applys rate symmetry for CPU DAI
ASoC: Intel: Add NULL checks for the stream pointer
ASoC: wm8960: Fix capture sample rate from 11250 to 11025
ASoC: adi: Add missing return statement.
ASoC: Intel: Don't change offset of block allocator during fixed allocate
ASoC: ts3a227e: Check and report jack status at probe
ASoC: fsl_esai: Fix incorrect xDC field width of xCCR registers
Linus Torvalds [Thu, 29 Jan 2015 19:43:43 +0000 (11:43 -0800)]
Merge tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull final pin control fix from Linus Walleij:
"A late pin control fix for the v3.19 series: The AT91 gpio controller
would miss wakeup events, this single fix make it work properly"
[ "Final"? Yeah, I'll believe that once I've actually released 3.19 ;) - Linus ]
* tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: at91: allow to have disabled gpio bank
Linus Torvalds [Thu, 29 Jan 2015 19:15:17 +0000 (11:15 -0800)]
vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS
The stack guard page error case has long incorrectly caused a SIGBUS
rather than a SIGSEGV, but nobody actually noticed until commit fee7e49d4514 ("mm: propagate error from stack expansion even for guard
page") because that error case was never actually triggered in any
normal situations.
Now that we actually report the error, people noticed the wrong signal
that resulted. So far, only the test suite of libsigsegv seems to have
actually cared, but there are real applications that use libsigsegv, so
let's not wait for any of those to break.
Laurent Pinchart [Fri, 23 Jan 2015 14:21:49 +0000 (16:21 +0200)]
arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
Commit 4bb25789ed28228a ("arm: dma-mapping: plumb our iommu mapping ops
into arch_setup_dma_ops") moved the setting of the DMA operations from
arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA
operations to be used are selected based on whether the device is
connected to an IOMMU. However, the IOMMU detection scheme requires the
IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver
has been ported yet, this effectively breaks all IOMMU ARM users that
depend on the IOMMU being handled transparently by the DMA mapping API.
Fix this by restoring the setting of DMA IOMMU ops in
arm_iommu_attach_device() and splitting the rest of the function into a
new internal __arm_iommu_attach_device() function, called by
arch_setup_dma_ops().
Linus Torvalds [Thu, 29 Jan 2015 18:51:32 +0000 (10:51 -0800)]
vm: add VM_FAULT_SIGSEGV handling support
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.
That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.
However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.
To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.
This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.
Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Ming Lei [Thu, 29 Jan 2015 12:17:27 +0000 (20:17 +0800)]
blk-mq: release mq's kobjects in blk_release_queue()
The kobject memory inside blk-mq hctx/ctx shouldn't have been freed
before the kobject is released because driver core can access it freely
before its release.
We can't do that in all ctx/hctx/mq_kobj's release handler because
it can be run before blk_cleanup_queue().
Given mq_kobj shouldn't have been introduced, this patch simply moves
mq's release into blk_release_queue().
Arnd Bergmann [Wed, 28 Jan 2015 16:58:33 +0000 (17:58 +0100)]
ARM: 8298/1: ARM_KERNMEM_PERMS only works with MMU enabled
The recently added ARM_KERNMEM_PERMS feature works by manipulating
the kernel page tables, which obviously requires an MMU. Trying
to enable this feature when the MMU is disabled results in a lot
of compile errors in mm/init.c, so let's add a Kconfig dependency
to avoid that case.
Nicolas Pitre [Tue, 27 Jan 2015 15:10:42 +0000 (16:10 +0100)]
ARM: 8294/1: ATAG_DTB_COMPAT: remove the DT workspace's hardcoded 64KB size
There is currently a hardcoded limit of 64KB for the DTB to live in and
be extended with ATAG info. Some DTBs have outgrown that limit:
$ du -b arch/arm/boot/dts/omap3-n900.dtb
70212 arch/arm/boot/dts/omap3-n900.dtb
Furthermore, the actual size passed to atags_to_fdt() included the stack
size which is obviously wrong.
The initial DTB size is known, so use it to size the allocated workspace
with a 50% growth assumption and relocate the temporary stack above that.
This is also clamped to 32KB min / 1MB max for robustness against bad
DTB data.
Will Deacon [Fri, 16 Jan 2015 17:01:43 +0000 (18:01 +0100)]
ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown
When tearing down the DMA ops for a device via of_dma_deconfigure, we
unconditionally detach the device from its IOMMU domain. For devices
that aren't actually behind an IOMMU, this produces a "Not attached"
warning message on the console.
This patch changes the teardown code so that we don't detach from the
IOMMU domain when there isn't an IOMMU dma mapping to start with.
Magnus Damm [Thu, 29 Jan 2015 07:25:32 +0000 (16:25 +0900)]
ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
As of commit 9a1091ef0017c40a ("irqchip: gic: Support hierarchy irq
domain."), the Lager legacy board support is known to be broken.
The IRQ numbers of the GIC are now virtual, and no longer match the
hardcoded hardware IRQ numbers in the legacy platform board code.
To fix this issue specific to non-multiplatform r8a7790 and Lager:
1) Instantiate the GIC from platform board code and also
2) Skip over the DT arch timer as well as
3) Force delay setup based on DT CPU frequency
With these 3 fixes in place interrupts on Lager are now unbroken.
Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to
him for the initial work.
Eric Dumazet [Wed, 28 Jan 2015 13:47:11 +0000 (05:47 -0800)]
tcp: ipv4: initialize unicast_sock sk_pacing_rate
When I added sk_pacing_rate field, I forgot to initialize its value
in the per cpu unicast_sock used in ip_send_unicast_reply()
This means that for sch_fq users, RST packets, or ACK packets sent
on behalf of TIME_WAIT sockets might be sent to slowly or even dropped
once we reach the per flow limit.
Signed-off-by: Eric Dumazet <[email protected]> Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing") Signed-off-by: David S. Miller <[email protected]>
karl beldan [Wed, 28 Jan 2015 09:58:11 +0000 (10:58 +0100)]
lib/checksum.c: fix carry in csum_tcpudp_nofold
The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
This patch avoids calling rtnl_notify if the device ndo_bridge_getlink
handler does not return any bytes in the skb.
Alternately, the skb->len check can be moved inside rtnl_notify.
For the bridge vlan case described in 92081, there is also a fix needed
in bridge driver to generate a proper notification. Will fix that in
subsequent patch.
David S. Miller [Thu, 29 Jan 2015 06:19:09 +0000 (22:19 -0800)]
Merge branch 'tcp_stretch_acks'
Neal Cardwell says:
====================
fix stretch ACK bugs in TCP CUBIC and Reno
This patch series fixes the TCP CUBIC and Reno congestion control
modules to properly handle stretch ACKs in their respective additive
increase modes, and in the transitions from slow start to additive
increase.
This finishes the project started by commit 9f9843a751d0a2057 ("tcp:
properly handle stretch acks in slow start"), which fixed behavior for
TCP congestion control when handling stretch ACKs in slow start mode.
Motivation: In the Jan 2015 netdev thread 'BW regression after "tcp:
refine TSO autosizing"', Eyal Perry documented a regression that Eric
Dumazet determined was caused by improper handling of TCP stretch
ACKs.
Background: LRO, GRO, delayed ACKs, and middleboxes can cause "stretch
ACKs" that cover more than the RFC-specified maximum of 2
packets. These stretch ACKs can cause serious performance shortfalls
in common congestion control algorithms, like Reno and CUBIC, which
were designed and tuned years ago with receiver hosts that were not
using LRO or GRO, and were instead ACKing every other packet.
Testing: at Google we have been using this approach for handling
stretch ACKs for CUBIC datacenter and Internet traffic for several
years, with good results.
v2:
* fixed return type of tcp_slow_start() to be u32 instead of int
====================
Neal Cardwell [Thu, 29 Jan 2015 01:01:39 +0000 (20:01 -0500)]
tcp: fix timing issue in CUBIC slope calculation
This patch fixes a bug in CUBIC that causes cwnd to increase slightly
too slowly when multiple ACKs arrive in the same jiffy.
If cwnd is supposed to increase at a rate of more than once per jiffy,
then CUBIC was sometimes too slow. Because the bic_target is
calculated for a future point in time, calculated with time in
jiffies, the cwnd can increase over the course of the jiffy while the
bic_target calculated as the proper CUBIC cwnd at time
t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only
increases on jiffy tick boundaries.
So since the cnt is set to:
ca->cnt = cwnd / (bic_target - cwnd);
as cwnd increases but bic_target does not increase due to jiffy
granularity, the cnt becomes too large, causing cwnd to increase
too slowly.
For example:
- suppose at the beginning of a jiffy, cwnd=40, bic_target=44
- so CUBIC sets:
ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10
- suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai()
increases cwnd to 41
- so CUBIC sets:
ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13
So now CUBIC will wait for 13 packets to be ACKed before increasing
cwnd to 42, insted of 10 as it should.
The fix is to avoid adjusting the slope (determined by ca->cnt)
multiple times within a jiffy, and instead skip to compute the Reno
cwnd, the "TCP friendliness" code path.
Neal Cardwell [Thu, 29 Jan 2015 01:01:38 +0000 (20:01 -0500)]
tcp: fix stretch ACK bugs in CUBIC
Change CUBIC to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().
In addition, because we are now precisely accounting for stretch ACKs,
including delayed ACKs, we can now remove the delayed ACK tracking and
estimation code that tracked recent delayed ACK behavior in
ca->delayed_ack.