]> Git Repo - linux.git/log
linux.git
10 years agogue: Call remcsum_adjust
Tom Herbert [Tue, 25 Nov 2014 19:21:20 +0000 (11:21 -0800)]
gue: Call remcsum_adjust

Change remote checksum offload to call remcsum_adjust. This also
eliminates the optimization to skip an IP header as part of the
adjustment (really does not seem to be much of a win).

Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet: Add remcsum_adjust as common function for remote checksum offload
Tom Herbert [Tue, 25 Nov 2014 19:21:19 +0000 (11:21 -0800)]
net: Add remcsum_adjust as common function for remote checksum offload

This function does the work to update a checksum field as part of
remote checksum offload.

remcsum_adjust does the following:

1) Subtract out the calculated checksum from the beginning of the
   packet (ptr arg) to the start offset.
2) Adjust the checksum field indicated by offset based on the modified
   checksum value from above step.
3) Return the difference in the old checksum field value and the
   new one. The caller will use this to update skb->csum and NAPI csum.

Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agopkt_sched: fq: increase max delay from 125 ms to one second
Eric Dumazet [Tue, 25 Nov 2014 16:57:29 +0000 (08:57 -0800)]
pkt_sched: fq: increase max delay from 125 ms to one second

FQ/pacing has a clamp of delay of 125 ms, to avoid some possible harm.

It turns out this delay is too small to allow pacing low rates :
Some ISP setup very aggressive policers as low as 16kbit.

Now TCP stack has spurious rtx prevention, it seems safe to increase
this fixed parameter, without adding a qdisc attribute.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Yang Yingliang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet/mlx4_core: Limit count field to 24 bits in qp_alloc_res
Jack Morgenstein [Tue, 25 Nov 2014 09:54:31 +0000 (11:54 +0200)]
net/mlx4_core: Limit count field to 24 bits in qp_alloc_res

Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.

Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.

As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.

Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests"
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoMerge branch 'bcm_sf2'
David S. Miller [Wed, 26 Nov 2014 17:04:14 +0000 (12:04 -0500)]
Merge branch 'bcm_sf2'

Florian Fainelli says:

====================
net: dsa: bcm_sf2: misc bugfixes

This patch series contains two bug fixes:

- first patch fixes an issue on the error path of the driver where we could
  have left some of our registers mapped

- second patch enforces the use of a software reset of the switch to guarantee
  the HW is in a consistent state prior to software initialization
====================

Signed-off-by: David S. Miller <[email protected]>
10 years agonet: dsa: bcm_sf2: reset switch prior to initialization
Florian Fainelli [Wed, 26 Nov 2014 02:08:49 +0000 (18:08 -0800)]
net: dsa: bcm_sf2: reset switch prior to initialization

Our boot agent may have left the switch in an certain configuration
state, make sure we issue a software reset prior to configuring the
switch in order to ensure the HW is in a consistent state, in particular
transmit queues and internal buffers.

Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet: dsa: bcm_sf2: fix unmapping registers in case of errors
Florian Fainelli [Wed, 26 Nov 2014 02:08:48 +0000 (18:08 -0800)]
net: dsa: bcm_sf2: fix unmapping registers in case of errors

In case we fail to ioremap() one of our registers, we would be leaking
existing mappings, unwind those accordingly on errors.

Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agokvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
Ard Biesheuvel [Mon, 10 Nov 2014 08:33:56 +0000 (09:33 +0100)]
kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()

This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.

The problem being addressed by the patch above was that some ARM code
based the memory mapping attributes of a pfn on the return value of
kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
be mapped as device memory.

However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
and the existing non-ARM users were already using it in a way which
suggests that its name should probably have been 'kvm_is_reserved_pfn'
from the beginning, e.g., whether or not to call get_page/put_page on
it etc. This means that returning false for the zero page is a mistake
and the patch above should be reverted.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
10 years agoarm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()
Ard Biesheuvel [Mon, 10 Nov 2014 08:33:55 +0000 (09:33 +0100)]
arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()

Instead of using kvm_is_mmio_pfn() to decide whether a host region
should be stage 2 mapped with device attributes, add a new static
function kvm_is_device_pfn() that disregards RAM pages with the
reserved bit set, as those should usually not be mapped as device
memory.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
10 years agoarm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()
Christoffer Dall [Thu, 6 Nov 2014 11:47:39 +0000 (11:47 +0000)]
arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()

If we detect another vCPU is running we just exit and return 0 as if we
succesfully created the VGIC, but the VGIC wouldn't actual be created.

This shouldn't break in-kernel behavior because the kernel will not
observe the failed the attempt to create the VGIC, but userspace could
be rightfully confused.

Cc: Andre Przywara <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
10 years agoarm64: KVM: Handle traps of ICC_SRE_EL1 as RAZ/WI
Christoffer Dall [Wed, 19 Nov 2014 11:23:54 +0000 (11:23 +0000)]
arm64: KVM: Handle traps of ICC_SRE_EL1 as RAZ/WI

When running on a system with a GICv3, we currenly don't allow the guest
to access the system register interface of the GICv3.  We do this by
clearing the ICC_SRE_EL2.Enable, which causes all guest accesses to
ICC_SRE_EL1 to trap to EL2 and causes all guest accesses to other ICC_
registers to cause an undefined exception in the guest.

However, we currently don't handle the trap of guest accesses to
ICC_SRE_EL1 and will spill out a warning.  The trap just needs to handle
the access as RAZ/WI, and a guest that tries to prod this register and
set ICC_SRE_EL1.SRE=1, must read back the value (which Linux already
does) to see if it succeeded, and will thus observe that ICC_SRE_EL1.SRE
was not set.

Add the simple trap handler in the sorted table of the system registers.

Signed-off-by: Christoffer Dall <[email protected]>
[ardb: added cp15 handling]
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
10 years agoarm64: KVM: fix unmapping with 48-bit VAs
Mark Rutland [Tue, 28 Oct 2014 19:36:45 +0000 (19:36 +0000)]
arm64: KVM: fix unmapping with 48-bit VAs

Currently if using a 48-bit VA, tearing down the hyp page tables (which
can happen in the absence of a GICH or GICV resource) results in the
rather nasty splat below, evidently becasue we access a table that
doesn't actually exist.

Commit 38f791a4e499792e (arm64: KVM: Implement 48 VA support for KVM EL2
and Stage-2) added a pgd_none check to __create_hyp_mappings to account
for the additional level of tables, but didn't add a corresponding check
to unmap_range, and this seems to be the source of the problem.

This patch adds the missing pgd_none check, ensuring we don't try to
access tables that don't exist.

Original splat below:

kvm [1]: Using HYP init bounce page @83fe94a000
kvm [1]: Cannot obtain GICH resource
Unable to handle kernel paging request at virtual address ffff7f7fff000000
pgd = ffff800000770000
[ffff7f7fff000000] *pgd=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89
task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000
PC is at unmap_range+0x120/0x580
LR is at free_hyp_pgds+0xac/0xe4
pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045
sp : ffff8003eb45fbf0
x29: ffff8003eb45fbf0 x28: ffff800000736000
x27: ffff800000735000 x26: ffff7f7fff000000
x25: 0000000040000000 x24: ffff8000006f5000
x23: 0000000000000000 x22: 0000007fffffffff
x21: 0000800000000000 x20: 0000008000000000
x19: 0000000000000000 x18: ffff800000648000
x17: ffff800000537228 x16: 0000000000000000
x15: 000000000000001f x14: 0000000000000000
x13: 0000000000000001 x12: 0000000000000020
x11: 0000000000000062 x10: 0000000000000006
x9 : 0000000000000000 x8 : 0000000000000063
x7 : 0000000000000018 x6 : 00000003ff000000
x5 : ffff800000744188 x4 : 0000000000000001
x3 : 0000000040000000 x2 : ffff800000000000
x1 : 0000007fffffffff x0 : 000000003fffffff

Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058)
Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000)
fbe0:                                     eb45fcb0 ffff8003 0009cad8 ffff8000
fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80
fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000
fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080
fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003
fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003
fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000
fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000
fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000
fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000
fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002
fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000
fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000
fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000
fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000
fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000
fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000
fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000
fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000
fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000
fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000
ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call trace:
[<ffff80000009b768>] unmap_range+0x120/0x580
[<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4
[<ffff80000009b374>] kvm_arch_init+0x268/0x44c
[<ffff80000009606c>] kvm_init+0x24/0x260
[<ffff80000009a338>] arm_init+0x18/0x24
[<ffff8000000814bc>] do_one_initcall+0x88/0x1a0
[<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8
[<ffff8000004f8ee4>] kernel_init+0x10/0xd4
Code: 8b000263 92628479 d1000720 eb01001f (f9400340)
---[ end trace 3bc230562e926fa4 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jungseok Lee <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
10 years agoufs: fix NULL dereference when no regulators are defined
Akinobu Mita [Tue, 18 Nov 2014 14:02:46 +0000 (23:02 +0900)]
ufs: fix NULL dereference when no regulators are defined

If no voltage supply regulators are defined for the UFS devices (assumed
they are always-on), ufshcd_config_vreg_load() can be called on
suspend/resume paths with vreg == NULL as hba->vreg_info.vcc* equal to
NULL, and it causes NULL pointer dereference.

This fixes it by making ufshcd_config_vreg_{h,l}pm noop when no regulators
are defined.

Signed-off-by: Akinobu Mita <[email protected]>
Reviewed-by: Subhash Jadavani <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
10 years agoufs: ensure clk gating work is finished before module unloading
Akinobu Mita [Mon, 24 Nov 2014 05:24:18 +0000 (14:24 +0900)]
ufs: ensure clk gating work is finished before module unloading

When dynamic clk gating feature is enabled, delayed workqueue machanism
is used in order to detect certain period of inactivity.  But there is no
guarantee that scheduled gating work is completed before module unloading.
So it can cause kernel crash by accessing memory after it was freed.

Fix it by cancelling clk gating and ungating works and ensure that its
execution is finished before module unloading.

Signed-off-by: Akinobu Mita <[email protected]>
Reviewed-by: Subhash Jadavani <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
10 years agoirqchip: brcmstb-l2: Fix error handling of irq_of_parse_and_map
Dmitry Torokhov [Fri, 14 Nov 2014 22:16:42 +0000 (14:16 -0800)]
irqchip: brcmstb-l2: Fix error handling of irq_of_parse_and_map

Return value of irq_of_parse_and_map() is unsigned int, with 0
indicating failure, so testing for negative result never works.

Signed-off-by: Dmitry Torokhov <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Tested-by: Kevin Cernekee <[email protected]>
Link: https://lkml.kernel.org/r/20141114221642.GA37468@dtor-ws
Signed-off-by: Jason Cooper <[email protected]>
10 years agoirqchip: bcm7120-l2: Fix error handling of irq_of_parse_and_map
Dmitry Torokhov [Fri, 14 Nov 2014 22:16:14 +0000 (14:16 -0800)]
irqchip: bcm7120-l2: Fix error handling of irq_of_parse_and_map

Return value of irq_of_parse_and_map() is unsigned int, with 0
indicating failure, so testing for negative result never works.

Signed-off-by: Dmitry Torokhov <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Tested-by: Kevin Cernekee <[email protected]>
Link: https://lkml.kernel.org/r/20141114221614.GA37395@dtor-ws
Signed-off-by: Jason Cooper <[email protected]>
10 years agoMerge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Wed, 26 Nov 2014 03:05:41 +0000 (19:05 -0800)]
Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfixes from Bruce Fields:
 "These fix one mishandling of the case when security labels are
  configured out, and two races in the 4.1 backchannel code"

* 'for-3.18' of git://linux-nfs.org/~bfields/linux:
  nfsd: Fix slot wake up race in the nfsv4.1 callback code
  SUNRPC: Fix locking around callback channel reply receive
  nfsd: correctly define v4.2 support attributes

10 years agoMerge git://git.kvack.org/~bcrl/aio-fixes
Linus Torvalds [Wed, 26 Nov 2014 02:55:44 +0000 (18:55 -0800)]
Merge git://git.kvack.org/~bcrl/aio-fixes

Pull aio fix from Ben LaHaise:
 "Dirty page accounting fix for aio"

* git://git.kvack.org/~bcrl/aio-fixes:
  aio: fix uncorrent dirty pages accouting when truncating AIO ring buffer

10 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Wed, 26 Nov 2014 02:43:21 +0000 (18:43 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
 "This series fix a nasty issue with radeon adapters on powerpc servers,
  it's all CC'ed stable and has the relevant maintainers ack's/reviews.

  Basically, some (radeon) adapters have issues with MSI addresses above
  1T (only support 40-bits).  We had powerpc specific quirk but it only
  listed a specific revision of an adapter that we shipped with our
  machines and didn't properly handle the audio function which some
  distros enable nowadays.

  So we made the quirk generic and fixed both the graphic and audio
  drivers properly to use it.

  Without that, ppc64 server machines will crash at boot with a radeon
  adapter.

  Note: This has been brewing for a while, it just needed a last respin
  which got delayed due to us moving ozlabs to a new location in town
  and other such things taking priority"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Remove unused force_32bit_msi quirk
  powerpc/pseries: Honor the generic "no_64bit_msi" flag
  powerpc/powernv: Honor the generic "no_64bit_msi" flag
  sound/radeon: Move 64-bit MSI quirk from arch to driver
  gpu/radeon: Set flag to indicate broken 64-bit MSI
  PCI/MSI: Add device flag indicating that 64-bit MSIs don't work
  ALSA: hda - Limit 40bit DMA for AMD HDMI controllers

10 years agoMerge tag 'hwmon-for-linus-v3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 26 Nov 2014 02:11:15 +0000 (18:11 -0800)]
Merge tag 'hwmon-for-linus-v3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull a hwmon fix from Guenter Roeck:
 "Fix hwmon registration problem in g762 driver"

* tag 'hwmon-for-linus-v3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (g762) fix call to devm_hwmon_device_register_with_groups()

10 years agoMerge tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux
Linus Torvalds [Wed, 26 Nov 2014 01:52:56 +0000 (17:52 -0800)]
Merge tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux

Pull clock fixes from Mike Turquette:
 "The fixes for the clock framework are all regressions in drivers, plus
  a single fix in one of the basic clock templates.  No fixes to the
  core this time around.

  As with most clock driver fixes these run the gamut from fixing a
  build warning to fixing wrecked memory timings, with a little USB
  tossed in for fun"

* tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux:
  clk: pxa: fix pxa27x CCCR bit usage
  clk-divider: Fix READ_ONLY when divider > 1
  clk: qcom: Fix duplicate rbcpr clock name
  clk: at91: usb: fix at91sam9x5 recalc, round and set rate
  clk: at91: usb: fix at91rm9200 round and set rate

10 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
David S. Miller [Wed, 26 Nov 2014 01:02:51 +0000 (20:02 -0500)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

More work from Al Viro to move away from modifying iovecs
by using iov_iter instead.

Signed-off-by: David S. Miller <[email protected]>
10 years agonet: Hyper-V: Deletion of an unnecessary check before the function call "vfree"
Markus Elfring [Tue, 25 Nov 2014 21:33:45 +0000 (22:33 +0100)]
net: Hyper-V: Deletion of an unnecessary check before the function call "vfree"

The vfree() function performs also input parameter validation.
Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoRevert "serial: of-serial: add PM suspend/resume support"
Greg Kroah-Hartman [Tue, 25 Nov 2014 20:46:39 +0000 (12:46 -0800)]
Revert "serial: of-serial: add PM suspend/resume support"

This reverts commit 2dea53bf57783f243c892e99c10c6921e956aa7e.

Turns out to be broken :(

Cc: Jingchang Lu <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
10 years agotg3: fix ring init when there are more TX than RX channels
Thadeu Lima de Souza Cascardo [Tue, 25 Nov 2014 16:21:11 +0000 (14:21 -0200)]
tg3: fix ring init when there are more TX than RX channels

If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.

This fix only initializes the RX ring if it has been allocated.

Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agotcp: fix possible NULL dereference in tcp_vX_send_reset()
Eric Dumazet [Tue, 25 Nov 2014 15:40:04 +0000 (07:40 -0800)]
tcp: fix possible NULL dereference in tcp_vX_send_reset()

After commit ca777eff51f7 ("tcp: remove dst refcount false sharing for
prequeue mode") we have to relax check against skb dst in
tcp_v[46]_send_reset() if prequeue dropped the dst.

If a socket is provided, a full lookup was done to find this socket,
so the dst test can be skipped.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88191
Reported-by: Jaša Bartelj <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Daniel Borkmann <[email protected]>
Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode")
Signed-off-by: David S. Miller <[email protected]>
10 years agortlwifi: Change order in device startup
Larry Finger [Tue, 25 Nov 2014 16:32:07 +0000 (10:32 -0600)]
rtlwifi: Change order in device startup

The existing order of steps when starting the PCI devices works for
2.4G devices, but fails to initialize the 5G section of the RTL8821AE
hardware.

This patch is needed to fix the regression reported in Bug #88811
(https://bugzilla.kernel.org/show_bug.cgi?id=88811).

Reported-by: Valerio Passini <[email protected]>
Tested-by: Valerio Passini <[email protected]>
Signed-off-by: Larry Finger <[email protected]>
Cc: Valerio Passini <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
10 years agortlwifi: rtl8821ae: Fix 5G detection problem
Larry Finger [Tue, 25 Nov 2014 16:32:06 +0000 (10:32 -0600)]
rtlwifi: rtl8821ae: Fix 5G detection problem

The changes associated with moving this driver from staging to the regular
tree missed one section setting the allowable rates for the 5GHz band.

This patch is needed to fix the regression reported in Bug #88811
(https://bugzilla.kernel.org/show_bug.cgi?id=88811).

Reported-by: Valerio Passini <[email protected]>
Tested-by: Valerio Passini <[email protected]>
Signed-off-by: Larry Finger <[email protected]>
Cc: Valerio Passini <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
10 years agoRevert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse"
Pablo Neira [Tue, 25 Nov 2014 18:54:47 +0000 (19:54 +0100)]
Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse"

This reverts commit 5195c14c8b27cc0b18220ddbf0e5ad3328a04187.

If the conntrack clashes with an existing one, it is left out of
the unconfirmed list, thus, crashing when dropping the packet and
releasing the conntrack since golden rule is that conntracks are
always placed in any of the existing lists for traceability reasons.

Reported-by: Daniel Borkmann <[email protected]>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=88841
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoMerge branch 'ipv6_vxlan_outer_udp_csum'
David S. Miller [Tue, 25 Nov 2014 19:12:36 +0000 (14:12 -0500)]
Merge branch 'ipv6_vxlan_outer_udp_csum'

Alexander Duyck says:

====================
Fix outer UDP checksums for IPv6 VXLAN tunnels

In testing against an older kernel I found a couple issues in the IPv6
VXLAN tunnel checksum logic for the outer UDP checksum.

First the default transitioned from using an outer checksum to not using
one.  Second, sometime after that the checksum inputs were changed
resulting the checksum not being correct if it were computed.

These two issues prevented a ping from the newer kernel to the older one.
With these two changes applied I verified I was able to send traffic over
the VXLAN tunnel to a link partner on an older kernel.

The boolean flip fix can be submitted for 3.17 stable as well since the
patch that introduced the issue was included in that kernel.
====================

Signed-off-by: David S. Miller <[email protected]>
10 years agovxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX]
Alexander Duyck [Tue, 25 Nov 2014 04:08:38 +0000 (20:08 -0800)]
vxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX]

In "vxlan: Call udp_sock_create" there was a logic error that resulted in
the default for IPv6 VXLAN tunnels going from using checksums to not using
checksums.  Since there is currently no support in iproute2 for setting
these values it means that a kernel after the change cannot talk over a IPv6
VXLAN tunnel to a kernel prior the change.

Fixes: 3ee64f3 ("vxlan: Call udp_sock_create")
Cc: Tom Herbert <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Acked-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoip6_udp_tunnel: Fix checksum calculation
Alexander Duyck [Tue, 25 Nov 2014 04:08:32 +0000 (20:08 -0800)]
ip6_udp_tunnel: Fix checksum calculation

The UDP checksum calculation for VXLAN tunnels is currently using the
socket addresses instead of the actual packet source and destination
addresses.  As a result the checksum calculated is incorrect in some
cases.

Also uh->check was being set twice, first it was set to 0, and then it is
set again in udp6_set_csum.  This change removes the redundant assignment
to 0.

Fixes: acbf74a7 ("vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.")
Cc: Andy Zhou <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agocxgb4/cxgb4vf/csiostor: Add T4/T5 PCI ID Table
Hariprasad Shenai [Tue, 25 Nov 2014 03:03:58 +0000 (08:33 +0530)]
cxgb4/cxgb4vf/csiostor: Add T4/T5 PCI ID Table

Add a new file t4_pci_id_tbl.h that contains T4/T5 PCI ID Table so that for all
drivers that uses T4/T5 PCI functions changes can be done in one place.

checkpatch.pl script reports following error, which if tried to fix ends up in
compilation error.

ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END \
+ { 0, } \
+ }

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
new file mode 100644

ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_ID_TABLE_FENTRY(devid) \
+ CH_PCI_ID_TABLE_ENTRY((devid) | \
+       ((CH_PCI_DEVICE_ID_FUNCTION) << 8)), \
+ CH_PCI_ID_TABLE_ENTRY((devid) | \
+       ((CH_PCI_DEVICE_ID_FUNCTION2) << 8))

ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END { 0, } }

ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END { 0, } }

Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet-timestamp: Fix a documentation typo
Andrew Lutomirski [Mon, 24 Nov 2014 20:02:29 +0000 (12:02 -0800)]
net-timestamp: Fix a documentation typo

SOF_TIMESTAMPING_OPT_ID puts the id in ee_data, not ee_info.

Cc: Willem de Bruijn <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoInput: xpad - use proper endpoint type
Greg Kroah-Hartman [Tue, 25 Nov 2014 08:38:17 +0000 (00:38 -0800)]
Input: xpad - use proper endpoint type

The xpad wireless endpoint is not a bulk endpoint on my devices, but
rather an interrupt one, so the USB core complains when it is submitted.
I'm guessing that the author really did mean that this should be an
interrupt urb, but as there are a zillion different xpad devices out
there, let's cover out bases and handle both bulk and interrupt
endpoints just as easily.

Signed-off-by: "Pierre-Loup A. Griffais" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
10 years agoInput: elantech - trust firmware about trackpoint presence
Dmitry Torokhov [Thu, 20 Nov 2014 07:33:07 +0000 (23:33 -0800)]
Input: elantech - trust firmware about trackpoint presence

Only try to parse data as coming from trackpoint if firmware told us that
trackpoint is present.

Fixes commit caeb0d37fa3e387eb0dd22e5d497523c002033d1

Reported-and-tested-by: Marcus Overhagen <[email protected]>
Reported-and-tested-by: Anders Kaseorg <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
10 years agousb-quirks: Add reset-resume quirk for MS Wireless Laser Mouse 6000
Hans de Goede [Mon, 24 Nov 2014 10:22:38 +0000 (11:22 +0100)]
usb-quirks: Add reset-resume quirk for MS Wireless Laser Mouse 6000

This wireless mouse receiver needs a reset-resume quirk to properly come
out of reset.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1165206
Signed-off-by: Hans de Goede <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
10 years agonet/ping: handle protocol mismatching scenario
Jane Zhou [Mon, 24 Nov 2014 19:44:08 +0000 (11:44 -0800)]
net/ping: handle protocol mismatching scenario

ping_lookup() may return a wrong sock if sk_buff's and sock's protocols
dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's
sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong
sock will be returned.
the fix is to "continue" the searching, if no matching, return NULL.

Cc: "David S. Miller" <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Jane Zhou <[email protected]>
Signed-off-by: Yiwei Zhao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet/smsc911x: Add minimal runtime PM support
Geert Uytterhoeven [Mon, 24 Nov 2014 18:58:17 +0000 (19:58 +0100)]
net/smsc911x: Add minimal runtime PM support

Add minimal runtime PM support (enable on probe, disable on remove), to
ensure proper operation with a parent device that uses runtime PM.

This is needed on systems where the external bus controller module of
the SoC is contained in a PM domain and/or has a gateable functional
clock. In such cases, before accessing any device connected to the
external bus, the PM domain must be powered up, and/or the functional
clock must be enabled, which is typically handled through runtime PM by
the bus controller driver.

An example of this is the kzm9g development board, where an smsc9220
Ethernet controller is connected to the Bus State Controller (BSC) of a
Renesas sh73a0 SoC.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agotipc: add tipc_netlink.h to uapi Kbuild
Richard Alpe [Mon, 24 Nov 2014 13:24:54 +0000 (14:24 +0100)]
tipc: add tipc_netlink.h to uapi Kbuild

tipc_netlink.h is the user-space header for the new netlink api. It
was accidentally left out of the uapi Kbuild list when the api was
added.

Signed-off-by: Richard Alpe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agorhashtable: Check for count mismatch while iterating in selftest
Thomas Graf [Mon, 24 Nov 2014 11:37:58 +0000 (12:37 +0100)]
rhashtable: Check for count mismatch while iterating in selftest

Verify whether both the lock and RCU protected iterators see all
test entries before and after expanding and shrinking has been
performed. Also verify whether the number of entries in the hashtable
remains stable during expansion and shrinking.

Signed-off-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoaf_packet: fix sparse warning
Michael S. Tsirkin [Mon, 24 Nov 2014 11:32:16 +0000 (13:32 +0200)]
af_packet: fix sparse warning

af_packet produces lots of these:
net/packet/af_packet.c:384:39: warning: incorrect type in return expression (different modifiers)
net/packet/af_packet.c:384:39:    expected struct page [pure] *
net/packet/af_packet.c:384:39:    got struct page *

this seems to be because sparse does not realize that _pure
refers to function, not the returned pointer.

Tweak code slightly to avoid the warning.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoxen-netback: do not report success if backend_create_xenvif() fails
Alexey Khoroshilov [Mon, 24 Nov 2014 10:58:00 +0000 (13:58 +0300)]
xen-netback: do not report success if backend_create_xenvif() fails

If xenvif_alloc() or xenbus_scanf() fail in backend_create_xenvif(),
xenbus is left in offline mode but netback_probe() reports success.

The patch implements propagation of error code for backend_create_xenvif().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <[email protected]>
Acked-by: Wei Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agotc_vlan: fix type of tcfv_push_vid
Jiri Pirko [Mon, 24 Nov 2014 10:30:26 +0000 (11:30 +0100)]
tc_vlan: fix type of tcfv_push_vid

Should be u16. So fix it to kill the sparse warning.

Fixes: c7e2b9689ef8136 "sched: introduce vlan action"
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoipv6: gre: fix wrong skb->protocol in WCCP
Yuri Chislov [Mon, 24 Nov 2014 10:25:15 +0000 (11:25 +0100)]
ipv6: gre: fix wrong skb->protocol in WCCP

When using GRE redirection in WCCP, it sets the wrong skb->protocol,
that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic.

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Cc: Dmitry Kozlov <[email protected]>
Signed-off-by: Yuri Chislov <[email protected]>
Tested-by: Yuri Chislov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agotipc: fix sparse warnings in new nl api
Richard Alpe [Mon, 24 Nov 2014 10:10:29 +0000 (11:10 +0100)]
tipc: fix sparse warnings in new nl api

Fix sparse warnings about non-static declaration of static functions
in the new tipc netlink API.

Signed-off-by: Richard Alpe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 24 Nov 2014 21:00:58 +0000 (16:00 -0500)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
netfilter/ipvs updates for net-next

The following patchset contains Netfilter updates for your net-next
tree, this includes the NAT redirection support for nf_tables, the
cgroup support for nft meta and conntrack zone support for the connlimit
match. Coming after those, a bunch of sparse warning fixes, missing
netns bits and cleanups. More specifically, they are:

1) Prepare IPv4 and IPv6 NAT redirect code to use it from nf_tables,
   patches from Arturo Borrero.

2) Introduce the nf_tables redir expression, from Arturo Borrero.

3) Remove an unnecessary assignment in ip_vs_xmit/__ip_vs_get_out_rt().
   Patch from Alex Gartrell.

4) Add nft_log_dereference() macro to the nf_log infrastructure, patch
   from Marcelo Leitner.

5) Add some extra validation when registering logger families, also
   from Marcelo.

6) Some spelling cleanups from stephen hemminger.

7) Fix sparse warning in nf_logger_find_get().

8) Add cgroup support to nf_tables meta, patch from Ana Rey.

9) A Kconfig fix for the new redir expression and fix sparse warnings in
   the new redir expression.

10) Fix several sparse warnings in the netfilter tree, from
    Florian Westphal.

11) Reduce verbosity when OOM in nfnetlink_log. User can basically do
    nothing when this situation occurs.

12) Add conntrack zone support to xt_connlimit, again from Florian.

13) Add netnamespace support to the h323 conntrack helper, contributed
    by Vasily Averin.

14) Remove unnecessary nul-pointer checks before free_percpu() and
    module_put(), from Markus Elfring.

15) Use pr_fmt in nfnetlink_log, again patch from Marcelo Leitner.
====================

Signed-off-by: David S. Miller <[email protected]>
10 years agoipvlan: Initial check-in of the IPVLAN driver.
Mahesh Bandewar [Mon, 24 Nov 2014 07:07:46 +0000 (23:07 -0800)]
ipvlan: Initial check-in of the IPVLAN driver.

This driver is very similar to the macvlan driver except that it
uses L3 on the frame to determine the logical interface while
functioning as packet dispatcher. It inherits L2 of the master
device hence the packets on wire will have the same L2 for all
the packets originating from all virtual devices off of the same
master device.

This driver was developed keeping the namespace use-case in
mind. Hence most of the examples given here take that as the
base setup where main-device belongs to the default-ns and
virtual devices are assigned to the additional namespaces.

The device operates in two different modes and the difference
in these two modes in primarily in the TX side.

(a) L2 mode : In this mode, the device behaves as a L2 device.
TX processing upto L2 happens on the stack of the virtual device
associated with (namespace). Packets are switched after that
into the main device (default-ns) and queued for xmit.

RX processing is simple and all multicast, broadcast (if
applicable), and unicast belonging to the address(es) are
delivered to the virtual devices.

(b) L3 mode : In this mode, the device behaves like a L3 device.
TX processing upto L3 happens on the stack of the virtual device
associated with (namespace). Packets are switched to the
main-device (default-ns) for the L2 processing. Hence the routing
table of the default-ns will be used in this mode.

RX processins is somewhat similar to the L2 mode except that in
this mode only Unicast packets are delivered to the virtual device
while main-dev will handle all other packets.

The devices can be added using the "ip" command from the iproute2
package -

ip link add link <master> <virtual> type ipvlan mode [ l2 | l3 ]

Signed-off-by: Mahesh Bandewar <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Maciej Żenczykowski <[email protected]>
Cc: Laurent Chavey <[email protected]>
Cc: Tim Hockin <[email protected]>
Cc: Brandon Philips <[email protected]>
Cc: Pavel Emelianov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years ago8139too: The maximum MTU should allow for VLAN headers
Alban Bedel [Sun, 23 Nov 2014 12:07:54 +0000 (13:07 +0100)]
8139too: The maximum MTU should allow for VLAN headers

As pointed out by Ben Hutchings drivers that allow using VLAN have to
provide enough headroom for the VLAN tags.

Signed-off-by: Alban Bedel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agonet: fec: init maximum receive buffer size for ring1 and ring2
Nimrod Andy [Sun, 23 Nov 2014 09:23:06 +0000 (17:23 +0800)]
net: fec: init maximum receive buffer size for ring1 and ring2

i.MX6SX fec support three rx ring1, the current driver lost to init
ring1 and ring2 maximum receive buffer size, that cause receving
frame date length error. The driver reports "rcv is not +last" error
log in user case.

Signed-off-by: Fugang Duan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoMerge tag 'iwlwifi-for-john-2014-11-23' of git://git.kernel.org/pub/scm/linux/kernel...
John W. Linville [Mon, 24 Nov 2014 18:53:41 +0000 (13:53 -0500)]
Merge tag 'iwlwifi-for-john-2014-11-23' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

Emmanuel Grumbach <[email protected]> says:

"Not all the firmware know how to handle the HOT_SPOT_CMD.
Make sure that the firmware will know this command before
sending it. This avoids a firmware crash."

Signed-off-by: John W. Linville <[email protected]>
10 years agords: switch rds_message_copy_from_user() to iov_iter
Al Viro [Thu, 20 Nov 2014 14:31:08 +0000 (09:31 -0500)]
rds: switch rds_message_copy_from_user() to iov_iter

Signed-off-by: Al Viro <[email protected]>
10 years agords: switch ->inc_copy_to_user() to passing iov_iter
Al Viro [Thu, 20 Nov 2014 14:21:14 +0000 (09:21 -0500)]
rds: switch ->inc_copy_to_user() to passing iov_iter

instances get considerably simpler from that...

Signed-off-by: Al Viro <[email protected]>
10 years ago[atm] switch vcc_sendmsg() to copy_from_iter()
Al Viro [Thu, 20 Nov 2014 12:01:29 +0000 (07:01 -0500)]
[atm] switch vcc_sendmsg() to copy_from_iter()

... and make it handle multi-segment iovecs - deals with that
"fix this later" issue for free.  A bit of shame, really - it
had been there since 2.3.15pre3 when the whole thing went into the
tree, practically a historical artefact by now...

Signed-off-by: Al Viro <[email protected]>
10 years agovmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr
Al Viro [Thu, 20 Nov 2014 09:05:34 +0000 (04:05 -0500)]
vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr

Signed-off-by: Al Viro <[email protected]>
10 years agotipc_msg_build(): pass msghdr instead of its ->msg_iov
Al Viro [Sat, 15 Nov 2014 06:16:27 +0000 (01:16 -0500)]
tipc_msg_build(): pass msghdr instead of its ->msg_iov

Signed-off-by: Al Viro <[email protected]>
10 years agotipc_sendmsg(): pass msghdr instead of its ->msg_iov
Al Viro [Sat, 15 Nov 2014 06:13:43 +0000 (01:13 -0500)]
tipc_sendmsg(): pass msghdr instead of its ->msg_iov

Signed-off-by: Al Viro <[email protected]>
10 years agoswitch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter
Al Viro [Sat, 15 Nov 2014 06:11:23 +0000 (01:11 -0500)]
switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter

Signed-off-by: Al Viro <[email protected]>
10 years agoswitch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
Al Viro [Thu, 6 Nov 2014 06:10:59 +0000 (01:10 -0500)]
switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()

... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro <[email protected]>
10 years agokill zerocopy_sg_from_iovec()
Al Viro [Thu, 6 Nov 2014 05:56:48 +0000 (00:56 -0500)]
kill zerocopy_sg_from_iovec()

no users left

Signed-off-by: Al Viro <[email protected]>
10 years ago{macvtap,tun}_get_user(): switch to iov_iter
Al Viro [Thu, 19 Jun 2014 19:36:49 +0000 (15:36 -0400)]
{macvtap,tun}_get_user(): switch to iov_iter

allows to switch macvtap and tun from ->aio_write() to ->write_iter()

Signed-off-by: Al Viro <[email protected]>
10 years agonew helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
Al Viro [Thu, 19 Jun 2014 18:15:22 +0000 (14:15 -0400)]
new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()

Signed-off-by: Al Viro <[email protected]>
10 years agoswitch macvtap to ->read_iter()
Al Viro [Fri, 7 Nov 2014 19:13:53 +0000 (14:13 -0500)]
switch macvtap to ->read_iter()

Signed-off-by: Al Viro <[email protected]>
10 years agoswitch drivers/net/tun.c to ->read_iter()
Al Viro [Fri, 7 Nov 2014 18:52:07 +0000 (13:52 -0500)]
switch drivers/net/tun.c to ->read_iter()

Signed-off-by: Al Viro <[email protected]>
10 years agonew helper: memcpy_to_msg()
Al Viro [Mon, 7 Apr 2014 01:51:23 +0000 (21:51 -0400)]
new helper: memcpy_to_msg()

Signed-off-by: Al Viro <[email protected]>
10 years agoswitch ipxrtr_route_packet() from iovec to msghdr
Al Viro [Mon, 7 Apr 2014 01:28:01 +0000 (21:28 -0400)]
switch ipxrtr_route_packet() from iovec to msghdr

Signed-off-by: Al Viro <[email protected]>
10 years agonew helper: memcpy_from_msg()
Al Viro [Mon, 7 Apr 2014 01:25:44 +0000 (21:25 -0400)]
new helper: memcpy_from_msg()

Signed-off-by: Al Viro <[email protected]>
10 years agonew helper: skb_copy_and_csum_datagram_msg()
Al Viro [Sun, 6 Apr 2014 22:47:38 +0000 (18:47 -0400)]
new helper: skb_copy_and_csum_datagram_msg()

Signed-off-by: Al Viro <[email protected]>
10 years agoMIPS: Fix address type used for early memory detection.
Steven J. Hill [Thu, 13 Nov 2014 15:52:00 +0000 (09:52 -0600)]
MIPS: Fix address type used for early memory detection.

In 'early_parse_mem' the data type used for the start
and size of a memory region specified on the command line
is incorrect. If 64-bit addressing is used, the value
gets truncated.

Signed-off-by: Steven J. Hill <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8456/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: Kconfig: Don't allow both microMIPS and SmartMIPS to be selected.
Markos Chandras [Mon, 21 Jul 2014 07:46:14 +0000 (08:46 +0100)]
MIPS: Kconfig: Don't allow both microMIPS and SmartMIPS to be selected.

microMIPS and SmartMIPS can't be used together. This fixes the
following build problem:

Warning: the 32-bit microMIPS architecture does not support the `smartmips' extension
arch/mips/kernel/entry.S:90: Error: unrecognized opcode `mtlhx $24'
[...]
arch/mips/kernel/entry.S:109: Error: unrecognized opcode `mtlhx $24'

Signed-off-by: Markos Chandras <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7421/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: kernel: cps-vec: Set ISA level to mips32r2 for the MIPS MT ASE
Markos Chandras [Wed, 16 Jul 2014 07:53:39 +0000 (08:53 +0100)]
MIPS: kernel: cps-vec: Set ISA level to mips32r2 for the MIPS MT ASE

Fixes the following build warnings:
arch/mips/kernel/cps-vec.S: Assembler messages:
arch/mips/kernel/cps-vec.S:228: Warning: the `mt' extension requires
MIPS32 revision 2 or greater
[...]
arch/mips/kernel/cps-vec.S: Assembler messages:
arch/mips/kernel/cps-vec.S:345: Warning: the `mt' extension requires
MIPS32 revision 2 or greater

Signed-off-by: Markos Chandras <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7355/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: Netlogic: handle modular AHCI builds
Florian Fainelli [Wed, 24 Sep 2014 17:55:11 +0000 (10:55 -0700)]
MIPS: Netlogic: handle modular AHCI builds

Commits a951440971d0 ("MIPS: Netlogic: Support for XLP3XX on-chip SATA")
and fedfcb1137d2 ("MIPS: Netlogic: XLP9XX on-chip SATA support") added
ahci-init and ahci-init-xlp2 as objects to build when CONFIG_SATA_AHCI
is enabled.

If CONFIG_SATA_AHCI is made modular, these two files will also get built
as modules (obj-m), which will result in the following linking failure:

ERROR: "nlm_set_pic_extra_ack" [arch/mips/netlogic/xlp/ahci-init.ko]
undefined!
ERROR: "nlm_io_base" [arch/mips/netlogic/xlp/ahci-init.ko] undefined!
ERROR: "nlm_nodes" [arch/mips/netlogic/xlp/ahci-init-xlp2.ko] undefined!
ERROR: "nlm_set_pic_extra_ack"
[arch/mips/netlogic/xlp/ahci-init-xlp2.ko] undefined!
ERROR: "xlp_socdev_to_node" [arch/mips/netlogic/xlp/ahci-init-xlp2.ko]
undefined!
ERROR: "nlm_io_base" [arch/mips/netlogic/xlp/ahci-init-xlp2.ko]
undefined!

Just check whether CONFIG_SATA_AHCI is defined for this build, and if
that is the case, add these objects to the list of built-in object
files.

Signed-off-by: Florian Fainelli <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7855/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: Netlogic: handle modular USB case
Florian Fainelli [Wed, 24 Sep 2014 17:55:10 +0000 (10:55 -0700)]
MIPS: Netlogic: handle modular USB case

Commit 1004165f346a ("MIPS: Netlogic: USB support for XLP") and then
commit 9eac3591e78b ("MIPS: Netlogic: Add support for USB on XLP2xx")
added usb-init and usb-init-xlp2 as objects to build when CONFIG_USB is
enabled.

If CONFIG_USB is made modular, these two files will also get built as
modules (obj-m), which will result in the following linking failure:

ERROR: "nlm_io_base" [arch/mips/netlogic/xlp/usb-init.ko] undefined!
ERROR: "nlm_nodes" [arch/mips/netlogic/xlp/usb-init-xlp2.ko] undefined!
ERROR: "nlm_set_pic_extra_ack" [arch/mips/netlogic/xlp/usb-init-xlp2.ko]
undefined!
ERROR: "xlp_socdev_to_node" [arch/mips/netlogic/xlp/usb-init-xlp2.ko]
undefined!
ERROR: "nlm_io_base" [arch/mips/netlogic/xlp/usb-init-xlp2.ko]
undefined!

Just check whether CONFIG_USB is defined for this build, and if that is
the case, add these objects to the list of built-in object files.

Signed-off-by: Florian Fainelli <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7854/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: Loongson: Make platform serial setup always built-in.
Aaro Koskinen [Wed, 19 Nov 2014 23:05:38 +0000 (01:05 +0200)]
MIPS: Loongson: Make platform serial setup always built-in.

If SERIAL_8250 is compiled as a module, the platform specific setup
for Loongson will be a module too, and it will not work very well.
At least on Loongson 3 it will trigger a build failure,
since loongson_sysconf is not exported to modules.

Fix by making the platform specific serial code always built-in.

Signed-off-by: Aaro Koskinen <[email protected]>
Reported-by: Ralf Baechle <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Markos Chandras <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/8533/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: fix EVA & non-SMP non-FPU FP context signal handling
Paul Burton [Tue, 28 Oct 2014 11:25:51 +0000 (11:25 +0000)]
MIPS: fix EVA & non-SMP non-FPU FP context signal handling

The save_fp_context & restore_fp_context pointers were being assigned
to the wrong variables if either:

  - The kernel is configured for UP & runs on a system without an FPU,
    since b2ead5282885 "MIPS: Move & rename
    fpu_emulator_{save,restore}_context".

  - The kernel is configured for EVA, since ca750649e08c "MIPS: kernel:
    signal: Prevent save/restore FPU context in user memory".

This would lead to FP context being clobbered incorrectly when setting
up a sigcontext, then the garbage values being saved uselessly when
returning from the signal.

Fix by swapping the pointer assignments appropriately.

Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected] # v3.15+
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8230/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: cpu-probe: Set the FTLB probability bit on supported cores
Markos Chandras [Mon, 10 Nov 2014 12:25:34 +0000 (12:25 +0000)]
MIPS: cpu-probe: Set the FTLB probability bit on supported cores

Make use of the Config6/FLTBP bit to set the probability of a TLBWR
instruction to hit the FTLB or the VTLB. A value of 0 (which may be
the default value on certain cores, such as proAptiv or P5600)
means that a TLBWR instruction will never hit the VTLB which
leads to performance limitations since it effectively decreases
the number of available TLB slots.

Signed-off-by: Markos Chandras <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cc: <[email protected]> # v3.15+
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8368/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: BMIPS: Fix ".previous without corresponding .section" warnings
Kevin Cernekee [Tue, 21 Oct 2014 04:27:51 +0000 (21:27 -0700)]
MIPS: BMIPS: Fix ".previous without corresponding .section" warnings

Commit 078a55fc824c1 ("Delete __cpuinit/__CPUINIT usage from MIPS code")
removed our __CPUINIT directives, so now the ".previous" directives
are superfluous.  Remove them.

Signed-off-by: Kevin Cernekee <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8156/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: uaccess.h: Fix strnlen_user comment.
Ralf Baechle [Tue, 4 Nov 2014 01:23:45 +0000 (02:23 +0100)]
MIPS: uaccess.h: Fix strnlen_user comment.

Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: r4kcache: Add EVA case for protected_writeback_dcache_line
Markos Chandras [Wed, 5 Nov 2014 08:25:37 +0000 (08:25 +0000)]
MIPS: r4kcache: Add EVA case for protected_writeback_dcache_line

Commit de8974e3f76c0 ("MIPS: asm: r4kcache: Add EVA cache flushing
functions") added cache function for EVA using the cachee instruction.
However, it didn't add a case for the protected_writeback_dcache_line.
mips_dsemul() calls r4k_flush_cache_sigtramp() which in turn uses
the protected_writeback_dcache_line() to flush the trampoline code
back to memory. This used the wrong "cache" instruction leading to
random userland crashes on non-FPU cores.

Signed-off-by: Markos Chandras <[email protected]>
Cc: <[email protected]> # v3.15+
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8331/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: Fix info about plat_setup in arch_mem_init comment
Rafał Miłecki [Wed, 3 Sep 2014 05:36:51 +0000 (07:36 +0200)]
MIPS: Fix info about plat_setup in arch_mem_init comment

Signed-off-by: Rafał Miłecki <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7607/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: rtlx: Remove KERN_DEBUG from pr_debug() arguments in rtlx.c
Masanari Iida [Tue, 7 Oct 2014 03:57:45 +0000 (12:57 +0900)]
MIPS: rtlx: Remove KERN_DEBUG from pr_debug() arguments in rtlx.c

Signed-off-by: Masanari Iida <[email protected]>
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/7938/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agoMIPS: SEAD3: Fix LED device registration.
Ralf Baechle [Thu, 23 Oct 2014 22:41:45 +0000 (00:41 +0200)]
MIPS: SEAD3: Fix LED device registration.

This isn't a module and shouldn't be one.

Signed-off-by: Ralf Baechle <[email protected]>
Cc: Markos Chandras <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/8202/

10 years agoMIPS: Fix a copy & paste error in unistd.h
Huacai Chen [Wed, 12 Nov 2014 05:47:14 +0000 (13:47 +0800)]
MIPS: Fix a copy & paste error in unistd.h

Commit 5df4c8dbbc (MIPS: Wire up bpf syscall.) break the N32 build
because of a copy & paste error.

Signed-off-by: Huacai Chen <[email protected]>
Cc: John Crispin <[email protected]>
Cc: Steven J. Hill <[email protected]>
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/8390/
Signed-off-by: Ralf Baechle <[email protected]>
10 years agopowerpc/pci: Remove unused force_32bit_msi quirk
Benjamin Herrenschmidt [Tue, 7 Oct 2014 05:13:34 +0000 (16:13 +1100)]
powerpc/pci: Remove unused force_32bit_msi quirk

This is now fully replaced with the generic "no_64bit_msi" one
that is set by the respective drivers directly.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
10 years agopowerpc/pseries: Honor the generic "no_64bit_msi" flag
Benjamin Herrenschmidt [Tue, 7 Oct 2014 05:12:55 +0000 (16:12 +1100)]
powerpc/pseries: Honor the generic "no_64bit_msi" flag

Instead of the arch specific quirk which we are deprecating

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]>
10 years agopowerpc/powernv: Honor the generic "no_64bit_msi" flag
Benjamin Herrenschmidt [Tue, 7 Oct 2014 05:12:36 +0000 (16:12 +1100)]
powerpc/powernv: Honor the generic "no_64bit_msi" flag

Instead of the arch specific quirk which we are deprecating
and that drivers don't understand.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]>
10 years agosound/radeon: Move 64-bit MSI quirk from arch to driver
Benjamin Herrenschmidt [Mon, 24 Nov 2014 03:17:08 +0000 (14:17 +1100)]
sound/radeon: Move 64-bit MSI quirk from arch to driver

A number of radeon cards have a HW limitation causing them to be
unable to generate the full 64-bit of address bits for MSIs. This
breaks MSIs on some platforms such as POWER machines.

We used to have a powerpc specific quirk to address that on a
single card, but this doesn't scale very well, this is better
put under control of the drivers who know precisely what a given
HW revision can do.

We now have a generic quirk in the PCI code. We should set it
appropriately for all radeon's from the audio driver.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Reviewed-by: Takashi Iwai <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
CC: <[email protected]>
10 years agogpu/radeon: Set flag to indicate broken 64-bit MSI
Benjamin Herrenschmidt [Fri, 3 Oct 2014 05:18:59 +0000 (15:18 +1000)]
gpu/radeon: Set flag to indicate broken 64-bit MSI

Some radeon ASICs don't support all 64 address bits of MSIs despite
advertising support for 64-bit MSIs in their configuration space.

This breaks on systems such as IBM POWER7/8, where 64-bit MSIs can
be assigned with some of the high address bits set.

This makes use of the newly introduced "no_64bit_msi" flag in structure
pci_dev to allow the MSI allocation code to fallback to 32-bit MSIs
on those adapters.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
CC: <[email protected]>
---

Adding Alex's review tag. Patch to the driver is identical to the
reviewed one, I dropped the arch/powerpc hunk rewrote the subject
and cset comment.

10 years agoPCI/MSI: Add device flag indicating that 64-bit MSIs don't work
Benjamin Herrenschmidt [Fri, 3 Oct 2014 05:13:24 +0000 (15:13 +1000)]
PCI/MSI: Add device flag indicating that 64-bit MSIs don't work

This can be set by quirks/drivers to be used by the architecture code
that assigns the MSI addresses.

We additionally add verification in the core MSI code that the values
assigned by the architecture do satisfy the limitation in order to fail
gracefully if they don't (ie. the arch hasn't been updated to deal with
that quirk yet).

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
10 years agoALSA: hda - Limit 40bit DMA for AMD HDMI controllers
Takashi Iwai [Wed, 1 Oct 2014 08:30:53 +0000 (10:30 +0200)]
ALSA: hda - Limit 40bit DMA for AMD HDMI controllers

AMD/ATI HDMI controller chip models, we already have a filter to lower
to 32bit DMA, but the rest are supposed to be working with 64bit
although the hardware doesn't really work with 63bit but only with 40
or 48bit DMA.  In this patch, we take 40bit DMA for safety for the
AMD/ATI controllers as the graphics drivers does.

Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]>
10 years agoip_tunnel: the lack of vti_link_ops' dellink() cause kernel panic
lucien [Sun, 23 Nov 2014 07:04:11 +0000 (15:04 +0800)]
ip_tunnel: the lack of vti_link_ops' dellink() cause kernel panic

Now the vti_link_ops do not point the .dellink, for fb tunnel device
(ip_vti0), the net_device will be removed as the default .dellink is
unregister_netdevice_queue,but the tunnel still in the tunnel list,
then if we add a new vti tunnel, in ip_tunnel_find():

        hlist_for_each_entry_rcu(t, head, hash_node) {
                if (local == t->parms.iph.saddr &&
                    remote == t->parms.iph.daddr &&
                    link == t->parms.link &&
==>                 type == t->dev->type &&
                    ip_tunnel_key_match(&t->parms, flags, key))
                        break;
        }

the panic will happen, cause dev of ip_tunnel *t is null:
[ 3835.072977] IP: [<ffffffffa04103fd>] ip_tunnel_find+0x9d/0xc0 [ip_tunnel]
[ 3835.073008] PGD b2c21067 PUD b7277067 PMD 0
[ 3835.073008] Oops: 0000 [#1] SMP
.....
[ 3835.073008] Stack:
[ 3835.073008]  ffff8800b72d77f0 ffffffffa0411924 ffff8800bb956000 ffff8800b72d78e0
[ 3835.073008]  ffff8800b72d78a0 0000000000000000 ffffffffa040d100 ffff8800b72d7858
[ 3835.073008]  ffffffffa040b2e3 0000000000000000 0000000000000000 0000000000000000
[ 3835.073008] Call Trace:
[ 3835.073008]  [<ffffffffa0411924>] ip_tunnel_newlink+0x64/0x160 [ip_tunnel]
[ 3835.073008]  [<ffffffffa040b2e3>] vti_newlink+0x43/0x70 [ip_vti]
[ 3835.073008]  [<ffffffff8150d4da>] rtnl_newlink+0x4fa/0x5f0
[ 3835.073008]  [<ffffffff812f68bb>] ? nla_strlcpy+0x5b/0x70
[ 3835.073008]  [<ffffffff81508fb0>] ? rtnl_link_ops_get+0x40/0x60
[ 3835.073008]  [<ffffffff8150d11f>] ? rtnl_newlink+0x13f/0x5f0
[ 3835.073008]  [<ffffffff81509cf4>] rtnetlink_rcv_msg+0xa4/0x270
[ 3835.073008]  [<ffffffff8126adf5>] ? sock_has_perm+0x75/0x90
[ 3835.073008]  [<ffffffff81509c50>] ? rtnetlink_rcv+0x30/0x30
[ 3835.073008]  [<ffffffff81529e39>] netlink_rcv_skb+0xa9/0xc0
[ 3835.073008]  [<ffffffff81509c48>] rtnetlink_rcv+0x28/0x30
....

modprobe ip_vti
ip link del ip_vti0 type vti
ip link add ip_vti0 type vti
rmmod ip_vti

do that one or more times, kernel will panic.

fix it by assigning ip_tunnel_dellink to vti_link_ops' dellink, in
which we skip the unregister of fb tunnel device. do the same on ip6_vti.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoenic: use netdev_rss_key_fill() helper
Eric Dumazet [Sun, 23 Nov 2014 20:27:41 +0000 (12:27 -0800)]
enic: use netdev_rss_key_fill() helper

Use of well known RSS key might increase attack surface.

Switch to a random one, using generic helper so that all
ports share a common key.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Christian Benvenuti <[email protected]>
Cc: Govindarajulu Varadarajan <[email protected]>
Cc: Sujith Sankar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoipv6: coding style improvements (remove assignment in if statements)
Ian Morris [Sun, 23 Nov 2014 21:28:43 +0000 (21:28 +0000)]
ipv6: coding style improvements (remove assignment in if statements)

This change has no functional impact and simply addresses some coding
style issues detected by checkpatch. Specifically this change
adjusts "if" statements which also include the assignment of a
variable.

No changes to the resultant object files result as determined by objdiff.

Signed-off-by: Ian Morris <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
10 years agoLinux 3.18-rc6 v3.18-rc6
Linus Torvalds [Sun, 23 Nov 2014 23:25:20 +0000 (15:25 -0800)]
Linux 3.18-rc6

10 years agouprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME
Andy Lutomirski [Fri, 21 Nov 2014 21:26:07 +0000 (13:26 -0800)]
uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME

x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
not on non-paranoid returns.  I suspect that this is a mistake and that
the code only works because int3 is paranoid.

Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
for the x86 bug.  With that bug fixed, we can remove _TIF_NOTIFY_RESUME
from the uprobes code.

Reported-by: Oleg Nesterov <[email protected]>
Acked-by: Srikar Dronamraju <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
10 years agosched: Provide update_curr callbacks for stop/idle scheduling classes
Thomas Gleixner [Sun, 23 Nov 2014 22:04:52 +0000 (23:04 +0100)]
sched: Provide update_curr callbacks for stop/idle scheduling classes

Chris bisected a NULL pointer deference in task_sched_runtime() to
commit 6e998916dfe3 'sched/cputime: Fix clock_nanosleep()/clock_gettime()
inconsistency'.

Chris observed crashes in atop or other /proc walking programs when he
started fork bombs on his machine.  He assumed that this is a new exit
race, but that does not make any sense when looking at that commit.

What's interesting is that, the commit provides update_curr callbacks
for all scheduling classes except stop_task and idle_task.

While nothing can ever hit that via the clock_nanosleep() and
clock_gettime() interfaces, which have been the target of the commit in
question, the author obviously forgot that there are other code paths
which invoke task_sched_runtime()

do_task_stat(()
 thread_group_cputime_adjusted()
   thread_group_cputime()
     task_cputime()
       task_sched_runtime()
        if (task_current(rq, p) && task_on_rq_queued(p)) {
          update_rq_clock(rq);
          up->sched_class->update_curr(rq);
        }

If the stats are read for a stomp machine task, aka 'migration/N' and
that task is current on its cpu, this will happily call the NULL pointer
of stop_task->update_curr.  Ooops.

Chris observation that this happens faster when he runs the fork bomb
makes sense as the fork bomb will kick migration threads more often so
the probability to hit the issue will increase.

Add the missing update_curr callbacks to the scheduler classes stop_task
and idle_task.  While idle tasks cannot be monitored via /proc we have
other means to hit the idle case.

Fixes: 6e998916dfe3 'sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency'
Reported-by: Chris Mason <[email protected]>
Reported-and-tested-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Stanislaw Gruszka <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
10 years agoMerge branch 'x86-traps' (trap handling from Andy Lutomirski)
Linus Torvalds [Sun, 23 Nov 2014 21:56:55 +0000 (13:56 -0800)]
Merge branch 'x86-traps' (trap handling from Andy Lutomirski)

Merge x86-64 iret fixes from Andy Lutomirski:
 "This addresses the following issues:

   - an unrecoverable double-fault triggerable with modify_ldt.
   - invalid stack usage in espfix64 failed IRET recovery from IST
     context.
   - invalid stack usage in non-espfix64 failed IRET recovery from IST
     context.

  It also makes a good but IMO scary change: non-espfix64 failed IRET
  will now report the correct error.  Hopefully nothing depended on the
  old incorrect behavior, but maybe Wine will get confused in some
  obscure corner case"

* emailed patches from Andy Lutomirski <[email protected]>:
  x86_64, traps: Rework bad_iret
  x86_64, traps: Stop using IST for #SS
  x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C

10 years agox86_64, traps: Rework bad_iret
Andy Lutomirski [Sun, 23 Nov 2014 02:00:33 +0000 (18:00 -0800)]
x86_64, traps: Rework bad_iret

It's possible for iretq to userspace to fail.  This can happen because
of a bad CS, SS, or RIP.

Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace.  To make this work, there's
an extra fixup to fudge the gs base into a usable state.

This is suboptimal because it loses the original exception.  It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with.  For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.

This patch throws out bad_iret entirely.  As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C.  It's should be clearer and more correct.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
10 years agox86_64, traps: Stop using IST for #SS
Andy Lutomirski [Sun, 23 Nov 2014 02:00:32 +0000 (18:00 -0800)]
x86_64, traps: Stop using IST for #SS

On a 32-bit kernel, this has no effect, since there are no IST stacks.

On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code.  The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.

This fixes a bug in which the espfix64 code mishandles a stack segment
violation.

This saves 4k of memory per CPU and a tiny bit of code.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
10 years agox86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
Andy Lutomirski [Sun, 23 Nov 2014 02:00:31 +0000 (18:00 -0800)]
x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C

There's nothing special enough about the espfix64 double fault fixup to
justify writing it in assembly.  Move it to C.

This also fixes a bug: if the double fault came from an IST stack, the
old asm code would return to a partially uninitialized stack frame.

Fixes: 3891a04aafd668686239349ea58f3314ea2af86b
Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
This page took 0.14683 seconds and 4 git commands to generate.