Salva Peiró [Tue, 17 Dec 2013 09:06:30 +0000 (10:06 +0100)]
hamradio/yam: fix info leak in ioctl
The yam_ioctl() code fails to initialise the cmd field
of the struct yamdrv_ioctl_cfg. Add an explicit memset(0)
before filling the structure to avoid the 4-byte info leak.
Signed-off-by: Salva Peiró <speiro@ai2.upv.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wenliang Fan [Tue, 17 Dec 2013 03:25:28 +0000 (11:25 +0800)]
drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()
The local variable 'bi' comes from userspace. If userspace passed a
large number to 'bi.data.calibrate', there would be an integer overflow
in the following line:
s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;
Signed-off-by: Wenliang Fan <fanwlexca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 17 Dec 2013 02:42:09 +0000 (10:42 +0800)]
xen-netback: fix some error return code
'err' is overwrited to 0 after maybe_pull_tail() call, so the error
code was not set if skb_partial_csum_set() call failed. Fix to return
error -EPROTO from those error handling case instead of 0.
Fixes: d52eb0d46f36 ('xen-netback: make sure skb linear area covers checksum field')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 16 Dec 2013 23:38:39 +0000 (00:38 +0100)]
net: inet_diag: zero out uninitialized idiag_{src,dst} fields
Jakub reported while working with nlmon netlink sniffer that parts of
the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6.
That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3].
In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab]
memory through this. At least, in udp_dump_one(), we allocate a skb in ...
rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL);
... and then pass that to inet_sk_diag_fill() that puts the whole struct
inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0],
r->id.idiag_dst[0] and leave the rest untouched:
r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;
struct inet_diag_msg embeds struct inet_diag_sockid that is correctly /
fully filled out in IPv6 case, but for IPv4 not.
So just zero them out by using plain memset (for this little amount of
bytes it's probably not worth the extra check for idiag_family == AF_INET).
Similarly, fix also other places where we fill that out.
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Len Brown [Wed, 18 Dec 2013 21:44:57 +0000 (16:44 -0500)]
x86 idle: Repair large-server 50-watt idle-power regression
Linux 3.10 changed the timing of how thread_info->flags is touched:
x86: Use generic idle loop
(
7d1a941731fabf27e5fb6edbebb79fe856edb4e5)
This caused Intel NHM-EX and WSM-EX servers to experience a large number
of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote
from deep C-states to shallow C-states, which caused these platforms
to experience a significant increase in idle power.
Note that this issue was already present before the commit above,
however, it wasn't seen often enough to be noticed in power measurements.
Here we extend an errata workaround from the Core2 EX "Dunnington"
to extend to NHM-EX and WSM-EX, to prevent these immediate
returns from MWAIT, reducing idle power on these platforms.
While only acpi_idle ran on Dunnington, intel_idle
may also run on these two newer systems.
As of today, there are no other models that are known
to need this tweak.
Link: http://lkml.kernel.org/r/CAJvTdK=%2BaNN66mYpCGgbHGCHhYQAKx-vB0kJSWjVpsNb_hOAtQ@mail.gmail.com
Signed-off-by: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com
Cc: <stable@vger.kernel.org> # 3.12.x, 3.11.x, 3.10.x
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Tejun Heo [Wed, 18 Dec 2013 12:07:32 +0000 (07:07 -0500)]
libata, freezer: avoid block device removal while system is frozen
Freezable kthreads and workqueues are fundamentally problematic in
that they effectively introduce a big kernel lock widely used in the
kernel and have already been the culprit of several deadlock
scenarios. This is the latest occurrence.
During resume, libata rescans all the ports and revalidates all
pre-existing devices. If it determines that a device has gone
missing, the device is removed from the system which involves
invalidating block device and flushing bdi while holding driver core
layer locks. Unfortunately, this can race with the rest of device
resume. Because freezable kthreads and workqueues are thawed after
device resume is complete and block device removal depends on
freezable workqueues and kthreads (e.g. bdi_wq, jbd2) to make
progress, this can lead to deadlock - block device removal can't
proceed because kthreads are frozen and kthreads can't be thawed
because device resume is blocked behind block device removal.
839a8e8660b6 ("writeback: replace custom worker pool implementation
with unbound workqueue") made this particular deadlock scenario more
visible but the underlying problem has always been there - the
original forker task and jbd2 are freezable too. In fact, this is
highly likely just one of many possible deadlock scenarios given that
freezer behaves as a big kernel lock and we don't have any debug
mechanism around it.
I believe the right thing to do is getting rid of freezable kthreads
and workqueues. This is something fundamentally broken. For now,
implement a funny workaround in libata - just avoid doing block device
hot[un]plug while the system is frozen. Kernel engineering at its
finest. :(
v2: Add EXPORT_SYMBOL_GPL(pm_freezing) for cases where libata is built
as a module.
v3: Comment updated and polling interval changed to 10ms as suggested
by Rafael.
v4: Add #ifdef CONFIG_FREEZER around the hack as pm_freezing is not
defined when FREEZER is not configured thus breaking build.
Reported by kbuild test robot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tomaž Šolc <tomaz.solc@tablix.org>
Reviewed-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=62801
Link: http://lkml.kernel.org/r/20131213174932.GA27070@htj.dyndns.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Cc: kbuild test robot <fengguang.wu@intel.com>
Will Deacon [Tue, 17 Dec 2013 17:09:08 +0000 (17:09 +0000)]
arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
Commit
8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for
disabled breakpoints") fixed an issue with GDB trying to zero breakpoint
control registers. The problem there is that the arch hw_breakpoint code
will attempt to create a (disabled), execute breakpoint of length 0.
This will fail validation and report unexpected failure to GDB. To avoid
this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that
seems to have broken with recent kernels, causing watchpoints to be
treated as TYPE_INST in the core code and returning ENOSPC for any
further breakpoints.
This patch fixes the problem by prioritising the `enable' field of the
breakpoint: if it is cleared, we simply update the perf_event_attr to
indicate that the thing is disabled and don't bother changing either the
type or the length. This reinforces the behaviour that the breakpoint
control register is essentially read-only apart from the enable bit
when disabling a breakpoint.
Cc: <stable@vger.kernel.org>
Reported-by: Aaron Liu <liucy214@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Thu, 19 Dec 2013 17:11:22 +0000 (09:11 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"An RT group-scheduling fix and the sched-domains topology setup fix
from Mel"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities
sched: Assign correct scheduling domain to 'sd_llc'
Linus Torvalds [Thu, 19 Dec 2013 17:10:46 +0000 (09:10 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"An ABI documentation fix, and a mixed-PMU perf-info-corruption fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Document the new transaction sample type
perf: Disable all pmus on unthrottling and rescheduling
Linus Torvalds [Thu, 19 Dec 2013 17:10:10 +0000 (09:10 -0800)]
Merge tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have a bit more changes than usual in ASoC here, as it was slipped
from the previous update. There are one minr ASoC PCM code fix and
ASoC dmaengine fix, in addition of a collection of small ASoC driver
fixes. The rest are a couple of HD-audio stable fixups, and a
long-standing fix for the paused stream handling.
So, all commits look not scary (and hopefully won't give you
disastrous holiday season)"
* tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add Dell headset detection quirk for one more laptop model
ASoC: wm8904: fix DSP mode B configuration
ASoC: wm_adsp: Add small delay while polling DSP RAM start
ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function
ASoC: kirkwood: Fix the CPU DAI rates
ASoC: wm5110: Correct HPOUT3 DAPM route typo
ALSA: hda - Add Dell headset detection quirk for three laptop models
ALSA: hda - Add enable_msi=0 workaround for four HP machines
ASoC: don't leak on error in snd_dmaengine_pcm_register
ASoC: fsl: imx-wm8962: Don't update bias_level in machine driver
ASoC: tegra: fix uninitialized variables in set_fmt
ASoC: wm8962: Enable SYSCLK provisonally before fetching generated DSPCLK_DIV
ASoC: sam9x5_wm8731: change to work in DSP A mode
ASoC: atmel_ssc_dai: add dai trigger ops
ASoC: soc-pcm: Use valid condition for snd_soc_dai_digital_mute() in hw_free()
Matias Bjorling [Wed, 18 Dec 2013 12:41:44 +0000 (13:41 +0100)]
null_blk: warning on ignored submit_queues param
Let the user know when the number of submission queues are being
ignored.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Matias Bjorling [Wed, 18 Dec 2013 12:41:43 +0000 (13:41 +0100)]
null_blk: refactor init and init errors code paths
Simplify the initialization logic of the three block-layers.
- The queue initialization is split into two parts. This allows reuse of
code when initializing the sq-, bio- and mq-based layers.
- Set submit_queues default value to 0 and always set it at init time.
- Simplify the init error code paths.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Matias Bjorling [Wed, 18 Dec 2013 12:41:42 +0000 (13:41 +0100)]
null_blk: documentation
Add description of module and its parameters.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Matias Bjorling [Tue, 10 Dec 2013 15:50:38 +0000 (16:50 +0100)]
null_blk: mem garbage on NUMA systems during init
For NUMA systems, initializing the blk-mq layer and using per node hctx.
We initialize submit queues to 1, while blk-mq nr_hw_queues is
initialized to the number of NUMA nodes.
This makes the null_init_hctx function overwrite memory outside of what
it allocated. In my case it lead to writing garbage into struct
request_queue's mq_map.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rashika Kheria [Thu, 19 Dec 2013 09:32:22 +0000 (15:02 +0530)]
drivers: block: Mark the functions as static in skd_main.c
Mark functions skd_skmsg_state_to_str() and skd_skreq_state_to_str() as
static in skd_main.c because they are not used outside this file.
This eliminates the following warnings in skd_main.c:
drivers/block/skd_main.c:5272:13: warning: no previous prototype for ‘skd_skmsg_state_to_str’ [-Wmissing-prototypes]
drivers/block/skd_main.c:5284:13: warning: no previous prototype for ‘skd_skreq_state_to_str’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Vineet Gupta [Thu, 19 Dec 2013 13:25:58 +0000 (18:55 +0530)]
ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h
Commit
97bc386fc12d "ARC: Add guard macro to uapi/asm/unistd.h"
inhibited multiple inclusion of ARCH unistd.h. This however hosed the system
since Generic syscall table generator relies on it being included twice,
and in lack-of an empty table was emitted by C preprocessor.
Fix that by allowing one exception to rule for the special case (just
like Xtensa)
Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Takashi Iwai [Thu, 19 Dec 2013 11:22:11 +0000 (12:22 +0100)]
Merge tag 'asoc-v3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.13
The fixes here are all driver specific ones, none of which particularly
stand out but all of which are useful to users of those drivers.
Mark Brown [Thu, 19 Dec 2013 10:25:27 +0000 (10:25 +0000)]
Merge remote-tracking branches 'asoc/fix/adsp', 'asoc/fix/arizona', 'asoc/fix/atmel', 'asoc/fix/fsl', 'asoc/fix/kirkwood', 'asoc/fix/tegra', 'asoc/fix/wm8904' and 'asoc/fix/wm8962' into asoc-linus
Mark Brown [Thu, 19 Dec 2013 10:25:27 +0000 (10:25 +0000)]
Merge remote-tracking branch 'asoc/fix/dma' into asoc-linus
Mark Brown [Thu, 19 Dec 2013 10:25:26 +0000 (10:25 +0000)]
Merge remote-tracking branch 'asoc/fix/core' into asoc-linus
Ben Dooks [Mon, 16 Dec 2013 12:38:48 +0000 (12:38 +0000)]
ARM: shmobile: r8a7790: fix shdi resource sizes
The r8a7790.dtsi file has four sdhi nodes which the first two have the wrong
resource size for their register block. This causes the sh_modbile_sdhi driver
to fail to communicate with card at-all.
Change sdhi{0,1} node size from 0x100 to 0x200 to correct these nodes
as per Kuninori Morimoto's response to the original patch where all four
nodes where changed. sdhi{2,3} are the correct size.
This bug has been present since sdhi resources were added to the r8a7790 by
8c9b1aa41853272a ("ARM: shmobile: r8a7790: add MMCIF and SDHI DT
templates") in v3.11-rc2.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Tested-by: William Towle <william.towle@codethink.co.uk>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Kuninori Morimoto [Mon, 16 Dec 2013 08:16:52 +0000 (00:16 -0800)]
ARM: shmobile: bockw: fixup DMA mask
4dcfa60071b3d23f0181f27d8519f12e37cefbb9
(ARM: DMA-API: better handing of DMA masks for coherent allocations)
exchanged DMA mask check method.
Below warning will appear without this patch
asoc-simple-card asoc-simple-card.0: \
Coherent DMA mask 0xffffffffffffffff is larger than dma_addr_t allows
asoc-simple-card asoc-simple-card.0: \
Driver did not use or check the return value from dma_set_coherent_mask()?
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Nicholas Bellinger [Thu, 12 Dec 2013 20:24:11 +0000 (12:24 -0800)]
target/file: Update hw_max_sectors based on current block_size
This patch allows FILEIO to update hw_max_sectors based on the current
max_bytes_per_io. This is required because vfs_[writev,readv]() can accept
a maximum of 2048 iovecs per call, so the enforced hw_max_sectors really
needs to be calculated based on block_size.
This addresses a >= v3.5 bug where block_size=512 was rejecting > 1M
sized I/O requests, because FD_MAX_SECTORS was hardcoded to 2048 for
the block_size=4096 case.
(v2: Use max_bytes_per_io instead of ->update_hw_max_sectors)
Reported-by: Henrik Goldman <hg@x-formation.com>
Cc: <stable@vger.kernel.org> #3.5+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 12 Dec 2013 00:20:13 +0000 (16:20 -0800)]
iser-target: Move INIT_WORK setup into isert_create_device_ib_res
This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().
This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Wed, 11 Dec 2013 23:45:32 +0000 (15:45 -0800)]
iscsi-target: Fix incorrect np->np_thread NULL assignment
When shutting down a target there is a race condition between
iscsit_del_np() and __iscsi_target_login_thread().
The latter sets the thread pointer to NULL, and the former
tries to issue kthread_stop() on that pointer without any
synchronization.
This patch moves the np->np_thread NULL assignment into
iscsit_del_np(), after kthread_stop() has completed. It also
removes the signal_pending() + np_state check, and only
exits when kthread_should_stop() is true.
Reported-by: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Laurent Pinchart [Wed, 11 Dec 2013 02:48:15 +0000 (03:48 +0100)]
ARM: shmobile: armadillo: Add PWM backlight power supply
Commit
22ceeee16eb8f0d04de3ef43a5174fb30ec18af9 ("pwm-backlight: Add
power supply support") added a mandatory power supply for the PWM
backlight. Add a fixed 5V regulator to board code with a consumer supply
entry for the backlight device.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
(cherry picked from commit
ad11cb9a5cf96346f1240995c672cdbb5501785c)
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Linus Torvalds [Thu, 19 Dec 2013 03:05:00 +0000 (19:05 -0800)]
Merge branch 'akpm' (incoming from Andrew)
Merge patches from Andrew Morton:
"23 fixes and a MAINTAINERS update"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
mm/hugetlb: check for pte NULL pointer in __page_check_address()
fix build with make 3.80
mm/mempolicy: fix !vma in new_vma_page()
MAINTAINERS: add Davidlohr as GPT maintainer
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
mm/compaction: respect ignore_skip_hint in update_pageblock_skip
mm/mempolicy: correct putback method for isolate pages if failed
mm: add missing dependency in Kconfig
sh: always link in helper functions extracted from libgcc
mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
mm: numa: defer TLB flush for THP migration as long as possible
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
mm: fix TLB flush race between migration, and change_protection_range
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
mm: numa: clear numa hinting information on mprotect
sched: numa: skip inaccessible VMAs
mm: numa: avoid unnecessary work on the failure path
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
mm: numa: do not clear PTE for pte_numa update
mm: numa: do not clear PMD during PTE update scan
...
Jianguo Wu [Thu, 19 Dec 2013 01:08:59 +0000 (17:08 -0800)]
mm/hugetlb: check for pte NULL pointer in __page_check_address()
In __page_check_address(), if address's pud is not present,
huge_pte_offset() will return NULL, we should check the return value.
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: qiuxishi <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Beulich [Thu, 19 Dec 2013 01:08:57 +0000 (17:08 -0800)]
fix build with make 3.80
According to Documentation/Changes, make 3.80 is still being supported
for building the kernel, hence make files must not make (unconditional)
use of features introduced only in newer versions.
Commit
1bf49dd4be0b ("./Makefile: export initial ramdisk compression
config option") however introduced "else ifeq" constructs which make
3.80 doesn't understand. Replace the logic there with more conventional
(in the kernel build infrastructure) list constructs (except that the
list here is intentionally limited to exactly one element).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: P J P <ppandit@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanpeng Li [Thu, 19 Dec 2013 01:08:56 +0000 (17:08 -0800)]
mm/mempolicy: fix !vma in new_vma_page()
BUG_ON(!vma) assumption is introduced by commit
0bf598d863e3 ("mbind:
add BUG_ON(!vma) in new_vma_page()"), however, even if
address = __vma_address(page, vma);
and
vma->start < address < vma->end
page_address_in_vma() may still return -EFAULT because of many other
conditions in it. As a result the while loop in new_vma_page() may end
with vma=NULL.
This patch revert the commit and also fix the potential dereference NULL
pointer reported by Dan.
http://marc.info/?l=linux-mm&m=
137689530323257&w=2
kernel BUG at mm/mempolicy.c:1204!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 3 PID: 7056 Comm: trinity-child3 Not tainted 3.13.0-rc3+ #2
task:
ffff8801ca5295d0 ti:
ffff88005ab20000 task.ti:
ffff88005ab20000
RIP: new_vma_page+0x70/0x90
RSP: 0000:
ffff88005ab21db0 EFLAGS:
00010246
RAX:
fffffffffffffff2 RBX:
0000000000000000 RCX:
0000000000000000
RDX:
0000000008040075 RSI:
ffff8801c3d74600 RDI:
ffffea00079a8b80
RBP:
ffff88005ab21dc8 R08:
0000000000000004 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
fffffffffffffff2
R13:
ffffea00079a8b80 R14:
0000000000400000 R15:
0000000000400000
FS:
00007ff49c6f4740(0000) GS:
ffff880244e00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007ff49c68f994 CR3:
000000005a205000 CR4:
00000000001407e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Stack:
ffffea00079a8b80 ffffea00079a8bc0 ffffea00079a8ba0 ffff88005ab21e50
ffffffff811adc7a 0000000000000000 ffff8801ca5295d0 0000000464e224f8
0000000000000000 0000000000000002 0000000000000000 ffff88020ce75c00
Call Trace:
migrate_pages+0x12a/0x850
SYSC_mbind+0x513/0x6a0
SyS_mbind+0xe/0x10
ia32_do_call+0x13/0x13
Code: 85 c0 75 2f 4c 89 e1 48 89 da 31 f6 bf da 00 02 00 65 44 8b 04 25 08 f7 1c 00 e8 ec fd ff ff 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 4c 89 e6 48 89 df ba 01 00 00 00 e8 48
RIP [<
ffffffff8119f200>] new_vma_page+0x70/0x90
RSP <
ffff88005ab21db0>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Thu, 19 Dec 2013 01:08:55 +0000 (17:08 -0800)]
MAINTAINERS: add Davidlohr as GPT maintainer
Add a new entry for the GPT standard. Any future changes will now be
routed through linux-efi.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jianguo Wu [Thu, 19 Dec 2013 01:08:54 +0000 (17:08 -0800)]
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
After a successful hugetlb page migration by soft offline, the source
page will either be freed into hugepage_freelists or buddy(over-commit
page). If page is in buddy, page_hstate(page) will be NULL. It will
hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page().
BUG: unable to handle kernel NULL pointer dereference at
0000000000000058
IP: [<
ffffffff81163761>] dequeue_hwpoisoned_huge_page+0x131/0x1d0
PGD
c23762067 PUD
c24be2067 PMD 0
Oops: 0000 [#1] SMP
So check PageHuge(page) after call migrate_pages() successfully.
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 19 Dec 2013 01:08:52 +0000 (17:08 -0800)]
mm/compaction: respect ignore_skip_hint in update_pageblock_skip
update_pageblock_skip() only fits to compaction which tries to isolate
by pageblock unit. If isolate_migratepages_range() is called by CMA, it
try to isolate regardless of pageblock unit and it don't reference
get_pageblock_skip() by ignore_skip_hint. We should also respect it on
update_pageblock_skip() to prevent from setting the wrong information.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Thu, 19 Dec 2013 01:08:51 +0000 (17:08 -0800)]
mm/mempolicy: correct putback method for isolate pages if failed
queue_pages_range() isolates hugetlbfs pages and putback_lru_pages()
can't handle these. We should change it to putback_movable_pages().
Naoya said that it is worth going into stable, because it can break
in-use hugepage list.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.12.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sima Baymani [Thu, 19 Dec 2013 01:08:49 +0000 (17:08 -0800)]
mm: add missing dependency in Kconfig
Eliminate the following (rand)config warning by adding missing PROC_FS
dependency:
warning: (HWPOISON_INJECT && MEM_SOFT_DIRTY) selects PROC_PAGE_MONITOR which has unmet direct dependencies (PROC_FS && MMU)
Signed-off-by: Sima Baymani <sima.baymani@gmail.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Thu, 19 Dec 2013 01:08:48 +0000 (17:08 -0800)]
sh: always link in helper functions extracted from libgcc
E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m:
ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined!
For "lib-y", if no symbols in a compilation unit are referenced by other
units, the compilation unit will not be included in vmlinux. This
breaks modules that do reference those symbols.
Use "obj-y" instead to fix this.
http://kisskb.ellerman.id.au/kisskb/buildresult/
8838077/
This doesn't fix all cases. There are others, e.g. udivsi3.
This is also not limited to sh, many architectures handle this in the
same way.
A simple solution is to unconditionally include all helper functions.
A more complex solution is to make the choice of "lib-y" or "obj-y" depend
on CONFIG_MODULES:
obj-$(CONFIG_MODULES) += ...
lib-y($CONFIG_MODULES) += ...
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Reviewed-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 19 Dec 2013 01:08:47 +0000 (17:08 -0800)]
mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
Dave Hansen noted a regression in a microbenchmark that loops around
open() and close() on an 8-node NUMA machine and bisected it down to
commit
81c0a2bb515f ("mm: page_alloc: fair zone allocator policy").
That change forces the slab allocations of the file descriptor to spread
out to all 8 nodes, causing remote references in the page allocator and
slab.
The round-robin policy is only there to provide fairness among memory
allocations that are reclaimed involuntarily based on pressure in each
zone. It does not make sense to apply it to unreclaimable kernel
allocations that are freed manually, in this case instantly after the
allocation, and incur the remote reference costs twice for no reason.
Only round-robin allocations that are usually freed through page reclaim
or slab shrinking.
Bisected by Dave Hansen.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:46 +0000 (17:08 -0800)]
mm: numa: defer TLB flush for THP migration as long as possible
THP migration can fail for a variety of reasons. Avoid flushing the TLB
to deal with THP migration races until the copy is ready to start.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:45 +0000 (17:08 -0800)]
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update. As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Thu, 19 Dec 2013 01:08:44 +0000 (17:08 -0800)]
mm: fix TLB flush race between migration, and change_protection_range
There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.
The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.
During that time, a CPU may continue writing to the page.
This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.
When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.
This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).
The basic race looks like this:
CPU A CPU B CPU C
load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data
[*] the old page may belong to a new user at this point!
The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.
This should fix both NUMA migration and compaction.
[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:42 +0000 (17:08 -0800)]
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:41 +0000 (17:08 -0800)]
mm: numa: clear numa hinting information on mprotect
On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:40 +0000 (17:08 -0800)]
sched: numa: skip inaccessible VMAs
Inaccessible VMA should not be trapping NUMA hint faults. Skip them.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:39 +0000 (17:08 -0800)]
mm: numa: avoid unnecessary work on the failure path
If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:38 +0000 (17:08 -0800)]
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:37 +0000 (17:08 -0800)]
mm: numa: do not clear PTE for pte_numa update
The TLB must be flushed if the PTE is updated but change_pte_range is
clearing the PTE while marking PTEs pte_numa without necessarily
flushing the TLB if it reinserts the same entry. Without the flush,
it's conceivable that two processors have different TLBs for the same
virtual address and at the very least it would generate spurious faults.
This patch only unmaps the pages in change_pte_range for a full
protection change.
[riel@redhat.com: write pte_numa pte back to the page tables]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:36 +0000 (17:08 -0800)]
mm: numa: do not clear PMD during PTE update scan
If the PMD is flushed then a parallel fault in handle_mm_fault() will
enter the pmd_none and do_huge_pmd_anonymous_page() path where it'll
attempt to insert a huge zero page. This is wasteful so the patch
avoids clearing the PMD when setting pmd_numa.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:34 +0000 (17:08 -0800)]
mm: clear pmd_numa before invalidating
On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults. The following two page table protection
bits are what defines them
_PAGE_NUMA:set _PAGE_PRESENT:clear
A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present. The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD. This patch keeps the state consistent
when calling pmdp_invalidate.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:33 +0000 (17:08 -0800)]
mm: numa: call MMU notifiers on THP migration
MMU notifiers must be called on THP page migration or secondary MMUs
will get very confused.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:32 +0000 (17:08 -0800)]
mm: numa: serialise parallel get_user_page against THP migration
Base pages are unmapped and flushed from cache and TLB during normal
page migration and replaced with a migration entry that causes any
parallel NUMA hinting fault or gup to block until migration completes.
THP does not unmap pages due to a lack of support for migration entries
at a PMD level. This allows races with get_user_pages and
get_user_pages_fast which commit
3f926ab945b6 ("mm: Close races between
THP migration and PMD numa clearing") made worse by introducing a
pmd_clear_flush().
This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against
THP migration and properly account for the NUMA hinting fault. On the
migration side the page table lock is taken for each PTE update.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Thu, 19 Dec 2013 01:08:31 +0000 (17:08 -0800)]
kexec: migrate to reboot cpu
Commit
1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
kernel") moved reboot= handling to generic code. In the process it also
removed the code in native_machine_shutdown() which are moving reboot
process to reboot_cpu/cpu0.
I guess that thought must have been that all reboot paths are calling
migrate_to_reboot_cpu(), so we don't need this special handling. But
kexec reboot path (kernel_kexec()) is not calling
migrate_to_reboot_cpu() so above change broke kexec. Now reboot can
happen on non-boot cpu and when INIT is sent in second kerneo to bring
up BP, it brings down the machine.
So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
this problem.
Bisected by WANG Chao.
Reported-by: Matthew Whitehead <mwhitehe@redhat.com>
Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Mon, 16 Dec 2013 14:31:23 +0000 (06:31 -0800)]
ipv6: sit: update mtu check to take care of gso packets
While testing my changes for TSO support in SIT devices,
I was using sit0 tunnel which appears to include nopmtudisc flag.
But using :
ip tun add sittun mode sit remote $REMOTE_IPV4 local $LOCAL_IPV4 \
dev $IFACE
We get a tunnel which rejects too long packets because of the mtu check
which is not yet GSO aware.
erd:~# ip tunnel
sittun: ipv6/ip remote 10.246.17.84 local 10.246.17.83 ttl inherit 6rd-prefix 2002::/16
sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16
This patch is based on an excellent report from
Michal Shmidt.
In the future, we probably want to extend the MTU check to do the
right thing for GSO packets...
Fixes: ("61c1db7fae21 ipv6: sit: add GSO/TSO support")
Reported-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Mon, 16 Dec 2013 11:36:44 +0000 (12:36 +0100)]
ipv6: pmtudisc setting not respected with UFO/CORK
Sockets marked with IPV6_PMTUDISC_PROBE (or later IPV6_PMTUDISC_INTERFACE)
don't respect this setting when the outgoing interface supports UFO.
We had the same problem in IPv4, which was fixed in commit
daba287b299ec7a2c61ae3a714920e90e8396ad5 ("ipv4: fix DO and PROBE pmtu
mode regarding local fragmentation with UFO/CORK").
Also IPV6_DONTFRAG mode did not care about already corked data, thus
it may generate a fragmented frame even if this socket option was
specified. It also did not care about the length of the ipv6 header and
possible options.
In the error path allow the user to receive the pmtu notifications via
both, rxpmtu method or error queue. The user may opted in for both,
so deliver the notification to both error handlers (the handlers check
if the error needs to be enqueued).
Also report back consistent pmtu values when sending on an already
cork-appended socket.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Mon, 16 Dec 2013 10:35:35 +0000 (11:35 +0100)]
dm9601: work around tx fifo sync issue on dm962x
Certain dm962x revisions contain an bug, where if a USB bulk transfer retry
(E.G. if bulk crc mismatch) happens right after a transfer with odd or
maxpacket length, the internal tx hardware fifo gets out of sync causing
the interface to stop working.
Work around it by adding up to 3 bytes of padding to ensure this situation
cannot trigger.
This workaround also means we never pass multiple-of-maxpacket size skb's
to usbnet, so the length adjustment to handle usbnet's padding of those can
be removed.
Cc: <stable@vger.kernel.org>
Reported-by: Joseph Chang <joseph_chang@davicom.com.tw>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Mon, 16 Dec 2013 10:35:34 +0000 (11:35 +0100)]
dm9601: make it clear that dm9620/dm9621a are also supported
The driver nowadays also support dm9620/dm9621a based USB 2.0 ethernet
adapters, so adjust module/driver description and Kconfig help text to
match.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Mon, 16 Dec 2013 10:35:33 +0000 (11:35 +0100)]
dm9601: fix reception of full size ethernet frames on dm9620/dm9621a
dm9620/dm9621a require room for 4 byte padding even in dm9601 (3 byte
header) mode.
Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Mon, 16 Dec 2013 10:35:32 +0000 (11:35 +0100)]
dm9601: add support for dm9621a based dongle
dm9621a is functionally identical to dm9620, so the existing handling can
directly be used.
Thanks to Davicom for sending me a dongle.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teräs [Mon, 16 Dec 2013 09:02:09 +0000 (11:02 +0200)]
ip_gre: fix msg_name parsing for recvfrom/recvmsg
ipgre_header_parse() needs to parse the tunnel's ip header and it
uses mac_header to locate the iphdr. This got broken when gre tunneling
was refactored as mac_header is no longer updated to point to iphdr.
Introduce skb_pop_mac_header() helper to do the mac_header assignment
and use it in ipgre_rcv() to fix msg_name parsing.
Bug introduced in commit
c54419321455 (GRE: Refactor GRE tunneling code.)
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 18 Dec 2013 22:35:00 +0000 (14:35 -0800)]
Merge tag 'usb-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few USB fixes for things that have people have reported
issues with recently"
* tag 'usb-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: ohci-at91: fix irq and iomem resource retrieval
usb: phy: fix driver dependencies
phy: kconfig: add depends on "USB_PHY" to OMAP_USB2 and TWL4030_USB
drivers: phy: tweaks to phy_create()
drivers: phy: Fix memory leak
xhci: Limit the spurious wakeup fix only to HP machines
usb: chipidea: fix nobody cared IRQ when booting with host role
usb: chipidea: host: Only disable the vbus regulator if it is not NULL
usb: serial: zte_ev: move support for ZTE AC2726 from zte_ev back to option
usb: cdc-wdm: manage_power should always set needs_remote_wakeup
usb: phy-tegra-usb.c: wrong pointer check for remap UTMI
usb: phy: twl6030-usb: signedness bug in twl6030_readb()
usb: dwc3: power off usb phy in error path
usb: dwc3: invoke phy_resume after phy_init
Linus Torvalds [Wed, 18 Dec 2013 22:34:27 +0000 (14:34 -0800)]
Merge tag 'tty-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are a few fixes for 3.13-rc5 that resolve a number of reported
tty and serial driver issues"
* tag 'tty-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: xuartps: Properly guard sysrq specific code
n_tty: Fix apparent order of echoed output
serial: 8250_dw: add new ACPI IDs
serial: 8250_dw: Fix LCR workaround regression
tty: Fix hang at ldsem_down_read()
Linus Torvalds [Wed, 18 Dec 2013 22:33:57 +0000 (14:33 -0800)]
Merge tag 'staging-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a number of staging, and iio, fixes for 3.13-rc5 that resolve
some reported issues"
* tag 'staging-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
imx-drm: imx-drm-core: improve safety of imx_drm_add_crtc()
imx-drm: imx-drm-core: make imx_drm_crtc_register() safer
imx-drm: imx-drm-core: use defined constant for number of CRTCs.
imx-drm: imx-tve: don't call sleeping functions beneath enable_lock spinlock
imx-drm: ipu-v3: fix potential CRTC device registration race
imx-drm: imx-drm-core: fix DRM cleanup paths
imx-drm: imx-drm-core: fix error cleanup path for imx_drm_add_crtc()
staging: comedi: drivers: fix return value of comedi_load_firmware()
staging: comedi: 8255_pci: fix for newer PCI-DIO48H
iio:adc:ad7887 Fix channel reported endianness from cpu to big endian
iio:imu:adis16400 fix pressure channel scan type
staging:iio:mag:hmc5843 fix incorrect endianness of channel as a result of missuse of the IIO_ST macro.
iio: cm36651: Changed return value of read function
Linus Torvalds [Wed, 18 Dec 2013 22:33:23 +0000 (14:33 -0800)]
Merge tag 'driver-core-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here's a single sysfs fix for 3.13-rc5 that resolves a lockdep issue
in sysfs that has been reported"
* tag 'driver-core-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: give different locking key to regular and bin files
Linus Torvalds [Wed, 18 Dec 2013 22:09:08 +0000 (14:09 -0800)]
Merge branch 'keys-devel' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull crypto key patches from David Howells:
"There are four items:
- A patch to fix X.509 certificate gathering. The problem was that I
was coming up with a different path for signing_key.x509 in the
build directory if it didn't exist to if it did exist. This meant
that the X.509 cert container object file would be rebuilt on the
second rebuild in a build directory and the kernel would get
relinked.
- Unconditionally remove files generated by SYSTEM_TRUSTED_KEYRING=y
when doing make mrproper.
- Actually initialise the persistent-keyring semaphore for
init_user_ns. I have no idea why this works at all for users in
the base user namespace unless it's something to do with systemd
containerising the system.
- Documentation for module signing"
* 'keys-devel' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
Add Documentation/module-signing.txt file
KEYS: fix uninitialized persistent_keyring_register_sem
KEYS: Remove files generated when SYSTEM_TRUSTED_KEYRING=y
X.509: Fix certificate gathering
Mugunthan V N [Fri, 13 Dec 2013 13:12:55 +0000 (18:42 +0530)]
drivers: net cpsw: Enable In Band mode in cpsw for 10 mbps
This patch adds support for enabling In Band mode in 10 mbps speed.
RGMII supports 1 Gig and 100 mbps mode for Forced mode of operation.
For 10mbps mode it should be configured to in band mode so that link
status, duplexity and speed are determined from the RGMII input data
stream
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Dec 2013 21:52:34 +0000 (16:52 -0500)]
Merge branch 'bond_locking'
Ding Tianhong says:
====================
Jay Vosburgh said that the bond_3ad_adapter_speed_changed and
bond_3ad_adapter_duplex_changed is called with RTNL only, and
the functions will modify the port's information with no further
locking, they will not mutex against bond state machine and
incoming LACPDU which do not hold RTNL, So I add port lock to
protect the port information.
But they are not critical bugs, they exist since day one, and till
now they have never been hit and reported, because change for speed
and duplex is very rare, and will not occur critical problem.
The comments in the function is very old, cleanup the comments together.
====================
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 13 Dec 2013 09:29:29 +0000 (17:29 +0800)]
bonding: protect port for bond_3ad_handle_link_change()
The bond_3ad_handle_link_change is called with RTNL only,
and the function will modify the port's information with
no further locking, it will not mutex against bond state
machine and incoming LACPDU which do not hold RTNL, So I
add __get_state_machine_lock to protect the port.
But it is not a critical bug, it exist since day one, and till
now it has never been hit and reported, because changes to
speed is very rare, and will not occur critical problem.
The comments in the function is very old, cleanup it and
add a new pr_debug to debug the port message.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 13 Dec 2013 09:29:24 +0000 (17:29 +0800)]
bonding: protect port for bond_3ad_adapter_duplex_changed()
Jay Vosburgh said that the bond_3ad_adapter_duplex_changed is
called with RTNL only, and the function will modify the port's
information with no further locking, it will not mutex against
bond state machine and incoming LACPDU which do not hold RTNL,
So I add __get_state_machine_lock to protect the port.
But it is not a critical bug, it exist since day one, and till
now it has never been hit and reported, because changes to
speed is very rare, and will not occur critical problem.
The comments in the function is very old, cleanup it.
Suggested-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Fri, 13 Dec 2013 09:29:19 +0000 (17:29 +0800)]
bonding: protect port for bond_3ad_adapter_speed_changed()
Jay Vosburgh said that the bond_3ad_adapter_speed_changed is
called with RTNL only, and the function will modify the port's
information with no further locking, it will not mutex against
bond state machine and incoming LACPDU which do not hold RTNL,
So I add __get_state_machine_lock to protect the port.
But it is not a critical bug, it exist since day one, and till
now it has never been hit and reported, because changes to
speed is very rare, and will not occur critical problem.
The comment in the function is very old, cleanup it.
Suggested-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Tue, 17 Dec 2013 18:09:32 +0000 (10:09 -0800)]
net_dma: mark broken
net_dma can cause data to be copied to a stale mapping if a
copy-on-write fault occurs during dma. The application sees missing
data.
The following trace is triggered by modifying the kernel to WARN if it
ever triggers copy-on-write on a page that is undergoing dma:
WARNING: CPU: 24 PID: 2529 at lib/dma-debug.c:485 debug_dma_assert_idle+0xd2/0x120()
ioatdma 0000:00:04.0: DMA-API: cpu touching an active dma mapped page [pfn=0x16bcd9]
Modules linked in: iTCO_wdt iTCO_vendor_support ioatdma lpc_ich pcspkr dca
CPU: 24 PID: 2529 Comm: linbug Tainted: G W 3.13.0-rc1+ #353
00000000000001e5 ffff88016f45f688 ffffffff81751041 ffff88017ab0ef70
ffff88016f45f6d8 ffff88016f45f6c8 ffffffff8104ed9c ffffffff810f3646
ffff8801768f4840 0000000000000282 ffff88016f6cca10 00007fa2bb699349
Call Trace:
[<
ffffffff81751041>] dump_stack+0x46/0x58
[<
ffffffff8104ed9c>] warn_slowpath_common+0x8c/0xc0
[<
ffffffff810f3646>] ? ftrace_pid_func+0x26/0x30
[<
ffffffff8104ee86>] warn_slowpath_fmt+0x46/0x50
[<
ffffffff8139c062>] debug_dma_assert_idle+0xd2/0x120
[<
ffffffff81154a40>] do_wp_page+0xd0/0x790
[<
ffffffff811582ac>] handle_mm_fault+0x51c/0xde0
[<
ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
[<
ffffffff8175fc2c>] __do_page_fault+0x19c/0x530
[<
ffffffff8175c196>] ? _raw_spin_lock_bh+0x16/0x40
[<
ffffffff810f3539>] ? trace_clock_local+0x9/0x10
[<
ffffffff810fa1f4>] ? rb_reserve_next_event+0x64/0x310
[<
ffffffffa0014c00>] ? ioat2_dma_prep_memcpy_lock+0x60/0x130 [ioatdma]
[<
ffffffff8175ffce>] do_page_fault+0xe/0x10
[<
ffffffff8175c862>] page_fault+0x22/0x30
[<
ffffffff81643991>] ? __kfree_skb+0x51/0xd0
[<
ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
[<
ffffffff81388ea2>] ? memcpy_toiovec+0x52/0xa0
[<
ffffffff8164770f>] skb_copy_datagram_iovec+0x5f/0x2a0
[<
ffffffff8169d0f4>] tcp_rcv_established+0x674/0x7f0
[<
ffffffff816a68c5>] tcp_v4_do_rcv+0x2e5/0x4a0
[..]
---[ end trace
e30e3b01191b7617 ]---
Mapped at:
[<
ffffffff8139c169>] debug_dma_map_page+0xb9/0x160
[<
ffffffff8142bf47>] dma_async_memcpy_pg_to_pg+0x127/0x210
[<
ffffffff8142cce9>] dma_memcpy_pg_to_iovec+0x119/0x1f0
[<
ffffffff81669d3c>] dma_skb_copy_datagram_iovec+0x11c/0x2b0
[<
ffffffff8169d1ca>] tcp_rcv_established+0x74a/0x7f0:
...the problem is that the receive path falls back to cpu-copy in
several locations and this trace is just one of the areas. A few
options were considered to fix this:
1/ sync all dma whenever a cpu copy branch is taken
2/ modify the page fault handler to hold off while dma is in-flight
Option 1 adds yet more cpu overhead to an "offload" that struggles to compete
with cpu-copy. Option 2 adds checks for behavior that is already documented as
broken when using get_user_pages(). At a minimum a debug mode is warranted to
catch and flag these violations of the dma-api vs get_user_pages().
Thanks to David for his reproducer.
Cc: <stable@vger.kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Reported-by: David Whipple <whipple@securedatainnovations.ch>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Gerhard Sittig [Tue, 10 Dec 2013 09:51:08 +0000 (10:51 +0100)]
powerpc/512x: dts: remove misplaced IRQ spec from 'soc' node (5125)
the 'soc' node in the MPC5125 "tower" board .dts has an '#interrupt-cells'
property although this node is not an interrupt controller
remove this erroneously placed property because starting with v3.13-rc1
lookup and resolution of 'interrupts' specs for peripherals gets misled
(tries to use the 'soc' as the interrupt parent which fails), emits
'no irq domain found' WARN() messages and breaks the boot process
[ best viewed with 'git diff -U5' to have DT node names in the context ]
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: devicetree@vger.kernel.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
John W. Linville [Wed, 18 Dec 2013 18:46:08 +0000 (13:46 -0500)]
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Will Deacon [Mon, 2 Dec 2013 18:01:30 +0000 (18:01 +0000)]
dma: pl330: ensure DMA descriptors are zero-initialised
I see the following splat with 3.13-rc1 when attempting to perform DMA:
[ 253.004516] Alignment trap: not handling instruction
e1902f9f at [<
c0204b40>]
[ 253.004583] Unhandled fault: alignment exception (0x221) at 0xdfdfdfd7
[ 253.004646] Internal error: : 221 [#1] PREEMPT SMP ARM
[ 253.004691] Modules linked in: dmatest(+) [last unloaded: dmatest]
[ 253.004798] CPU: 0 PID: 671 Comm: kthreadd Not tainted 3.13.0-rc1+ #2
[ 253.004864] task:
df9b0900 ti:
df03e000 task.ti:
df03e000
[ 253.004937] PC is at dmaengine_unmap_put+0x14/0x34
[ 253.005010] LR is at pl330_tasklet+0x3c8/0x550
[ 253.005087] pc : [<
c0204b44>] lr : [<
c0207478>] psr:
a00e0193
[ 253.005087] sp :
df03fe48 ip :
00000000 fp :
df03bf18
[ 253.005178] r10:
bf00e108 r9 :
00000001 r8 :
00000000
[ 253.005245] r7 :
df837040 r6 :
dfb41800 r5 :
df837048 r4 :
df837000
[ 253.005316] r3 :
dfdfdfcf r2 :
dfb41f80 r1 :
df837048 r0 :
dfdfdfd7
[ 253.005384] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 253.005459] Control:
30c5387d Table:
9fb9ba80 DAC:
fffffffd
[ 253.005520] Process kthreadd (pid: 671, stack limit = 0xdf03e248)
This is due to desc->txd.unmap containing garbage (uninitialised memory).
Rather than add another dummy initialisation to _init_desc, instead
ensure that the descriptors are zero-initialised during allocation and
remove the dummy, per-field initialisation.
Cc: Andriy Shevchenko <andriy.shevchenko@intel.com>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Martin Schwidefsky [Wed, 18 Dec 2013 13:36:18 +0000 (14:36 +0100)]
s390/3270: fix allocation of tty3270_screen structure
The tty3270_alloc_screen function is called from tty3270_install with
swapped arguments, the number of columns instead of rows and vice versa.
The number of rows is typically smaller than the number of columns which
makes the screen array too big but the individual cell arrays for the
lines too small. Creating lines longer than the number of rows will
clobber the memory after the end of the cell array.
The fix is simple, call tty3270_alloc_screen with the correct argument
order.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 16 Dec 2013 13:31:26 +0000 (14:31 +0100)]
s390/smp: improve setup of possible cpu mask
Since under z/VM we cannot have more than 64 cpus, make sure the
cpu_possible_mask does not contain more bits.
This avoids wasting memory for dynamic per-cpu allocations if
CONFIG_NR_CPUS is larger than 64.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hui Wang [Wed, 18 Dec 2013 10:09:56 +0000 (18:09 +0800)]
ALSA: hda - Add Dell headset detection quirk for one more laptop model
On the Dell machines with codec whose Subsystem Id is 0x10280640,
no external microphone can be detected when plugging a 3-ring headset.
Using ALC255_FIXUP_DELL1_MIC_NO_PRESENCE can fix this problem.
The codec (Vendor ID: 0x10ec0255) on the machine belongs to alc_269
family.
BugLink: https://bugs.launchpad.net/bugs/1260303
Cc: David Henningsson <david.henningsson@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Bo Shen [Wed, 18 Dec 2013 03:26:23 +0000 (11:26 +0800)]
ASoC: wm8904: fix DSP mode B configuration
When wm8904 work in DSP mode B, we still need to configure it to
work in DSP mode. Or else, it will work in Right Justified mode.
Signed-off-by: Bo Shen <voice.shen@atmel.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
Charles Keepax [Wed, 18 Dec 2013 09:25:49 +0000 (09:25 +0000)]
ASoC: wm_adsp: Add small delay while polling DSP RAM start
Some devices are getting very close to the limit whilst polling the RAM
start, this patch adds a small delay to this loop to give a longer
startup timeout.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
Paul Mackerras [Mon, 16 Dec 2013 02:31:46 +0000 (13:31 +1100)]
KVM: PPC: Book3S HV: Don't drop low-order page address bits
Commit
caaa4c804fae ("KVM: PPC: Book3S HV: Fix physical address
calculations") unfortunately resulted in some low-order address bits
getting dropped in the case where the guest is creating a 4k HPTE
and the host page size is 64k. By getting the low-order bits from
hva rather than gpa we miss out on bits 12 - 15 in this case, since
hva is at page granularity. This puts the missing bits back in.
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Aneesh Kumar K.V [Mon, 11 Nov 2013 13:59:47 +0000 (19:29 +0530)]
powerpc: book3s: kvm: Don't abuse host r2 in exit path
We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
KVM mode when both HV and PR are enabled in the kernel. Without this we
get the below crash
(qemu)
Unable to handle kernel paging request for data at address 0xffffffffffff8310
Faulting instruction address: 0xc00000000001d5a4
cpu 0x2: Vector: 300 (Data Access) at [
c0000001dc53aef0]
pc:
c00000000001d5a4: .vtime_delta.isra.1+0x34/0x1d0
lr:
c00000000001d760: .vtime_account_system+0x20/0x60
sp:
c0000001dc53b170
msr:
8000000000009032
dar:
ffffffffffff8310
dsisr:
40000000
current = 0xc0000001d76c62d0
paca = 0xc00000000fef1100 softe: 0 irq_happened: 0x01
pid = 4472, comm = qemu-system-ppc
enter ? for help
[
c0000001dc53b200]
c00000000001d760 .vtime_account_system+0x20/0x60
[
c0000001dc53b290]
c00000000008d050 .kvmppc_handle_exit_pr+0x60/0xa50
[
c0000001dc53b340]
c00000000008f51c kvm_start_lightweight+0xb4/0xc4
[
c0000001dc53b510]
c00000000008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
[
c0000001dc53b9e0]
c00000000008341c .kvmppc_vcpu_run+0x2c/0x40
[
c0000001dc53ba50]
c000000000080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[
c0000001dc53bae0]
c00000000007b4c8 .kvm_vcpu_ioctl+0x478/0x730
[
c0000001dc53bca0]
c0000000002140cc .do_vfs_ioctl+0x4ac/0x770
[
c0000001dc53bd80]
c0000000002143e8 .SyS_ioctl+0x58/0xb0
[
c0000001dc53be30]
c000000000009e58 syscall_exit+0x0/0x98
Signed-off-by: Alexander Graf <agraf@suse.de>
Don Skidmore [Fri, 22 Nov 2013 04:27:23 +0000 (04:27 +0000)]
ixgbe: fix for unused variable warning with certain config
If CONFIG_PCI_IOV isn't defined we get an "unused variable" warining so
now wrap the variable declaration like it's usage already was.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David Ertman [Tue, 17 Dec 2013 04:42:42 +0000 (04:42 +0000)]
e1000e: Fix a compile flag mis-match for suspend/resume
This patch addresses a mis-match between the declaration and usage of
the e1000_suspend and e1000_resume functions. Previously, these
functions were declared in a CONFIG_PM_SLEEP wrapper, and then utilized
within a CONFIG_PM wrapper. Both the declaration and usage will now be
contained within CONFIG_PM wrappers.
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David Ertman [Sat, 14 Dec 2013 07:30:39 +0000 (07:30 +0000)]
e1000e: fix compiler warning (maybe-unitialized variable)
This patch is to fix a compiler warning of maybe-uininitialized-variable
that is generated from gcc when the -O3 flag is used. In the function
e1000_reset_hw_80003es2lan(), the variable krmn_reg_data is first given
a value by being passed to a register read function as a
pass-by-reference parameter. But, the return value of that read
function was never checked to see if the read failed and the variable
not given an initial value. The compiler was smart enough to spot
this. This patch is to check the return value for that read function
and return it, if an error occurs, without trying to utilize the value
in kmrn_reg_data.
Signed-off-by: David Ertman <davidx.m.ertman@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David Ertman [Sat, 14 Dec 2013 07:18:18 +0000 (07:18 +0000)]
e1000e: fix compiler warnings
This patch is to fix a compiler warning of __bad_udelay due to a value
of >999 being passed as a parameter to udelay() in the function
e1000e_phy_has_link_generic(). This affects the gcc compiler when
it is given a flag of -O3 and the icc compiler.
This patch is also making the change from mdelay() to msleep() in the
same function, since it was determined though code inspection that this
function is never called in atomic context.
Signed-off-by: David Ertman <davidx.m.ertman@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jan Kara [Wed, 18 Dec 2013 05:44:44 +0000 (00:44 -0500)]
ext4: fix deadlock when writing in ENOSPC conditions
Akira-san has been reporting rare deadlocks of his machine when running
xfstests test 269 on ext4 filesystem. The problem turned out to be in
ext4_da_reserve_metadata() and ext4_da_reserve_space() which called
ext4_should_retry_alloc() while holding i_data_sem. Since
ext4_should_retry_alloc() can force a transaction commit, this is a
lock ordering violation and leads to deadlocks.
Fix the problem by just removing the retry loops. These functions should
just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that
function must take care of retrying after dropping all necessary locks.
Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
John Fastabend [Tue, 26 Nov 2013 06:33:52 +0000 (06:33 +0000)]
net: allow netdev_all_upper_get_next_dev_rcu with rtnl lock held
It is useful to be able to walk all upper devices when bringing
a device online where the RTNL lock is held. In this case it
is safe to walk the all_adj_list because the RTNL lock is used
to protect the write side as well.
This patch adds a check to see if the rtnl lock is held before
throwing a warning in netdev_all_upper_get_next_dev_rcu().
Also because we now have a call site for lockdep_rtnl_is_held()
outside COFIG_LOCK_PROVING an inline definition returning 1 is
needed. Similar to the rcu_read_lock_is_held().
Fixes: 2a47fa45d4df ("ixgbe: enable l2 forwarding acceleration for macvlans")
CC: Veaceslav Falico <vfalico@redhat.com>
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Russell King [Mon, 16 Dec 2013 12:39:31 +0000 (12:39 +0000)]
imx-drm: imx-drm-core: improve safety of imx_drm_add_crtc()
We must not add more CRTCs than we have declared to the vblank
helpers, otherwise we overflow their arrays. Force failure if we
exceed the number of CRTCs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 12:39:11 +0000 (12:39 +0000)]
imx-drm: imx-drm-core: make imx_drm_crtc_register() safer
imx_drm_crtc_register() doesn't clean up the CRTC upon failure, which
leaves the CRTC attached to the DRM device. Also, it does setup after
attaching the CRTC to the DRM device.
Fix this by reordering the function such that we do the setup before
drm_crtc_init(): this fixes both issues.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 12:38:50 +0000 (12:38 +0000)]
imx-drm: imx-drm-core: use defined constant for number of CRTCs.
We have this definition, there's no reason not to use it.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 12:38:30 +0000 (12:38 +0000)]
imx-drm: imx-tve: don't call sleeping functions beneath enable_lock spinlock
Enable lock claims that it is serializing tve_enable/disable calls.
However, DRM already serialises mode sets with a mutex, which prevents
encoder/connector functions being called concurrently. Secondly,
holding a spinlock while calling clk_prepare_enable() is wrong; it
will cause a might_sleep() warning should that debugging be enabled.
So, let's just get rid of the enable_lock.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 11:34:25 +0000 (11:34 +0000)]
imx-drm: ipu-v3: fix potential CRTC device registration race
Clean up the IPUv3 CRTC device registration; we don't need a separate
function just to call platform_device_register_data(), and we don't
need the return value converted at all.
Update the IPU client id under a mutex, so that parallel probing
doesn't race.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 11:33:44 +0000 (11:33 +0000)]
imx-drm: imx-drm-core: fix DRM cleanup paths
We must call drm_vblank_cleanup() on the error cleanup and unload paths
after we've had a successful call to drm_vblank_init(). Ensure that
the calls are in the reverse order to the initialisation order.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Mon, 16 Dec 2013 11:33:24 +0000 (11:33 +0000)]
imx-drm: imx-drm-core: fix error cleanup path for imx_drm_add_crtc()
imx_drm_add_crtc() was kfree'ing the imx_drm_crtc structure while
leaving it on the list of CRTCs. Delete it from the list first.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Wed, 18 Dec 2013 00:59:59 +0000 (16:59 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Definitely seems quieter this week,
Radeon, intel, intel broadwell, vmwgfx, ttm, armada, and a couple of
core fixes, one revert in radeon
Most of these are either going to stable or fixes for things
introduced in the merge window"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
drm/edid: add quirk for BPC in Samsung NP700G7A-S01PL notebook
drm/ttm: Fix accesses through vmas with only partial coverage
drm/nouveau: only runtime suspend by default in optimus configuration
drm: don't double-free on driver load error
Revert "drm/radeon: Implement radeon_pci_shutdown"
drm/radeon: add missing display tiling setup for oland
drm/radeon: fix typo in cik_copy_dma
drm/radeon/cik: plug in missing blit callback
drm/radeon/dpm: Fix hwmon crash
drm/radeon: Fix sideport problems on certain RS690 boards
drm/i915: don't update the dri1 breadcrumb with modesetting
DRM: Armada: prime refcounting bug fix
DRM: Armada: fix printing of phys_addr_t/dma_addr_t
DRM: Armada: destroy framebuffer after helper
DRM: Armada: implement lastclose() for fbhelper
drm/i915: Repeat eviction search after idling the GPU
drm/vmwgfx: Add max surface memory param
drm/i915: Fix use-after-free in do_switch
drm/i915: fix pm init ordering
drm/i915: Hold mutex across i915_gem_release
...
Tomi Valkeinen [Mon, 16 Dec 2013 07:14:48 +0000 (09:14 +0200)]
Revert "ARM: OMAP2+: Remove legacy mux code for display.c"
Commit
e30b06f4d5f000c31a7747a7e7ada78a5fd419a1 (ARM: OMAP2+: Remove
legacy mux code for display.c) removed non-DT DSI and HDMI pinmuxing.
However, DSI pinmuxing is still needed, and removing that caused DSI
displays not to work.
This reverts the DSI parts of the commit.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Soren Brinkmann [Mon, 2 Dec 2013 19:38:38 +0000 (11:38 -0800)]
tty: xuartps: Properly guard sysrq specific code
Commit 'tty: xuartps: Implement BREAK detection, add SYSRQ support'
(
0c0c47bc40a2e358d593b2d7fb93b50027fbfc0c) introduced sysrq support
without properly guarding sysrq specific code which results in build
errors when sysrq is disabled:
DNAME=KBUILD_STR(xilinx_uartps)" -c -o
drivers/tty/serial/.tmp_xilinx_uartps.o
drivers/tty/serial/xilinx_uartps.c
drivers/tty/serial/xilinx_uartps.c: In function 'xuartps_isr':
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
drivers/tty/serial/xilinx_uartps.c:247:5: error: 'struct uart_port'
has no member named 'sysrq'
make[3]: *** [drivers/tty/serial/xilinx_uartps.o] Error 1
Reported-by: Masanari Iida <standby24x7@gmail.com>
Cc: Vlad Lungu <vlad.lungu@windriver.com>
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Tue, 17 Dec 2013 23:53:24 +0000 (15:53 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"A quick batch of fixes, including the annoying bad lock stack problem
introduced by udp_sk_rx_dst_set() locking change:
1) Use xchg() instead of sk_dst_lock() in udp_sk_rx_dst_set(), from
Eric Dumazet.
2) qlcnic bug fixes from Himanshu Madhani and Manish Chopra.
3) Update IPSEC MAINTAINERS entry, from Steffen Klassert.
4) Administrative neigh entry changes should generate netlink
notifications the same as event generated ones. From Bob
Gilligan.
5) Netfilter SYNPROXY fixes from Patrick McHardy.
6) Netfilter nft_reject endianness fixes from Eric Leblond"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
qlcnic: Dump mailbox registers when mailbox command times out.
qlcnic: Fix mailbox processing during diagnostic test
qlcnic: Allow firmware dump collection when auto firmware recovery is disabled
qlcnic: Fix memory allocation
qlcnic: Fix TSS/RSS validation for 83xx/84xx series adapter.
qlcnic: Fix TSS/RSS ring validation logic.
qlcnic: Fix diagnostic test for all adapters.
qlcnic: Fix usage of netif_tx_{wake, stop} api during link change.
xen-netback: fix fragments error handling in checksum_setup_ip()
neigh: Netlink notification for administrative NUD state change
ipv4: improve documentation of ip_no_pmtu_disc
net: unix: allow bind to fail on mutex lock
MAINTAINERS: Update the IPsec maintainer entry
udp: ipv4: do not use sk_dst_lock from softirq context
netvsc: don't flush peers notifying work during setting mtu
can: peak_usb: fix mem leak in pcan_usb_pro_init()
can: ems_usb: fix urb leaks on failure paths
sctp: loading sctp when load sctp_probe
netfilter: nft_reject: fix endianness in dump function
netfilter: SYNPROXY target: restrict to INPUT/FORWARD
David S. Miller [Tue, 17 Dec 2013 22:21:30 +0000 (17:21 -0500)]
Merge branch 'fixes-for-3.13' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
this is a pull request with two fixes for net/master, the current release
cycle.
It consists of a patch by Alexey Khoroshilov from the Linux Driver Verification
project, which fixes a memory leak in ems_usb's failure patch. And a patch by
me which fixes a memory leak in the peak usb driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gao feng [Mon, 16 Dec 2013 06:59:22 +0000 (14:59 +0800)]
netfilter: nfnetlink_log: unset nf_loggers for netns when unloading module
Steven Rostedt and Arnaldo Carvalho de Melo reported a panic
when access the files /proc/sys/net/netfilter/nf_log/*.
This problem will occur when we do:
echo nfnetlink_log > /proc/sys/net/netfilter/nf_log/any_file
rmmod nfnetlink_log
and then access the files.
Since the nf_loggers of netns hasn't been unset, it will point
to the memory that has been freed.
This bug is introduced by commit
9368a53c ("netfilter: nfnetlink_log:
add net namespace support for nfnetlink_log").
[17261.822047] BUG: unable to handle kernel paging request at
ffffffffa0d49090
[17261.822056] IP: [<
ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0
[...]
[17261.822226] Call Trace:
[17261.822235] [<
ffffffff81297b98>] ? security_capable+0x18/0x20
[17261.822240] [<
ffffffff8106fa09>] ? ns_capable+0x29/0x50
[17261.822247] [<
ffffffff8163d25f>] ? net_ctl_permissions+0x1f/0x90
[17261.822254] [<
ffffffff81216613>] proc_sys_call_handler+0xb3/0xc0
[17261.822258] [<
ffffffff81216651>] proc_sys_read+0x11/0x20
[17261.822265] [<
ffffffff811a80de>] vfs_read+0x9e/0x170
[17261.822270] [<
ffffffff811a8c09>] SyS_read+0x49/0xa0
[17261.822276] [<
ffffffff810e6496>] ? __audit_syscall_exit+0x1f6/0x2a0
[17261.822283] [<
ffffffff81656e99>] system_call_fastpath+0x16/0x1b
[17261.822285] Code: cc 81 4d 63 e4 4c 89 45 88 48 89 4d 90 e8 19 03 0d 00 4b 8b 84 e5 28 08 00 00 48 8b 4d 90 4c 8b 45 88 48 85 c0 0f 84 a8 00 00 00 <48> 8b 40 10 48 89 43 08 48 89 df 4c 89 f2 31 f6 e8 4b 35 af ff
[17261.822329] RIP [<
ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0
[17261.822334] RSP <
ffff880274d3fe28>
[17261.822336] CR2:
ffffffffa0d49090
[17261.822340] ---[ end trace
a14ce54c0897a90d ]---
Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Tue, 17 Dec 2013 21:25:24 +0000 (16:25 -0500)]
Merge branch 'qlcnic'
Himanshu Madhani says:
====================
qlcnic: Bug fixes.
This series contains bug fixes for mailbox handling and multi Tx queue support
for all supported adapters.
changes from v1 -> v2
o updated patch to fix usage of netif_tx_{wake,stop} api during link change
as per David Miller's suggestion.
o Dropped patch to use spinklock per tx queue for more work.
o Added reworked patch for memory allocation failures.
o Added patch to allow capturing of dump, when auto recovery is disabled in firmware.
o Added patches for mailbox interrupt handling and debugging data for mailbox failure.
Please apply to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish chopra [Mon, 16 Dec 2013 20:37:03 +0000 (15:37 -0500)]
qlcnic: Dump mailbox registers when mailbox command times out.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This page took 0.146658 seconds and 4 git commands to generate.