Kent Overstreet [Tue, 11 Feb 2014 01:45:50 +0000 (17:45 -0800)]
block: Fix cloning of discard/write same bios
Immutable biovecs changed the way bio segments are treated in such a way that
bio_for_each_segment() cannot now do what we want for discard/write same bios,
since bi_size means something completely different for them.
Fortunately discard and write same bios never have more than a single biovec, so
bio_for_each_segment() is unnecessary and not terribly meaningful for them, but
we still have to special case them in a few places.
Paul Bolle [Sat, 8 Feb 2014 22:35:43 +0000 (23:35 +0100)]
ia64/xen: Remove Xen support for ia64 even more
Commit d52eefb47d4e ("ia64/xen: Remove Xen support for ia64") removed
the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending
on that symbol. Remove that code now.
The kernel-side VCPU binding was not being correctly set for newly
allocated or bound interdomain events. In ARM guests where 2-level
events were used, this would result in no interdomain events being
handled because the kernel-side VCPU masks would all be clear.
x86 guests would work because the irq affinity was set during irq
setup and this would set the correct kernel-side VCPU binding.
Fix this by properly initializing the kernel-side VCPU binding in
bind_evtchn_to_irq().
Tomi Valkeinen [Tue, 28 Jan 2014 06:50:47 +0000 (08:50 +0200)]
OMAPDSS: fix fck field types
'fck' field in dpi and sdi clock calculation struct is 'unsigned long
long', even though it should be 'unsigned long'. This hasn't caused any
issues so far.
Tomi Valkeinen [Mon, 27 Jan 2014 09:29:53 +0000 (11:29 +0200)]
OMAPDSS: DISPC: decimation rounding fix
The driver uses DIV_ROUND_UP when calculating decimated width & height.
For example, when decimating with 3, the width is calculated as:
width = DIV_ROUND_UP(width, decim_x);
This yields bad results for some values. For example, 800/3=266.666...,
which is rounded to 267. When the input width is set to 267, and pixel
increment is set to 3, this causes the dispc to read a line of 801
pixels, i.e. it reads a wrong pixel at the end of the line.
Even more pressing, the above rounding causes a BUG() in pixinc(), as
the value of 801 is used to calculate row increment, leading to a bad
value being passed to pixinc().
This patch fixes the decimation by removing the DIV_ROUND_UP()s when
calculating width and height for decimation.
Paul Gortmaker [Mon, 10 Feb 2014 18:39:53 +0000 (13:39 -0500)]
genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
In allmodconfig builds for sparc and any other arch which does
not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:
CC [M] lib/cpu-notifier-error-inject.o
CC [M] lib/pm-notifier-error-inject.o
ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
make[2]: *** [__modpost] Error 1
This happens because commit 3911ff30f5 ("genirq: export
handle_edge_irq() and irq_to_desc()") added one export for it, but
there were actually two instances of it, in an if/else clause for
CONFIG_SPARSE_IRQ. Add the second one.
powerpc/powernv: Add iommu DMA bypass support for IODA2
This patch adds the support for to create a direct iommu "bypass"
window on IODA2 bridges (such as Power8) allowing to bypass iommu
page translation completely for 64-bit DMA capable devices, thus
significantly improving DMA performances.
Additionally, this adds a hook to the struct iommu_table so that
the IOMMU API / VFIO can disable the bypass when external ownership
is requested, since in that case, the device will be used by an
environment such as userspace or a KVM guest which must not be
allowed to bypass translations.
Dave Airlie [Tue, 11 Feb 2014 02:57:27 +0000 (12:57 +1000)]
Merge tag 'drm-intel-fixes-2014-02-06' of ssh://git.freedesktop.org/git/drm-intel into drm-next
Just minor stuff really, on vlv dp fix and two patches to tune down some
opregion sanity check. Plus MAINTAINERS update for the new git repo, which
is the only reason I've really bothered with this pull request.
* tag 'drm-intel-fixes-2014-02-06' of ssh://git.freedesktop.org/git/drm-intel:
drm/i915: demote opregion excessive timeout WARN_ONCE to DRM_INFO_ONCE
drm: add DRM_INFO_ONCE() to print a one-time DRM_INFO() message
MAINTAINERS: Update drm/i915 git repo
drm/i915: vlv: fix DP PHY lockup due to invalid PP sequencer setup
Dave Airlie [Tue, 11 Feb 2014 02:56:57 +0000 (12:56 +1000)]
Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
This pull request fixes memory leak issue in exynos_drm_open() and
multiplatform breakage for ipp/gsc. And also including some cleanups.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: Convert to use the standard hdmi.h header
drm/exynos: Fix trivial typo
drm/exynos: Remove unnecessary semicolon
drm/exynos: Fix multiplatform breakage for ipp/gsc
drm/exynos: Fix freeing issues in exynos_drm_drv.c
Dave Airlie [Tue, 11 Feb 2014 02:56:17 +0000 (12:56 +1000)]
Merge branch 'msm-next' of git://people.freedesktop.org/~robclark/linux into drm-next
Compared to original fixes pull req that I sent yesterday, this adds
one more fix that I found for a synchronization issue which starts to
crop up when we use XA in DDX for 2d accel on 3d core. In particular,
accelerating presentation blit triggers this problem.
* 'msm-next' of git://people.freedesktop.org/~robclark/linux:
drm/msm: bigger synchronization hammer
drm/msm: fix deadlock in bo create fail path
drm/msm/mdp4: cursor fixes
drm/msm/mdp4: pageflip fixes
drm/msm/mdp5: fix ref leaks in error paths
drm/msm: fix inconsequential typo
Eric Dumazet [Mon, 10 Feb 2014 19:42:35 +0000 (11:42 -0800)]
6lowpan: fix lockdep splats
When a device ndo_start_xmit() calls again dev_queue_xmit(),
lockdep can complain because dev_queue_xmit() is re-entered and the
spinlocks protecting tx queues share a common lockdep class.
Same issue was fixed for bonding/l2tp/ppp in commits
0daa2303028a6 ("[PATCH] bonding: lockdep annotation") 49ee49202b4ac ("bonding: set qdisc_tx_busylock to avoid LOCKDEP splat") 23d3b8bfb8eb2 ("net: qdisc busylock needs lockdep annotations ") 303c07db487be ("ppp: set qdisc_tx_busylock to avoid LOCKDEP splat ")
John Greene [Mon, 10 Feb 2014 19:34:04 +0000 (14:34 -0500)]
alx: add missing stats_lock spinlock init
Trivial fix for init time stack trace occuring in
alx_get_stats64 upon start up. Should have been part of
commit adding the spinlock: f1b6b106 alx: add alx_get_stats64 operation
Richard Yao [Sun, 9 Feb 2014 00:32:01 +0000 (19:32 -0500)]
9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
The 9p-virtio transport does zero copy on things larger than 1024 bytes
in size. It accomplishes this by returning the physical addresses of
pages to the virtio-pci device. At present, the translation is usually a
bit shift.
That approach produces an invalid page address when we read/write to
vmalloc buffers, such as those used for Linux kernel modules. Any
attempt to load a Linux kernel module from 9p-virtio produces the
following stack.
qemu-system-x86_64: virtio: trying to map MMIO memory
This patch enables 9p-virtio to correctly handle this case. This not
only enables us to load Linux kernel modules off virtfs, but also
enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
Special thanks to both Avi Kivity and Alexander Graf for their
interpretation of QEMU backtraces. Without their guidence, tracking down
this bug would have taken much longer. Also, special thanks to Linus
Torvalds for his insightful explanation of why this should use
is_vmalloc_addr() instead of is_vmalloc_or_module_addr():
dingtianhong [Mon, 10 Feb 2014 08:33:59 +0000 (16:33 +0800)]
bonding: remove unwanted bond lock for enslave processing
The bond enslave processing don't hold bond->lock anymore,
so release an unlocked rw lock will cause warning message,
remove the unwanted read_unlock(&bond->lock).
Kevin Hao [Wed, 29 Jan 2014 10:24:54 +0000 (18:24 +0800)]
powerpc/ppc32: Fix the bug in the init of non-base exception stack for UP
We would allocate one specific exception stack for each kind of
non-base exceptions for every CPU. For ppc32 the CPU hard ID is
used as the subscript to get the specific exception stack for
one CPU. But for an UP kernel, there is only one element in the
each kind of exception stack array. We would get stuck if the
CPU hard ID is not equal to '0'. So in this case we should use the
subscript '0' no matter what the CPU hard ID is.
Michael Ellerman [Mon, 23 Dec 2013 12:46:06 +0000 (23:46 +1100)]
powerpc/xmon: Don't signal we've entered until we're finished printing
Currently we set our cpu's bit in cpus_in_xmon, and then we take the
output lock and print the exception information.
This can race with the master cpu entering the command loop and printing
the backtrace. The result is that the backtrace gets garbled with
another cpu's exception print out.
Fix it by delaying the set of cpus_in_xmon until we are finished
printing.
Michael Ellerman [Mon, 23 Dec 2013 12:46:05 +0000 (23:46 +1100)]
powerpc/xmon: Fix timeout loop in get_output_lock()
As far as I can tell, our 70s era timeout loop in get_output_lock() is
generating no code.
This leads to the hostile takeover happening more or less simultaneously
on all cpus. The result is "interesting", some example output that is
more readable than most:
Fix it by using udelay() in the timeout loop. The wait time and check
frequency are arbitrary, but seem to work OK. We already rely on
udelay() working so this is not a new dependency.
Michael Ellerman [Mon, 23 Dec 2013 12:46:04 +0000 (23:46 +1100)]
powerpc/xmon: Don't loop forever in get_output_lock()
If we enter with xmon_speaker != 0 we skip the first cmpxchg(), we also
skip the while loop because xmon_speaker != last_speaker (0) - meaning we
skip the second cmpxchg() also.
Following that code path the compiler sees no memory barriers and so is
within its rights to never reload xmon_speaker. The end result is we loop
forever.
This manifests as all cpus being in xmon ('c' command), but they refuse
to take control when you switch to them ('c x' for cpu # x).
I have seen this deadlock in practice and also checked the generated code to
confirm this is what's happening.
The simplest fix is just to always try the cmpxchg().
powerpc/perf: Configure BHRB filter before enabling PMU interrupts
Right now the config_bhrb() PMU specific call happens after
write_mmcr0(), which actually enables the PMU for event counting and
interrupts. So there is a small window of time where the PMU and BHRB
runs without the required HW branch filter (if any) enabled in BHRB.
This can cause some of the branch samples to be collected through BHRB
without any filter applied and hence affects the correctness of
the results. This patch moves the BHRB config function call before
enabling interrupts.
Here are some data points captured via trace prints which depicts how we
could get PMU interrupts with BHRB filter NOT enabled with a standard
perf record command line (asking for branch record information as well).
All the PMU interrupts have the requested HW BHRB branch filter
enabled in MMCRA.
Signed-off-by: Anshuman Khandual <[email protected]>
[mpe: Fixed up whitespace and cleaned up changelog] Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Nathan Fontenot [Wed, 29 Jan 2014 16:34:56 +0000 (10:34 -0600)]
crypto/nx/nx-842: Fix handling of vmalloc addresses
The powerpc specific nx-842 compression driver does not currently
handle translating a vmalloc address to a physical address.
The current driver uses __pa() for all addresses which does not
properly handle vmalloc addresses and thus causes a failure since
we do not pass a proper physical address to the hypervisor.
This patch adds a routine to convert an address to a physical
address by checking for vmalloc addresses and handling them properly.
Laurent Dufour [Thu, 30 Jan 2014 15:58:42 +0000 (16:58 +0100)]
powerpc/relocate fix relocate processing in LE mode
Relocation's code is not working in little endian mode because the r_info
field, which is a 64 bits value, should be read from the right offset.
The current code is optimized to read the r_info field as a 32 bits value
starting at the middle of the double word (offset 12). When running in LE
mode, the read value is not correct since only the MSB is read.
This patch removes this optimization which consist to deal with a 32 bits
value instead of a 64 bits one. This way it works in big and little endian
mode.
powerpc: Fix kdump hang issue on p8 with relocation on exception enabled.
On p8 systems, with relocation on exception feature enabled we are seeing
kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this
feature enabled, exception are raised with MMU (IR=DR=1) ON with the
default offset of 0xc*4000. Since exception is raised in virtual mode it
requires the vector region to be executable without which it fails to
fetch and execute instruction at 0xc*4xxx. For default kernel since kernel
is loaded at real 0, the htab mappings sets the entire kernel text region
executable. But for relocatable kernel (e.g. kdump case) we only copy
interrupt vectors down to real 0 and never marked that region as
executable because in p7 and below we always get exception in real mode.
This patch fixes this issue by marking htab mapping range as executable
that overlaps with the interrupt vector region for relocatable kernel.
Thanks to Ben who helped me to debug this issue and find the root cause.
powerpc/pseries: Disable relocation on exception while going down during crash.
Disable relocation on exception while going down even in kdump case. This
is because we are about clear htab mappings while kexec-ing into kdump
kernel and we may run into issues if we still have AIL ON.
powerpc/eeh: Drop taken reference to driver on eeh_rmv_device
Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 ("powerpc/eeh: Use
partial hotplug for EEH unaware drivers") introduces eeh_rmv_device,
which may grab a reference to a driver, but not release it.
That prevents a driver from being removed after it has gone through EEH
recovery.
CC arch/powerpc/sysdev/mpic.o
arch/powerpc/sysdev/mpic.c: In function 'mpic_set_irq_type':
arch/powerpc/sysdev/mpic.c:886:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:890:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:894:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:898:9: error: case label does not reduce to an integer constant
Looking at the cpp output (gcc 4.7.3), I see:
case mpic->hw_set[MPIC_IDX_VECPRI_SENSE_EDGE] |
mpic->hw_set[MPIC_IDX_VECPRI_POLARITY_POSITIVE]:
The pointer into an array appears because CONFIG_MPIC_WEIRD=y is set
for this platform, thus enabling the following:
Here we convert the case section to if/else if, and also add
the equivalent of a default case to warn about unknown types.
Boot tested on sbc8548, build tested on all defconfigs.
Linus Torvalds [Tue, 11 Feb 2014 00:03:16 +0000 (16:03 -0800)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge misc fixes from Andrew Morton:
"A bunch of fixes"
* emailed patches fron Andrew Morton <[email protected]>:
ocfs2: check existence of old dentry in ocfs2_link()
ocfs2: update inode size after zeroing the hole
ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size
mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED
smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls
mm: fix page leak at nfs_symlink()
slub: do not assert not having lock in removing freed partial
gitignore: add all.config
ocfs2: fix ocfs2_sync_file() if filesystem is readonly
drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zero
fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem
xen: properly account for _PAGE_NUMA during xen pte translations
mm/slub.c: list_lock may not be held in some circumstances
drivers/md/bcache/extents.c: use %zi to format size_t
vmcore: prevent PT_NOTE p_memsz overflow during header update
drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS)
Documentation/: update 00-INDEX files
checkpatch: fix detection of git repository
get_maintainer: fix detection of git repository
drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context()
Xue jiufei [Mon, 10 Feb 2014 22:25:54 +0000 (14:25 -0800)]
ocfs2: check existence of old dentry in ocfs2_link()
System call linkat first calls user_path_at(), check the existence of
old dentry, and then calls vfs_link()->ocfs2_link() to do the actual
work. There may exist a race when Node A create a hard link for file
while node B rm it.
Node A Node B
user_path_at()
->ocfs2_lookup(),
find old dentry exist
rm file, add inode say inodeA
to orphan_dir
call ocfs2_link(),create a
hard link for inodeA.
rm the link, add inodeA to orphan_dir
again
When orphan_scan work start, it calls ocfs2_queue_orphans() to do the
main work. It first tranverses entrys in orphan_dir, linking all inodes
in this orphan_dir to a list look like this:
inodeA->inodeB->...->inodeA
When tranvering this list, it will fall into loop, calling iput() again
and again. And finally trigger BUG_ON(inode->i_state & I_CLEAR).
Junxiao Bi [Mon, 10 Feb 2014 22:25:53 +0000 (14:25 -0800)]
ocfs2: update inode size after zeroing the hole
fs-writeback will release the dirty pages without page lock whose offset
are over inode size, the release happens at
block_write_full_page_endio(). If not update, dirty pages in file holes
may be released before flushed to the disk, then file holes will contain
some non-zero data, this will cause sparse file md5sum error.
To reproduce the bug, find a big sparse file with many holes, like vm
image file, its actual size should be bigger than available mem size to
make writeback work more frequently, tar it with -S option, then keep
untar it and check its md5sum again and again until you get a wrong
md5sum.
Younger Liu [Mon, 10 Feb 2014 22:25:51 +0000 (14:25 -0800)]
ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size
The issue scenario is as following:
- Create a small file and fallocate a large disk space for a file with
FALLOC_FL_KEEP_SIZE option.
- ftruncate the file back to the original size again. but the disk free
space is not changed back. This is a real bug that be fixed in this
patch.
In order to solve the issue above, we modified ocfs2_setattr(), if
attr->ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
truncate disk space to attr->ia_size.
Naoya Horiguchi [Mon, 10 Feb 2014 22:25:50 +0000 (14:25 -0800)]
mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED
mce-test detected a test failure when injecting error to a thp tail
page. This is because we take page refcount of the tail page in
madvise_hwpoison() while the fix in commit a3e0f9e47d5e
("mm/memory-failure.c: transfer page count from head page to tail page
after split thp") assumes that we always take refcount on the head page.
When a real memory error happens we take refcount on the head page where
memory_failure() is called without MF_COUNT_INCREASED set, so it seems
to me that testing memory error on thp tail page using madvise makes
little sense.
This patch cancels moving refcount in !MF_COUNT_INCREASED for valid
testing.
Paul Gortmaker [Mon, 10 Feb 2014 22:25:49 +0000 (14:25 -0800)]
smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls
Use what we already do for arch_disable_smp_support() to fix these:
arch/x86/kernel/smpboot.c:1155:6: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
arch/x86/kernel/smpboot.c:1160:6: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
kernel/cpu.c:512:13: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
kernel/cpu.c:516:13: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
Rafael Aquini [Mon, 10 Feb 2014 22:25:48 +0000 (14:25 -0800)]
mm: fix page leak at nfs_symlink()
Changes in commit a0b8cab3b9b2 ("mm: remove lru parameter from
__pagevec_lru_add and remove parts of pagevec API") have introduced a
call to add_to_page_cache_lru() which causes a leak in nfs_symlink() as
now the page gets an extra refcount that is not dropped.
Jan Stancek observed and reported the leak effect while running test8
from Connectathon Testsuite. After several iterations over the test
case, which creates several symlinks on a NFS mountpoint, the test
system was quickly getting into an out-of-memory scenario.
This patch fixes the page leak by dropping that extra refcount
add_to_page_cache_lru() is grabbing.
Steven Rostedt [Mon, 10 Feb 2014 22:25:46 +0000 (14:25 -0800)]
slub: do not assert not having lock in removing freed partial
Vladimir reported the following issue:
Commit c65c1877bd68 ("slub: use lockdep_assert_held") requires
remove_partial() to be called with n->list_lock held, but free_partial()
called from kmem_cache_close() on cache destruction does not follow this
rule, leading to a warning:
His solution was to add a spinlock in order to quiet lockdep. Although
there would be no contention to adding the lock, that lock also requires
disabling of interrupts which will have a larger impact on the system.
Instead of adding a spinlock to a location where it is not needed for
lockdep, make a __remove_partial() function that does not test if the
list_lock is held, as no one should have it due to it being freed.
Also added a __add_partial() function that does not do the lock
validation either, as it is not needed for the creation of the cache.
fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem
Recently due to a spike in connections per second memcached on 3
separate boxes triggered the OOM killer from accept. At the time the
OOM killer was triggered there was 4GB out of 36GB free in zone 1. The
problem was that alloc_fdtable was allocating an order 3 page (32KiB) to
hold a bitmap, and there was sufficient fragmentation that the largest
page available was 8KiB.
I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious
but I do agree that order 3 allocations are very likely to succeed.
There are always pathologies where order > 0 allocations can fail when
there are copious amounts of free memory available. Using the pigeon
hole principle it is easy to show that it requires 1 page more than 50%
of the pages being free to guarantee an order 1 (8KiB) allocation will
succeed, 1 page more than 75% of the pages being free to guarantee an
order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of
the pages being free to guarantee an order 3 allocate will succeed.
A server churning memory with a lot of small requests and replies like
memcached is a common case that if anything can will skew the odds
against large pages being available.
Therefore let's not give external applications a practical way to kill
linux server applications, and specify __GFP_NORETRY to the kmalloc in
alloc_fdmem. Unless I am misreading the code and by the time the code
reaches should_alloc_retry in __alloc_pages_slowpath (where
__GFP_NORETRY becomes signification). We have already tried everything
reasonable to allocate a page and the only thing left to do is wait. So
not waiting and falling back to vmalloc immediately seems like the
reasonable thing to do even if there wasn't a chance of triggering the
OOM killer.
Mel Gorman [Mon, 10 Feb 2014 22:25:40 +0000 (14:25 -0800)]
xen: properly account for _PAGE_NUMA during xen pte translations
Steven Noonan forwarded a users report where they had a problem starting
vsftpd on a Xen paravirtualized guest, with this in dmesg:
BUG: Bad page map in process vsftpd pte:8000000493b88165 pmd:e9cc01067
page:ffffea00124ee200 count:0 mapcount:-1 mapping: (null) index:0x0
page flags: 0x2ffc0000000014(referenced|dirty)
addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping: (null) index:7f97eea74
CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1
Call Trace:
dump_stack+0x45/0x56
print_bad_pte+0x22e/0x250
unmap_single_vma+0x583/0x890
unmap_vmas+0x65/0x90
exit_mmap+0xc5/0x170
mmput+0x65/0x100
do_exit+0x393/0x9e0
do_group_exit+0xcc/0x140
SyS_exit_group+0x14/0x20
system_call_fastpath+0x1a/0x1f
Disabling lock debugging due to kernel taint
BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1
BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1
The issue could not be reproduced under an HVM instance with the same
kernel, so it appears to be exclusive to paravirtual Xen guests. He
bisected the problem to commit 1667918b6483 ("mm: numa: clear numa
hinting information on mprotect") that was also included in 3.12-stable.
The problem was related to how xen translates ptes because it was not
accounting for the _PAGE_NUMA bit. This patch splits pte_present to add
a pteval_present helper for use by xen so both bare metal and xen use
the same code when checking if a PTE is present.
David Rientjes [Mon, 10 Feb 2014 22:25:39 +0000 (14:25 -0800)]
mm/slub.c: list_lock may not be held in some circumstances
Commit c65c1877bd68 ("slub: use lockdep_assert_held") incorrectly
required that add_full() and remove_full() hold n->list_lock. The lock
is only taken when kmem_cache_debug(s), since that's the only time it
actually does anything.
Require that the lock only be taken under such a condition.
drivers/md/bcache/extents.c: use %zi to format size_t
drivers/md/bcache/extents.c: In function `btree_ptr_bad_expensive':
drivers/md/bcache/extents.c:196: warning: format `%li' expects type `long int', but argument 4 has type `size_t'
Greg Pearson [Mon, 10 Feb 2014 22:25:36 +0000 (14:25 -0800)]
vmcore: prevent PT_NOTE p_memsz overflow during header update
Currently, update_note_header_size_elf64() and
update_note_header_size_elf32() will add the size of a PT_NOTE entry to
real_sz even if that causes real_sz to exceeds max_sz. This patch
corrects the while loop logic in those routines to ensure that does not
happen and prints a warning if a PT_NOTE entry is dropped. If zero
PT_NOTE entries are found or this condition is encountered because the
only entry was dropped, a warning is printed and an error is returned.
One possible negative side effect of exceeding the max_sz limit is an
allocation failure in merge_note_headers_elf64() or
merge_note_headers_elf32() which would produce console output such as
the following while booting the crash kernel.
kdump: dump target is /dev/sda4
kdump: saving to /sysroot//var/crash/127.0.0.1-2014.01.28-13:58:52/
kdump: saving vmcore-dmesg.txt
Cannot open /proc/vmcore: No such file or directory
kdump: saving vmcore-dmesg.txt failed
kdump: saving vmcore
kdump: saving vmcore failed
This type of failure has been seen on a four socket prototype system
with certain memory configurations. Most PT_NOTE sections have a single
entry similar to:
n_namesz = 0x5
n_descsz = 0x150
n_type = 0x1
Occasionally, a second entry is encountered with very large n_namesz and
n_descsz sizes:
drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS)
i2o_cfg_compat_ioctl(I2OGETIOPS) locks i2o_cfg_mutex and then calls
i2o_cfg_ioctl(I2OGETIOPS) that locks i2o_cfg_mutex as well. A deadlock
is guaranteed.
Found by Linux Driver Verification project (linuxtesting.org).
Henrik Austad [Mon, 10 Feb 2014 22:25:33 +0000 (14:25 -0800)]
Documentation/: update 00-INDEX files
Some of the 00-INDEX files are somewhat outdated and some folders does
not contain 00-INDEX at all. Only outdated (with the notably exception
of spi) indexes are touched here, the 169 folders without 00-INDEX has
not been touched.
New 00-INDEX
- spi/* was added in a series of commits dating back to 2006
Added files (missing in (*/)00-INDEX)
- dmatest.txt was added by commit 851b7e16a07d ("dmatest: run test via
debugfs")
- this_cpu_ops.txt was added by commit a1b2a555d637 ("percpu: add
documentation on this_cpu operations")
- ww-mutex-design.txt was added by commit 040a0a371005 ("mutex: Add
support for wound/wait style locks")
- bcache.txt was added by commit cafe56359144 ("bcache: A block layer
cache")
- kernel-per-CPU-kthreads.txt was added by commit 49717cb40410
("kthread: Document ways of reducing OS jitter due to per-CPU
kthreads")
- phy.txt was added by commit ff764963479a ("drivers: phy: add generic
PHY framework")
- block/null_blk was added by commit 12f8f4fc0314 ("null_blk:
documentation")
- module-signing.txt was added by commit 3cafea307642 ("Add
Documentation/module-signing.txt file")
- assoc_array.txt was added by commit 3cb989501c26 ("Add a generic
associative array implementation.")
- arm/IXP4xx was part of the initial repo
- arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28e839
("ARM: mcpm: introduce helpers for platform coherency exit/setup")
- arm/firmware.txt was added by commit 7366b92a77fc ("ARM: Add
interface for registering and calling firmware-specific operations")
- arm/kernel_mode_neon.txt was added by commit 2afd0a05241d ("ARM:
7825/1: document the use of NEON in kernel mode")
- arm/tcm.txt was added by commit bc581770cfdd ("ARM: 5580/2: ARM TCM
(Tightly-Coupled Memory) support v3")
- arm/vlocks.txt was added by commit 9762f12d3e05 ("ARM: mcpm: Add
baremetal voting mutexes")
- blackfin/gptimers-example.c, Makefile was added by commit 4b60779d5ea7 ("Blackfin: add an example showing how to use the
gptimers API")
- devicetree/usage-model.txt was added by commit 31134efc681a ("dt:
Linux DT usage model documentation")
- fb/api.txt was added by commit fb21c2f42879 ("fbdev: Add FOURCC-based
format configuration API")
- fb/sm501.txt was added by commit e6a049807105 ("video, sm501: add
edid and commandline support")
- fb/udlfb.txt was added by commit 96f8d864afd6 ("fbdev: move udlfb out
of staging.")
- filesystems/Makefile was added by commit 1e0051ae48a2
("Documentation/fs/: split txt and source files")
- filesystems/nfs/nfsd-admin-interfaces.txt was added by commit 8a4c6e19cfed ("nfsd: document kernel interfaces for nfsd
configuration")
- ide/warm-plug-howto.txt was added by commit f74c91413ec6 ("ide: add
warm-plug support for IDE devices (take 2)")
- laptops/Makefile was added by commit d49129accc21
("Documentation/laptop/: split txt and source files")
- leds/leds-blinkm.txt was added by commit b54cf35a7f65 ("LEDS: add
BlinkM RGB LED driver, documentation and update MAINTAINERS")
- leds/ledtrig-oneshot.txt was added by commit 5e417281cde2 ("leds: add
oneshot trigger")
- leds/ledtrig-transient.txt was added by commit 44e1e9f8e705 ("leds:
add new transient trigger for one shot timer activation")
- m68k/README.buddha was part of the initial repo
- networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits 40839129f779, c4e84bde1d59, 5a4faa873782
- networking/Makefile was added by commit 3794f3e812ef ("docsrc: build
Documentation/ sources")
- networking/i40evf.txt was added by commit 105bf2fe6b32 ("i40evf: add
driver to kernel build system")
- networking/ipsec.txt was added by commit b3c6efbc36e2 ("xfrm: Add
file to document IPsec corner case")
- networking/mac80211-auth-assoc-deauth.txt was added by commit 3cd7920a2be8 ("mac80211: add auth/assoc/deauth flow diagram")
- networking/netlink_mmap.txt was added by commit 5683264c3981
("netlink: add documentation for memory mapped I/O")
- networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e1597f
("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
- networking/team.txt was added by commit 3d249d4ca7d0 ("net: introduce
ethernet teaming device")
- networking/vxlan.txt was added by commit d342894c5d2f ("vxlan:
virtual extensible lan")
- power/runtime_pm.txt was added by commit 5e928f77a09a ("PM: Introduce
core framework for run-time PM of I/O devices (rev. 17)")
- power/charger-manager.txt was added by commit 3bb3dbbd56ea
("power_supply: Add initial Charger-Manager driver")
- RCU/lockdep-splat.txt was added by commit d7bd2d68aa2e ("rcu:
Document interpretation of RCU-lockdep splats")
- s390/kvm.txt was added by 5ecee4b (KVM: s390: API documentation)
- s390/qeth.txt was added by commit b4d72c08b358 ("qeth: bridgeport
support - basic control")
- scheduler/sched-bwc.txt was added by commit 88ebc08ea9f7 ("sched: Add
documentation for bandwidth control")
- scsi/advansys.txt was added by commit 4bd6d7f35661 ("[SCSI] advansys:
Move documentation to Documentation/scsi")
- scsi/bfa.txt was added by commit 1ec90174bdb4 ("[SCSI] bfa: add
readme file")
- scsi/bnx2fc.txt was added by commit 12b8fc10eaf4 ("[SCSI] bnx2fc: Add
driver documentation")
- scsi/cxgb3i.txt was added by commit c3673464ebc0 ("[SCSI] cxgb3i: Add
cxgb3i iSCSI driver.")
- scsi/hpsa.txt was added by commit 992ebcf14f3c ("[SCSI] hpsa: Add
hpsa.txt to Documentation/scsi")
- scsi/link_power_management_policy.txt was added by commit ca77329fb713 ("[libata] Link power management infrastructure")
- scsi/osd.txt was added by commit 78e0c621deca ("[SCSI] osd:
Documentation for OSD library")
- scsi/scsi-parameter.txt was created/moved by commit 163475fb111c
("Documentation: move SCSI parameters to their own text file")
- serial/driver was part of the initial repo
- serial/n_gsm.txt was added by commit 323e84122ec6 ("n_gsm: add a
documentation")
- timers/Makefile was added by commit 3794f3e812ef ("docsrc: build
Documentation/ sources")
- virt/kvm/s390.txt was added by commit d9101fca3d57 ("KVM: s390:
diagnose call documentation")
- vm/split_page_table_lock was added by commit 49076ec2ccaf ("mm:
dynamically allocate page->ptl if it cannot be embedded to struct
page")
- w1/slaves/w1_ds28e04 was added by commit fbf7f7b4e2ae ("w1: Add
1-wire slave device driver for DS28E04-100")
- w1/masters/omap-hdq was added by commit e0a29382c6f5 ("hdq:
documentation for OMAP HDQ")
- x86/early-microcode.txt was added by commit 0d91ea86a895 ("x86, doc:
Documentation for early microcode loading")
- x86/earlyprintk.txt was added by commit a1aade478862 ("x86/doc:
mini-howto for using earlyprintk=dbgp")
- x86/entry_64.txt was added by commit 8b4777a4b50c ("x86-64: Document
some of entry_64.S")
- x86/pat.txt was added by commit d27554d874c7 ("x86: PAT
documentation")
Moved files
- arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
commit 37b8304642c7 ("ARM: kuser: move interface documentation out of
the source code")
- efi-stub.txt was moved out of x86/ and down into Documentation/ in
commit 4172fe2f8a47 ("EFI stub documentation updates")
- laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit efcfed9bad88 ("Move hp_accel to drivers/platform/x86")
- commit 5616c23ad9cd ("x86: doc: move x86-generic documentation from
Doc/x86/i386"):
* x86/usb-legacy-support.txt
* x86/boot.txt
* x86/zero_page.txt
- power/video_extension.txt was moved to acpi in commit 70e66e4df191
("ACPI / video: move video_extension.txt to Documentation/acpi")
Removed files (left in 00-INDEX)
- memory.txt was removed by commit 00ea8990aadf ("memory.txt: remove
stray information")
- gpio.txt was moved to gpio/ in commit fd8e198cfcaa ("Documentation:
gpiolib: document new interface")
- networking/DLINK.txt was removed by commit 168e06ae26dd
("drivers/net: delete old parallel port de600/de620 drivers")
- serial/hayes-esp.txt was removed by commit f53a2ade0bb9 ("tty: esp:
remove broken driver")
- s390/TAPE was removed by commit 9e280f669308 ("[S390] remove tape
block docu")
- vm/locking was removed by commit 57ea8171d2bc ("mm: documentation:
remove hopelessly out-of-date locking doc")
- laptops/acer-wmi.txt was remvoed by commit 020036678e81 ("acer-wmi:
Delete out-of-date documentation")
Typos/misc issues
- rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication.")
- commit b88cf73d9278 ("net: add missing entries to
Documentation/networking/00-INDEX")
* generic-hdlc.txt was added as generic_hdlc.txt
* spider_net.txt was added as spider-net.txt
- w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139f74c ("w1: add
1-wire master driver for i.MX27 / i.MX31")
- s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a407e
("[S390] Add Documentation/s390/00-INDEX.")
Richard Genoud [Mon, 10 Feb 2014 22:25:32 +0000 (14:25 -0800)]
checkpatch: fix detection of git repository
Since git v1.7.7, the .git directory can be a file when, for example,
the kernel is a submodule of another git super project. So, the check
"-d .git" is not working anymore in this case. Using a more generic
check like "-e .git" corrects this behaviour.
Richard Genoud [Mon, 10 Feb 2014 22:25:31 +0000 (14:25 -0800)]
get_maintainer: fix detection of git repository
Since git v1.7.7, the .git directory can be a file when, for example,
the kernel is a submodule of another git super project. So, the check
"-d .git" is not working anymore in this case. Using a more generic
check like "-e .git" corrects this behaviour.
Dan Carpenter [Mon, 10 Feb 2014 22:25:30 +0000 (14:25 -0800)]
drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context()
I was reviewing this and noticed that unlocking should be conditional on
the error path. I've changed it to unlock and return directly since we
only do it once and it seems unlikely to change in the near future.
John Ogness [Mon, 10 Feb 2014 02:40:11 +0000 (18:40 -0800)]
tcp: tsq: fix nonagle handling
Commit 46d3ceabd8d9 ("tcp: TCP Small Queues") introduced a possible
regression for applications using TCP_NODELAY.
If TCP session is throttled because of tsq, we should consult
tp->nonagle when TX completion is done and allow us to send additional
segment, especially if this segment is not a full MSS.
Otherwise this segment is sent after an RTO.
[edumazet] : Cooked the changelog, added another fix about testing
sk_wmem_alloc twice because TX completion can happen right before
setting TSQ_THROTTLED bit.
This problem is particularly visible with recent auto corking,
but might also be triggered with low tcp_limit_output_bytes
values or NIC drivers delaying TX completion by hundred of usec,
and very low rtt.
Thomas Glanzmann for example reported an iscsi regression, caused
by tcp auto corking making this bug quite visible.
David S. Miller [Mon, 10 Feb 2014 22:35:11 +0000 (14:35 -0800)]
Merge branch 'bridge'
Toshiaki Makita says:
====================
bridge: Fix corner case problems around local fdb entries
There are so many corner cases that are not handled properly around local
fdb entries.
- We might fail to delete the old entry and might delete an arbitrary local
entry when changing mac address of a bridge port.
- We always fail to delete the old entry when changing mac address of the
bridge device.
- We might incorrectly delete a necessary entry when detaching a bridge port.
- We might incorrectly delete a necessary entry when deleting a vlan.
and so on.
This is a patch series to fix these issues.
v3:
- Handle NTF_USE case in patch 1/9, commented by Vlad Yasevich.
- Tested port detach/attach and didn't find any problem with patch 5/9,
suggested by Stephen Hemminger.
- Add comments about possible inconsistent state in current implementation
into commit log of patch 5/9, found by the above test.
- Reword unintensive changelog of patch 7/9, commented by Vlad Yasevich.
v2:
- Change the way to find the old entry in br_fdb_changeaddr() from memorizing
previous port address to introducing a new flag indicating whether a fdb
entry is added by user or not, commented by Stephen Hemminger.
- Add a fix for the way to insert a new address in br_fdb_changeaddr().
- Prevent creating an entry such that its dst is NULL in br_add_if() to
preserve old behavior, commented by Vlad Yasevich.
- Add more comments about slight behavior change, where the bridge device
come to be able to receive traffic to an address it has during short
window, to changelogs, commented by Vlad Yasevich.
- Add a fix for possible race in br_fdb_change_mac_address().
====================
Toshiaki Makita [Fri, 7 Feb 2014 07:48:25 +0000 (16:48 +0900)]
bridge: Properly check if local fdb entry can be deleted when deleting vlan
Vlan codes unconditionally delete local fdb entries.
We should consider the possibility that other ports have the same
address and vlan.
Example of problematic case:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
bridge vlan add dev eth0 vid 10
bridge vlan add dev eth1 vid 10
bridge vlan add dev br0 vid 10 self
We will have fdb entry such that f->dst == eth0, f->vlan_id == 10 and
f->addr == 12:34:56:78:90:ab at this time.
Next, delete eth0 vlan 10.
bridge vlan del dev eth0 vid 10
In this case, we still need the entry for br0, but it will be deleted.
Note that br0 needs the entry even though its mac address is not set
manually. To delete the entry with proper condition checking,
fdb_delete_local() is suitable to use.
Toshiaki Makita [Fri, 7 Feb 2014 07:48:24 +0000 (16:48 +0900)]
bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port
br_fdb_delete_by_port() doesn't care about vlan and mac address of the
bridge device.
As the check is almost the same as mac address changing, slightly modify
fdb_delete_local() and use it.
Note that we can always set added_by_user to 0 in fdb_delete_local() because
- br_fdb_delete_by_port() calls fdb_delete_local() for local entries
regardless of its added_by_user. In this case, we have to check if another
port has the same address and vlan, and if found, we have to create the
entry (by changing dst). This is kernel-added entry, not user-added.
- br_fdb_changeaddr() doesn't call fdb_delete_local() for user-added entry.
Toshiaki Makita [Fri, 7 Feb 2014 07:48:23 +0000 (16:48 +0900)]
bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address
br_fdb_change_mac_address() doesn't check if the local entry has the
same address as any of bridge ports.
Although I'm not sure when it is beneficial, current implementation allow
the bridge device to receive any mac address of its ports.
To preserve this behavior, we have to check if the mac address of the
entry being deleted is identical to that of any port.
As this check is almost the same as that in br_fdb_changeaddr(), create
a common function fdb_delete_local() and call it from
br_fdb_changeadddr() and br_fdb_change_mac_address().
Toshiaki Makita [Fri, 7 Feb 2014 07:48:22 +0000 (16:48 +0900)]
bridge: Fix the way to check if a local fdb entry can be deleted
We should take into account the followings when deleting a local fdb
entry.
- nbp_vlan_find() can be used only when vid != 0 to check if an entry is
deletable, because a fdb entry with vid 0 can exist at any time while
nbp_vlan_find() always return false with vid 0.
Example of problematic case:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address 12:34:56:78:90:ab
brctl addif br0 eth0
brctl addif br0 eth1
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the fdb entry 12:34:56:78:90:ab will be deleted even though the
bridge port eth1 still has that address.
- The port to which the bridge device is attached might needs a local entry
if its mac address is set manually.
Example of problematic case:
ip link set eth0 address 12:34:56:78:90:ab
brctl addif br0 eth0
ip link set br0 address 12:34:56:78:90:ab
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the fdb still must have the entry 12:34:56:78:90:ab, but it will be
deleted.
We can use br->dev->addr_assign_type to check if the address is manually
set or not, but I propose another approach.
Since we delete and insert local entries whenever changing mac address
of the bridge device, we can change dst of the entry to NULL regardless of
addr_assign_type when deleting an entry associated with a certain port,
and if it is found to be unnecessary later, then delete it.
That is, if changing mac address of a port, the entry might be changed
to its dst being NULL first, but is eventually deleted when recalculating
and changing bridge id.
This approach is especially useful when we want to share the code with
deleting vlan in which the bridge device might want such an entry regardless
of addr_assign_type, and makes things easy because we don't have to consider
if mac address of the bridge device will be changed or not at the time we
delete a local entry of a port, which means fdb code will not be bothered
even if the bridge id calculating logic is changed in the future.
Also, this change reduces inconsistent state, where frames whose dst is the
mac address of the bridge, can't reach the bridge because of premature fdb
entry deletion. This change reduces the possibility that the bridge device
replies unreachable mac address to arp requests, which could occur during
the short window between calling del_nbp() and br_stp_recalculate_bridge_id()
in br_del_if(). This will effective after br_fdb_delete_by_port() starts to
use the same code by following patch.
Toshiaki Makita [Fri, 7 Feb 2014 07:48:21 +0000 (16:48 +0900)]
bridge: Change local fdb entries whenever mac address of bridge device changes
Vlan code may need fdb change when changing mac address of bridge device
even if it is caused by the mac address changing of a bridge port.
Example configuration:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
bridge vlan add dev br0 vid 10 self
bridge vlan add dev eth0 vid 10
We will have fdb entry such that f->dst == NULL, f->vlan_id == 10 and
f->addr == 12:34:56:78:90:ab at this time.
Next, change the mac address of eth0 to greater value.
ip link set eth0 address ee:ff:12:34:56:78
Then, mac address of br0 will be recalculated and set to aa:bb:cc:dd:ee:ff.
However, an entry aa:bb:cc:dd:ee:ff will not be created and we will be not
able to communicate using br0 on vlan 10.
Address this issue by deleting and adding local entries whenever
changing the mac address of the bridge device.
If there already exists an entry that has the same address, for example,
in case that br_fdb_changeaddr() has already inserted it,
br_fdb_change_mac_address() will simply fail to insert it and no
duplicated entry will be made, as it was.
This approach also needs br_add_if() to call br_fdb_insert() before
br_stp_recalculate_bridge_id() so that we don't create an entry whose
dst == NULL in this function to preserve previous behavior.
Note that this is a slight change in behavior where the bridge device can
receive the traffic to the new address before calling
br_stp_recalculate_bridge_id() in br_add_if().
However, it is not a problem because we have already the address on the
new port and such a way to insert new one before recalculating bridge id
is taken in br_device_event() as well.
Toshiaki Makita [Fri, 7 Feb 2014 07:48:20 +0000 (16:48 +0900)]
bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address
We have been always failed to delete the old entry at
br_fdb_change_mac_address() because br_set_mac_address() updates
dev->dev_addr before calling br_fdb_change_mac_address() and
br_fdb_change_mac_address() uses dev->dev_addr to find the old entry.
That update of dev_addr is completely unnecessary because the same work
is done in br_stp_change_bridge_id() which is called right away after
calling br_fdb_change_mac_address().
Toshiaki Makita [Fri, 7 Feb 2014 07:48:19 +0000 (16:48 +0900)]
bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr
Since commit bc9a25d21ef8 ("bridge: Add vlan support for local fdb entries"),
br_fdb_changeaddr() has inserted a new local fdb entry only if it can
find old one. But if we have two ports where they have the same address
or user has deleted a local entry, there will be no entry for one of the
ports.
Example of problematic case:
ip link set eth0 address aa:bb:cc:dd:ee:ff
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
brctl addif br0 eth1 # eth1 will not have a local entry due to dup.
ip link set eth1 address 12:34:56:78:90:ab
Then, the new entry for the address 12:34:56:78:90:ab will not be
created, and the bridge device will not be able to communicate.
Insert new entries regardless of whether we can find old entries or not.
Toshiaki Makita [Fri, 7 Feb 2014 07:48:18 +0000 (16:48 +0900)]
bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
br_fdb_changeaddr() assumes that there is at most one local entry per port
per vlan. It used to be true, but since commit 36fd2b63e3b4 ("bridge: allow
creating/deleting fdb entries via netlink"), it has not been so.
Therefore, the function might fail to search a correct previous address
to be deleted and delete an arbitrary local entry if user has added local
entries manually.
Example of problematic case:
ip link set eth0 address ee:ff:12:34:56:78
brctl addif br0 eth0
bridge fdb add 12:34:56:78:90:ab dev eth0 master
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the address 12:34:56:78:90:ab might be deleted instead of
ee:ff:12:34:56:78, the original mac address of eth0.
Address this issue by introducing a new flag, added_by_user, to struct
net_bridge_fdb_entry.
Note that br_fdb_delete_by_port() has to set added_by_user to 0 in cases
like:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
bridge fdb add aa:bb:cc:dd:ee:ff dev eth0 master
brctl addif br0 eth1
brctl delif br0 eth0
In this case, kernel should delete the user-added entry aa:bb:cc:dd:ee:ff,
but it also should have been added by "brctl addif br0 eth1" originally,
so we don't delete it and treat it a new kernel-created entry.
which is based off v3.13-rc6. If you would like me to rebase it on
a different branch/tag I would be more than happy to do so.
The patches are all bug-fixes and hopefully can go in 3.14.
They deal with xen-blkback shutdown and cause memory leaks
as well as shutdown races. They should go to stable tree and if you
are OK with I will ask them to backport those fixes.
There is also a fix to xen-blkfront to deal with unexpected state
transition. And lastly a fix to the header where it was using the
__aligned__ unnecessarily.
Michal Simek [Fri, 31 Jan 2014 11:55:06 +0000 (12:55 +0100)]
ARM: zynq: Reserve not DMAable space in front of the kernel
Reserve space from 0x0 - __pa(swapper_pg_dir),
if kernel is loaded from 0, which is not DMAable.
It is causing problem with MMC driver and others
which want to add dma buffers to this space.
Kevin Hilman [Mon, 10 Feb 2014 18:39:27 +0000 (10:39 -0800)]
Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
From Nicolas Ferre:
First series of AT91 fixes for 3.14.
All of them are DT-related.
- fixes for typos in i2c and ohci clocks
- addition of a USB host node for at91sam9n12ek
- 2 DT documentation updates that have been sent a long time ago
- a new board based on the sama5d36 SoC
* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
ARM: at91: add Atmel's SAMA5D3 Xplained board
spi/atmel: document clock properties
mmc: atmel-mci: document clock properties
ARM: at91: enable USB host on at91sam9n12ek board
ARM: at91/dt: fix sama5d3 ohci hclk clock reference
ARM: at91/dt: sam9263: fix compatibility string for the I2C
Nishanth Menon [Thu, 6 Feb 2014 15:29:31 +0000 (09:29 -0600)]
ARM: multi_v7_defconfig: Select CONFIG_SOC_DRA7XX
Select CONFIG_SOC_DRA7XX so that we can boot dra7-evm.
DRA7 family are A15 based processors that supports LPAE and an
evolutionary update to the OMAP5 generation of processors.
Philipp Zabel [Wed, 29 Jan 2014 16:10:04 +0000 (17:10 +0100)]
ARM: imx6: Initialize low-power mode early again
Since commit 9e8147bb5ec5d1dda2141da70f96b98985a306cb
"ARM: imx6q: move low-power code out of clock driver"
the kernel fails to boot on i.MX6Q/D if preemption is
enabled (CONFIG_PREEMPT=y). The kernel just hangs
before the console comes up.
The above commit moved the initalization of the low-power
mode setting (enabling clocked WAIT states), which was
introduced in commit 83ae20981ae924c37d02a42c829155fc3851260c
"ARM: imx: correct low-power mode setting", from
imx6q_clks_init to imx6q_pm_init. Now it is called
much later, after all cores are enabled.
This patch moves the low-power mode initialization back
to imx6q_clks_init again (and to imx6sl_clks_init).
Linus Torvalds [Mon, 10 Feb 2014 18:33:50 +0000 (10:33 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"Small fix from Jeff for writepages leak, and some fixes for ACLs and
xattrs when SMB2 enabled.
Am expecting another fix from Jeff and at least one more fix (for
mounting SMB2 with cifsacl) in the next week"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] clean up page array when uncached write send fails
cifs: use a flexarray in cifs_writedata
retrieving CIFS ACLs when mounted with SMB2 fails dropping session
Add protocol specific operation for CIFS xattrs
In file included from sound/soc/pxa/spitz.c:28:0:
sound/soc/pxa/spitz.c: In function ‘spitz_ext_control’:
arch/arm/mach-pxa/include/mach/spitz.h:111:30: error:
‘PXA_NR_BUILTIN_GPIO’ undeclared (first use in this function)
#define SPITZ_SCP_GPIO_BASE (PXA_NR_BUILTIN_GPIO)
(etc.)
This is caused by implicit inclusion of <mach/irqs.h> from
various board-specific headers under <mach/*> in the PXA
platform. So we take a sweep over these, and for every such
header that uses PXA_NR_BUILTIN_GPIO or PXA_GPIO_TO_IRQ()
we explicitly #include "irqs.h" so that we satisfy the
dependency in the board include file alone.
Linus Walleij [Tue, 4 Feb 2014 12:30:05 +0000 (13:30 +0100)]
ARM: pxa: fix compilation problem on AM300EPD board
This board fails compilation like this:
arch/arm/mach-pxa/am300epd.c: In function ‘am300_cleanup’:
arch/arm/mach-pxa/am300epd.c:179:2: error: implicit declaration
of function ‘PXA_GPIO_TO_IRQ’ [-Werror=implicit-function-declaration]
free_irq(PXA_GPIO_TO_IRQ(RDY_GPIO_PIN), par);
This is because it was previously getting the macro PXA_GPIO_TO_IRQ
implicitly from <linux/gpio.h> which in turn implicitly included
<mach/gpio.h> which in turn included <mach/irqs.h>.
Add the missing include so that the board compiles again.
Kevin Hilman [Mon, 10 Feb 2014 17:44:34 +0000 (09:44 -0800)]
Merge tag 'mvebu-phy_ata-fixes-3.14' of git://git.infradead.org/linux-mvebu into fixes
From Jason Cooper:
mvebu phy and ata fixes for v3.14
- phy
- add support for optional phys via NULL
- ata
- fix boot hang due to probe failure of optional phys
NOTE: Series has been Ack'd by both the phy maintainer and the ata maintainer
for going through arm-soc
* tag 'mvebu-phy_ata-fixes-3.14' of git://git.infradead.org/linux-mvebu:
ata: sata_mv: Fix probe failures with optional phys
drivers: phy: Add support for optional phys
drivers: phy: Make NULL a valid phy reference
Masanari Iida [Mon, 10 Feb 2014 17:39:18 +0000 (10:39 -0700)]
block: Fix type mismatch in ssize_t_blk_mq_tag_sysfs_show
cppcheck detected following format string mismatch.
[blk-mq-tag.c:201]: (warning) %u in format string (no. 1) requires
'unsigned int' but the argument type is 'int'.
Change "cpu" from int to unsigned int, because the cpu
never become minus value.
Takashi Iwai [Mon, 10 Feb 2014 17:23:57 +0000 (18:23 +0100)]
ALSA: hda - Make snd_hda_gen_spec_free() static
The last user of snd_hda_gen_spec_free() is patch_via.c, and we can
rewrite it safely with snd_hda_gen_free(), so that
snd_hda_gen_spec_free() can be a local function in hda_generic.c.
Takashi Iwai [Mon, 10 Feb 2014 17:18:17 +0000 (18:18 +0100)]
ALSA: hda - Disable static quirks for C-Media codecs
According to alsa-info.sh outputs, all three entries with static
quirks have the correct pin configs, so it's safe to remove static
quirks. For now, turn the static quirks off via ifdef. The dead
codes will be removed in later release.
Witch to using a preallocated flush_rq for blk-mq similar to what's done
with the old request path. This allows us to set up the request properly
with a tag from the actually allowed range and ->rq_disk as needed by
some drivers. To make life easier we also switch to dynamic allocation
of ->flush_rq for the old path.
Rework I/O completions to work more like the old code path. blk_mq_end_io
now stays out of the business of deferring completions to others CPUs
and calling blk_mark_rq_complete. The latter is very important to allow
completing requests that have timed out and thus are already marked completed,
the former allows using the IPI callout even for driver specific completions
instead of having to reimplement them.
Adam Thomson [Thu, 6 Feb 2014 18:03:07 +0000 (18:03 +0000)]
ASoC: da9055: Fix device registration of PMIC and CODEC devices
Currently the I2C device Ids conflict for the MFD and CODEC so
cannot be both instantiated on one platform. This patch updates
the Ids and names to make them unique from each other.
It should be noted that the I2C addresses for both PMIC and CODEC
are modifiable so instantiation of the two are kept as separate
devices, rather than instantiating the CODEC from the MFD code.
Shawn Guo [Sat, 8 Feb 2014 05:20:35 +0000 (13:20 +0800)]
ASoC: fsl: fix pm support of machine drivers
The commit 1abe729 (ASoC: fsl: Add missing pm to current machine
drivers) enables pm support for a few IMX machine drivers. But it does
not update dev drvdata to be the pointer to 'card'. This causes the
kernel dump below in system suspend, because snd_soc_suspend() expects
that the dev drvdata points to 'card', while it still points to the
private data of machine driver.
This patch fixes imx-sgtl5000 and imx-wm8962 by attaching 'card' to dev
drvdata and private data to card drvdata. For imx-mc13783, I simply
revert the pm change because it must be broken for the same reason and
I don't have hardware to test pm enabling code.
ACPI / dock: Use acpi_device_enumerated() to check if dock is present
After commit 202317a573b2 (ACPI / scan: Add acpi_device objects for
all device nodes in the namespace) acpi_bus_get_device() will always
return 0 for dock devices in dock_notify(), so the dock station
docking code under ACPI_NOTIFY_DEVICE_CHECK will never be executed
and docking will not work as a result of that.
Fix the problem by making dock_notify() use acpi_device_enumerated()
to check the presence of the device instead of checking the return
value of acpi_bus_get_device().
Fixes: 202317a573b2 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace) Signed-off-by: Rafael J. Wysocki <[email protected]>
Takashi Iwai [Mon, 10 Feb 2014 10:43:41 +0000 (11:43 +0100)]
ALSA: hda - Fix undefined symbol due to builtin/module mixup
Even after the fix for leftover kconfig handling (commit f8f1becf),
the current code still doesn't handle properly the builtin/module
mixup case between the core snd-hda-codec and other codec drivers.
For example, when CONFIG_SND_HDA_INTEL=y and
CONFIG_SND_HDA_CODEC_HDMI=m, it'll end up with an unresolved symbol
snd_hda_parse_hdmi_codec. This patch fixes the issue.
Now codec->parser points to the parser object *only* when a module
(either generic or HDMI parser) is loaded and bound. When a builtin
symbol is used, codec->parser still points to NULL. This is the
difference from the previous versions.
Takashi Iwai [Mon, 10 Feb 2014 08:48:47 +0000 (09:48 +0100)]
ALSA: Replace with IS_ENABLED()
Replace the lengthy #if defined(XXX) || defined(XXX_MODULE) with the
new IS_ENABLED() macro.
The patch still doesn't cover all ifdefs. For example, the dependency
on CONFIG_GAMEPORT is still open-coded because this also has an extra
dependency on MODULE. Similarly, an open-coded ifdef in pcm_oss.c and
some sequencer-related stuff are left untouched.
Masanari Iida [Sat, 8 Feb 2014 15:47:36 +0000 (00:47 +0900)]
ALSA: Fix typos in alsa-driver-api.xml
This patch fixed 2 typos in DocBook/alsa-driver-api.xml.
It is because this file is generated by make xmldocs,
I have to fix typos within source files.
Michal Simek [Thu, 30 Jan 2014 14:25:37 +0000 (15:25 +0100)]
microblaze: Fix missing HZ macro
Add missing param.h header because of HZ macro.
It is causing compilation failure:
In file included from include/linux/delay.h:14:0,
from drivers/clk/qcom/reset.c:18:
drivers/clk/qcom/reset.c: In function 'qcom_reset':
arch/microblaze/include/asm/delay.h:39:35: error: 'HZ'
undeclared (first use in this function)
Jesper Juhl [Sun, 9 Feb 2014 22:30:32 +0000 (23:30 +0100)]
tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min
As far as I can tell we have used a default of 60 seconds for
FIN_WAIT2 timeout for ages (since 2.x times??).
In any case, the timeout these days is 60 seconds, so the 3 min
comment is wrong (and cost me a few minutes of my life when I was
debugging a FIN_WAIT2 related problem in a userspace application and
checked the kernel source for details).