I got 1. It should be 0, the reason is copy_workqueue_attrs() called
in apply_workqueue_attrs() doesn't copy no_numa field.
Fix it by making copy_workqueue_attrs() copy ->no_numa too. This
would also make get_unbound_pool() set a pool's ->no_numa attribute
according to the workqueue attributes used when the pool was created.
While harmelss, as ->no_numa isn't a pool attribute, this is a bit
confusing. Clear it explicitly.
Takashi Iwai [Thu, 1 Aug 2013 09:12:10 +0000 (11:12 +0200)]
Merge tag 'asoc-v3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.11
A fix to make sure userspace knows when control writes have caused a
change in value, fixing some UIs, plus a few few driver fixes mainly
cleaning up issues from recent refactorings on less mainstream platforms.
Aaro Koskinen [Sun, 21 Jul 2013 00:30:11 +0000 (03:30 +0300)]
powerpc/windfarm: Fix noisy slots-fan on Xserve (rm31)
slots-fan on G5 Xserve is always running at full speed with windfarm_rm31
driver, resulting in a very high acoustic noise level. It seems the fan
parameters are incorrect, and have been copied from the Drive Bay fan
(RPM, not present on rm31) of the legacy therm_pm72 driver. This patch
changes the parameters to match the Slots fan (PWM) of therm_pm72. With
the patch, slots-fan speed drops from 99% to 19% during normal use,
and slots-temp settle to ~42'C.
Robert Jennings [Thu, 25 Jul 2013 01:13:21 +0000 (20:13 -0500)]
powerpc: VPHN topology change updates all siblings
When an associativity level change is found for one thread, the
siblings threads need to be updated as well. This is done today
for PRRN in stage_topology_update() but is missing for VPHN in
update_cpu_associativity_changes_mask(). This patch will correctly
update all thread siblings during a topology change.
Without this patch a topology update can result in a CPU in
init_sched_groups_power() getting stuck indefinitely in a loop.
This loop is built in build_sched_groups(). As a result of the thread
moving to a node separate from its siblings the struct sched_group will
have its next pointer set to point to itself rather than the sched_group
struct of the next thread. This happens because we have a domain without
the SD_OVERLAP flag, which is correct, and a topology that doesn't conform
with reality (threads on the same core assigned to different numa nodes).
When this list is traversed by init_sched_groups_power() it will reach
the thread's sched_group structure and loop indefinitely; the cpu will
be stuck at this point.
The bug was exposed when VPHN was enabled in commit b7abef0 (v3.9).
Michael Ellerman [Tue, 23 Jul 2013 08:07:45 +0000 (18:07 +1000)]
powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspace
We use bit 63 of the event code for userspace to request that the event
be counted using EBB (Event Based Branches). Export this value, making
it part of the API - though only on processors that support EBB.
Back in commit 89713ed "Add timer, performance monitor and machine check
counts to /proc/interrupts" we added a count of PMU interrupts to the
output of /proc/interrupts.
At the time we named them "CNT" to match x86.
However in commit 89ccf46 "Rename 'performance counter interrupt'", the
x86 guys renamed theirs from "CNT" to "PMI".
Arguably changing the name could break someone's script, but I think the
chance of that is minimal, and it's preferable to have a name that 1) is
somewhat meaningful, and 2) matches x86.
tracing/kprobes: Fail to unregister if probe event files are in use
When a probe is being removed, it cleans up the event files that correspond
to the probe. But there is a race between writing to one of these files
and deleting the probe. This is especially true for the "enable" file.
CPU 0 CPU 1
----- -----
fd = open("enable",O_WRONLY);
probes_open()
release_all_trace_probes()
unregister_trace_probe()
if (trace_probe_is_enabled(tp))
return -EBUSY
A test program was written that used two threads to simulate the
above scenario adding a nanosleep() interval to change the timings
and after several thousand runs, it was able to trigger this bug
and crash:
The unregister_trace_probe() must be done first, and if it fails it must
fail the removal of the kprobe.
Several changes have already been made by Oleg Nesterov and Masami Hiramatsu
to allow moving the unregister_probe_event() before the removal of
the probe and exit the function if it fails. This prevents the tp
structure from being used after it is freed.
Linus Torvalds [Thu, 1 Aug 2013 00:55:12 +0000 (17:55 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Radeon, nouveau, exynos, intel, mgag200..
Not all strictly regressions but there was probably only one patch I'd
have really left out and it didn't seem worth respinning exynos to
avoid it, the line change count is quite low.
radeon: regressions + more dynamic powermanagement fixes, since DPM
is a new feature, and off by default I'd prefer to keep merging
fixes since it has a large userbase already and I'd like to keep
them on mainline
nouveau: is mostly regression fixes
i915: is a regression fix since Daniel is on holidays I've merged it.
mgag200: I've picked a bunch of targetted fixes from a big bunch of
distro patches,
exynos: build fixes mostly, one regression fix
I expect things will slow right down now, I may send on the intel
early quirk from Jesse separatly, since I think the x86 maintainers
acked it"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (37 commits)
drm/i915: fix missed hunk after GT access breakage
drm/radeon/dpm: re-enable cac control on SI
drm/radeon/dpm: fix calculations in si_calculate_leakage_for_v_and_t_formula
drm: fix 64 bit drm fixed point helpers
drm/radeon/atom: initialize more atom interpretor elements to 0
drm/nouveau: fix semaphore dmabuf obj
drm/nouveau/vm: make vm refcount into a kref
drm/nv31/mpeg: don't recognize nv3x cards as having nv44 graph class
drm/nv40/mpeg: write magic value to channel object to make it work
drm/nouveau: fix size check for cards without vm
drm/nv50-/disp: remove dcb_outp_match call, and related variables
drm/nva3-/disp: fix hda eld writing, needs to be padded
drm/nv31/mpeg: fix mpeg engine initialization
drm/nv50/mc: include vp in the fb error reporting mask
drm/nouveau: fix null pointer dereference in poll_changed
drm/nv50/gpio: post-nv92 cards have 32 interrupt lines
drm/nvc0/fb: take lock in nvc0_ram_put()
drm/nouveau/core: xtensa firmware size needs to be 0x40000 no matter what
drm/mgag200: Fix LUT programming for 16bpp
drm/mgag200: Fix framebuffer pitch calculation
...
Linus Torvalds [Thu, 1 Aug 2013 00:53:38 +0000 (17:53 -0700)]
Merge tag 'vfio-v3.11-rc4' of git://github.com/awilliam/linux-vfio
Pull vfio fixes from Alex Williamson:
"misc fixes around overreacting to bus notifier events and a locking
fix for a corner case blocked remove"
* tag 'vfio-v3.11-rc4' of git://github.com/awilliam/linux-vfio:
vfio-pci: Avoid deadlock on remove
vfio: Ignore sprurious notifies
vfio: Don't overreact to DEL_DEVICE
Linus Torvalds [Thu, 1 Aug 2013 00:52:04 +0000 (17:52 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge more patches from Andrew Morton:
"A bunch of fixes.
Plus Joe's printk move and rework. It's not a -rc3 thing but now
would be a nice time to offload it, while things are quiet. I've been
sitting on it all for a couple of weeks, no issues"
* emailed patches from Andrew Morton <[email protected]>:
vmpressure: make sure there are no events queued after memcg is offlined
vmpressure: do not check for pending work to prevent from new work
vmpressure: change vmpressure::sr_lock to spinlock
printk: rename struct log to struct printk_log
printk: use pointer for console_cmdline indexing
printk: move braille console support into separate braille.[ch] files
printk: add console_cmdline.h
printk: move to separate directory for easier modification
drivers/rtc/rtc-twl.c: fix: rtcX/wakealarm attribute isn't created
mm: zbud: fix condition check on allocation size
thp, mm: avoid PageUnevictable on active/inactive lru lists
mm/swap.c: clear PageActive before adding pages onto unevictable list
arch/x86/platform/ce4100/ce4100.c: include reboot.h
mm: sched: numa: fix NUMA balancing when !SCHED_DEBUG
rapidio: fix use after free in rio_unregister_scan()
.gitignore: ignore *.lz4 files
MAINTAINERS: dynamic debug: Jason's not there...
dmi_scan: add comments on dmi_present() and the loop in dmi_scan_machine()
ocfs2/refcounttree: add the missing NULL check of the return value of find_or_create_page()
mm: mempolicy: fix mbind_range() && vma_adjust() interaction
If there is no querier on a link then we won't get periodic reports and
therefore won't be able to learn about multicast listeners behind ports,
potentially leading to lost multicast packets, especially for multicast
listeners that joined before the creation of the bridge.
These lost multicast packets can appear since c5c23260594
("bridge: Add multicast_querier toggle and disable queries by default")
in particular.
With this patch we are flooding multicast packets if our querier is
disabled and if we didn't detect any other querier.
A grace period of the Maximum Response Delay of the querier is added to
give multicast responses enough time to arrive and to be learned from
before disabling the flooding behaviour again.
Neil Horman [Wed, 31 Jul 2013 13:03:56 +0000 (09:03 -0400)]
8139cp: Add dma_mapping_error checking
Self explanitory dma_mapping_error addition to the 8139 driver, based on this:
https://bugzilla.redhat.com/show_bug.cgi?id=947250
It showed several backtraces arising for dma_map_* usage without checking the
return code on the mapping. Add the check and abort the rx/tx operation if its
failed. Untested as I have no hardware and the reporter has wandered off, but
seems pretty straightforward.
net/usb/r815x: avoid to call mdio functions for runtime-suspended device
Don't replace the usb_control_msg() with usbnet_{read,write}_cmd()
which couldn't be called inside suspend/resume callback. Keep the
basic functions unlimited. Instead, using usb_autopm_get_interface()
and usb_autopm_put_interface() in r815x_mdio_{read,write}().
parisc: Fix interrupt routing for C8000 serial ports
We can't use dev->mod_index for selecting the interrupt routing entry,
because it's not an index into interrupt routing table. It will be even
wrong on a machine with 2 CPUs (4 cores). But all needed information is
contained in the PAT entries for the serial ports. mod[0] contains the
iosapic address and mod_info has some indications for the interrupt
input (at least it looks like it). This patch implements the searching
for the right iosapic and uses this interrupt input information.
The system doesn't more fail to bind the memory, and hence not falling
back to the PCI mode (if other failures aren't detected).
This is just a simple write down from the following patches:
agp/amd-k7: Allow binding user memory to the AGP GART
agp/hp-agp: Allow binding user memory to the AGP GART
parisc: Fix cache routines to ignore vma's with an invalid pfn
The parisc architecture does not have a pte special bit. As a result,
special mappings are handled with the VM_PFNMAP and VM_MIXEDMAP flags.
VM_MIXEDMAP mappings may or may not have a "struct page" backing. When
pfn_valid() is false, there is no "struct page" backing. Otherwise, they
are treated as normal pages.
The FireGL driver uses the VM_MIXEDMAP without a backing "struct page".
This treatment caused a panic due to a TLB data miss in
update_mmu_cache. This appeared to be in the code generated for
page_address(). We were in fact using a very circular bit of code to
determine the physical address of the PFN in various cache routines.
This wasn't valid when there was no "struct page" backing. The needed
address can in fact be determined simply from the PFN itself without
using the "struct page".
The attached patch updates update_mmu_cache(), flush_cache_mm(),
flush_cache_range() and flush_cache_page() to check pfn_valid() and to
directly compute the PFN physical and virtual addresses.
Michal Hocko [Wed, 31 Jul 2013 20:53:51 +0000 (13:53 -0700)]
vmpressure: make sure there are no events queued after memcg is offlined
vmpressure is called synchronously from reclaim where the target_memcg
is guaranteed to be alive but the eventfd is signaled from the work
queue context. This means that memcg (along with vmpressure structure
which is embedded into it) might go away while the work item is pending
which would result in use-after-release bug.
We have two possible ways how to fix this. Either vmpressure pins memcg
before it schedules vmpr->work and unpin it in vmpressure_work_fn or
explicitely flush the work item from the css_offline context (as
suggested by Tejun).
This patch implements the later one and it introduces vmpressure_cleanup
which flushes the vmpressure work queue item item. It hooks into
mem_cgroup_css_offline after the memcg itself is cleaned up.
Michal Hocko [Wed, 31 Jul 2013 20:53:48 +0000 (13:53 -0700)]
vmpressure: change vmpressure::sr_lock to spinlock
There is nothing that can sleep inside critical sections protected by
this lock and those sections are really small so there doesn't make much
sense to use mutex for them. Change the log to a spinlock
drivers/rtc/rtc-twl.c: fix: rtcX/wakealarm attribute isn't created
The device_init_wakeup() should be called before rtc_device_register().
Otherwise, sysfs "sys/class/rtc/rtcX/wakealarm" attribute will not be seen
from User space.
zbud_alloc() incorrectly verifies the size of allocation limit. It
should deny the allocation request greater than (PAGE_SIZE -
ZHDR_SIZE_ALIGNED - CHUNK_SIZE), not (PAGE_SIZE - ZHDR_SIZE_ALIGNED)
which has no remaining spaces for its buddy. There is no point in
spending the entire zbud page storing only a single page, since we don't
have any benefits.
thp, mm: avoid PageUnevictable on active/inactive lru lists
active/inactive lru lists can contain unevicable pages (i.e. ramfs pages
that have been placed on the LRU lists when first allocated), but these
pages must not have PageUnevictable set - otherwise shrink_[in]active_list
goes crazy:
kernel BUG at /home/space/kas/git/public/linux-next/mm/vmscan.c:1122!
__isolate_lru_page() returns EINVAL for PageUnevictable(page).
For lru_add_page_tail(), it means we should not set PageUnevictable()
for tail pages unless we're sure that it will go to LRU_UNEVICTABLE.
Let's just copy PG_active and PG_unevictable from head page in
__split_huge_page_refcount(), it will simplify lru_add_page_tail().
This will fix one more bug in lru_add_page_tail(): if
page_evictable(page_tail) is false and PageLRU(page) is true, page_tail
will go to the same lru as page, but nobody cares to sync page_tail
active/inactive state with page. So we can end up with inactive page on
active lru. The patch will fix it as well since we copy PG_active from
head page.
mm/swap.c: clear PageActive before adding pages onto unevictable list
As a result of commit 13f7f78981e4 ("mm: pagevec: defer deciding which
LRU to add a page to until pagevec drain time"), pages on unevictable
lists can have both of PageActive and PageUnevictable set. This is not
only confusing, but also corrupts page migration and
shrink_[in]active_list.
This patch fixes the problem by adding ClearPageActive before adding
pages into unevictable list. It also cleans up VM_BUG_ONs.
Andrew Morton [Wed, 31 Jul 2013 20:53:36 +0000 (13:53 -0700)]
arch/x86/platform/ce4100/ce4100.c: include reboot.h
Fix the build:
arch/x86/platform/ce4100/ce4100.c: In function 'x86_ce4100_early_setup':
arch/x86/platform/ce4100/ce4100.c:165:2: error: 'reboot_type' undeclared (first use in this function)
Dave Kleikamp [Wed, 31 Jul 2013 20:53:35 +0000 (13:53 -0700)]
mm: sched: numa: fix NUMA balancing when !SCHED_DEBUG
Commit 3105b86a9fee ("mm: sched: numa: Control enabling and disabling of
NUMA balancing if !SCHED_DEBUG") defined numabalancing_enabled to
control the enabling and disabling of automatic NUMA balancing, but it
is never used.
I believe the intention was to use this in place of sched_feat_numa(NUMA).
Currently, if SCHED_DEBUG is not defined, sched_feat_numa(NUMA) will
never be changed from the initial "false".
Ben Hutchings [Wed, 31 Jul 2013 20:53:30 +0000 (13:53 -0700)]
dmi_scan: add comments on dmi_present() and the loop in dmi_scan_machine()
My previous refactoring in commit 79bae42d51a5 ("dmi_scan: refactor
dmi_scan_machine(), {smbios,dmi}_present()") resulted in slightly tricky
code (though I think it's more elegant). Explain what it's doing.
IPoIB: Fix pkey change flow for virtualization environments
IPoIB's required behaviour w.r.t to the pkey used by the device is the following:
- For "parent" interfaces (e.g ib0, ib1, etc) who are created
automatically as a result of hot-plug events from the IB core, the
driver needs to take whatever pkey vlaue it finds in index 0, and
stick to that index.
- For child interfaces (e.g ib0.8001, etc) created by admin directive,
the driver needs to use and stick to the value provided during its
creation.
In SR-IOV environment its possible for the VF probe to take place
before the cloud management software provisions the suitable pkey for
the VF in the paravirtualed PKEY table index 0. When this is the case,
the VF IB stack will find in index 0 an invalide pkey, which is all
zeros.
Moreover, the cloud managment can assign the pkey value at index 0 at
any time of the guest life cycle.
The correct behavior for IPoIB to address these requirements for
parent interfaces is to use PKEY_CHANGE event as trigger to optionally
re-init the device pkey value and re-create all the relevant resources
accordingly, if the value of the pkey in index 0 has changed (from
invalid to valid or from valid value X to invalid value Y).
This patch enhances the heavy flushing code which is triggered by pkey
change event, to behave correctly for parent devices. For child
devices, the code remains the same, namely chases pkey value and not
index.
Jack Morgenstein [Thu, 18 Jul 2013 11:02:29 +0000 (14:02 +0300)]
IB/core: Create QP1 using the pkey index which contains the default pkey
Currently, QP1 is created using pkey index 0. This patch simply looks
for the index containing the default pkey, rather than hard-coding
pkey index 0.
This change will have no effect in native mode, since QP0 and QP1 are
created before the SM configures the port, so pkey table will still be
the default table defined by the IB Spec, in C10-123: "If non-volatile
storage is not used to hold P_Key Table contents, then if a PM
(Partition Manager) is not present, and prior to PM initialization of
the P_Key Table, the P_Key Table must act as if it contains a single
valid entry, at P_Key_ix = 0, containing the default partition
key. All other entries in the P_Key Table must be invalid."
Thus, in the native mode case, the driver will find the default pkey
at index 0 (so it will be no different than the hard-coding).
However, in SR-IOV mode, for VFs, the pkey table may be
paravirtualized, so that the VF's pkey index zero may not necessarily
be mapped to the real pkey index 0. For VFs, therefore, it is
important to find the virtual index which maps to the real default
pkey.
This commit does the following for QP1 creation:
1. Find the pkey index containing the default pkey, and use that index
if found. ib_find_pkey() returns the index of the
limited-membership default pkey (0x7FFF) if the full-member default
pkey is not in the table.
2. If neither form of the default pkey is found, use pkey index 0
(previous behavior).
Eli Cohen [Thu, 18 Jul 2013 12:31:08 +0000 (15:31 +0300)]
mlx5_core: Implement new initialization sequence
Introduce enbale_hca and disable_hca commands to signify when the
driver starts or ceases to operate on the device.
In addition the driver will use boot and init pages count; boot pages
is required to allow firmware to complete boot commands and the other
to complete init hca. Command interface revision is bumped to 4 to
enforce using supported firmware.
This patch breaks compatibility with old versions of firmware (< 4);
however, the first GA firmware we will publish will support version 4
so this should not be a problem.
Russell King [Tue, 23 Jul 2013 17:37:00 +0000 (18:37 +0100)]
ARM: allow kuser helpers to be removed from the vector page
Provide a kernel configuration option to allow the kernel user helpers
to be removed from the vector page, thereby preventing their use with
ROP (return orientated programming) attacks. This option is only
visible for CPU architectures which natively support all the operations
which kernel user helpers would normally provide, and must be enabled
with caution.
Russell King [Thu, 4 Jul 2013 11:03:31 +0000 (12:03 +0100)]
ARM: use linker magic for vectors and vector stubs
Use linker magic to create the vectors and vector stubs: we can tell the
linker to place them at an appropriate VMA, but keep the LMA within the
kernel. This gets rid of some unnecessary symbol manipulation, and
have the linker calculate the relocations appropriately.
Russell King [Thu, 4 Jul 2013 10:40:32 +0000 (11:40 +0100)]
ARM: move vector stubs
Move the machine vector stubs into the page above the vector page,
which we can prevent from being visible to userspace. Also move
the reset stub, and place the swi vector at a location that the
'ldr' can get to it.
This hides pointers into the kernel which could give valuable
information to attackers, and reduces the number of exploitable
instructions at a fixed address.
Russell King [Thu, 4 Jul 2013 10:00:23 +0000 (11:00 +0100)]
ARM: poison the vectors page
Fill the empty regions of the vectors page with an exception generating
instruction. This ensures that any inappropriate branch to the vector
page is appropriately trapped, rather than just encountering some code
to execute. (The vectors page was filled with zero before, which
corresponds with the "andeq r0, r0, r0" instruction - a no-op.)
1) Fix association failures not triggering a connect-failure event in
cfg80211, from Johannes Berg.
2) Eliminate a potential NULL deref with older iptables tools when
configuring xt_socket rules, from Eric Dumazet.
3) Missing RTNL locking in wireless regulatory code, from Johannes
Berg.
4) Fix OOPS caused by firmware loading races in ath9k_htc, from Alexey
Khoroshilov.
5) Fix usb URB leak in usb_8dev CAN driver, also from Alexey
Khoroshilov.
6) VXLAN namespace teardown fails to unregister devices, from Stephen
Hemminger.
7) Fix multicast settings getting dropped by firmware in qlcnic driver,
from Sucheta Chakraborty.
8) Add sysctl range enforcement for tcp_syn_retries, from Michal Tesar.
9) Fix a nasty bug in bridging where an active timer would get
reinitialized with a setup_timer() call. From Eric Dumazet.
10) Fix use after free in new mlx5 driver, from Dan Carpenter.
11) Fix freed pointer reference in ipv6 multicast routing on namespace
cleanup, from Hannes Frederic Sowa.
12) Some usbnet drivers report TSO and SG in their feature set, but the
usbnet layer doesn't really support them. From Eric Dumazet.
13) Fix crash on EEH errors in tg3 driver, from Gavin Shan.
14) Drop cb_lock when requesting modules in genetlink, from Stanislaw
Gruszka.
15) Kernel stack leaks in cbq scheduler and af_key pfkey messages, from
Dan Carpenter.
16) FEC driver erroneously signals NETDEV_TX_BUSY on transmit leading to
endless loops, from Uwe Kleine-König.
17) Fix hangs from loading mvneta driver, from Arnaud Patard.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (84 commits)
mlx5: fix error return code in mlx5_alloc_uuars()
mvneta: Try to fix mvneta when compiled as module
mvneta: Fix hang when loading the mvneta driver
atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE
af_key: more info leaks in pfkey messages
net/fec: Don't let ndo_start_xmit return NETDEV_TX_BUSY without link
net_sched: Fix stack info leak in cbq_dump_wrr().
igb: fix vlan filtering in promisc mode when not in VT mode
ixgbe: Fix Tx Hang issue with lldpad on 82598EB
genetlink: release cb_lock before requesting additional module
net: fec: workaround stop tx during errata ERR006358
qlcnic: Fix diagnostic interrupt test for 83xx adapters.
qlcnic: Fix setting Guest VLAN
qlcnic: Fix operation type and command type.
qlcnic: Fix initialization of work function.
Revert "atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring"
atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
net/tg3: Fix warning from pci_disable_device()
net/tg3: Fix kernel crash
...
Jack Morgenstein [Thu, 18 Jul 2013 11:02:30 +0000 (14:02 +0300)]
IB/mlx4: Use default pkey when creating tunnel QPs
When creating tunnel QPs for special QP tunneling, look for the
default pkey in the slave's virtual pkey table. If it is present, use
the real pkey index where the default pkey is located.
If the default pkey is not found in the pkey table, use the real pkey
index which is stored at index 0 in the slave's virtual pkey table
(this is the current behavior).
This change is required to support cloud computing, where the
paravirtualized index of the default pkey is moved to index 1 or
higher. The pkey at paravirtualized index 0 is used for the default
IPoIB interface created by the VF.
Its possible for the pkey value at paravirtualized index 0 to be
invalid (zero) at VF probe time (pkey index 0 is mapped to real pkey
index 127, which contains pkey = 0).
At some point after the VF probe, the cloud computing interface at the
hypervisor maps virtual index 0 for the VF to the pkey index
containing the pkey that IPoIB will use in its operation. However,
when the tunnel QP is created, the pkey at the slave's virtual index 0
is still mapped to the invalid pkey index, so tunnel QP creation
fails.
This commit causes the hypervisor to search for the default pkey in
the slave's pkey table -- and this pkey is present in the table (at
index > 0) at tunnel QP creation time, so that the tunnel QP creation
will succeed.
'This is the second NFC fixes pull request for 3.11.
We have:
- A build failure fix for the NCI SPI transport layer due to a
missing CRC_CCITT Kconfig dependency.
- A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
merge window but the typical terminology for loading a firmware to a
target is firmware download rather than upload. In order to avoid any
confusion in a file exported to userspace, we rename this command into
CMD_FW_DOWNLOAD."
mwifiex: check for bss_role instead of bss_mode for STA operations
This patch fixes an issue wherein association would fail on P2P
interfaces. This happened because we are checking priv->mode
against NL80211_IFTYPE_STATION. While this check is correct for
infrastructure stations, it would fail P2P clients for which mode
is NL80211_IFTYPE_P2P_CLIENT.
Better check would be bss_role which has only 2 values: STA/AP.
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
- BMIPS SMP fixes
- a build fix necessary for older compilers
- two more bugs found my Chandras' testing
- and one more build fix
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: BMIPS: fix slave CPU booting when physical CPU is not 0
MIPS: BMIPS: do not change interrupt routing depending on boot CPU
MIPS: powertv: Fix arguments for free_reserved_area()
MIPS: Set default CPU type for BCM47XX platforms
MIPS: uapi/asm/siginfo.h: Fix GCC 4.1.2 compilation
MIPS: Fix multiple definitions of UNCAC_BASE.
Merge tag 'stable/for-linus-3.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from Konrad Rzeszutek Wilk:
- Three fixes for ARM/ARM64 to either compile or not certain generic
drivers
- Fix for avoiding a potential deadlock when an user space event
channel is destroyed.
- Fix a workqueue resuming multiple times.
* tag 'stable/for-linus-3.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/tmem: do not allow XEN_TMEM on ARM64
xen/evtchn: avoid a deadlock when unbinding an event channel
xen/arm: enable PV control for ARM
xen/arm64: Don't compile cpu hotplug
xenbus: frontend resume cleanup
Merge tag 'xen-arm-3.11-rc2-warn-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen
Pull Xen ARM fix from Stefano Stabellini.
Update xen_restart to new calling convention.
* tag 'xen-arm-3.11-rc2-warn-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen:
xen/arm,arm64: update xen_restart after ff701306cd49 and 7b6d864b48d9
Merge tag 'usb-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some tiny USB fixes for 3.11-rc4
Nothing major, some gadget fixes, some new device ids, a new tiny
driver for the ANT+ USB device, and a number of fixes for the mos7840
driver that were much needed"
* tag 'usb-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: ftdi_sio: add more RT Systems ftdi devices
usb: chipidea: fix the build error with randconfig
usb: chipidea: cast PORTSC_PTS and DEVLC_PTS macros
usb: gadget: udc-core: fix the typo of udc state attribute
usb: gadget: f_phonet: remove unused preprocessor conditional
usb: gadget: multi: fix error return code in cdc_do_config()
USB: mos7840: fix pointer casts
USB: mos7840: fix race in led handling
USB: mos7840: fix device-type detection
USB: mos7840: fix race in register handling
USB: serial: add driver for Suunto ANT+ USB device
usb: gadget: free opts struct on error recovery
usb: gadget: ether: put_usb_function on unbind
usb: musb: fix resource passed from glue layer to musb
Merge tag 'tty-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are 4 tiny tty and serial driver fixes for 3.11-rc4.
Nothing big, a refcount leak, a module alias fix, and two fixes to the
mxs-auart serial driver"
* tag 'tty-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: arc_uart: Fix module alias
tty_port: Fix refcounting leak in tty_port_tty_hangup()
serial/mxs-auart: increase time to wait for transmitter to become idle
serial/mxs-auart: fix race condition in interrupt handler
Mutex can not be released unless all hid_device members are properly
initialized. Otherwise it would result in a race condition that can
cause NULL pointer kernel panic issue in hidraw_open where it uses
uninitialized 'list' member in list_add_tail().
tracing: Add comment to describe special break case in probe_remove_event_call()
The "break" used in the do_for_each_event_file() is used as an optimization
as the loop is really a double loop. The loop searches all event files
for each trace_array. There's only one matching event file per trace_array
and after we find the event file for the trace_array, the break is used
to jump to the next trace_array and start the search there.
As this is not a standard way of using "break" in C code, it requires
a comment right before the break to let people know what is going on.
tracing: trace_remove_event_call() should fail if call/file is in use
Change trace_remove_event_call(call) to return the error if this
call is active. This is what the callers assume but can't verify
outside of the tracing locks. Both trace_kprobe.c/trace_uprobe.c
need the additional changes, unregister_trace_probe() should abort
if trace_remove_event_call() fails.
The caller is going to free this call/file so we must ensure that
nobody can use them after trace_remove_event_call() succeeds.
debugfs should be fine after the previous changes and event_remove()
does TRACE_REG_UNREGISTER, but still there are 2 reasons why we need
the additional checks:
- There could be a perf_event(s) attached to this tp_event, so the
patch checks ->perf_refcount.
- TRACE_REG_UNREGISTER can be suppressed by FTRACE_EVENT_FL_SOFT_MODE,
so we simply check FTRACE_EVENT_FL_ENABLED protected by event_mutex.
debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
debugfs_remove_recursive() is wrong,
1. it wrongly assumes that !list_empty(d_subdirs) means that this
dir should be removed.
This is not that bad by itself, but:
2. if d_subdirs does not becomes empty after __debugfs_remove()
it gives up and silently fails, it doesn't even try to remove
other entries.
However ->d_subdirs can be non-empty because it still has the
already deleted !debugfs_positive() entries.
3. simple_release_fs() is called even if __debugfs_remove() fails.
Suppose we have
dir1/
dir2/
file2
file1
and someone opens dir1/dir2/file2.
Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes
away.
But debugfs_remove_recursive(dir1) silently fails and doesn't remove
this directory. Because it tries to delete (the already deleted)
dir1/dir2/file2 again and then fails due to "Avoid infinite loop"
logic.
And after that it is not possible to create another probe entry.
With this patch debugfs_remove_recursive() skips !debugfs_positive()
files although this is not strictly needed. The most important change
is that it does not try to make ->d_subdirs empty, it simply scans
the whole list(s) recursively and removes as much as possible.
If kzalloc() fails for `img' then we are going to leak the memory
for `out'. We are freeing the memory of all the tx/rx transfers
but the tx/rx buf pointers will be NULL if we drop out earlier.
Paul Walmsley [Tue, 30 Jul 2013 11:38:45 +0000 (12:38 +0100)]
ARM: 7801/1: v6: prevent gcc 4.5 from reordering extended CP15 reads above is_smp() test
Commit 621a0147d5c921f4cc33636ccd0602ad5d7cbfbc ("ARM: 7757/1: mm:
don't flush icache in switch_mm with hardware broadcasting") breaks
the boot on OMAP2430SDP with omap2plus_defconfig. Tracked to an
undefined instruction abort from the CP15 read in
cache_ops_need_broadcast(). It turns out that gcc 4.5 reorders the
extended CP15 read above the is_smp() test. This breaks ARM1136 r0
cores, since they don't support several CP15 registers that later ARM
cores do. ARM1136JF-S TRM section 3.2.1 "Register allocation" has the
details.
So mark the extended CP15 read as clobbering memory, which prevents
the compiler from reordering it before the is_smp() test. Russell
states that the code generated from this approach is preferable to
marking the inline asm as volatile. Remove the existing condition
code clobber as it's obsolete, per Nico's post:
CC sound/soc/au1x/ac97c.o
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: expected identifier or ‘(’ before ‘&’ token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: pasting "__initcall_" and "&" does not give a valid preprocessing token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘&’ token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: expected identifier or ‘(’ before ‘&’ token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: pasting "__exitcall_" and "&" does not give a valid preprocessing token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:344:1: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘&’ token
/home/ralf/src/linux/upstream-sfr/sound/soc/au1x/ac97c.c:334:31: warning: ‘au1xac97c_driver’ defined but not used [-Wunused-variable]
make[5]: *** [sound/soc/au1x/ac97c.o] Error 1
make[4]: *** [sound/soc/au1x] Error 2
make[3]: *** [sound/soc] Error 2
Sean Hefty [Wed, 24 Jul 2013 22:06:08 +0000 (15:06 -0700)]
RDMA/cma: Fix accessing invalid private data for UD
If a application is using AF_IB with a UD QP, but does not provide any
private data, we will end up accessing invalid memory. Check for this
case and handle it appropriately.
When the mvneta driver is compiled as module, the clock is disabled before
it's loading. This will reset the registers values and all configuration
made by the bootloader.
This patch sets the "sgmii serdes configuration" register to a magical value
found in:
https://github.com/yellowback/ubuntu-precise-armadaxp/blob/master/arch/arm/mach-armadaxp/armada_xp_family/ctrlEnv/mvCtrlEnvLib.c
With this change, the interrupts are working/generated and ethernet is
working.
When the mvneta driver is compiled, it'll be loaded with clocks disabled.
This implies that the clocks should be enabled again before any register
access or it'll hang.
To fix it:
- enable clock earlier
- move timer callback after setting timer.data
Eric Dumazet [Mon, 29 Jul 2013 17:24:04 +0000 (10:24 -0700)]
atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
On Mon, 2013-07-29 at 08:30 -0700, Eric Dumazet wrote:
> On Mon, 2013-07-29 at 13:09 +0100, Luis Henriques wrote:
>
> >
> > I confirm that I can't reproduce the issue using this patch.
> >
>
> Thanks, I'll send a polished patch, as this one had an error if
> build_skb() returns NULL (in case sk_buff allocation fails)
Please try the following patch : It should use 2K frags instead of 4K
for normal 1500 mtu
Thanks !
[PATCH] atl1c: use custom skb allocator
We had reports ( https://bugzilla.kernel.org/show_bug.cgi?id=54021 )
that using high order pages for skb allocations is problematic for atl1c
We do not know exactly what the problem is, but we suspect that crossing
4K pages is not well supported by this hardware.
Use a custom allocator, using page allocator and 2K fragments for
optimal stack behavior. We might make this allocator generic
in future kernels.
It was finally narrowed down to unloading a module that was being traced.
It was actually more than that. When functions are being traced, there's
a table of all functions that have a ref count of the number of active
tracers attached to that function. When a function trace callback is
registered to a function, the function's record ref count is incremented.
When it is unregistered, the function's record ref count is decremented.
If an inconsistency is detected (ref count goes below zero) the above
warning is shown and the function tracing is permanently disabled until
reboot.
The ftrace callback ops holds a hash of functions that it filters on
(and/or filters off). If the hash is empty, the default means to filter
all functions (for the filter_hash) or to disable no functions (for the
notrace_hash).
When a module is unloaded, it frees the function records that represent
the module functions. These records exist on their own pages, that is
function records for one module will not exist on the same page as
function records for other modules or even the core kernel.
Now when a module unloads, the records that represents its functions are
freed. When the module is loaded again, the records are recreated with
a default ref count of zero (unless there's a callback that traces all
functions, then they will also be traced, and the ref count will be
incremented).
The problem is that if an ftrace callback hash includes functions of the
module being unloaded, those hash entries will not be removed. If the
module is reloaded in the same location, the hash entries still point
to the functions of the module but the module's ref counts do not reflect
that.
With the help of Steve and Joern, we found a reproducer:
Using uinput module and uinput_release function.
cd /sys/kernel/debug/tracing
modprobe uinput
echo uinput_release > set_ftrace_filter
echo function > current_tracer
rmmod uinput
modprobe uinput
# check /proc/modules to see if loaded in same addr, otherwise try again
echo nop > current_tracer
[BOOM]
The above loads the uinput module, which creates a table of functions that
can be traced within the module.
We add uinput_release to the filter_hash to trace just that function.
Enable function tracincg, which increments the ref count of the record
associated to uinput_release.
Remove uinput, which frees the records including the one that represents
uinput_release.
Load the uinput module again (and make sure it's at the same address).
This recreates the function records all with a ref count of zero,
including uinput_release.
Disable function tracing, which will decrement the ref count for uinput_release
which is now zero because of the module removal and reload, and we have
a mismatch (below zero ref count).
The solution is to check all currently tracing ftrace callbacks to see if any
are tracing any of the module's functions when a module is loaded (it already does
that with callbacks that trace all functions). If a callback happens to have
a module function being traced, it increments that records ref count and starts
tracing that function.
There may be a strange side effect with this, where tracing module functions
on unload and then reloading a new module may have that new module's functions
being traced. This may be something that confuses the user, but it's not
a big deal. Another approach is to disable all callback hashes on module unload,
but this leaves some ftrace callbacks that may not be registered, but can
still have hashes tracing the module's function where ftrace doesn't know about
it. That situation can cause the same bug. This solution solves that case too.
Another benefit of this solution, is it is possible to trace a module's
function on unload and load.
Ben Widawsky [Tue, 30 Jul 2013 23:27:57 +0000 (16:27 -0700)]
drm/i915: fix missed hunk after GT access breakage
Upon some code refactoring, a hunk was missed. This was fixed for
next, but missed the current trees, and hasn't yet been merged by Dave
Airlie. It is fixed in:
commit 907b28c56ea40629aa6595ddfa414ec2fc7da41c
Author: Chris Wilson <[email protected]>
Date: Fri Jul 19 20:36:52 2013 +0100
drm/i915: Colocate all GT access routines in the same file
Pablo Neira [Mon, 29 Jul 2013 10:30:04 +0000 (12:30 +0200)]
genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE
Currently, it is not possible to use neither NLM_F_EXCL nor
NLM_F_REPLACE from genetlink. This is due to this checking in
genl_family_rcv_msg:
if (nlh->nlmsg_flags & NLM_F_DUMP)
NLM_F_DUMP is NLM_F_MATCH|NLM_F_ROOT. Thus, if NLM_F_EXCL or
NLM_F_REPLACE flag is set, genetlink believes that you're
requesting a dump and it calls the .dumpit callback.
The solution that I propose is to refine this checking to
make it stricter:
if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)
And given the combination NLM_F_REPLACE and NLM_F_EXCL does
not make sense to me, it removes the ambiguity.
There was a patch that tried to fix this some time ago (0ab03c2
netlink: test for all flags of the NLM_F_DUMP composite) but it
tried to resolve this ambiguity in *all* existing netlink subsystems,
not only genetlink. That patch was reverted since it broke iproute2,
which is using NLM_F_ROOT to request the dump of the routing cache.
Dan Carpenter [Sun, 28 Jul 2013 20:04:45 +0000 (23:04 +0300)]
af_key: more info leaks in pfkey messages
This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages". There are some struct members which don't get initialized
and could disclose small amounts of private information.
Samuel Ortiz [Tue, 30 Jul 2013 23:19:43 +0000 (01:19 +0200)]
NFC: netlink: Rename CMD_FW_UPLOAD to CMD_FW_DOWNLOAD
Loading a firmware into a target is typically called firmware
download, not firmware upload. So we rename the netlink API to
NFC_CMD_FW_DOWNLOAD in order to avoid any terminology confusion from
userspace.
Paul Bolle [Wed, 24 Jul 2013 22:06:07 +0000 (15:06 -0700)]
RDMA/cma: Fix gcc warning
Building cma.o triggers this gcc warning:
drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here
This is a false positive, as "port" will always be initialized if we're
at "found". But if we assign to "id_priv->id.port_num" directly, we can
drop "port". That will, obviously, silence gcc.
net/fec: Don't let ndo_start_xmit return NETDEV_TX_BUSY without link
Don't test for having link and let hardware deal with this situation.
Without this patch I see a machine running an -rt patched Linux being
stuck in sch_direct_xmit when it looses link while there is still a
packet to be sent. In this case the fec_enet_start_xmit routine returned
NETDEV_TX_BUSY which makes the network stack reschedule the packet and
so sch_direct_xmit calls fec_enet_start_xmit again.
I failed to reproduce a complete hang without -rt, but I think the
problem exists there, too.