Heiko Carstens [Mon, 23 May 2011 08:24:32 +0000 (10:24 +0200)]
[S390] percpu: implement arch specific irqsafe_cpu_ops
Implement arch specific irqsafe_cpu ops. The arch specific ops do not
disable/enable interrupts since that is an expensive operation. Instead
we disable preemption and perform a compare and swap loop.
Since on server distros (the ones we care about) preemption is disabled
the preempt_disable()/preempt_enable() pair is a nop.
In the end this code should be faster than the generic one.
The concepts of VDSO and gcov-based profiling don't mix: the former
includes kernel-provided code running in userspace, the latter adds
instructions that modify counters in kernel data segments. On s390
this has not been a problem so far due to VDSO code being written in
all-assembler which is exempt from gcov-based profiling. This could
change in the future, so disable profiling excplicitly for VDSO code.
Heiko Carstens [Mon, 23 May 2011 08:24:29 +0000 (10:24 +0200)]
[S390] extmem: get rid of compile warning
Get rid of these:
arch/s390/mm/extmem.c: In function 'segment_modify_shared':
arch/s390/mm/extmem.c:622:3: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized]
arch/s390/mm/extmem.c:627:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized]
arch/s390/mm/extmem.c: In function 'segment_load':
arch/s390/mm/extmem.c:481:11: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized]
arch/s390/mm/extmem.c:480:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized]
Heiko Carstens [Mon, 23 May 2011 08:24:27 +0000 (10:24 +0200)]
[S390] monwriter: fix return code handling
Fix return code handling within monwrite_new_hdr(). Return code handling
is everwhere implemented, the return code of the diagnose function was
just not passed.
Heiko Carstens [Mon, 23 May 2011 08:24:24 +0000 (10:24 +0200)]
[S390] Remove tape block device driver.
Remove the tape block device driver. It's not of real use but has
already created some confusion when users wanted to access tape devices
and used the block device nodes instead of the character device nodes.
Also remove the whole tape documentation since it's completely outdated
and we have the device drivers book which is the place where everything
is properly documented.
The noexec support on s390 does not rely on a bit in the page table
entry but utilizes the secondary space mode to distinguish between
memory accesses for instructions vs. data. The noexec code relies
on the assumption that the cpu will always use the secondary space
page table for data accesses while it is running in the secondary
space mode. Up to the z9-109 class machines this has been the case.
Unfortunately this is not true anymore with z10 and later machines.
The load-relative-long instructions lrl, lgrl and lgfrl access the
memory operand using the same addressing-space mode that has been
used to fetch the instruction.
This breaks the noexec mode for all user space binaries compiled
with march=z10 or later. The only option is to remove the current
noexec support.
Sebastian Ott [Mon, 23 May 2011 08:23:32 +0000 (10:23 +0200)]
[S390] cio: fix unreg race in set_online path
In ccw_device_set_online we basically start path verification and
wait for the device to reach a final state. If it turns out that the
device has no useable path we schedule the deregistration of the
device (which is still in an non-final state) and wake up the waiting
process. The deregistration process will set a final state, but if
the wake up happens to be prior to this, the device will hang forever
in ccw_device_set_online.
To fix this just set the final NOT_OPER state prior to the scheduled
deregistration of the device.
Paul Mundt [Mon, 23 May 2011 08:09:30 +0000 (17:09 +0900)]
sh: Ignore R_SH_NONE module relocations.
Some modules may end up with R_SH_NONE relocs with the right combination
of compiler/kernel config (specifically dwarf unwinder), so simply trap
and ignore them instead of letting them get down to the error path.
Linus Torvalds [Mon, 23 May 2011 05:43:01 +0000 (22:43 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: use mark_buffer_dirty to mark btnode or meta data dirty
nilfs2: always set back pointer to host inode in mapping->host
nilfs2: get rid of NILFS_I_NILFS
nilfs2: use list_first_entry
nilfs2: use empty_aops for gc-inodes
nilfs2: implement resize ioctl
nilfs2: add truncation routine of segment usage file
nilfs2: add routine to move secondary super block
nilfs2: add ioctl which limits range of segment to be allocated
nilfs2: zero fill unused portion of super root block
nilfs2: super root size should change depending on inode size
nilfs2: get rid of private page allocator
nilfs2: merge list_del()/list_add_tail() to list_move_tail()
This patch updates the clocksource part of the TMU driver
to make use of the __clocksource_updatefreq_hz() function.
Without this patch the old code uses clocksource_register()
together with a hack that assumes a never changing clock rate
(see clk_enable(), clk_get_rate() and clk_disable()).
The patch uses clocksource_register_hz() with 1 Hz as initial
value, then lets the ->enable() callback update the value
with __clocksource_updatefreq_hz() once the struct clk has
been enabled and the frequency is stable.
This patch updates the clocksource part of the CMT driver
to make use of the __clocksource_updatefreq_hz() function.
Without this patch the old code uses clocksource_register()
together with a hack that assumes a never changing clock rate
(see clk_enable(), clk_get_rate() and clk_disable()).
The patch uses clocksource_register_hz() with 1 Hz as initial
value, then lets the ->enable() callback update the value
with __clocksource_updatefreq_hz() once the struct clk has
been enabled and the frequency is stable.
Artem Bityutskiy [Tue, 17 May 2011 12:15:30 +0000 (15:15 +0300)]
UBIFS: switch to dynamic printks
Switch to debugging using dynamic printk (pr_debug()). There is no good reason
to carry custom debugging prints if there is so cool and powerful generic
dynamic printk infrastructure, see Documentation/dynamic-debug-howto.txt. With
dynamic printks we can switch on/of individual prints, per-file, per-function
and per format messages. This means that instead of doing old-fashioned
echo 1 > /sys/module/ubifs/parameters/debug_msgs
to enable general messages, we can do:
echo 'format "UBIFS DBG gen" +ptlf' > control
to enable general messages and additionally ask the dynamic printk
infrastructure to print process ID, line number and function name. So there is
no reason to keep UBIFS-specific crud if there is more powerful generic thing.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
ide/ide-scan-pci.c: Use for_each_pci_dev().
ide: Use linux/mutex.h
IDE: ide-floppy, remove unnecessary NULL check
drivers/ide/pmac.c: Remove unnecessary casts of pci_get_drvdata
ide: fix use after free in ide-acpi
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (28 commits)
sparc32: fix build, fix missing cpu_relax declaration
SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI
sparc32,leon: Remove unnecessary page_address calls in LEON DMA API.
sparc: convert old cpumask API into new one
sparc32, sun4d: Implemented SMP IPIs support for SUN4D machines
sparc32, sun4m: Implemented SMP IPIs support for SUN4M machines
sparc32,leon: Implemented SMP IPIs for LEON CPU
sparc32: implement SMP IPIs using the generic functions
sparc32,leon: SMP power down implementation
sparc32,leon: added some SMP comments
sparc: add {read,write}*_be routines
sparc32,leon: don't rely on bootloader to mask IRQs
sparc32,leon: operate on boot-cpu IRQ controller registers
sparc32: always define boot_cpu_id
sparc32: removed unused code, implemented by generic code
sparc32: avoid build warning at mm/percpu.c:1647
sparc32: always register a PROM based early console
sparc32: probe for cpu info only during startup
sparc: consolidate show_cpuinfo in cpu.c
sparc32,leon: implement genirq CPU affinity
...
Linus Torvalds [Mon, 23 May 2011 05:03:03 +0000 (22:03 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: allow resync_start to be set while an array is active.
md/raid10: reformat some loops with less indenting.
md/raid10: remove unused variable.
md/raid10: make more use of 'slot' in raid10d.
md/raid10: some tidying up in fix_read_error
md/raid1: improve handling of pages allocated for write-behind.
md/raid1: try fix_sync_read_error before process_checks.
md/raid1: tidy up new functions: process_checks and fix_sync_read_error.
md/raid1: split out two sub-functions from sync_request_write
md: make error_handler functions more uniform and correct.
md/multipath: discard ->working_disks in favour of ->degraded
md/raid1: clean up read_balance.
md: simplify raid10 read_balance
md/bitmap: fix saving of events_cleared and other state.
md: reject a re-add request that cannot be honoured.
md: Fix race when creating a new md device.
Randy Dunlap [Mon, 23 May 2011 00:22:45 +0000 (17:22 -0700)]
wireless: fix fatal kernel-doc error + warning in mac80211.h
Fix new kernel-doc Error and Warning in <net/mac80211.h>:
Error(linux-2.6.39-git5/include/net/mac80211.h:550): cannot understand prototype: 'struct ieee80211_sched_scan_ies '
Warning(linux-2.6.39-git5/include/net/mac80211.h:2289): No description found for parameter 'sta'
Linus Torvalds [Mon, 23 May 2011 04:37:01 +0000 (21:37 -0700)]
x86: setup_smep needs to be __cpuinit
The setup_smep function gets calle at resume time too, and is thus not a
pure __init function. When marked as __init, it gets thrown out after
the kernel has initialized, and when the kernel is suspended and
resumed, the code will no longer be around, and we'll get a nice "kernel
tried to execute NX-protected page" oops because the page is no longer
marked executable.
Linus Torvalds [Sun, 22 May 2011 23:51:43 +0000 (16:51 -0700)]
Remove prefetch() from <linux/skbuff.h> and "netlabel_addrlist.h"
Commit e66eed651fd1 ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h. The skbuff
list traversal still had them.
Quoth David Miller:
"Please just remove the prefetches.
Those are modelled after list.h as I intend to eventually convert
SKB list handling to "struct list_head" but we're not there yet.
Therefore if we kill prefetches from list.h we should kill it from
these things in skbuff.h too."
Paul Gortmaker [Sun, 22 May 2011 20:47:17 +0000 (16:47 -0400)]
Add appropriate <linux/prefetch.h> include for prefetch users
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout. Give
them an explicit include via the following $0.02 script.
=========================================
#!/bin/bash
MANUAL=""
for i in `git grep -l 'prefetch(.*)' .` ; do
grep -q '<linux/prefetch.h>' $i
if [ $? = 0 ] ; then
continue
fi
( echo '?^#include <linux/?a'
echo '#include <linux/prefetch.h>'
echo .
echo w
echo q
) | ed -s $i > /dev/null 2>&1
if [ $? != 0 ]; then
echo $i needs manual fixup
MANUAL="$i $MANUAL"
fi
done
echo ------------------- 8\<----------------------
echo vi $MANUAL
=========================================
Signed-off-by: Paul <[email protected]>
[ Fixed up some incorrect #include placements, and added some
non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: Linus Torvalds <[email protected]>
CC arch/sparc/kernel/asm-offsets.s
In file included from include/linux/seqlock.h:29:0,
from include/linux/time.h:8,
from include/linux/timex.h:56,
from include/linux/sched.h:57,
from arch/sparc/kernel/asm-offsets.c:13:
include/linux/spinlock.h: In function 'spin_unlock_wait':
include/linux/spinlock.h:360:2: error: implicit declaration of function 'cpu_relax'
Most likely caused by commit e66eed651fd1 ("list: remove
prefetching from regular list iterators") due to include
changes.
dmaengine: shdma: synchronize RCU before freeing, simplify spinlock
List elements, deleted using list_del_rcu(), cannot be freed without
synchronising RCU. Further, the spinlock, used to protect the RCU
writer, is called in process context, so, we don't have to save flags.
dmaengine: shdma: add runtime- and system-level power management
This patch extends and fixes runtime power management in the shdma
driver to support powering down the DMA controller and adds support
for system-level suspend and resume.
Magnus Damm [Thu, 21 Apr 2011 13:08:46 +0000 (13:08 +0000)]
serial: sh-sci: suspend/resume wakeup support V2
This patch adds wakeup support to the sh-sci driver. The serial
core deals with all details but defaults to wakeup disabled. So
to make use of this feature enable wakeup in sysfs:
Eric Dumazet [Thu, 19 May 2011 23:42:09 +0000 (23:42 +0000)]
net: avoid synchronize_rcu() in dev_deactivate_many
dev_deactivate_many() issues one synchronize_rcu() call after qdiscs set
to noop_qdisc.
This call is here to make sure they are no outstanding qdisc-less
dev_queue_xmit calls before returning to caller.
But in dismantle phase, we dont have to wait, because we wont activate
again the device, and we are going to wait one rcu grace period later in
rollback_registered_many().
After this patch, device dismantle uses one synchronize_net() and one
rcu_barrier() call only, so we have a ~30% speedup and a smaller RTNL
latency.
Amerigo Wang [Thu, 19 May 2011 21:39:10 +0000 (21:39 +0000)]
netpoll: disable netpoll when enslave a device
V3: rename NETDEV_ENSLAVE to NETDEV_JOIN
Currently we do nothing when we enslave a net device which is running netconsole.
Neil pointed out that we may get weird results in such case, so let's disable
netpoll on the device being enslaved. I think it is too harsh to prevent
the device being ensalved if it is running netconsole.
By the way, this patch also removes the NETDEV_GOING_DOWN from netconsole
netdev notifier, because netpoll will check if the device is running or not
and we don't handle NETDEV_PRE_UP neither.
David Ward [Thu, 19 May 2011 02:53:20 +0000 (02:53 +0000)]
macvlan: Forward unicast frames in bridge mode to lowerdev
Unicast frames between macvlan interfaces in bridge mode are not otherwise
sent to network taps on the lowerdev (as all other macvlan frames are), so
forward the frames to the receive queue of the lowerdev first.
Paul Gortmaker [Sun, 22 May 2011 11:02:08 +0000 (11:02 +0000)]
drivers/net: add prefetch header for prefetch users
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout. Give
them an explicit include via the following $0.02 script.
=========================================
#!/bin/bash
MANUAL=""
for i in `git grep -l 'prefetch(.*)' .` ; do
grep -q '<linux/prefetch.h>' $i
if [ $? = 0 ] ; then
continue
fi
( echo '?^#include <linux/?a'
echo '#include <linux/prefetch.h>'
echo .
echo w
echo q
) | ed -s $i > /dev/null 2>&1
if [ $? != 0 ]; then
echo $i needs manual fixup
MANUAL="$i $MANUAL"
fi
done
echo ------------------- 8\<----------------------
echo vi $MANUAL
=========================================
In case of checksum error, the framing layer returns -EILSEQ, but
does not free the packet. Plug this hole by freeing the packet if
-EILSEQ is returned.
caif: Update documentation of CAIF transmit and receive functions.
Trivial patch updating documentation in header files only.
Error handling of CAIF transmit errors was changed by commit:
caif: Don't resend if dev_queue_xmit fails.
This patch updates the documentation accordingly.
CAIF Socket layer - caif_socket.c:
- Plug mem-leak at reconnect.
- Always call disconnect to cleanup CAIF stack.
- Disconnect will always report success.
CAIF configuration layer - cfcnfg.c
- Disconnect must dismantle the caif stack correctly
- Protect against faulty removals (check on id zero)
CAIF mux layer - cfmuxl.c
- When inserting new service layer in the MUX remove
any old entries with the same ID.
- When removing CAIF Link layer, remove the associated
service layers before notifying service layers.
Linus Torvalds [Sun, 22 May 2011 21:30:36 +0000 (14:30 -0700)]
Give up on pushing CC_OPTIMIZE_FOR_SIZE
I still happen to believe that I$ miss costs are a major thing, but
sadly, -Os doesn't seem to be the solution. With or without it, gcc
will miss some obvious code size improvements, and with it enabled gcc
will sometimes make choices that aren't good even with high I$ miss
ratios.
For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy
into a "rep movsl". While I sincerely hope that x86 CPU's will some day
do a good job at that, they certainly don't do it yet, and the cost is
higher than a L1 I$ miss would be.
Linus Torvalds [Sun, 22 May 2011 19:39:58 +0000 (12:39 -0700)]
Merge branch 'viafb-next' of git://github.com/schandinat/linux-2.6
* 'viafb-next' of git://github.com/schandinat/linux-2.6: (24 commits)
viafb: Automatic OLPC XO-1.5 configuration
viafb: remove unused CEA mode
viafb: try to map less memory in case of failure
viafb: use write combining for video ram
viafb: add X server compatibility mode
viafb: reduce OLPC refresh a bit
viafb: fix OLPC XO 1.5 device connection
viafb: fix OLPC DCON refresh rate
viafb: delete clock and PLL initialization
viafb: replace custom return values
viafb: some small cleanup for global variables
viafb: gather common good, old VGA initialization in one place
viafb: add engine clock support
viafb: add VIA slapping capability
viafb: split clock and PLL code to an extra file
viafb: add primary/secondary clock on/off switches
viafb: add clock source selection and PLL power management support
viafb: prepare for PLL separation
viafb: call viafb_get_clk_value only in viafb_set_vclock
viafb: remove unused max_hres/vres
...
Linus Torvalds [Sun, 22 May 2011 19:38:40 +0000 (12:38 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] wire up syncfs syscall
[PARISC] wire up the fhandle syscalls
[PARISC] wire up clock_adjtime syscall
[PARISC] wire up fanotify syscalls
[PARISC] prevent speculative re-read on cache flush
[PARISC] only make executable areas executable
[PARISC] fix pacache .size with new binutils
Heiko Carstens [Sun, 22 May 2011 16:54:21 +0000 (18:54 +0200)]
fs: add missing prefetch.h include
Fixes this build error on s390 and probably other archs as well:
fs/inode.c: In function 'new_inode':
fs/inode.c:894:2: error: implicit declaration of function 'spin_lock_prefetch'
Signed-off-by: Heiko Carstens <[email protected]>
[ Happens on architectures that don't define their own prefetch
functions in <asm/processor.h>, and instead rely on the default
ones in <linux/prefetch.h> - Linus] Signed-off-by: Linus Torvalds <[email protected]>
Heiko Carstens [Sun, 22 May 2011 16:55:10 +0000 (18:55 +0200)]
net: add missing prefetch.h include
Fixes build errors on s390 and probably other archs as well:
In file included from net/ipv4/ip_forward.c:32:0:
include/net/udp.h: In function 'udp_csum_outgoing':
include/net/udp.h:141:2: error: implicit declaration of function 'prefetch'
Gleb Natapov [Wed, 4 May 2011 13:31:04 +0000 (16:31 +0300)]
KVM: make guest mode entry to be rcu quiescent state
KVM does not hold any references to rcu protected data when it switches
CPU into a guest mode. In fact switching to a guest mode is very similar
to exiting to userspase from rcu point of view. In addition CPU may stay
in a guest mode for quite a long time (up to one time slice). Lets treat
guest mode as quiescent state, just like we do with user-mode execution.
* commit '29ce831000081dd757d3116bf774aafffc4b6b20': (34 commits)
rcu: provide rcu_virt_note_context_switch() function.
rcu: get rid of signed overflow in check_cpu_stall()
rcu: optimize rcutiny
rcu: prevent call_rcu() from diving into rcu core if irqs disabled
rcu: further lower priority in rcu_yield()
rcu: introduce kfree_rcu()
rcu: fix spelling
rcu: call __rcu_read_unlock() in exit_rcu for tree RCU
rcu: Converge TINY_RCU expedited and normal boosting
rcu: remove useless ->boosted_this_gp field
rcu: code cleanups in TINY_RCU priority boosting.
rcu: Switch to this_cpu() primitives
rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings
rcu: mark rcutorture boosting callback as being on-stack
rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment
rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT
rcu: Add forward-progress diagnostic for per-CPU kthreads
rcu: add grace-period age and more kthread state to tracing
rcu: fix tracing bug thinko on boost-balk attribution
rcu: update tracing documentation for new rcutorture and rcuboost
...
Pulling in rcu_virt_note_context_switch().
Signed-off-by: Avi Kivity <[email protected]>
* commit '29ce831000081dd757d3116bf774aafffc4b6b20': (34 commits)
rcu: provide rcu_virt_note_context_switch() function.
rcu: get rid of signed overflow in check_cpu_stall()
rcu: optimize rcutiny
rcu: prevent call_rcu() from diving into rcu core if irqs disabled
rcu: further lower priority in rcu_yield()
rcu: introduce kfree_rcu()
rcu: fix spelling
rcu: call __rcu_read_unlock() in exit_rcu for tree RCU
rcu: Converge TINY_RCU expedited and normal boosting
rcu: remove useless ->boosted_this_gp field
rcu: code cleanups in TINY_RCU priority boosting.
rcu: Switch to this_cpu() primitives
rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings
rcu: mark rcutorture boosting callback as being on-stack
rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment
rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT
rcu: Add forward-progress diagnostic for per-CPU kthreads
rcu: add grace-period age and more kthread state to tracing
rcu: fix tracing bug thinko on boost-balk attribution
rcu: update tracing documentation for new rcutorture and rcuboost
...
KVM: Validate userspace_addr of memslot when registered
This way, we can avoid checking the user space address many times when
we read the guest memory.
Although we can do the same for write if we check which slots are
writable, we do not care write now: reading the guest memory happens
more often than writing.
Scott Wood [Tue, 29 Mar 2011 21:49:10 +0000 (16:49 -0500)]
KVM: PPC: e500: emulate SVR
Return the actual host SVR for now, as we already do for PVR. Eventually
we may support Qemu overriding PVR/SVR if the situation is appropriate,
once we implement KVM_SET_SREGS on e500.
Avi Kivity [Wed, 27 Apr 2011 16:42:18 +0000 (19:42 +0300)]
KVM: VMX: Cache vmcs segment fields
Since the emulator now checks segment limits and access rights, it
generates a lot more accesses to the vmcs segment fields. Undo some
of the performance hit by cacheing those fields in a read-only cache
(the entire cache is invalidated on any write, or on guest exit).
The CPUIDs for Centaur are added, and then the features of
PadLock hardware engine on VIA CPU, such as "ace", "ace_en"
and so on, can be passed into the kvm guest.