Git Repo - linux.git/log

cgroups: add a read-only "procs" file similar to "tasks" that shows only unique tgids

struct cgroup used to have a bunch of fields for keeping track of the
pidlist for the tasks file. Those are now separated into a new struct
cgroup_pidlist, of which two are had, one for procs and one for tasks.
The way the seq_file operations are set up is changed so that just the
pidlist struct gets passed around as the private data.

Interface example: Suppose a multithreaded process has pid 1000 and other
threads with ids 1001, 1002, 1003:
$ cat tasks
1000
1001
1002
1003
$ cat cgroup.procs
1000
$

Signed-off-by: Ben Blum <[email protected]>
Signed-off-by: Paul Menage <[email protected]>
Acked-by: Li Zefan <[email protected]>
Cc: Matt Helsley <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: revert "cgroups: fix pid namespace bug"

The following series adds a "cgroup.procs" file to each cgroup that
reports unique tgids rather than pids, and allows all threads in a
threadgroup to be atomically moved to a new cgroup.

The subsystem "attach" interface is modified to support attaching whole
threadgroups at a time, which could introduce potential problems if any
subsystem were to need to access the old cgroup of every thread being
moved.  The attach interface may need to be revised if this becomes the
case.

Also added is functionality for read/write locking all CLONE_THREAD
fork()ing within a threadgroup, by means of an rwsem that lives in the
sighand_struct, for per-threadgroup-ness and also for sharing a cacheline
with the sighand's atomic count.  This scheme should introduce no extra
overhead in the fork path when there's no contention.

The final patch reveals potential for a race when forking before a
subsystem's attach function is called - one potential solution in case any
subsystem has this problem is to hang on to the group's fork mutex through
the attach() calls, though no subsystem yet demonstrates need for an
extended critical section.

This patch:

Revert

commit 096b7fe012d66ed55e98bc8022405ede0cc80e96
Author:     Li Zefan <[email protected]>
AuthorDate: Wed Jul 29 15:04:04 2009 -0700
Commit:     Linus Torvalds <[email protected]>
CommitDate: Wed Jul 29 19:10:35 2009 -0700

    cgroups: fix pid namespace bug

This is in preparation for some clashing cgroups changes that subsume the
original commit's functionaliy.

The original commit fixed a pid namespace bug which Ben Blum fixed
independently (in the same way, but with different code) as part of a
series of patches.  I played around with trying to reconcile Ben's patch
series with Li's patch, but concluded that it was simpler to just revert
Li's, given that Ben's patch series contained essentially the same fix.

Signed-off-by: Paul Menage <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Matt Helsley <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: allow cgroup hierarchies to be created with no bound subsystems

This patch removes the restriction that a cgroup hierarchy must have at
least one bound subsystem. The mount option "none" is treated as an
explicit request for no bound subsystems.

A hierarchy with no subsystems can be useful for plain task tracking, and
is also a step towards the support for multiply-bindable subsystems.

As part of this change, the hierarchy id is no longer calculated from the
bitmask of subsystems in the hierarchy (since this is not guaranteed to be
unique) but is allocated via an ida. Reference counts on cgroups from
css_set objects are now taken explicitly one per hierarchy, rather than
one per subsystem.

Example usage:

mount -t cgroup -o none,name=foo cgroup /mnt/cgroup

Based on the "no-op"/"none" subsystem concept proposed by
[email protected]

Signed-off-by: Paul Menage <[email protected]>
Reviewed-by: Li Zefan <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: add a back-pointer from struct cg_cgroup_link to struct cgroup

Currently the cgroups code makes the assumption that the subsystem
pointers in a struct css_set uniquely identify the hierarchy->cgroup
mappings associated with the css_set; and there's no way to directly
identify the associated set of cgroups other than by indirecting through
the appropriate subsystem state pointers.

This patch removes the need for that assumption by adding a back-pointer
from struct cg_cgroup_link object to its associated cgroup; this allows
the set of cgroups to be determined by traversing the cg_links list in
the struct css_set.

Signed-off-by: Paul Menage <[email protected]>
Reviewed-by: Li Zefan <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: move the cgroup debug subsys into cgroup.c to access internal state

While it's architecturally clean to have the cgroup debug subsystem be
completely independent of the cgroups framework, it limits its usefulness
for debugging the contents of internal data structures. Move the debug
subsystem code into the scope of all the cgroups data structures to make
more detailed debugging possible.

Signed-off-by: Paul Menage <[email protected]>
Reviewed-by: Li Zefan <[email protected]>
Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: support named cgroups hierarchies

To simplify referring to cgroup hierarchies in mount statements, and to
allow disambiguation in the presence of empty hierarchies and
multiply-bindable subsystems this patch adds support for naming a new
cgroup hierarchy via the "name=" mount option

A pre-existing hierarchy may be specified by either name or by subsystems;
a hierarchy's name cannot be changed by a remount operation.

Example usage:

# To create a hierarchy called "foo" containing the "cpu" subsystem
mount -t cgroup -oname=foo,cpu cgroup /mnt/cgroup1

# To mount the "foo" hierarchy on a second location
mount -t cgroup -oname=foo cgroup /mnt/cgroup2

Signed-off-by: Paul Menage <[email protected]>
Reviewed-by: Li Zefan <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cgroups: make unlock sequence in cgroup_get_sb consistent

Make the last unlock sequence consistent with previous unlock sequeue.

Acked-by: Balbir Singh <[email protected]>
Acked-by: Paul Menage <[email protected]>
Signed-off-by: Xiaotian Feng <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs: fix various Documentation/ paths in header files

Fix various Documentation/ paths in include/linux/.

Signed-off-by: Randy Dunlap <[email protected]>
Reviewed-by: Jesper Juhl <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

page-types: add feature for walking process address space

Introduce "-p|--pid <pid>" for walking the process address space.  The
default action is to walk raw memory PFNs.

Both the virtual address and physical address of each present pages will
be listed:

# ./tools/vm/page-types -lp $$ | head -3
voffset offset  len     flags
400     11bebe  1       __RU_lA____M______________________
402     11bebc  1       __RU_lA____M______________________

Note that voffset/offset/len are now showed as hex numbers.

[[email protected]: coding-style fixes]
Cc: Andi Kleen <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Documentation/vm/.gitignore: add page-types

Signed-off-by: Josh Triplett <[email protected]>
Cc: Wu Fengguang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

includecheck fix: Documentation, cfag12864b-example.c

fix the following 'make includecheck' warning:

Documentation/auxdisplay/cfag12864b-example.c: string.h is included more than once.

Signed-off-by: Jaswinder Singh Rajput <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Documentation: update stale definition of file-nr in fs.txt

In "documentation: update Documentation/filesystem/proc.txt and
Documentation/sysctls" (commit 760df93ec) we merged /proc/sys/fs
documentation in Documentation/sysctl/fs.txt and
Documentation/filesystem/proc.txt, but stale file-nr definition
remained.

This patch adds back the right fs-nr definition for 2.6 kernel.

Signed-off-by: Xiaotian Feng<[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

doc/filesystems: more mount cleanups

Documentation/filesystems/sharedsubtree.txt needs updating because the
mount command in util-linux package is well aware of shared subtree
features now. The patch also fixes two typos in sharedsubtree.txt.

Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

doc/filesystems: remove smount program

mount(8) handles shared subtrees just fine, so remove the smount program
from Documentation/filesystems/sharedsubtree.txt.

Fix annoying "Lets" -> "Let's".
Insert space between '#' prompt and "mount" command.

Signed-off-by: Randy Dunlap <[email protected]>
Acked-by: Miklos Szeredi <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

time: add function to convert between calendar time and broken-down time for universal use

There are many similar code in kernel for one object: convert time between
calendar time and broken-down time.

Here is some source I found:
  fs/ncpfs/dir.c
  fs/smbfs/proc.c
  fs/fat/misc.c
  fs/udf/udftime.c
  fs/cifs/netmisc.c
  net/netfilter/xt_time.c
  drivers/scsi/ips.c
  drivers/input/misc/hp_sdc_rtc.c
  drivers/rtc/rtc-lib.c
  arch/ia64/hp/sim/boot/fw-emu.c
  arch/m68k/mac/misc.c
  arch/powerpc/kernel/time.c
  arch/parisc/include/asm/rtc.h
  ...

We can make a common function for this type of conversion, At least we
can get following benefit:

1: Make kernel simple and unify
2: Easy to fix bug in converting code
3: Reduce clone of code in future
   For example, I'm trying to make ftrace display walltime,
   this patch will make me easy.

This code is based on code from glibc-2.6

Signed-off-by: Zhao Lei <[email protected]>
Cc: OGAWA Hirofumi <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hugetlbfs: do not call user_shm_lock() for MAP_HUGETLB fix

Commit 6bfde05bf5c ("hugetlbfs: allow the creation of files suitable for
MAP_PRIVATE on the vfs internal mount") altered can_do_hugetlb_shm() to
check if a file is being created for shared memory or mmap(). If this
returns false, we then unconditionally call user_shm_lock() triggering a
warning. This block should never be entered for MAP_HUGETLB. This
patch partially reverts the problem and fixes the check.

Signed-off-by: Eric B Munson <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Adam Litke <[email protected]>
Cc: David Gibson <[email protected]>
Cc: Lee Schermerhorn <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ksm: change default values to better fit into mainline kernel

Now that ksm is in mainline it is better to change the default values to
better fit to most of the users.

This patch change the ksm default values to be:

ksm_thread_pages_to_scan = 100 (instead of 200)
ksm_thread_sleep_millisecs = 20 (like before)
ksm_run = KSM_RUN_STOP (instead of KSM_RUN_MERGE - meaning ksm is
disabled by default)
ksm_max_kernel_pages = nr_free_buffer_pages / 4 (instead of 2046)

The important aspect of this patch is: it disables ksm by default, and sets
the number of the kernel_pages that can be allocated to be a reasonable
number.

Signed-off-by: Izik Eidus <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

input: fix build failures caused by Kconfig Winbond WPCD376I Consumer IR hardware driver Kconfig entry

Fix these warnings:

  drivers/built-in.o: In function `apanel_remove':
  apanel.c:(.text+0x56e852): undefined reference to `led_classdev_unregister'
  drivers/built-in.o: In function `apanel_probe':
  apanel.c:(.text+0x56eae3): undefined reference to `led_classdev_register'
  drivers/built-in.o: In function `acpi_fujitsu_hotkey_add':
  fujitsu-laptop.c:(.text+0x5d7647): undefined reference to `led_classdev_register'
  fujitsu-laptop.c:(.text+0x5d76b5): undefined reference to `led_classdev_register'
  drivers/built-in.o: In function `wbcir_probe':
  winbond-cir.c:(.devinit.text+0x5f375): undefined reference to `led_classdev_register'
  winbond-cir.c:(.devinit.text+0x5f663): undefined reference to `led_classdev_unregister'
  drivers/built-in.o: In function `wbcir_remove':
  winbond-cir.c:(.devexit.text+0x7f23): undefined reference to `led_classdev_unregister'
  drivers/built-in.o: In function `fujitsu_cleanup':
  fujitsu-laptop.c:(.exit.text+0xbe37): undefined reference to `led_classdev_unregister'
  fujitsu-laptop.c:(.exit.text+0xbe53): undefined reference to `led_classdev_unregister'

It happens because the new INPUT_WINBOND_CIR driver relies on new-leds
infrastructure - but does not select it in drivers/input/misc/Kconfig.
But it selects LEDS_CLASS, which confuses a number of other drivers into
thinking that all the leds infrastructure is in place.

Fix this by selecting NEW_LEDS as well, like similar drivers do.

Eventually, this whole leds infrastructure complexity should be
cleaned up, it's been going on for years.

Signed-off-by: Ingo Molnar <[email protected]>
Cc: Jesse Barnes <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: David Härdeman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable into for-linus

Conflicts:
fs/btrfs/super.c

Btrfs: hash the btree inode during fill_super

The snapshot deletion patches dropped this line, but the inode
needs to be hashed.

Signed-off-by: Chris Mason <[email protected]>

Btrfs: relocate file extents in clusters

The extent relocation code copy file extents one by one when
relocating data block group. This is inefficient if file
extents are small. This patch makes the relocation code copy
file extents in clusters. So we can can make better use of
read-ahead.

Signed-off-by: Yan Zheng <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

Btrfs: don't rename file into dummy directory

A recent change enforces only one access point to each subvolume. The first
directory entry (the one added when the subvolume/snapshot was created) is
treated as valid access point, all other subvolume links are linked to dummy
empty directories. The dummy directories are temporary inodes that only in
memory, so we can not rename file into them.

Signed-off-by: Yan Zheng <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

Btrfs: check size of inode backref before adding hardlink

For every hardlink in btrfs, there is a corresponding inode back
reference. All inode back references for hardlinks in a given
directory are stored in single b-tree item. The size of b-tree item
is limited by the size of b-tree leaf, so we can only create limited
number of hardlinks to a given file in a directory.

The original code lacks of the check, it oops if the number of
hardlinks goes over the limit. This patch fixes the issue by adding
check to btrfs_link and btrfs_rename.

Signed-off-by: Yan Zheng <[email protected]>
Signed-off-by: Chris Mason <[email protected]>

truncate: use new helpers

Update some fs code to make use of new helper functions introduced
in the previous patch. Should be no significant change in behaviour
(except CIFS now calls send_sig under i_lock, via inode_newsize_ok).

Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Miklos Szeredi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Nick Piggin <[email protected]>
Signed-off-by: Al Viro <[email protected]>

truncate: new helpers

Introduce new truncate helpers truncate_pagecache and inode_newsize_ok.
vmtruncate is also consolidated from mm/memory.c and mm/nommu.c and
into mm/truncate.c.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Nick Piggin <[email protected]>
Signed-off-by: Al Viro <[email protected]>

fs: fix overflow in sys_mount() for in-kernel calls

sys_mount() reads/copies a whole page for its "type" parameter.  When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).

(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code.  Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc.  checks.)

But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte.  Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?

[[email protected]: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Vegard Nossum <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: al <[email protected]>

fs: Make unload_nls() NULL pointer safe

Most call sites of unload_nls() do:
if (nls)
unload_nls(nls);

Check the pointer inside unload_nls() like we do in kfree() and
simplify the call sites.

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Steve French <[email protected]>
Cc: OGAWA Hirofumi <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Dave Kleikamp <[email protected]>
Cc: Petr Vandrovec <[email protected]>
Cc: Anton Altaparmakov <[email protected]>
Signed-off-by: Al Viro <[email protected]>

freeze_bdev: grab active reference to frozen superblocks

Currently we held s_umount while a filesystem is frozen, despite that we
might return to userspace and unlock it from a different process. Instead
grab an active reference to keep the file system busy and add an explicit
check for frozen filesystems in remount and reject the remount instead
of blocking on s_umount.

Add a new get_active_super helper to super.c for use by freeze_bdev that
grabs an active reference to a superblock from a given block device.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>

freeze_bdev: kill bd_mount_sem

Now that we have the freeze count there is not much reason for bd_mount_sem
anymore.  The actual freeze/thaw operations are serialized using the
bd_fsfreeze_mutex, and the only other place we take bd_mount_sem is
get_sb_bdev which tries to prevent mounting a filesystem while the block
device is frozen.  Instead of add a check for bd_fsfreeze_count and
return -EBUSY if a filesystem is frozen.  While that is a change in user
visible behaviour a failing mount is much better for this case rather
than having the mount process stuck uninterruptible for a long time.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>

exofs: remove BKL from super operations

the two places inside exofs that where taking the BKL were:
exofs_put_super() - .put_super
and
exofs_sync_fs() - which is .sync_fs and is also called from
.write_super.

Now exofs_sync_fs() is protected from itself by also taking
the sb_lock.

exofs_put_super() directly calls exofs_sync_fs() so there is no
danger between these two either.

In anyway there is absolutely nothing dangerous been done
inside exofs_sync_fs().

Unless there is some subtle race with the actual lifetime of
the super_block in regard to .put_super and some other parts
of the VFS. Which is highly unlikely.

Signed-off-by: Boaz Harrosh <[email protected]>
Signed-off-by: Al Viro <[email protected]>

fs/romfs: correct error-handling code

romfs_fill_super() assumes that romfs_iget() returns NULL when
it fails. romfs_iget() actually returns ERR_PTR(-ve) in that
case...

Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: seq_file: add helpers for data filling

Add two helpers that allow access to the seq_file's own buffer, but
hide the internal details of seq_files.

This allows easier implementation of special purpose filling
functions. It also cleans up some existing functions which duplicated
the seq_file logic.

Make these inline functions in seq_file.h, as suggested by Al.

Signed-off-by: Miklos Szeredi <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: remove redundant position check in do_sendfile

As Johannes Weiner pointed out, one of the range checks in do_sendfile
is redundant and is already checked in rw_verify_area.

Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: Johannes Weiner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Robert Love <[email protected]>
Cc: Mandeep Singh Baines <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: change sb->s_maxbytes to a loff_t

sb->s_maxbytes is supposed to indicate the maximum size of a file that can
exist on the filesystem.  It's declared as an unsigned long long.

Even if a filesystem has no inherent limit that prevents it from using
every bit in that unsigned long long, it's still problematic to set it to
anything larger than MAX_LFS_FILESIZE.  There are places in the kernel
that cast s_maxbytes to a signed value.  If it's set too large then this
cast makes it a negative number and generally breaks the comparison.

Change s_maxbytes to be loff_t instead.  That should help eliminate the
temptation to set it too large by making it a signed value.

Also, add a warning for couple of releases to help catch filesystems that
set s_maxbytes too large.  Eventually we can either convert this to a
BUG() or just remove it and in the hope that no one will get it wrong now
that it's a signed value.

Signed-off-by: Jeff Layton <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Robert Love <[email protected]>
Cc: Mandeep Singh Baines <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: explicitly cast s_maxbytes in fiemap_check_ranges

If fiemap_check_ranges is passed a large enough value, then it's
possible that the value would be cast to a signed value for comparison
against s_maxbytes when we change it to loff_t. Make sure that doesn't
happen by explicitly casting s_maxbytes to an unsigned value for the
purposes of comparison.

Signed-off-by: Jeff Layton <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Robert Love <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Mandeep Singh Baines <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

libfs: return error code on failed attr set

Currently all simple_attr.set handlers return 0 on success and negative
codes on error. Fix simple_attr_write() to return these error codes.

Signed-off-by: Wu Fengguang <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Nick Piggin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

seq_file: return a negative error code when seq_path_root() fails.

seq_path_root() is returning a return value of successful __d_path()
instead of returning a negative value when mangle_path() failed.

This is not a bug so far because nobody is using return value of
seq_path_root().

Signed-off-by: Tetsuo Handa <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: optimize touch_time() too

Do a similar optimization as earlier for touch_atime. Getting the lock in
mnt_get_write is relatively costly, so try all avenues to avoid it first.

This patch is careful to still only update inode fields inside the lock
region.

This didn't show up in benchmarks, but it's easy enough to do.

[[email protected]: fix typo in comment]
[[email protected]: fix inverted test of mnt_want_write_file()]
Signed-off-by: Andi Kleen <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Valerie Aurora <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Dave Hansen <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: optimization for touch_atime()

Some benchmark testing shows touch_atime to be high up in profile logs for
IO intensive workloads.  Most likely that's due to the lock in
mnt_want_write().  Unfortunately touch_atime first takes the lock, and
then does all the other tests that could avoid atime updates (like noatime
or relatime).

Do it the other way round -- first try to avoid the update and only then
if that didn't succeed take the lock.  That works because none of the
atime avoidance tests rely on locking.

This also eliminates a goto.

Signed-off-by: Andi Kleen <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Reviewed-by: Valerie Aurora <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Dave Hansen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

vfs: split generic_forget_inode() so that hugetlbfs does not have to copy it

Hugetlbfs needs to do special things instead of truncate_inode_pages().
Currently, it copied generic_forget_inode() except for
truncate_inode_pages() call which is asking for trouble (the code there
isn't trivial). So create a separate function generic_detach_inode()
which does all the list magic done in generic_forget_inode() and call
it from hugetlbfs_forget_inode().

Signed-off-by: Jan Kara <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

fs/inode.c: add dev-id and inode number for debugging in init_special_inode()

Add device-id and inode number for better debugging. This was suggested
by Andreas in one of the threads
http://article.gmane.org/gmane.comp.file-systems.ext4/12062 .

"If anyone has a chance, fixing this error message to be not-useless would
be good... Including the device name and the inode number would help
track down the source of the problem."

Signed-off-by: Manish Katiyar <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

libfs: make simple_read_from_buffer conventional

Impact: have simple_read_from_buffer conform to standards

It was brought to my attention by Andrew Morton, Theodore Tso, and H.
Peter Anvin that a read from userspace should only return -EFAULT if
nothing was actually read.

Looking at the simple_read_from_buffer I noticed that this function does
not conform to that rule. This patch fixes that function.

[[email protected]: simplification suggested by hpa]
[[email protected]: fix count==0 handling]
Signed-off-by: Steven Rostedt <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

microblaze: Disable heartbeat/enable emaclite in defconfigs

I need to disable heartbeat function because this features
breaks testing in Qemu.

Signed-off-by: Michal Simek <[email protected]>

microblaze: Support simpleImage.dts make target

Instead of remembering to specify DTB= on the make commandline, this commit
allows the much friendlier make simpleImage.<dts>
where <dts>.dts is expected to be found in arch/microblaze/boot/dts/
The resulting vmlinux, with the compiled DTS linked in, will be copied to
boot/simpleImage.<dts>

This mirrors the same functionality as on PowerPC,
albeit achieving it in a slightly different way.

+ strip simpleImage file
The size of output file is very similar to linux.bin.

vmlinux - full elf without fdt blob
simpleImage.<dtb name>.unstrip - full elf with fdt blob
simpleImage.<dtb name> - stripped elf with fdt blob

Add symlink to generic system.dts in platform folder

Signed-off-by: John Williams <[email protected]>
Signed-off-by: Michal Simek <[email protected]>

[PATCH] Fix idle time field in /proc/uptime

Git commit 79741dd changes idle cputime accounting, but unfortunately
the /proc/uptime file hasn't caught up. Here the idle time calculation
from /proc/stat is copied over.

Signed-off-by: Michael Abbott <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

lsm: Use a compressed IPv6 string format in audit events

Currently the audit subsystem prints uncompressed IPv6 addresses which not
only differs from common usage but also results in ridiculously large audit
strings which is not a good thing.  This patch fixes this by simply converting
audit to always print compressed IPv6 addresses.

Old message example:

audit(1253576792.161:30): avc:  denied  { ingress } for
  saddr=0000:0000:0000:0000:0000:0000:0000:0001 src=5000
  daddr=0000:0000:0000:0000:0000:0000:0000:0001 dest=35502 netif=lo
  scontext=system_u:object_r:unlabeled_t:s15:c0.c1023
  tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif

New message example:

audit(1253576792.161:30): avc:  denied  { ingress } for
  saddr=::1 src=5000 daddr=::1 dest=35502 netif=lo
  scontext=system_u:object_r:unlabeled_t:s15:c0.c1023
  tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif

Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Eric Paris <[email protected]>
Signed-off-by: Al Viro <[email protected]>

Audit: send signal info if selinux is disabled

Audit will not respond to signal requests if selinux is disabled since it is
unable to translate the 0 sid from the sending process to a context. This
patch just doesn't send the context info if there isn't any.

Signed-off-by: Eric Paris <[email protected]>
Signed-off-by: Al Viro <[email protected]>

Audit: rearrange audit_context to save 16 bytes per struct

pahole pointed out that on x86_64 struct audit_context can be rearrainged
to save 16 bytes per struct. Since we have an audit_context per task this
can acually be a pretty significant gain.

Signed-off-by: Eric Paris <[email protected]>
Signed-off-by: Al Viro <[email protected]>

Audit: reorganize struct audit_watch to save 8 bytes

pahole showed that struct audit_watch had two holes:

struct audit_watch {
        atomic_t                   count;                /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        char *                     path;                 /*     8     8 */
        dev_t                      dev;                  /*    16     4 */

        /* XXX 4 bytes hole, try to pack */

        long unsigned int          ino;                  /*    24     8 */
        struct audit_parent *      parent;               /*    32     8 */
        struct list_head           wlist;                /*    40    16 */
        struct list_head           rules;                /*    56    16 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */

        /* size: 72, cachelines: 2, members: 7 */
        /* sum members: 64, holes: 2, sum holes: 8 */
        /* last cacheline: 8 bytes */
};      /* definitions: 1 */

by moving dev after count we save 8 bytes,  actually improving cacheline
usage.  There are typically very few of these in the kernel so it won't be
a large savings, but it's a good thing no matter what.

Signed-off-by: Eric Paris <[email protected]>
Signed-off-by: Al Viro <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
  cpumask: Move deprecated functions to end of header.
  cpumask: remove unused deprecated functions, avoid accusations of insanity
  cpumask: use new-style cpumask ops in mm/quicklist.
  cpumask: use mm_cpumask() wrapper: x86
  cpumask: use mm_cpumask() wrapper: um
  cpumask: use mm_cpumask() wrapper: mips
  cpumask: use mm_cpumask() wrapper: mn10300
  cpumask: use mm_cpumask() wrapper: m32r
  cpumask: use mm_cpumask() wrapper: arm
  cpumask: Use accessors for cpu_*_mask: um
  cpumask: Use accessors for cpu_*_mask: powerpc
  cpumask: Use accessors for cpu_*_mask: mips
  cpumask: Use accessors for cpu_*_mask: m32r
  cpumask: remove arch_send_call_function_ipi
  cpumask: arch_send_call_function_ipi_mask: s390
  cpumask: arch_send_call_function_ipi_mask: powerpc
  cpumask: arch_send_call_function_ipi_mask: mips
  cpumask: arch_send_call_function_ipi_mask: m32r
  cpumask: arch_send_call_function_ipi_mask: alpha
  cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
  ...

headers: utsname.h redux

* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it

NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Revert "kmod: fix race in usermodehelper code"

This reverts commit c02e3f361c7 ("kmod: fix race in usermodehelper code")

The patch is wrong.  UMH_WAIT_EXEC is called with VFORK what ensures
that the child finishes prior returing back to the parent.  No race.

In fact, the patch makes it even worse because it does the thing it
claims not do:

- It calls ->complete() on UMH_WAIT_EXEC

- the complete() callback may de-allocated subinfo as seen in the
   following call chain:

    [<c009f904>] (__link_path_walk+0x20/0xeb4) from [<c00a094c>] (path_walk+0x48/0x94)
    [<c00a094c>] (path_walk+0x48/0x94) from [<c00a0a34>] (do_path_lookup+0x24/0x4c)
    [<c00a0a34>] (do_path_lookup+0x24/0x4c) from [<c00a158c>] (do_filp_open+0xa4/0x83c)
    [<c00a158c>] (do_filp_open+0xa4/0x83c) from [<c009ba90>] (open_exec+0x24/0xe0)
    [<c009ba90>] (open_exec+0x24/0xe0) from [<c009bfa8>] (do_execve+0x7c/0x2e4)
    [<c009bfa8>] (do_execve+0x7c/0x2e4) from [<c0026a80>] (kernel_execve+0x34/0x80)
    [<c0026a80>] (kernel_execve+0x34/0x80) from [<c004b514>] (____call_usermodehelper+0x130/0x148)
    [<c004b514>] (____call_usermodehelper+0x130/0x148) from [<c0024858>] (kernel_thread_exit+0x0/0x8)

   and the path pointer was NULL.  Good that ARM's kernel_execve()
   doesn't check the pointer for NULL or else I wouldn't notice it.

The only race there might be is with UMH_NO_WAIT but it is too late for
me to investigate it now.  UMH_WAIT_PROC could probably also use VFORK
and we could save one exec.  So the only race I see is with UMH_NO_WAIT
and recent scheduler changes where the child does not always run first
might have trigger here something but as I said, it is late....

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Btrfs: fix releasepage to avoid unlocking extents we haven't locked

During releasepage, we try to drop any extent_state structs for the
bye offsets of the page we're releaseing. But the code was incorrectly
telling clear_extent_bit to delete the state struct unconditionallly.

Normally this would be fine because we have the page locked, but other
parts of btrfs will lock down an entire extent, the most common place
being IO completion.

releasepage was deleting the extent state without first locking the extent,
which may result in removing a state struct that another process had
locked down. The fix here is to leave the NODATASUM and EXTENT_LOCKED
bits alone in releasepage.

Signed-off-by: Chris Mason <[email protected]>

Btrfs: Fix test_range_bit for whole file extents

If test_range_bit finds an extent that goes all the way to (u64)-1, it
can incorrectly wrap the u64 instead of treaing it like the end of
the address space.

This just adds a check for the highest possible offset so we don't wrap.

Signed-off-by: Chris Mason <[email protected]>

Btrfs: fix errors handling cached state in set/clear_extent_bit

Both set and clear_extent_bit allow passing a cached
state struct to reduce rbtree search times. clear_extent_bit
was improperly bypassing some of the checks around making sure
the extent state fields were correct for a given operation.

The fix used here (from Yan Zheng) is to use the hit_next
goto target instead of jumping all the way down to start clearing
bits without making sure the cached state was exactly correct
for the operation we were doing.

This also fixes up the setting of the start variable for both
ops in the case where we find an overlapping extent that
begins before the range we want to change. In both cases
we were incorrectly going backwards from the original
requested change.

Signed-off-by: Chris Mason <[email protected]>

cpumask: Move deprecated functions to end of header.

The new ones have pretty kerneldoc. Move the old ones to the end to
avoid confusing people.

Signed-off-by: Rusty Russell <[email protected]>
Cc: [email protected]

cpumask: remove unused deprecated functions, avoid accusations of insanity

We're not forcing removal of the old cpu_ functions, but we might as
well delete the now-unused ones.

Especially CPUMASK_ALLOC and friends. I actually got a phone call (!)
from a hacker who thought I had introduced them as the new cpumask
API. He seemed bewildered that I had lost all taste.

Signed-off-by: Rusty Russell <[email protected]>
Cc: [email protected]

cpumask: use new-style cpumask ops in mm/quicklist.

This slipped past the previous sweeps.

Signed-off-by: Rusty Russell <[email protected]>
Acked-by: Christoph Lameter <[email protected]>

cpumask: use mm_cpumask() wrapper: x86

Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <[email protected]>

cpumask: use mm_cpumask() wrapper: um

Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <[email protected]>

cpumask: use mm_cpumask() wrapper: mips

Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <[email protected]>

cpumask: use mm_cpumask() wrapper: mn10300

Makes code futureproof against the impending change to mm->cpu_vm_mask
(to be a pointer).

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Also change the actual arg name here to "mm" (which it is), not "task".

Signed-off-by: Rusty Russell <[email protected]>

cpumask: use mm_cpumask() wrapper: m32r

Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <[email protected]>
Acked-by: Hirokazu Takata <[email protected]> (fixes)

cpumask: use mm_cpumask() wrapper: arm

Makes code futureproof against the impending change to mm->cpu_vm_mask.

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).

Signed-off-by: Rusty Russell <[email protected]>

cpumask: Use accessors for cpu_*_mask: um

Use the accessors rather than frobbing bits directly (the new versions
are const).

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Mike Travis <[email protected]>

cpumask: Use accessors for cpu_*_mask: powerpc

Use the accessors rather than frobbing bits directly (the new versions
are const).

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Mike Travis <[email protected]>

cpumask: Use accessors for cpu_*_mask: mips

Use the accessors rather than frobbing bits directly (the new versions
are const).

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Mike Travis <[email protected]>

cpumask: Use accessors for cpu_*_mask: m32r

Use the accessors rather than frobbing bits directly (the new versions
are const).

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Mike Travis <[email protected]>

cpumask: remove arch_send_call_function_ipi

Now everyone is converted to arch_send_call_function_ipi_mask, remove
the shim and the #defines.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: arch_send_call_function_ipi_mask: s390

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask().

Signed-off-by: Rusty Russell <[email protected]>

cpumask: arch_send_call_function_ipi_mask: powerpc

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask(), and by defining
it, the old arch_send_call_function_ipi is defined by the core code.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: arch_send_call_function_ipi_mask: mips

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask(), and by defining
it, the old arch_send_call_function_ipi is defined by the core code.

We also take the chance to wean the implementations off the
obsolescent for_each_cpu_mask(): making send_ipi_mask take the pointer
seemed the most natural way to ensure all implementations used
for_each_cpu.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: arch_send_call_function_ipi_mask: m32r

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask(), and by defining
it, the old arch_send_call_function_ipi is defined by the core code.

We also take the chance to wean the implementations off the
obsolescent for_each_cpu_mask(): making send_ipi_mask take the pointer
seemed the most natural way to ensure all implementations used
for_each_cpu.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: arch_send_call_function_ipi_mask: alpha

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask().

We also take the chance to wean the send_ipi_message off the
obsolescent for_each_cpu_mask(): making it take a pointer seemed the
most natural way to do this.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64

There were replaced by topology_core_cpumask and topology_thread_cpumask.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: powerpc

There were replaced by topology_core_cpumask and topology_thread_cpumask.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: s390

There were replaced by topology_core_cpumask and topology_thread_cpumask.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: sparc

There were replaced by topology_core_cpumask and topology_thread_cpumask.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: core

There were replaced by topology_core_cpumask and topology_thread_cpumask.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove the deprecated smp_call_function_mask()

Everyone is now using smp_call_function_many().

Signed-off-by: Rusty Russell <[email protected]>

ia64: convert last user of smp_call_function_mask

smp_call_function_many is the new version: it takes a pointer. Also,
use mm accessor macro while we're changing this.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: don't define set_cpus_allowed() if CONFIG_CPUMASK_OFFSTACK=y

You're not supposed to pass cpumasks on the stack in that case.

Signed-off-by: Rusty Russell <[email protected]>

ACPI: remove cpumask_t usage

set_cpus_allowed() is on the way out; replace it with
set_cpus_allowed_ptr().

Reference: http://lkml.org/lkml/2008/11/6/448

Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

cpumask: Remove mask field from comments

By 7be23e278f, mask field was deleted by irqaction. However, it was not
deleted from comment.

Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
CC: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove unused mask field from struct irqaction.

Up until 1.1.83, the primitive human tribes used struct sigaction for
interrupts. The sa_mask field was overloaded to hold a pointer to the
name.

When someone created the new "struct irqaction" they carried across
the "mask" field as a kind of ancestor worship: the fact that it was
unused makes clear its spiritual significance.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove last assignment to mask field of struct irqaction.

This snuck in after the patch which removed all the others.

Signed-off-by: Rusty Russell <[email protected]>
Cc: Ingo Molnar <[email protected]>

cpumask: remove unused cpu_mask_all

It's only defined for NR_CPUS > BITS_PER_LONG; cpu_all_mask is always
defined (and const).

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL.: mips

(Thanks to Al Viro for reminding me of this, via Ingo)

CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:

#define CPU_MASK_ALL (cpumask_t) { { ... } }

Taking the address of such a temporary is questionable at best,
unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
CPU_MASK_ALL_PTR:

#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)

Which formalizes this practice. One day gcc could bite us over this
usage (though we seem to have gotten away with it so far).

So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR
with the modern "cpu_all_mask" (a real struct cpumask *), and remove
CPU_MASK_ALL_PTR altogether.

Signed-off-by: Rusty Russell <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Reported-by: Al Viro <[email protected]>
Cc: Mike Travis <[email protected]>

cpumask: remove dangerous CPU_MASK_ALL_PTR

(Thanks to Al Viro for reminding me of this, via Ingo)

CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:

#define CPU_MASK_ALL (cpumask_t) { { ... } }

Taking the address of such a temporary is questionable at best,
unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
CPU_MASK_ALL_PTR:

#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)

Which formalizes this practice. One day gcc could bite us over this
usage (though we seem to have gotten away with it so far).

Now all callers are removed, we kill it.

Signed-off-by: Rusty Russell <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Reported-by: Al Viro <[email protected]>
Cc: Mike Travis <[email protected]>

cpumask: remove obsolete node_to_cpumask now everyone uses cpumask_of_node

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove the now-obsoleted pcibus_to_cpumask(): powerpc

cpumask_of_pcibus() is the new version.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove the now-obsoleted pcibus_to_cpumask(): mips

cpumask_of_pcibus() is the new version.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: remove the now-obsoleted pcibus_to_cpumask(): alpha

cpumask_of_pcibus() is the new version.

Signed-off-by: Rusty Russell <[email protected]>

cpumask: use zalloc_cpumask_var() where possible

Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().

Signed-off-by: Li Zefan <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: add driver for Atmel AT42QT2160 Sensor Chip
  Input: max7359 - use threaded IRQs
  Input: add driver for Maxim MAX7359 key switch controller
  Input: add driver for ADP5588 QWERTY I2C Keypad
  Input: add touchscreen driver for MELFAS MCS-5000 controller
  Input: add driver for OpenCores Keyboard Controller
  Input: dm355evm_keys - remove dm355evm_keys_hardirq
  Input: synaptics_i2c - switch to using __cancel_delayed_work()
  Input: ad7879 - add support for AD7889
  Input: atkbd - rely on input core to restore state on resume
  Input: add generic suspend and resume for input devices
  Input: libps2 - additional locking for i8042 ports

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next

* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
  Use macros for .data.page_aligned section.
  Use macros for .bss.page_aligned section.
  Use new __init_task_data macro in arch init_task.c files.
  kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
  arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
  kbuild: add static to prototypes
  kbuild: fail build if recordmcount.pl fails
  kbuild: set -fconserve-stack option for gcc 4.5
  kbuild: echo the record_mcount command
  gconfig: disable "typeahead find" search in treeviews
  kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
  checkincludes.pl: add option to remove duplicates in place
  markup_oops: use modinfo to avoid confusion with underscored module names
  checkincludes.pl: provide usage helper
  checkincludes.pl: close file as soon as we're done with it
  ctags: usability fix
  kernel hacking: move STRIP_ASM_SYMS from General
  gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
  kbuild: Check if linker supports the -X option
  kbuild: introduce ld-option
  ...

Fix trivial conflict in scripts/basic/fixdep.c

Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Propagate 'fsc' mount option through automounts
  sunrpc/rpc_pipe: fix kernel-doc notation
  sunrpc: xdr_xcode_hyper helpers cannot presume 64-bit alignment
  NFS: Add nfs_alloc_parsed_mount_data
  NFS/RPC: fix problems with reestablish_timeout and related code.
  NFS: Get rid of the NFS_MOUNT_VER3 and NFS_MOUNT_TCP flags

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: Update documentation to add fscache related bits
  9p: Add fscache support to 9p
  9p: Fix the incorrect update of inode size in v9fs_file_write()
  9p: Use the i_size_[read, write]() macros instead of using inode->i_size directly.

Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (ltc4245) Clear faults at startup
  hwmon: (ltc4215) Clear faults at startup
  hwmon: (coretemp) Add Lynnfield CPU
  hwmon: (coretemp) Add support for Penryn mobile CPUs
  hwmon: (coretemp) Fix Atom CPUs support
  hwmon: Delete deprecated FSC drivers
  hwmon: (adm1031) Add sysfs files for temperature offsets

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  SELinux: do not destroy the avc_cache_nodep
  KEYS: Have the garbage collector set its timer for live expired keys
  tpm-fixup-pcrs-sysfs-file-update
  creds_are_invalid() needs to be exported for use by modules:
  include/linux/cred.h: fix build

Fix trivial BUILD_BUG_ON-induced conflicts in drivers/char/tpm/tpm.c