Dmitry Monakhov [Sat, 27 Mar 2010 12:15:38 +0000 (15:15 +0300)]
quota: optimize mark_dirty logic
- Skip locking if quota is dirty already.
- Return old quota state to help fs-specciffic implementation to optimize
case where quota was dirty already.
Jan Kara [Mon, 29 Mar 2010 11:55:39 +0000 (13:55 +0200)]
ext2: Avoid loading bitmaps for full groups during block allocation
There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).
ext3: Avoid loading bitmaps for full groups during block allocation
There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).
Jan Kara: Added a comment and changed check to use cpu-endian value.
Linus Torvalds [Fri, 21 May 2010 16:48:36 +0000 (09:48 -0700)]
Fix networking tree iscsi_tcp.c mis-merge
The removal of the 'waitqueue_active()' test in commit d7d05548a6
("[SCSI] iscsi_tcp: fix relogin/shutdown hang") got incorrectly resolved
by David when he back-merged the main git tree into the networking tree
in commit 278554bd65 ("Merge branch 'master' of master.kernel.org:...").
There was a content conflict due to 'sock->sk->sk_sleep' being changed
into 'sk_sleep(sock->sk)' in the networking tree, but David didn't pick
up the iscsi change from the main tree.
Chase Douglas [Fri, 21 May 2010 16:41:01 +0000 (18:41 +0200)]
i2c-nforce2: Remove redundant error messages on ACPI conflict
The ACPI subsystem strictly checks for resource conflicts. When there's
a conflict, it outputs a warning message with all the details needed to
properly diagnose the underlying issue. However, the i2c-nforce2 driver
also prints its own message. Not only is the message redundant, it is at
the KERN_ERR level, which overrides some bootsplash screens for no good
reason. This change removes the two lines that print out the error
messages.
Farid Hammane [Fri, 21 May 2010 16:41:00 +0000 (18:41 +0200)]
i2c-algo-pca: Fix coding style issues
Fix up some coding style issues. i2c-algo-pca.c has been built
successfully after applying this patch and the binary object is
still exactly the same. Other issues found by checkpatch.pl were
voluntarily not fixed, either to keep readability, or because of
false positive errors.
Marek Szyprowski [Fri, 21 May 2010 16:40:58 +0000 (18:40 +0200)]
i2c-gpio: Move initialization code to subsys_initcall()
GPIO driven I2C bus can be used for controlling the PMIC chip. The
example of such configuration is Samsung Aquila board.
This patch moves initialization code to subsys_initcall() to ensure
that the i2c bus is available early so the regulators can be quickly
probed and available for other devices on their probe() call.
Such solution has been proposed by Mark Brown to fix the problem of
the regulators not beeing available on the peripheral device probe():
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/011971.html
Jean Delvare [Fri, 21 May 2010 16:40:57 +0000 (18:40 +0200)]
at24: Fall back to byte or word reads if needed
Increase the portability of the at24 driver by letting it read from
EEPROM chips connected to cheap SMBus controllers that support neither
raw I2C messages nor even I2C block reads. All SMBus controllers
should support either word reads or byte reads, so read support
becomes universal, much like with the legacy "eeprom" driver.
Obviously, this only works with EEPROM chips up to AT24C16, that use
8-bit offset addressing. 16-bit offset addressing is almost impossible
to support on SMBus controllers.
I did not add universal support for writes, as I had no immediate need
for this, but it could be added later if needed (with the same
performance issue as byte and word reads have, of course.)
Jean Delvare [Fri, 21 May 2010 16:40:55 +0000 (18:40 +0200)]
i2c-i801: All newer devices have all the optional features
Only the oldest devices lack some of the features supported by this
driver. List them explicitly, and default to all features enabled for
all other chips, including the ones added through sysfs. This will
make future driver maintenance easier.
In the unlikely event of a not yet supported device not implementing
all the features, one can always use the disable_features module
parameter to prevent the driver from attempting to use them.
Jean Delvare [Fri, 21 May 2010 16:40:54 +0000 (18:40 +0200)]
i2c-i801: Let the user disable selected driver features
Let the user disable selected features normally supported by the
device. This makes it possible to work around possible driver or
hardware bugs if the feature in question doesn't work as intended
for whatever reason.
Drivers like the ipw2100 call device_create_group when they
are initialized and device_remove_group when they are shutdown.
Moving them between namespaces deletes their sysfs groups early.
In particular the following call chain results.
netdev_unregister_kobject -> device_del -> kobject_del -> sysfs_remove_dir
With sysfs_remove_dir recursively deleting all of it's subdirectories,
and nothing adding them back.
Ouch!
Therefore we need to call something that ultimate calls sysfs_mv_dir
as that sysfs function can move sysfs directories between namespaces
without deleting their subdirectories or their contents. Allowing
us to avoid placing extra boiler plate into every driver that does
something interesting with sysfs.
Currently the function that provides that capability is device_rename.
That is the code works without nasty side effects as originally written.
So remove the misguided fix for moving devices between namespaces. The
bug in the kobject layer that inspired it has now been recognized and
fixed.
It only makes sense for uevent_helper to get events
in the intial namespaces. It's invocation is not
per namespace and it is not clear how we could make
it's invocation namespace aware.
When netlink sockets are used to convey data that is in a namespace
we need a way to select a subset of the listening sockets to deliver
the packet to. For the network namespace we have been doing this
by only transmitting packets in the correct network namespace.
For data belonging to other namespaces netlink_bradcast_filtered
provides a mechanism that allows us to examine the destination
socket and to decide if we should transmit the specified packet
to it.
net/sysfs: Fix the bitrot in network device kobject namespace support
I had a couple of stupid bugs in:
netns: Teach network device kobjects which namespace they are in.
- I duplicated the Kconfig for the NET_NS
- The build was broken when sysfs was not compiled in
The sysfs breakage is because after I moved the operations
for the sysfs to the kobject layer, to make things cleaner
I forgot to move the ifdefs. Opps.
I'm not quite certain how I got introduced a second NET_NS Kconfig,
but it was probably a 3 way merge somewhere along the way that
did not notice that the NET_NS Kconfig option had mvoed and thout
that was a bug. It probably slipped in because it used to be the
sysfs patches were the first patches in my network namespace patches.
Some things just don't go like you would expect.
Neither of these bugs actually affect anything in the common case
but they should be fixed.
netns: Teach network device kobjects which namespace they are in.
The problem. Network devices show up in sysfs and with the network
namespace active multiple devices with the same name can show up in
the same directory, ouch!
To avoid that problem and allow existing applications in network namespaces
to see the same interface that is currently presented in sysfs, this
patch enables the tagging directory support in sysfs.
By using the network namespace pointers as tags to separate out the
the sysfs directory entries we ensure that we don't have conflicts
in the directories and applications only see a limited set of
the network devices.
Chris Wright [Thu, 13 May 2010 17:43:07 +0000 (10:43 -0700)]
pci: check caps from sysfs file open to read device dependent config space
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
check to verify privileges before allowing a user to read device
dependent config space. This is meant to protect from an unprivileged
user potentially locking up the box.
When assigning a PCI device directly to a guest with libvirt and KVM,
the sysfs config space file is chown'd to the unprivileged user that
the KVM guest will run as. The guest needs to have full access to the
device's config space since it's responsible for driving the device.
However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
check will not allow read access beyond the config header.
With this patch we check privileges against the capabilities used when
openining the sysfs file. The allows a privileged process to open the
file and hand it to an unprivileged process, and the unprivileged process
can still read all of the config space.
The problem. When implementing a network namespace I need to be able
to have multiple network devices with the same name. Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.
What this patch does is to add an additional tag field to the
sysfs dirent structure. For directories that should show different
contents depending on the context such as /sys/class/net/, and
/sys/devices/virtual/net/ this tag field is used to specify the
context in which those directories should be visible. Effectively
this is the same as creating multiple distinct directories with
the same name but internally to sysfs the result is nicer.
I am calling the concept of a single directory that looks like multiple
directories all at the same path in the filesystem tagged directories.
For the networking namespace the set of directories whose contents I need
to filter with tags can depend on the presence or absence of hotplug
hardware or which modules are currently loaded. Which means I need
a simple race free way to setup those directories as tagged.
To achieve a reace free design all tagged directories are created
and managed by sysfs itself.
Users of this interface:
- define a type in the sysfs_tag_type enumeration.
- call sysfs_register_ns_types with the type and it's operations
- sysfs_exit_ns when an individual tag is no longer valid
- Implement mount_ns() which returns the ns of the calling process
so we can attach it to a sysfs superblock.
- Implement ktype.namespace() which returns the ns of a syfs kobject.
Everything else is left up to sysfs and the driver layer.
For the network namespace mount_ns and namespace() are essentially
one line functions, and look to remain that.
Tags are currently represented a const void * pointers as that is
both generic, prevides enough information for equality comparisons,
and is trivial to create for current users, as it is just the
existing namespace pointer.
The work needed in sysfs is more extensive. At each directory
or symlink creating I need to check if the directory it is being
created in is a tagged directory and if so generate the appropriate
tag to place on the sysfs_dirent. Likewise at each symlink or
directory removal I need to check if the sysfs directory it is
being removed from is a tagged directory and if so figure out
which tag goes along with the name I am deleting.
Currently only directories which hold kobjects, and
symlinks are supported. There is not enough information
in the current file attribute interfaces to give us anything
to discriminate on which makes it useless, and there are
no potential users which makes it an uninteresting problem
to solve.
sysfs: Remove usage of S_BIAS to avoid merge conflict with the vfs tree
In Al's latest vfs tree the code is reworked and S_BIAS has been removed.
It turns out that checking to see if a super block is in the
middle of an unmount in sysfs_exit_ns is unnecessary because we
remove the super_block from the s_supers/s_instances list before
struct sysfs_super_info pointed to by sb->s_fs_info is freed.
For now just delete the unnecessary check to see if a superblock is in the
middle of an unmount, it isn't necessary with or without Al's changes
and it just causes a needless conflict.
sysfs: Don't use enums in inline function declaration.
It appears gcc can't cope with using an enum that is only declared in
an inline function declaration, that doesn't even use the variable
that is so declared.
Avoid the silliness and replace the enum with an int, and make gcc
happy.
Serge E. Hallyn [Wed, 5 May 2010 02:45:38 +0000 (21:45 -0500)]
sysfs-namespaces: add a high-level Documentation file
The first three paragraphs are almost verbatim taken from Eric's
commit message on the patch introducing network ns tags. The next
two paragraphs I wrote to be a brief high level overview. The last
section is taken from the commit message on "Implement sysfs tagged
directory support", but updated. Hopefully correctly.
Serge E. Hallyn [Mon, 3 May 2010 21:23:15 +0000 (16:23 -0500)]
sysfs: Comment sysfs directory tagging logic
Add some in-line comments to explain the new infrastructure, which
was introduced to support sysfs directory tagging with namespaces.
I think an overall description someplace might be good too, but it
didn't really seem to fit into Documentation/filesystems/sysfs.txt,
which appears more geared toward users, rather than maintainers, of
sysfs.
(Tejun, please let me know if I can make anything clearer or failed
altogether to comment something that should be commented.)
driver core: Implement ns directory support for device classes.
device_del and device_rename were modified to use
sysfs_delete_link and sysfs_rename_link respectively to ensure
when these operations happen on devices whose classes
are in namespace directories they work properly.
When removing a symlink sysfs_remove_link does not provide
enough information to figure out which tagged directory the symlink
falls in. So I need sysfs_delete_link which is passed the target
of the symlink to delete.
sysfs_rename_link is updated to call sysfs_delete_link instead
of sysfs_remove_link as we have all of the information necessary
and the callers are interesting.
Both of these functions now have enough information to find a symlink
in a tagged directory. The only restriction is that they must be called
before the target kobject is renamed or deleted. If they are called
later I loose track of which tag the target kobject was marked with
and can no longer find the old symlink to remove it.
sysfs: Add support for tagged directories with untagged members.
I had hopped to avoid this but the bonding driver adds a file
to /sys/class/net/ and the easiest way to handle that file is
to make it untagged and to register it only once.
So relax the rules on tagged directories, and make bonding work.
David Zeuthen [Mon, 3 May 2010 12:08:59 +0000 (14:08 +0200)]
generate "change" uevent for loop device
Recent udev versions probe loop devices for filesystems meaning that
the /dev/disk hierarchy may contain useful entries such as
$ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0
Unfortunately, no "change" uevent is generated when the loop device is
detached so the symlink persists. Additionally, no "change" uevent is
guaranteed to be generated when attaching an fd or changing capacity.
For example, user space could open the loop device O_RDONLY (in fact,
recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
not trigger the "change" uevent.
This patch ensures that the "change" uevent is generated in all of
these cases. As a result, the /dev/disk hierarchy works as expected
for loop devices.
Hugh Daschbach [Mon, 22 Mar 2010 17:36:37 +0000 (10:36 -0700)]
Driver core: Protect device shutdown from hot unplug events.
While device_shutdown() walks through devices_kset to shutdown all
devices, device unplug events may race to shutdown individual devices.
Specifically, sd_shutdown(), on behalf of fc_starget_delete(), has
been observed deleting devices during device_shutdown()'s list
traversal. So we factor out list_for_each_entry_safe_reverse(...) in
favor of while (!list_empty(...)).
Johannes Berg [Mon, 29 Mar 2010 15:57:20 +0000 (17:57 +0200)]
firmware class: export nowait to userspace
When we use request_firmware_nowait(), userspace may
not want to answer negatively right away when for
example it is answering from an initrd only, but
with request_firmware() it has to in order to not
delay the kernel boot until the request times out.
This allows userspace to differentiate between the
two in order to be able to reply negatively to async
requests only when all filesystems have been mounted
and have been checked for the requested firmware file.
Peter Zijlstra [Fri, 19 Mar 2010 00:37:42 +0000 (01:37 +0100)]
lockdep: Add novalidate class for dev->mutex conversion
The conversion of device->sem to device->mutex resulted in lockdep
warnings. Create a novalidate class for now until the driver folks
come up with separate classes. That way we have at least the basic
mutex debugging coverage.
Add a checkpatch error so the usage is reserved for device->mutex.
[ tglx: checkpatch and compile fix for LOCKDEP=n ]
Thomas Gleixner [Fri, 29 Jan 2010 20:39:02 +0000 (20:39 +0000)]
drivers/base: Convert dev->sem to mutex
The semaphore is semantically a mutex. Convert it to a real mutex and
fix up a few places where code was relying on semaphore.h to be included
by device.h, as well as the users of the trylock function, as that value
is now reversed.
Kevin Hilman [Wed, 17 Mar 2010 23:18:15 +0000 (16:18 -0700)]
platform_bus: allow custom extensions to system PM methods
When runtime PM for platform_bus was added, it allowed for platforms
to customize the runtime PM methods since they are defined as weak
symbols.
This patch allows platforms to also extend the system PM methods with
custom hooks so runtime PM and system PM extensions can be managed
together by custom platform-specific code.
Alan Stern [Mon, 8 Mar 2010 21:46:19 +0000 (16:46 -0500)]
Driver core: don't initialize wakeup flags
This patch (as1351) removes an unnecessary and unwanted assignment
from device_initialize(). The wakeup flags are set to 0 along with
everything else when the device structure is allocated, so we don't
need to do it again. Furthermore, the subsystem might already have
set these flags to their correct values; we don't want to override it.
Stefani Seibold [Sat, 6 Mar 2010 16:50:14 +0000 (17:50 +0100)]
driver-core: fix potential race condition in drivers/base/dd.c
This patch fix a potential race condition in the driver_bound() function
in the file driver/base/dd.c.
The broadcast of the BUS_NOTIFY_BOUND_DRIVER notifier should be done
after adding the new device to the driver list. Otherwise notifier
listener will fail if they use functions like usb_find_interface().
The patch is against kernel 2.6.33. Please merge it.
Driver core: Reduce the level of request_firmware() messages
The messages from _request_firmware() informing that firmware is
being requested or built-in firmware is going to be used are printed
at KERN_INFO, which produces lots of noise on systems with huge
numbers of AMD CPUs. Reduce the level of these messages to
KERN_DEBUG to get rid of that noise.
fix memory leak introduced by the patch 6e03a201bbe:
firmware: speed up request_firmware()
1. vfree won't release pages there were allocated explicitly and mapped
using vmap. The memory has to be vunmap-ed and the pages needs
to be freed explicitly
2. page array is moved into the 'struct
firmware' so that we can free it from release_firmware()
and not only in fw_dev_release()
Jan Beulich [Tue, 27 Apr 2010 21:01:20 +0000 (14:01 -0700)]
drivers/base/cpu.c: fix the output from /sys/devices/system/cpu/offline
Without CONFIG_CPUMASK_OFFSTACK, simply inverting cpu_online_mask leads
to CPUs beyond nr_cpu_ids to be displayed twice and CPUs not even
possible to be displayed as offline.
Christoph Egger [Mon, 17 May 2010 15:25:54 +0000 (17:25 +0200)]
serial: Tidy REMOTE_DEBUG
REMOTE_DEBUG does already appear in 2.2 kernel sources but didn't
appear as a config Option in the initial git import 2.6.12-rc. It's
currently just used in one single place of the linux kernel and should
probably be dropped totally
Dan Carpenter [Fri, 7 May 2010 08:30:41 +0000 (10:30 +0200)]
serial: isicomm: handle running out of slots
This patch makes it return -ENODEV if we run out of empty slots in the
probe function. It's unlikely to happen, but it makes the static
checkers happy.
Richard Röjfors [Tue, 27 Apr 2010 21:16:34 +0000 (14:16 -0700)]
serial: timbuart: make sure last byte is sent when port is closed
Fix a problem in early versions of the FPGA IP.
In certain situations the IP reports that the FIFO is empty, but a byte is
still clocked out. If a flush is done at that point the currently clocked
byte is canceled.
This causes incompatibilities with the upper layers when a port is closed,
it waits until the FIFO is empty and then closes the port. During close
the FIFO is flushed -> the last byte is not sent properly.
Now the FIFO is only flushed if it is reported to be non-empty. Which
makes the currently clocked out byte to finish.
Randy Dunlap [Mon, 3 May 2010 16:08:38 +0000 (09:08 -0700)]
tty: n_gsm: depends on NET
n_gsm uses skb functions, so it should depend on NET.
n_gsm.c:(.text+0x123d49): undefined reference to `skb_dequeue'
n_gsm.c:(.text+0x123d98): undefined reference to `kfree_skb'
n_gsm.c:(.text+0x123e1e): undefined reference to `skb_pull'
Alan Cox [Fri, 26 Mar 2010 11:32:54 +0000 (11:32 +0000)]
tty: n_gsm line discipline
Add an implementation of GSM 0710 MUX. The implementation currently supports
- Basic and advanced framing (as either end of the link)
- UI or UIH data frames
- Adaption layer 1-4 (1 and 2 via tty, 3 and 4 as skbuff lists)
- Modem and control messages including the correct retry process
- Flow control
and exposes the MUX channels as a set of virtual tty devices including modem
signals. This is an experimental driver.
Sonic Zhang [Tue, 9 Mar 2010 17:25:37 +0000 (12:25 -0500)]
serial: bfin_sport_uart: only enable SPORT TX if data is to be sent
Rather than always turn on the SPORT TX interrupt, only do it when we've
actually queued up data for transmission. This avoids useless interrupt
processing.
Sonic Zhang [Tue, 9 Mar 2010 17:25:29 +0000 (12:25 -0500)]
serial: bfin_sport_uart: shorten the SPORT TX waiting loop
The waiting loop to stop SPORT TX from TX interrupt is too long. This may
block the SPORT RX interrupts and cause the RX FIFO to overflow. So, do
stop sport TX only after the last char in TX FIFO is moved into the shift
register.
Linus Torvalds [Fri, 21 May 2010 14:47:00 +0000 (07:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (23 commits)
nilfs2: disallow remount of snapshot from/to a regular mount
nilfs2: use huge_encode_dev/huge_decode_dev
nilfs2: update comment on deactivate_super at nilfs_get_sb
nilfs2: replace MS_VERBOSE with MS_SILENT
nilfs2: add missing initialization of s_mode
nilfs2: fix misuse of open_bdev_exclusive/close_bdev_exclusive
nilfs2: enlarge s_volume_name member in nilfs_super_block
nilfs2: use checkpoint number instead of timestamp to select super block
nilfs2: add missing endian conversion on super block magic number
nilfs2: make nilfs_sc_*_ops static
nilfs2: add kernel doc comments to persistent object allocator functions
nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info
nilfs2: remove nilfs_segctor_init() in segment.c
nilfs2: insert checkpoint number in segment summary header
nilfs2: add a print message after loading nilfs2
nilfs2: cleanup multi kmem_cache_{create,destroy} code
nilfs2: move out checksum routines to segment buffer code
nilfs2: move pointer to super root block into logs
nilfs2: change default of 'errors' mount option to 'remount-ro' mode
nilfs2: Combine nilfs_btree_release_path() and nilfs_btree_free_path()
...
Linus Torvalds [Fri, 21 May 2010 14:25:43 +0000 (07:25 -0700)]
Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (154 commits)
mtd: cfi_cmdset_0002: use AMD standard command-set with Winbond flash chips
mtd: cfi_cmdset_0002: Fix MODULE_ALIAS and linkage for new 0701 commandset ID
mtd: mxc_nand: Remove duplicate NAND_CMD_RESET case value
mtd: update gfp/slab.h includes
jffs2: Stop triggering block erases from jffs2_write_super()
jffs2: Rename jffs2_erase_pending_trigger() to jffs2_dirty_trigger()
jffs2: Use jffs2_garbage_collect_trigger() to trigger pending erases
jffs2: Require jffs2_garbage_collect_trigger() to be called with lock held
jffs2: Wake GC thread when there are blocks to be erased
jffs2: Erase pending blocks in GC pass, avoid invalid -EIO return
jffs2: Add 'work_done' return value from jffs2_erase_pending_blocks()
mtd: mtdchar: Do not corrupt backing device of device node inode
mtd/maps/pcmciamtd: Fix printk format for ssize_t in debug messages
drivers/mtd: Use kmemdup
mtd: cfi_cmdset_0002: Fix argument order in bootloc warning
mtd: nand: add Toshiba TC58NVG0 device ID
pcmciamtd: add another ID
pcmciamtd: coding style cleanups
pcmciamtd: fixing obvious errors
mtd: chips: add SST39WF160x NOR-flashes
...
Trivial conflicts due to dev_node removal in drivers/mtd/maps/pcmciamtd.c
Linus Torvalds [Fri, 21 May 2010 14:23:12 +0000 (07:23 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: misc comment fixes
UBI: fix s/then/than/ typos
UBI: init even if MTD device cannot be attached, if built into kernel
UBI: remove reboot notifier
Linus Torvalds [Fri, 21 May 2010 14:22:11 +0000 (07:22 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (54 commits)
xfs: mark xfs_iomap_write_ helpers static
xfs: clean up end index calculation in xfs_page_state_convert
xfs: clean up mapping size calculation in __xfs_get_blocks
xfs: clean up xfs_iomap_valid
xfs: move I/O type flags into xfs_aops.c
xfs: kill struct xfs_iomap
xfs: report iomap_bn in block base
xfs: report iomap_offset and iomap_bsize in block base
xfs: remove iomap_delta
xfs: remove iomap_target
xfs: limit xfs_imap_to_bmap to a single mapping
xfs: simplify buffer to transaction matching
xfs: Make fiemap work in query mode.
xfs: kill off l_sectbb_mask
xfs: record log sector size rather than log2(that)
xfs: remove dead XFS_LOUD_RECOVERY code
xfs: removed unused XFS_QMOPT_ flags
xfs: remove a few macro indirections in the quota code
xfs: access quotainfo structure directly
xfs: wait for direct I/O to complete in fsync and write_inode
...
Linus Torvalds [Fri, 21 May 2010 14:20:17 +0000 (07:20 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits)
ocfs2: Silence a gcc warning.
ocfs2: Don't retry xattr set in case value extension fails.
ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break
ocfs2: Reset xattr value size after xa_cleanup_value_truncate().
fs/ocfs2/dlm: Use kstrdup
fs/ocfs2/dlm: Drop memory allocation cast
Ocfs2: Optimize punching-hole code.
Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.
Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.
Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.
ocfs2: Block signals for mkdir/link/symlink/O_CREAT.
ocfs2: Wrap signal blocking in void functions.
ocfs2/dlm: Increase o2dlm lockres hash size
ocfs2: Make ocfs2_extend_trans() really extend.
ocfs2/trivial: Code cleanup for allocation reservation.
ocfs2: make ocfs2_adjust_resv_from_alloc simple.
ocfs2: Make nointr a default mount option
ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE
o2net: log socket state changes
ocfs2: print node # when tcp fails
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (182 commits)
[SCSI] aacraid: add an ifdef'd device delete case instead of taking the device offline
[SCSI] aacraid: prohibit access to array container space
[SCSI] aacraid: add support for handling ATA pass-through commands.
[SCSI] aacraid: expose physical devices for models with newer firmware
[SCSI] aacraid: respond automatically to volumes added by config tool
[SCSI] fcoe: fix fcoe module ref counting
[SCSI] libfcoe: FIP Keep-Alive messages for VPorts are sent with incorrect port_id and wwn
[SCSI] libfcoe: Fix incorrect MAC address clearing
[SCSI] fcoe: fix a circular locking issue with rtnl and sysfs mutex
[SCSI] libfc: Move the port_id into lport
[SCSI] fcoe: move link speed checking into its own routine
[SCSI] libfc: Remove extra pointer check
[SCSI] libfc: Remove unused fc_get_host_port_type
[SCSI] fcoe: fixes wrong error exit in fcoe_create
[SCSI] libfc: set seq_id for incoming sequence
[SCSI] qla2xxx: Updates to ISP82xx support.
[SCSI] qla2xxx: Optionally disable target reset.
[SCSI] qla2xxx: ensure flash operation and host reset via sg_reset are mutually exclusive
[SCSI] qla2xxx: Silence bogus warning by gcc for wrap and did.
[SCSI] qla2xxx: T10 DIF support added.
...
Jiri Kosina [Fri, 21 May 2010 11:15:17 +0000 (13:15 +0200)]
HID: fix up 'EMBEDDED' mess in Kconfig
The whole point of making some of the drivers automatically selected
unless 'EMBEDDED' was to handle quirks transparently after their separation
from the generic core.
Over time, some of the later-added quirks grew into more standalone drivers,
implementing non-trivial features a being larger than a few bytes of code.
In addition to that, some of the standalone drivers don't make sense for
99.9% of the users, as they are very specific to rare devices.
Therefore build by default in only those drivers which
- we historically used to support even before quirk separation from the
core code
- are isolated enough and likely to hit quite large portion of the
users anyway (Microsoft, Logitech)
In 2008, the IOMMU was fixed to use the boundary_mask parameter per
device properly. So 'protect4gb' workaround was removed (the 383af9525bb27f927511874f6306247ec13f1c28). But somehow I messed the
'protect4gb' boot parameter that was used to enable the
workaround.
Mark Nelson [Tue, 18 May 2010 22:51:00 +0000 (22:51 +0000)]
powerpc/pseries: Make request_ras_irqs() available to other pseries code
At the moment only the RAS code uses event-sources interrupts (for EPOW
events and internal errors) so request_ras_irqs() (which actually requests
the event-sources interrupts) is found in ras.c and is static.
We want to be able to use event-sources interrupts in other pseries code,
so let's rename request_ras_irqs() to request_event_sources_irqs() and
move it to event_sources.c.
This will be used in an upcoming patch that adds support for IO Event
interrupts that come through as event sources.
Anton Blanchard [Sun, 16 May 2010 20:28:35 +0000 (20:28 +0000)]
powerpc/numa: Use ibm,architecture-vec-5 to detect form 1 affinity
I've been told that the architected way to determine we are in form 1
affinity mode is by reading the ibm,architecture-vec-5 property which
mirrors the layout of the fifth vector of the ibm,client-architecture
structure.
Eventually we may want to parse the ibm,architecture-vec-5 and create
FW_FEATURE_* bits.
Anton Blanchard [Sun, 16 May 2010 20:19:56 +0000 (20:19 +0000)]
powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
enabled via this:
/*
* If another node is sufficiently far away then it is better
* to reclaim pages in a zone before going off node.
*/
if (distance > RECLAIM_DISTANCE)
zone_reclaim_mode = 1;
Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
RECLAIM_DISTANCE it never kicks in.
The local to remote bandwidth ratios can be quite large on System p
machines so it makes sense for us to reclaim clean pagecache locally before
going off node.
The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.
Anton Blanchard [Sun, 16 May 2010 20:02:39 +0000 (20:02 +0000)]
powerpc: Use smt_snooze_delay=-1 to always busy loop
Right now if we want to busy loop and not give up any time to the hypervisor
we put a very large value into smt_snooze_delay. This is sometimes useful
when running a single partition and you want to avoid any latencies due
to the hypervisor or CPU power state transitions. While this works, it's a bit
ugly - how big a number is enough now we have NO_HZ and can be idle for a very
long time.
The patch below makes smt_snooze_delay signed, and a negative value means loop
forever:
Anton Blanchard [Sun, 16 May 2010 20:01:28 +0000 (20:01 +0000)]
powerpc: Remove check of ibm,smt-snooze-delay OF property
I'm not sure why we have code for parsing an ibm,smt-snooze-delay OF
property. Since we have a smt-snooze-delay= boot option and we can
also set it at runtime via sysfs, it should be safe to get rid of
this code.
Michael Neuling [Thu, 13 May 2010 19:40:11 +0000 (19:40 +0000)]
powerpc/kdump: Fix race in kdump shutdown
When we are crashing, the crashing/primary CPU IPIs the secondaries to
turn off IRQs, go into real mode and wait in kexec_wait. While this
is happening, the primary tears down all the MMU maps. Unfortunately
the primary doesn't check to make sure the secondaries have entered
real mode before doing this.
On PHYP machines, the secondaries can take a long time shutting down
the IRQ controller as RTAS calls are need. These RTAS calls need to
be serialised which resilts in the secondaries contending in
lock_rtas() and hence taking a long time to shut down.
We've hit this on large POWER7 machines, where some secondaries are
still waiting in lock_rtas(), when the primary tears down the HPTEs.
This patch makes sure all secondaries are in real mode before the
primary tears down the MMU. It uses the new kexec_state entry in the
paca. It times out if the secondaries don't reach real mode after
10sec.
Michael Neuling [Thu, 13 May 2010 19:40:11 +0000 (19:40 +0000)]
powerpc/kexec: Fix race in kexec shutdown
In kexec_prepare_cpus, the primary CPU IPIs the secondary CPUs to
kexec_smp_down(). kexec_smp_down() calls kexec_smp_wait() which sets
the hw_cpu_id() to -1. The primary does this while leaving IRQs on
which means the primary can take a timer interrupt which can lead to
the IPIing one of the secondary CPUs (say, for a scheduler re-balance)
but since the secondary CPU now has a hw_cpu_id = -1, we IPI CPU
-1... Kaboom!
We are hitting this case regularly on POWER7 machines.
There is also a second race, where the primary will tear down the MMU
mappings before knowing the secondaries have entered real mode.
Also, the secondaries are clearing out any pending IPIs before
guaranteeing that no more will be received.
This changes kexec_prepare_cpus() so that we turn off IRQs in the
primary CPU much earlier. It adds a paca flag to say that the
secondaries have entered the kexec_smp_down() IPI and turned off IRQs,
rather than overloading hw_cpu_id with -1. This new paca flag is
again used to in indicate when the secondaries has entered real mode.
It also ensures that all CPUs have their IRQs off before we clear out
any pending IPI requests (in kexec_cpu_down()) to ensure there are no
trailing IPIs left unacknowledged.