Git Repo - linux.git/log

isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply

commit a144ea4b7a13087081ab5402fa9ad0bcfd249e67 [IPV4]: annotate struct in_ifaddr

Missed this extra byteswap as the isdn inlines hide the htonl inside
put_u32 which causes an extra byteswap on little-endian arches.

Signed-off-by: Harvey Harrison <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ematch: simpler tcf_em_unregister()

Simply delete ops from list and let list debugging do the job.

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Cleanup of af_unix

This is a pure cleanup of net/unix/af_unix.c to meet current code
style standards

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: Tidy up setsockopt calls

This splits the setsockopt calls into two groups, depending on whether an
integer argument (val) is required and whether routines being called do
their own locking.

Some options (such as setting the CCID) use u8 rather than int, so that for
these the test with regard to integer-sizeof can not be used.

The second switch-case statement now only has those statements which need
locking and which make use of `val'.

Signed-off-by: Gerrit Renker <[email protected]>
Acked-by: Ian McDonald <[email protected]>
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Eugene Teo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: Deprecate Ack Ratio sysctl

This patch deprecates the Ack Ratio sysctl, since
* Ack Ratio is entirely ignored by CCID-3 and CCID-4,
* Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
* even if it would work in CCID-2, there is no point for a user to change it:
   - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
   - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts
     (since waiting for Acks which will never arrive in this window),
   - cwnd is not a user-configurable value.

The only reasonable place for Ack Ratio is to print it for debugging. It is
planned to do this later on, as part of e.g. dccp_probe.

With this patch Ack Ratio is now under full control of feature negotiation:
* Ack Ratio is resolved as a dependency of the selected CCID;
* if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
   the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
   Ratio 2 for both endpoints";
* what happens then is part of another patch set, since it concerns the
   dynamic update of Ack Ratio while the connection is in full flight.

Thanks to Tomasz Grobelny for discussion leading up to this patch.

Signed-off-by: Gerrit Renker <[email protected]>
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: Feature negotiation for minimum-checksum-coverage

This provides feature negotiation for server minimum checksum coverage
which so far has been missing.

Since sender/receiver coverage values range only from 0...15, their
type has also been reduced in size from u16 to u4.

Feature-negotiation options are now generated for both sender and receiver
coverage, i.e. when the peer has `forgotten' to enable partial coverage
then feature negotiation will automatically enable (negotiate) the partial
coverage value for this connection.

Signed-off-by: Gerrit Renker <[email protected]>
Acked-by: Ian McDonald <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: Deprecate old setsockopt framework

The previous setsockopt interface, which passed socket options via struct
dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
ugly code since the old approach did not distinguish between NN and SP values.

This patch removes the old setsockopt interface and replaces it with two new
functions to register NN/SP values for feature negotiation.
These are essentially wrappers around the internal __feat_register functions,
with checking added to avoid

* wrong usage (type);
* changing values while the connection is in progress.

Signed-off-by: Gerrit Renker <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dccp: Mechanism to resolve CCID dependencies

This adds a hook to resolve features whose value depends on the choice of
CCID. It is done at the server since it can only be done after the CCID
values have been negotiated; i.e. the client will add its CCID preference
list on the Change options sent in the Request, which will be reconciled
with the local preference list of the server.

The concept is documented on
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
implementation_notes.html#ccid_dependencies

Signed-off-by: Gerrit Renker <[email protected]>
Acked-by: Ian McDonald <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation)

If segmentation offload is enabled by the host, we currently allocate
maximum sized packet buffers and pass them to the host. This uses up
20 ring entries, allowing us to supply only 20 packet buffers to the
host with a 256 entry ring. This is a huge overhead when receiving
small packets, and is most keenly felt when receiving MTU sized
packets from off-host.

The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support
using receive buffers which are smaller than the maximum packet size.
In order to transfer large packets to the guest, the host merges
together multiple receive buffers to form a larger logical buffer.
The number of merged buffers is returned to the guest via a field in
the virtio_net_hdr.

Make use of this support by supplying single page receive buffers to
the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
the payload to the skb's linear data buffer and adjust the fragment
offset to point to the remaining data. This ensures proper alignment
and allows us to not use any paged data for small packets. If the
payload occupies multiple pages, we simply append those pages as
fragments and free the associated skbs.

This scheme allows us to be efficient in our use of ring entries
while still supporting large packets. Benchmarking using netperf from
an external machine to a guest over a 10Gb/s network shows a 100%
improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark
with GSO disabled on the host side, throughput was seen to increase
from 700Mb/s to 1.7Gb/s.

Based on a patch from Herbert Xu.

Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]> (use netdev_priv)
Signed-off-by: David S. Miller <[email protected]>

virtio_net: hook up the set-tso ethtool op

Seems like an oversight that we have set-tx-csum and set-sg hooked
up, but not set-tso.

Also leads to the strange situation that if you e.g. disable tx-csum,
then tso doesn't get disabled.

Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

virtio_net: Recycle some more rx buffer pages

Each time we re-fill the recv queue with buffers, we allocate
one too many skbs and free it again when adding fails. We should
recycle the pages allocated in this case.

A previous version of this patch made trim_pages() trim trailing
unused pages from skbs with some paged data, but this actually
caused a barely measurable slowdown.

Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]> (use netdev_priv)
Signed-off-by: David S. Miller <[email protected]>

[CIFS] Fix build break

Signed-off-by: Steve French <[email protected]>

net: use %pF for /proc/net/ptype

Technically, patch changes format for modules, but I think nobody cares.

-86dd :ipv6:ipv6_rcv+0x0
+86dd ipv6_rcv+0x0/0x400 [ipv6]

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Phonet: refuse to send bigger than MTU packets

Signed-off-by: Rémi Denis-Courmont <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: make sure struct dst_entry refcount is aligned on 64 bytes

As found in the past (commit f1dd9c379cac7d5a76259e7dffcd5f8edc697d17
[NET]: Fix tbench regression in 2.6.25-rc1), it is really
important that struct dst_entry refcount is aligned on a cache line.

We cannot use __atribute((aligned)), so manually pad the structure
for 32 and 64 bit arches.

for 32bit : offsetof(truct dst_entry, __refcnt) is 0x80
for 64bit : offsetof(truct dst_entry, __refcnt) is 0xc0

As it is not possible to guess at compile time cache line size,
we use a generic value of 64 bytes, that satisfies many current arches.
(Using 128 bytes alignment on 64bit arches would waste 64 bytes)

Add a BUILD_BUG_ON to catch future updates to "struct dst_entry" dont
break this alignment.

"tbench 8" is 4.4 % faster on a dual quad core (HP BL460c G1), Intel E5450 @3.00GHz
(2350 MB/s instead of 2250 MB/s)

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rcu: documents rculist_nulls

Adds Documentation/RCU/rculist_nulls.txt file to describe how 'nulls'
end-of-list can help in some RCU algos.

Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls

RCU was added to UDP lookups, using a fast infrastructure :
- sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the
price of call_rcu() at freeing time.
- hlist_nulls permits to use few memory barriers.

This patch uses same infrastructure for TCP/DCCP established
and timewait sockets.

Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications
using short lived TCP connections. A followup patch, converting
rwlocks to spinlocks will even speedup this case.

__inet_lookup_established() is pretty fast now we dont have to
dirty a contended cache line (read_lock/read_unlock)

Only established and timewait hashtable are converted to RCU
(bind table and listen table are still using traditional locking)

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

udp: Use hlist_nulls in UDP RCU code

This is a straightforward patch, using hlist_nulls infrastructure.

RCUification already done on UDP two weeks ago.

Using hlist_nulls permits us to avoid some memory barriers, both
at lookup time and delete time.

Patch is large because it adds new macros to include/net/sock.h.
These macros will be used by TCP & DCCP in next patch.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rcu: Introduce hlist_nulls variant of hlist

hlist uses NULL value to finish a chain.

hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker.

This allows to store many different end markers, so that some RCU lockless
algos (used in TCP/UDP stack for example) can save some memory barriers in
fast paths.

Two new files are added :

include/linux/list_nulls.h
  - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant

include/linux/rculist_nulls.h
  - mimics hlist part of include/linux/rculist.h, derived to hlist_nulls variant

   Only four helpers are declared for the moment :

     hlist_nulls_del_init_rcu(), hlist_nulls_del_rcu(),
     hlist_nulls_add_head_rcu() and hlist_nulls_for_each_entry_rcu()

prefetches() were removed, since an end of list is not anymore NULL value.
prefetches() could trigger useless (and possibly dangerous) memory transactions.

Example of use (extracted from __udp4_lib_lookup())

struct sock *sk, *result;
        struct hlist_nulls_node *node;
        unsigned short hnum = ntohs(dport);
        unsigned int hash = udp_hashfn(net, hnum);
        struct udp_hslot *hslot = &udptable->hash[hash];
        int score, badness;

        rcu_read_lock();
begin:
        result = NULL;
        badness = -1;
        sk_nulls_for_each_rcu(sk, node, &hslot->head) {
                score = compute_score(sk, net, saddr, hnum, sport,
                                      daddr, dport, dif);
                if (score > badness) {
                        result = sk;
                        badness = score;
                }
        }
        /*
         * if the nulls value we got at the end of this lookup is
         * not the expected one, we must restart lookup.
         * We probably met an item that was moved to another chain.
         */
        if (get_nulls_value(node) != hash)
                goto begin;

        if (result) {
                if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt)))
                        result = NULL;
                else if (unlikely(compute_score(result, net, saddr, hnum, sport,
                                  daddr, dport, dif) < badness)) {
                        sock_put(result);
                        goto begin;
                }
        }
        rcu_read_unlock();
        return result;

Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

TPROXY: implemented IP_RECVORIGDSTADDR socket option

In case UDP traffic is redirected to a local UDP socket,
the originally addressed destination address/port
cannot be recovered with the in-kernel tproxy.

This patch adds an IP_RECVORIGDSTADDR sockopt that enables
a IP_ORIGDSTADDR ancillary message in recvmsg(). This
ancillary message contains the original destination address/port
of the packet being received.

Signed-off-by: Balazs Scheidler <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: Fix ARP behavior with many mac-vlans

Ben Greear wrote:
> I have 500 mac-vlans on a system talking to 500 other
> mac-vlans.  My problem is that the arp-table gets extremely
> huge because every time an arp-request comes in on all mac-vlans,
> a stale arp entry is added for each mac-vlan.  I have filtering
> turned on, but that doesn't help because the neigh_event_ns call
> below will cause a stale neighbor entry to be created regardless
> of whether a replay will be sent or not.
> Maybe the neigh_event code should be below the checks for dont_send,
> and only create check neigh_event_ns if we are !dont_send?

The attached patch makes it work much better for me.  The patch
will cause the code to NOT create a stale neighbor entry if we
are not going to respond to the ARP request.  The old code
*would* create a stale entry even if we are not going to respond.

Signed-off-by: Ben Greear <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cifs: reinstate sharing of tree connections

Use a similar approach to the SMB session sharing. Add a list of tcons
attached to each SMB session. Move the refcount to non-atomic. Protect
all of the above with the cifs_tcp_ses_lock. Add functions to
properly find and put references to the tcons.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Steve French <[email protected]>

e1000e: enable ECC correction on 82571 silicon

This change enables ECC correction for the packet buffer on all 82571
silicon.

Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e1000e: fix IPMI traffic

Some users reported that they have machines with BMCs enabled that cannot
receive IPMI traffic after e1000e is loaded.
http://marc.info/?l=e1000-devel&m=121909039127414&w=2
http://marc.info/?l=e1000-devel&m=121365543823387&w=2

This fixes the issue if they load with the new parameter = 0 by disabling
crc stripping, but leaves the performance feature on for most users.
Based on work done by Hong Zhang.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e1000e: fix warn_on reload after phy_id error

If the driver fails to initialize the first time due to the failure in the
phy_id check the kernel triggers a warn_on on the second try to load the
driver because the driver did not free the msi/x resources in the first
load because of the previous failure in phy_id check.

Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phylib: make mdio-gpio work without OF (v4)

make mdio-gpio work with non OpenFirmware gpio implementation.

Aditional changes to mdio-gpio:
- use gpio_request() and gpio_free()
- place irq[] array in struct mdio_gpio_info
- add module description, author and license
- add note about compiling this driver as module
- rename mdc and mdio function (were ugly names)
- change MII to MDIO in bus name
- add __init __exit to module (un)loading functions
- probe fails if no phys added to the bus
- kzalloc bitbang with sizeof(*bitbang)

Changes since v3:
- keep bus naming "%x" to be compatible with existing drivers.

Changes since v2:
- more #ifdefs reduction
- platform driver will be registered on OF platforms also
- unified platform and OF bus_id to phy%i

Changes since v1:
- removed NO_IRQ
- reduced #idefs

Laurent, please test this driver under OF.

Signed-off-by: Paulius Zaleckas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phylib: rename mdio-ofgpio to mdio-gpio

Signed-off-by: Paulius Zaleckas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

unitialized return value in mm/mlock.c: __mlock_vma_pages_range()

Fix an unitialized return value when compiling on parisc (with CONFIG_UNEVICTABLE_LRU=y):
mm/mlock.c: In function `__mlock_vma_pages_range':
mm/mlock.c:165: warning: `ret' might be used uninitialized in this function

Signed-off-by: Helge Deller <[email protected]>
[ It isn't ever really used uninitialized, since no caller should ever
  call this function with an empty range.  But the compiler is correct
  that from a local analysis standpoint that is impossible to see, and
  fixing the warning is appropriate.  ]
Signed-off-by: Linus Torvalds <[email protected]>

stop_machine: fix race with return value (fixes Bug #11989)

Bug #11989: Suspend failure on NForce4-based boards due to chanes in
stop_machine

We should not access active.fnret outside the lock; in theory the next
stop_machine could overwrite it.

Signed-off-by: Rusty Russell <[email protected]>
Tested-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Fix broken ownership of /proc/sys/ files

D'oh...

Signed-off-by: Al Viro <[email protected]>
Reported-and-tested-by: Peter Palfrader <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

dm9000: Fix build error.

Reported by Stephen Rothwell:

drivers/net/dm9000.c:1450: error: expected ')' before ';' token
drivers/net/dm9000.c:1455: error: expected ';' before '}' token

Signed-off-by: David S. Miller <[email protected]>

mfd: Correct WM8350 I2C return code usage

The vendor BSP used for the WM8350 development provided an I2C driver
which incorrectly returned zero on succesful sends rather than the
number of transmitted bytes, an error which was then propagated into the
WM8350 I2C accessors.

Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Samuel Ortiz <[email protected]>

mfd: fix event masking for da9030

Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Eric Miao <[email protected]>
Signed-off-by: Samuel Ortiz <[email protected]>

acpi: fix oops in acpi_system_wakeup_device_seq_show

Commit 0794469da3f7b2093575cbdfc1108308dd3641ce: ("ACPI: struct device -
replace bus_id with dev_name(), dev_set_name()") introduced a bug by
testing 'dev_name(ldev)' instead of 'ldev->bus' for NULL when printing
out the bus information.

So if ldev->bus was NULL, we'd oops.

Reported-and-tested-by: Bruno Prémont <[email protected]>
Cc: Kay Sievers <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

phy: fix phy address bug

PHYID returns 0xffff and not 0xffffffff when not found and in some
case(at91sam9263) 0x0. Maybe this patch could be useful.

Signed-off-by: Giulio Benetti <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e100: fix dma error in direction for mapping

The e100 driver triggers BUG_ON(buf->direction != dir)
by doing pci_map_single(..., PCI_DMA_BIDIRECTIONAL)
and pci_dma_sync_single_for_device(..., PCI_DMA_TODEVICE).

Changing the DMA direction, especially with dmabounce will result
in unexpected behaviour.

Reported-by: Anders Grafstrom <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

igb: use dev_printk instead of printk

Use dev_printk() instead of printk() to give a little more context
and use consistent format.

Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qla3xxx: Cleanup: Fix link print statements.

Removed debug print statements and improved conditionals around informational statements.

Signed-off-by: Ron Mercer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

igb: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
igb_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e1000: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e1000e: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

via-velocity: enable perfect filtering for multicast packets

Signed-off-by: Joey Zhuo <[email protected]>
Acked-by: Francois Romieu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

pegasus: minor resource shrinkage

Make pegasus driver not allocate a workqueue until the driver
is bound to some device, which will need that workqueue if
the device is brought up. This conserves resources when the
driver is linked but there's no pegasus device connected.

Also shrink the runtime footprint a smidgeon by moving some
init-only code into its proper section, and move an obnoxious
(frequent and meaningless) message to be debug-only.

Signed-off-by: David Brownell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ixgbe: Fix usage of netif_*_all_queues() with netif_carrier_{off|on}()

netif_carrier_off() is sufficient to stop Tx into the driver. Stopping the Tx
queues is redundant and unnecessary. By the same token, netif_carrier_on()
will be sufficient to re-enable Tx, so waking the queues is unnecessary.

Signed-off-by: Peter P Waskiewicz Jr <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

function tracing: fix wrong pos computing when read buffer has been fulfilled

Impact: make output of available_filter_functions complete

phenomenon:

The first value of dyn_ftrace_total_info is not equal with
`cat available_filter_functions | wc -l`, but they should be equal.

root cause:

When printing functions with seq_printf in t_show, if the read buffer
is just overflowed by current function record, then this function
won't be printed to user space through read buffer, it will
just be dropped. So we can't see this function printing.

So, every time the last function to fill the read buffer, if overflowed,
will be dropped.

This also applies to set_ftrace_filter if set_ftrace_filter has
more bytes than read buffer.

fix:

Through checking return value of seq_printf, if less than 0, we know
this function doesn't be printed. Then we decrease position to force
this function to be printed next time, in next read buffer.

Another little fix is to show correct allocating pages count.

Signed-off-by: walimis <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

MAINTAINERS: remove me as RAID maintainer

Neil has been the maintainer of the RAID/MD code for a long time,
remove me as a co-maintainer.

Signed-off-by: Ingo Molnar <[email protected]>

sched: fix kernel warning on /proc/sched_debug access

Luis Henriques reported that with CONFIG_PREEMPT=y + CONFIG_PREEMPT_DEBUG=y +
CONFIG_SCHED_DEBUG=y + CONFIG_LATENCYTOP=y enabled, the following warning
triggers when using latencytop:

> [  775.663239] BUG: using smp_processor_id() in preemptible [00000000] code: latencytop/6585
> [  775.663303] caller is native_sched_clock+0x3a/0x80
> [  775.663314] Pid: 6585, comm: latencytop Tainted: G        W 2.6.28-rc4-00355-g9c7c354 #1
> [  775.663322] Call Trace:
> [  775.663343]  [<ffffffff803a94e4>] debug_smp_processor_id+0xe4/0xf0
> [  775.663356]  [<ffffffff80213f7a>] native_sched_clock+0x3a/0x80
> [  775.663368]  [<ffffffff80213e19>] sched_clock+0x9/0x10
> [  775.663381]  [<ffffffff8024550d>] proc_sched_show_task+0x8bd/0x10e0
> [  775.663395]  [<ffffffff8034466e>] sched_show+0x3e/0x80
> [  775.663408]  [<ffffffff8031039b>] seq_read+0xdb/0x350
> [  775.663421]  [<ffffffff80368776>] ? security_file_permission+0x16/0x20
> [  775.663435]  [<ffffffff802f4198>] vfs_read+0xc8/0x170
> [  775.663447]  [<ffffffff802f4335>] sys_read+0x55/0x90
> [  775.663460]  [<ffffffff8020c67a>] system_call_fastpath+0x16/0x1b
> ...

This breakage was caused by me via:

  7cbaef9: sched: optimize sched_clock() a bit

Change the calls to cpu_clock().

Reported-by: Luis Henriques <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: don't grab devices with no input
  HID: fix radio-mr800 hidquirks
  HID: fix kworld fm700 radio hidquirks
  HID: fix start/stop cycle in usbhid driver
  HID: use single threaded work queue for hid_compat
  HID: map macbook keys for "Expose" and "Dashboard"
  HID: support for new unibody macbooks
  HID: fix locking in hidraw_open()

Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
  pcmcia: ensure correct logging in do_io_probe
  pcmcia: add another pata/ide ID
  pcmcia: add braces in error path
  pcmcia: struct device - replace bus_id with dev_name(), dev_set_name()
  pcmcia: setup resource information for pseudo multifunction devices.
  pcmcia: fix indentation & braces disagreement - add braces

phy: Add support for Marvell 88E1118 PHY

This patch will add support for the Marvell 88E1118 PHY which supports gigabit ethernet among other things.

Signed-off-by: Ron Madrid <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlx4_en: Pause parameters per port

Before the change the driver reported the same pause parameters
for all the ports, even only one of them was modified.

Signed-off-by: Yevgeny Petrilin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Linux 2.6.28-rc5

Fix inotify watch removal/umount races

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex.  That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount.  We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done.  Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords.  That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e.  that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong.  We had
to drop ih->mutex before we could grab ->s_umount.  So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer.  If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address.  That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that.  Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check.  If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro <[email protected]>
Acked-by: Greg KH <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

LIS3LV02Dx: remove unused #include <version.h>

The file(s) below do not use LINUX_VERSION_CODE nor KERNEL_VERSION.
drivers/hwmon/lis3lv02d.c

This patch removes the said #include <version.h>.

Signed-off-by: Huang Weiyi <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6

* 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  serial: sh-sci: Reorder the SCxTDR write after the TDxE clear.
  sh: __copy_user function can corrupt the stack in case of exception
  sh: Fixed the TMU0 reload value on resume
  sh: Don't factor in PAGE_OFFSET for valid_phys_addr_range() check.
  sh: early printk port type fix
  i2c: fix i2c-sh_mobile rx underrun
  sh: Provide a sane valid_phys_addr_range() to prevent TLB reset with PMB.
  usb: r8a66597-hcd: fix wrong data access in SuperH on-chip USB
  fix sci type for SH7723
  serial: sh-sci: fix cannot work SH7723 SCIFA
  sh: Handle fixmap TLB eviction more coherently.

Merge branch 'doc-subdirs' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs

* 'doc-subdirs' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs:
Create/use more directory structure in the Documentation/ tree.

Add 'pr_fmt()' format modifier to pr_xyz macros.

A common reason for device drivers to implement their own printk macros
is the lack of a printk prefix with the standard pr_xyz macros.
Introduce a pr_fmt() macro that is applied for every pr_xyz macro to the
format string.

The most common use of the pr_fmt macro would be to add the name of the
device driver to all pr_xyz messages in a source file.

Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
9p: restrict RDMA usage

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  V4L/DVB (9624): CVE-2008-5033: fix OOPS on tvaudio when controlling bass/treble
  V4L/DVB (9623): tvaudio: Improve debug msg by printing something more human
  V4L/DVB (9622): tvaudio: Improve comments and remove a unneeded prototype
  V4L/DVB (9621): Avoid writing outside shadow.bytes[] array
  V4L/DVB (9620): tvaudio: use a direct reference for chip description
  V4L/DVB (9619): tvaudio: update initial comments
  V4L/DVB (9618): tvaudio: add additional logic to avoid OOPS
  V4L/DVB (9617): tvtime: remove generic_checkmode callback
  V4L/DVB (9616): tvaudio: cleanup - group all callbacks together
  V4L/DVB (9615): tvaudio: instead of using a magic number, use ARRAY_SIZE
  V4L/DVB (9613): tvaudio: fix a memory leak

Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] fix s390x_newuname
  [S390] dasd: log sense for fatal errors
  [S390] cpu topology: fix locking
  [S390] cio: Fix refcount after moving devices.
  [S390] ftrace: fix kernel stack backchain walking
  [S390] ftrace: disable tracing on idle psw
  [S390] lockdep: fix compile bug
  [S390] kvm_s390: Fix oops in virtio device detection with "mem="
  [S390] sclp: emit error message if assign storage fails
  [S390] Fix range for add_active_range() in setup_memory()

Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] dpt_i2o: fix transferred data length for scsi_set_resid()
  [SCSI] scsi_error regression: Fix idempotent command handling
  [SCSI] zfcp: Fix hexdump data in s390dbf traces
  [SCSI] zfcp: fix erp timeout cleanup for port open requests
  [SCSI] zfcp: Wait for port scan to complete when setting adapter online
  [SCSI] zfcp: Fix cast warning
  [SCSI] zfcp: Fix request list handling in error path
  [SCSI] zfcp: fix mempool usage for status_read requests
  [SCSI] zfcp: fix req_list_locking.
  [SCSI] zfcp: Dont clear reference from SCSI device to unit
  [SCSI] qla2xxx: Update version number to 8.02.01-k9.
  [SCSI] qla2xxx: Return a FAILED status when abort mailbox-command fails.
  [SCSI] qla2xxx: Do not honour max_vports from firmware for 2G ISPs and below.
  [SCSI] qla2xxx: Use pci_disable_rom() to manipulate PCI config space.
  [SCSI] qla2xxx: Correct Atmel flash-part handling.
  [SCSI] megaraid: fix mega_internal_command oops

Revert "x86: blacklist DMAR on Intel G31/G33 chipsets"

This reverts commit e51af6630848406fc97adbd71443818cdcda297b, which was
wrongly hoovered up and submitted about a month after a better fix had
already been merged.

The better fix is commit cbda1ba898647aeb4ee770b803c922f595e97731
("PCI/iommu: blacklist DMAR on Intel G31/G33 chipsets"), where we do
this blacklisting based on the DMI identification for the offending
motherboard, since sometimes this chipset (or at least a chipset with
the same PCI ID) apparently _does_ actually have an IOMMU.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: remove unevictable's show_page_path

Hugh Dickins reported show_page_path() is buggy and unsafe because

- lack dput() against d_find_alias()
- don't concern vma->vm_mm->owner == NULL
- lack lock_page()

it was only for debugging, so rather than trying to fix it, just remove
it now.

Reported-by: Hugh Dickins <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: KOSAKI Motohiro <[email protected]>
CC: Lee Schermerhorn <[email protected]>
CC: Rik van Riel <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

misc: C2port needs <linux/sched.h>

m68k allmodconfig:

| drivers/misc/c2port/core.c: In function 'c2port_reset':
| drivers/misc/c2port/core.c:73: error: dereferencing pointer to incomplete type
| drivers/misc/c2port/core.c: In function 'c2port_strobe_ck':
| drivers/misc/c2port/core.c:91: error: dereferencing pointer to incomplete type

Include <linux/sched.h> to fix it, as m68k's local_irq_enable() needs to know
about struct task_struct.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

m68k: Fix off-by-one in m68k_setup_user_interrupt()

commit 69961c375288bdab7604e0bb1c8d22999bb8a347 ("[PATCH] m68k/Atari:
Interrupt updates") added a BUG_ON() with an incorrect upper bound
comparison, which causes an early crash on VME boards, where IRQ_USER is
8, cnt is 192 and NR_IRQS is 200.

Reported-by: Stephen N Chivers <[email protected]>
Tested-by: Kars de Jong <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: improve phantom device detection

ALSA: hda - Check model type instead of SSID in patch_92hd71bxx()

Check board preset model instead of codec->subsystem_id in
patch_92hd71bxx() so that other hardwares configured via the model
option work like the given model.

Signed-off-by: Takashi Iwai <[email protected]>

Move "exit_robust_list" into mm_release()

We don't want to get rid of the futexes just at exit() time, we want to
drop them when doing an execve() too, since that gets rid of the
previous VM image too.

Doing it at mm_release() time means that we automatically always do it
when we disassociate a VM map from the task.

Reported-by: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Brad Spengler <[email protected]>
Cc: Alex Efros <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ALSA: sound/pci/pcxhr/pcxhr.c: introduce missing kfree and pci_disable_device

Error handling code following a kzalloc should free the allocated data.
The error handling code is adjusted to call pci_disable_device(pci); as
well, as done later in the function

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,l;
position p1,p2;
expression *ptr != NULL;
@@

(
if ((x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...)) == NULL) S
|
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
)
<... when != x
when != if (...) { <+...x...+> }
x->f = E
...>
(
return \(0\|<+...x...+>\|ptr\);
|
return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda: STAC_VREF_EVENT value change

Changed value for STAC_VREF_EVENT from 0x40 to 0x00 because the
unsol response value is only 6-bits width and the former value
was 1<<6 which is an overrun.

Signed-off-by: Matthew Ranostay <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

[SCSI] dpt_i2o: fix transferred data length for scsi_set_resid()

dpt_i2o.c::adpt_i2o_to_scsi() reads the value at (reply+5) which
should contain the length in bytes of the transferred data. This
would be correct if reply was a u32 *. However it is a void * here,
so we need to read the value at (reply+20) instead.

The value at (reply+5) is usually 0xff0000, which is apparently
'large enough' and didn't cause any trouble until 2.6.27 where

commit 427e59f09fdba387547106de7bab980b7fff77be
Author: James Bottomley <[email protected]>
Date: Sat Mar 8 18:24:17 2008 -0600

[SCSI] make use of the residue value

caused this to become visible through e.g. iostat -x .

Signed-off-by: Miquel van Smoorenburg <[email protected]>
Cc: Stable Tree <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

rtc: rtc-sun4v fixes, revised

- simplified code
- use platform_driver_probe
- removed locking: it's provided by rtc subsystem

Signed-off-by: Alessandro Zummo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

[CIFS] minor cleanup to cifs_mount

Signed-off-by: Steve French <[email protected]>

cifs: reinstate sharing of SMB sessions sans races

We do this by abandoning the global list of SMB sessions and instead
moving to a per-server list. This entails adding a new list head to the
TCP_Server_Info struct. The refcounting for the cifsSesInfo is moved to
a non-atomic variable. We have to protect it by a lock anyway, so there's
no benefit to making it an atomic. The list and refcount are protected
by the global cifs_tcp_ses_lock.

The patch also adds a new routines to find and put SMB sessions and
that properly take and put references under the lock.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Steve French <[email protected]>

libata: improve phantom device detection

Currently libata uses four methods to detect device presence.

1. PHY status if available.
2. TF register R/W test (only promotes presence, never demotes)
3. device signature after reset
4. IDENTIFY failure detection in SFF state machine

Combination of the above works well in most cases but recently there
have been a few reports where a phantom device causes unnecessary
delay during probe. In both cases, PHY status wasn't available. In
one case, it passed #2 and #3 and failed IDENTIFY with ATA_ERR which
didn't qualify as #4. The other failed #2 but as it passed #3 and #4,
it still caused failure.

In both cases, phantom device reported diagnostic failure, so these
cases can be safely worked around by considering any !ATA_DRQ IDENTIFY
failure as NODEV_HINT if diagnostic failure is set.

Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

cifs: disable sharing session and tcon and add new TCP sharing code

The code that allows these structs to be shared is extremely racy.
Disable the sharing of SMB and tcon structs for now until we can
come up with a way to do this that's race free.

We want to continue to share TCP sessions, however since they are
required for multiuser mounts. For that, implement a new (hopefully
race-free) scheme. Add a new global list of TCP sessions, and take
care to get a reference to it whenever we're dealing with one.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

phylib: fix premature freeing of struct mii_bus

Commit 46abc02175b3c246dd5141d878f565a8725060c9 ("phylib: give mdio
buses a device tree presence") added a call to device_unregister() in
a situation where the caller did not intend for the device to be
freed yet, but apart from just unregistering the device from the
system, device_unregister() does an additional put_device() that is
intended to free it.

The right function to use in this situation is device_del(), which
unregisters the device from the system like device_unregister() does,
but without dropping the reference count an additional time.

Bug report from Bryan Wu <[email protected]>.

Signed-off-by: Lennert Buytenhek <[email protected]>
Tested-by: Bryan Wu <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

atl1: Do not enumerate options unsupported by chip

Of the various WOL options provided in include/linux/ethtool.h, the
L1 NIC supports only magic packet. Remove all options except magic
packet from the atl1 driver.

Signed-off-by: Jay Cliburn <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

atl1e: fix broken multicast by removing unnecessary crc inversion

Inverting the crc after calling ether_crc_le() is unnecessary and breaks
multicast. Remove it.

Tested-by: David Madore <[email protected]>
Signed-off-by: Jay Cliburn <[email protected]>
Cc: [email protected]
Signed-off-by: Jeff Garzik <[email protected]>

gianfar: Fix DMA unmap invocations

We weren't unmapping DMA memory, which will break when gianfar gets used
on systems with more than 32-bits of memory. Also, it's just plain wrong.

Signed-off-by: Andy Fleming <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

net/ucc_geth: Fix oops in uec_get_ethtool_stats()

p_{tx,rx}_fw_statistics_pram are special: they're available only when
a device is open. If the device is closed, we should just fill the data
with zeroes.

Fixes the following oops:

root@b1:~# ifconfig eth1 down
root@b1:~# ethtool -S eth1
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01e1dcc
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01e1dcc] uec_get_ethtool_stats+0x98/0x124
LR [c0287cc8] ethtool_get_stats+0xfc/0x23c
Call Trace:
[cfaadde0] [c0287ca8] ethtool_get_stats+0xdc/0x23c (unreliable)
[cfaade20] [c0288340] dev_ethtool+0x2fc/0x588
[cfaade50] [c0285648] dev_ioctl+0x290/0x33c
[cfaadea0] [c0272238] sock_ioctl+0x80/0x2ec
[cfaadec0] [c00b5ae4] vfs_ioctl+0x40/0xc0
[cfaadee0] [c00b5fa8] do_vfs_ioctl+0x78/0x20c
[cfaadf10] [c00b617c] sys_ioctl+0x40/0x74
[cfaadf40] [c00142d8] ret_from_syscall+0x0/0x38
[...]
---[ end trace b941007b2dfb9759 ]---
Segmentation fault

p.s. While at it, also remove u64 casts, they aren't needed.

Signed-off-by: Anton Vorontsov <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

scm: fix scm_fp_list->list initialization made in wrong place

This is the next page of the scm recursion story (the commit
f8d570a4 net: Fix recursive descent in __scm_destroy()).

In function scm_fp_dup(), the INIT_LIST_HEAD(&fpl->list) of newly
created fpl is done *before* the subsequent memcpy from the old
structure and thus the freshly initialized list is overwritten.

But that's OK, since this initialization is not required at all,
since the fpl->list is list_add-ed at the destruction time in any
case (and is unused in other code), so I propose to drop both
initializations, rather than moving it after the memcpy.

Please, correct me if I miss something significant.

Signed-off-by: Pavel Emelyanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

niu: Bump driver version and release date.

This driver is pretty mature, and the worst of the known
problems has been fixed (the 32-bit failures due to readq
implementation).

So let's finally give it a version of 1.0

Signed-off-by: David S. Miller <[email protected]>

NIU: Add Sun CP3260 ATCA blade support

This patch adds support for the Sun CP3260 ATCA blade which is
a N2 based ATCA blade with 2 NIU ports. The NIU ports do not
have on-board PHY.

Signed-off-by: Santwona Behera <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

9p: restrict RDMA usage

linux-next:

Make 9p's RDMA option depend on INET since it uses Infiniband rdma_*
functions and that code depends on INET. Otherwise 9p can try to
use symbols which don't exist.

ERROR: "rdma_destroy_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_connect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_qp" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_route" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_disconnect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_addr" [net/9p/9pnet_rdma.ko] undefined!

I used an if/endif block so that the menu items would remain
presented together.

Also correct an article adjective.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Eric Van Hensbergen <[email protected]>

Create/use more directory structure in the Documentation/ tree.

Create Documentation/blockdev/ sub-directory and populate it.
Populate the Documentation/serial/ sub-directory.
Move MSI-HOWTO.txt to Documentation/PCI/.
Move ioctl-number.txt to Documentation/ioctl/.
Update all relevant 00-INDEX files.
Update all relevant Kconfig files and source files.

Signed-off-by: Randy Dunlap <[email protected]>

[S390] fix s390x_newuname

The uname system call for 64 bit compares current->personality without
masking the upper 16 bits. If e.g. READ_IMPLIES_EXEC is set the result
of a uname system call will always be s390x even if the process uses
the s390 personality.

Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] dasd: log sense for fatal errors

The logging of sense data for fatal errors was accidentally removed
during Hyper PAV implementation.

Signed-off-by: Stefan Haberland <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] cpu topology: fix locking

cpu_coregroup_map used to grab a mutex on s390 since it was only
called from process context.
Since c7c22e4d5c1fdebfac4dba76de7d0338c2b0d832 "block: add support
for IO CPU affinity" this is not true anymore.
It now also gets called from softirq context.

To prevent possible deadlocks change this in architecture code and
use a spinlock instead of a mutex.

Cc: [email protected]
Cc: Jens Axboe <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] cio: Fix refcount after moving devices.

In ccw_device_move_to_orphanage(), a replacing ccw_device
is searched via get_{disc,orphaned}_ccwdev_by_dev_id()
which obtain a reference on the returned ccw_device.
This reference must be given up again after the device
has been moved to its new parent.

Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] ftrace: fix kernel stack backchain walking

With CONFIG_IRQSOFF_TRACER the trace_hardirqs_off() function includes
a call to __builtin_return_address(1). But we calltrace_hardirqs_off()
from early entry code. There we have just a single stack frame.
So this results in a kernel stack backchain walk that would walk beyond
the kernel stack. Following the NULL terminated backchain this results
in a lowcore read access.

To fix this we simply call trace_hardirqs_off_caller() and pass the
current instruction pointer.

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] ftrace: disable tracing on idle psw

Disable tracing on idle psw. Otherwise it would give us huge
preempt off times for idle. Which is rather pointless.

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] lockdep: fix compile bug

arch/s390/kernel/built-in.o: In function `cleanup_io_leave_insn':
mem_detect.c:(.text+0x10592): undefined reference to `lockdep_sys_exit'

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] kvm_s390: Fix oops in virtio device detection with "mem="

The current virtio model on s390 has the descriptor page above the main
memory. The guest virtio detection will oops if the mem= parameter is
used to reduce/change the memory size.
We have to use real_memory_size instead of max_pfn to detect the virtio
descriptor pages.

Signed-off-by: Christian Borntraeger <[email protected]>

[S390] sclp: emit error message if assign storage fails

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

[S390] Fix range for add_active_range() in setup_memory()

add_active_range() expects start_pfn + size as end_pfn value, i.e. not
the pfn of the last page frame but the one behind that.
We used the pfn of the last page frame so far, which can lead to a
BUG_ON in move_freepages(), when the kernelcore parameter is specified
(page_zone(start_page) != page_zone(end_page)).

Signed-off-by: Gerald Schaefer <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

V4L/DVB (9624): CVE-2008-5033: fix OOPS on tvaudio when controlling bass/treble

This bug were supposed to be fixed by 5ba2f67afb02c5302b2898949ed6fc3b3d37dcf1,
where a call to NULL happens.

Not all tvaudio chips allow controlling bass/treble. So, the driver
has a table with a flag to indicate if the chip does support it.

Unfortunately, the handling of this logic were broken for a very long
time (probably since the first module version). Due to that, an OOPS
were generated for devices that don't support bass/treble.

This were the resulting OOPS message before the patch, with debug messages
enabled:

tvaudio' 1-005b: VIDIOC_S_CTRL
BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<00000000>]
*pde = 22fda067 *pte = 00000000
Oops: 0000 [#1] SMP
Modules linked in: snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_hwdep snd soundcore tuner_simple tuner_types tea5767 tuner
tvaudio bttv bridgebnep rfcomm l2cap bluetooth it87 hwmon_vid hwmon fuse sunrpc ipt_REJECT
nf_conntrack_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack
ip6table_filter ip6_tables x_tables ipv6 dm_mirrordm_multipath dm_mod configfs videodev v4l1_compat
ir_common 8139cp compat_ioctl32 v4l2_common 8139too videobuf_dma_sg videobuf_core mii btcx_risc tveeprom
i915 button snd_page_alloc serio_raw drm pcspkr i2c_algo_bit i2c_i801 i2c_core iTCO_wdt
iTCO_vendor_support sr_mod cdrom sg ata_generic pata_acpi ata_piix libata sd_mod scsi_mod ext3 jbdmbcache
uhci_hcd ohci_hcd ehci_hcd [last unloaded: soundcore]

Pid: 15413, comm: qv4l2 Not tainted (2.6.25.14-108.fc9.i686 #1)
EIP: 0060:[<00000000>] EFLAGS: 00210246 CPU: 0
EIP is at 0x0
EAX: 00008000 EBX: ebd21600 ECX: e2fd9ec4 EDX: 00200046
ESI: f8c0f0c4 EDI: f8c0f0c4 EBP: e2fd9d50 ESP: e2fd9d2c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process qv4l2 (pid: 15413, ti=e2fd9000 task=ebe44000 task.ti=e2fd9000)
Stack: f8c0c6ae e2ff2a00 00000d00 e2fd9ec4 ebc4e000 e2fd9d5c f8c0c448 00000000
       f899c12a e2fd9d5c f899c154 e2fd9d68 e2fd9d80 c0560185 e2fd9d88 f8f3e1d8
       f8f3e1dc ebc4e034 f8f3e18c e2fd9ec4 00000000 e2fd9d90 f899c286 c008561c
Call Trace:
[<f8c0c6ae>] ? chip_command+0x266/0x4b6 [tvaudio]
[<f8c0c448>] ? chip_command+0x0/0x4b6 [tvaudio]
[<f899c12a>] ? i2c_cmd+0x0/0x2f [i2c_core]
[<f899c154>] ? i2c_cmd+0x2a/0x2f [i2c_core]
[<c0560185>] ? device_for_each_child+0x21/0x49
[<f899c286>] ? i2c_clients_command+0x1c/0x1e [i2c_core]
[<f8f283d8>] ? bttv_call_i2c_clients+0x14/0x16 [bttv]
[<f8f23601>] ? bttv_s_ctrl+0x1bc/0x313 [bttv]
[<f8f23445>] ? bttv_s_ctrl+0x0/0x313 [bttv]
[<f8b6096d>] ? __video_do_ioctl+0x1f84/0x3726 [videodev]
[<c05abb4e>] ? sock_aio_write+0x100/0x10d
[<c041b23e>] ? kmap_atomic_prot+0x1dd/0x1df
[<c043a0c9>] ? enqueue_hrtimer+0xc2/0xcd
[<c04f4fa4>] ? copy_from_user+0x39/0x121
[<f8b622b9>] ? __video_ioctl2+0x1aa/0x24a [videodev]
[<c04054fd>] ? do_notify_resume+0x768/0x795
[<c043c0f7>] ? getnstimeofday+0x34/0xd1
[<c0437b77>] ? autoremove_wake_function+0x0/0x33
[<f8b62368>] ? video_ioctl2+0xf/0x13 [videodev]
[<c048c6f0>] ? vfs_ioctl+0x50/0x69
[<c048c942>] ? do_vfs_ioctl+0x239/0x24c
[<c048c995>] ? sys_ioctl+0x40/0x5b
[<c0405bf2>] ? syscall_call+0x7/0xb
[<c0620000>] ? cpuid4_cache_sysfs_exit+0x3d/0x69
=======================
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 0068:e2fd9d2c

Signed-off-by: Mauro Carvalho Chehab <[email protected]>

V4L/DVB (9623): tvaudio: Improve debug msg by printing something more human

Before the patch, the used ioctl were printed as an hexadecimal code,
hard to be understand without consulting the way _IO macros work.
Instead, use the V4L default handler for printing such errors into a way
that would be easier to understand.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>

V4L/DVB (9622): tvaudio: Improve comments and remove a unneeded prototype

Some comments are not clear enough. Improve it to allow a better
understanding of the driver behavior.

While there, remove an unneeded struct prototype.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>