Git Repo - linux.git/log

]> Git Repo - linux.git/log

Paolo Bonzini [Tue, 14 Oct 2014 23:52:33 +0000 (10:22 +1030)]

virito_scsi: use freezable WQ for events

Michael S. Tsirkin noticed a race condition:
we reset device on freeze, but system WQ is still
running so it might try adding bufs to a VQ meanwhile.

To fix, switch to handling events from the freezable WQ.

Reported-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:32 +0000 (10:22 +1030)]

virtio_net: enable VQs early on restore

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio net violated this
rule by using receive VQs within restore.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:32 +0000 (10:22 +1030)]

virtio_console: enable VQs early on restore

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after resume returns, virtio console violated this
rule by adding inbufs, which causes the VQ to be used directly within
restore.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:32 +0000 (10:22 +1030)]

virtio_scsi: enable VQs early on restore

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio scsi violated
this rule on restore by kicking event vq within restore.

To fix, call virtio_device_ready before using event queue.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:32 +0000 (10:22 +1030)]

virtio_blk: enable VQs early on restore

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio block violated
this rule on restore by restarting queues, which might in theory
cause the VQ to be used directly within restore.

To fix, call virtio_device_ready before using starting queues.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:31 +0000 (10:22 +1030)]

virtio_scsi: move kick event out from virtscsi_init

We currently kick event within virtscsi_init,
before host is fully initialized.

This can in theory confuse guest if device
consumes the buffers immediately.

To fix, move virtscsi_kick_event_all out to scan/restore.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:31 +0000 (10:22 +1030)]

virtio_net: fix use after free on allocation failure

In the extremely unlikely event that driver initialization fails after
RX buffers are added, virtio net frees RX buffers while VQs are
still active, potentially causing device to use a freed buffer.

To fix, reset device first - same as we do on device removal.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:31 +0000 (10:22 +1030)]

9p/trans_virtio: enable VQs early

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, but virtio 9p device
adds self to channel list within probe, at which point VQ can be
used in violation of the spec.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:31 +0000 (10:22 +1030)]

virtio_console: enable VQs early

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio console violated this
rule by adding inbufs, which causes the VQ to be used directly within
probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:30 +0000 (10:22 +1030)]

virtio_blk: enable VQs early

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio block violated this
rule by calling add_disk, which causes the VQ to be used directly within
probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:30 +0000 (10:22 +1030)]

virtio_net: enable VQs early

virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio net violated this
rule by using receive VQs within probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:30 +0000 (10:22 +1030)]

virtio: add API to enable VQs early

virtio spec 0.9.X requires DRIVER_OK to be set before
VQs are used, but some drivers use VQs before probe
function returns.
Since DRIVER_OK is set after probe, this violates the spec.

Even though under virtio 1.0 transitional devices support this
behaviour, we want to make it possible for those early callers to become
spec compliant and eventually support non-transitional devices.

Add API for drivers to call before using VQs.

Sets DRIVER_OK internally.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:30 +0000 (10:22 +1030)]

virtio_net: minor cleanup

goto done;
done:
return;
is ugly, it was put there to make diff review easier.
replace by open-coded return.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:29 +0000 (10:22 +1030)]

virtio-net: drop config_mutex

config_mutex served two purposes: prevent multiple concurrent config
change handlers, and synchronize access to config_enable flag.

Since commit dbf2576e37da0fcc7aacbfbb9fd5d3de7888a3c1
workqueue: make all workqueues non-reentrant
all workqueues are non-reentrant, and config_enable
is now gone.

Get rid of the unnecessary lock.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:29 +0000 (10:22 +1030)]

virtio_net: drop config_enable

Now that virtio core ensures config changes don't arrive during probing,
drop config_enable flag in virtio net.
On removal, flush is now sufficient to guarantee that no change work is
queued.

This help simplify the driver, and will allow setting DRIVER_OK earlier
without losing config change notifications.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:29 +0000 (10:22 +1030)]

virtio-blk: drop config_mutex

config_mutex served two purposes: prevent multiple concurrent config
change handlers, and synchronize access to config_enable flag.

Since commit dbf2576e37da0fcc7aacbfbb9fd5d3de7888a3c1
workqueue: make all workqueues non-reentrant
all workqueues are non-reentrant, and config_enable
is now gone.

Get rid of the unnecessary lock.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:52:26 +0000 (10:22 +1030)]

virtio_blk: drop config_enable

Now that virtio core ensures config changes don't
arrive during probing, drop config_enable flag
in virtio blk.
On removal, flush is now sufficient to guarantee that
no change work is queued.

This help simplify the driver, and will allow
setting DRIVER_OK earlier without losing config
change notifications.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 23:51:55 +0000 (10:21 +1030)]

virtio: defer config changed notifications

Defer config changed notifications that arrive during
probe/scan/freeze/restore.

This will allow drivers to set DRIVER_OK earlier, without worrying about
racing with config change interrupts.

This change will also benefit old hypervisors (before 2009)
that send interrupts without checking DRIVER_OK: previously,
the callback could race with driver-specific initialization.

This will also help simplify drivers.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]> (cosmetic changes)

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 00:10:35 +0000 (10:40 +1030)]

virtio-pci: move freeze/restore to virtio core

This is in preparation to extending config changed event handling
in core.
Wrapping these in an API also seems to make for a cleaner code.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 00:10:34 +0000 (10:40 +1030)]

virtio: unify config_changed handling

Replace duplicated code in all transports with a single wrapper in
virtio.c.

The only functional change is in virtio_mmio.c: if a buggy device sends
us an interrupt before driver is set, we previously returned IRQ_NONE,
now we return IRQ_HANDLED.

As this must not happen in practice, this does not look like a big deal.

See also commit 3fff0179e33cd7d0a688dab65700c46ad089e934
virtio-pci: do not oops on config change if driver not loaded.
for the original motivation behind the driver check.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Michael S. Tsirkin [Tue, 14 Oct 2014 00:10:29 +0000 (10:40 +1030)]

virtio_pci: fix virtio spec compliance on restore

On restore, virtio pci does the following:
+ set features
+ init vqs etc - device can be used at this point!
+ set ACKNOWLEDGE,DRIVER and DRIVER_OK status bits

This is in violation of the virtio spec, which
requires the following order:
- ACKNOWLEDGE
- DRIVER
- init vqs
- DRIVER_OK

This behaviour will break with hypervisors that assume spec compliant
behaviour. It seems like a good idea to have this patch applied to
stable branches to reduce the support butden for the hypervisors.

Cc: [email protected]
Cc: Amit Shah <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

commit | commitdiff | tree

Prarit Bhargava [Mon, 13 Oct 2014 16:21:39 +0000 (02:51 +1030)]

modules, lock around setting of MODULE_STATE_UNFORMED

A panic was seen in the following sitation.

There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
driver module.

When doing this, the following panic occurred:

------------[ cut here ]------------
kernel BUG at kernel/module.c:3739!
invalid opcode: 0000 [#1] SMP
Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
  ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
  ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
  ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
Call Trace:
  [<ffffffff810d666c>] m_show+0x19c/0x1e0
  [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
  [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
  [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
  [<ffffffff811c1a58>] SyS_read+0x58/0xb0
  [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
  RSP <ffff88080fa7fe18>

    Consider the two processes running on the system.

    CPU 0 (/proc/modules reader)
    CPU 1 (loading/unloading module)

    CPU 0 opens /proc/modules, and starts displaying data for each module by
    traversing the modules list via fs/seq_file.c:seq_open() and
    fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
    does

            op->start()  <-- this is a pointer to m_start()
            op->show()   <- this is a pointer to m_show()
            op->stop()   <-- this is a pointer to m_stop()

    The m_start(), m_show(), and m_stop() module functions are defined in
    kernel/module.c. The m_start() and m_stop() functions acquire and release
    the module_mutex respectively.

    ie) When reading /proc/modules, the module_mutex is acquired and released
    for each module.

    m_show() is called with the module_mutex held.  It accesses the module
    struct data and attempts to write out module data.  It is in this code
    path that the above BUG_ON() warning is encountered, specifically m_show()
    calls

    static char *module_flags(struct module *mod, char *buf)
    {
            int bx = 0;

            BUG_ON(mod->state == MODULE_STATE_UNFORMED);
    ...

    The other thread, CPU 1, in unloading the module calls the syscall
    delete_module() defined in kernel/module.c.  The module_mutex is acquired
    for a short time, and then released.  free_module() is called without the
    module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
    also without the module_mutex.  Some additional code is called and then the
    module_mutex is reacquired to remove the module from the modules list:

        /* Now we can delete it from the lists */
        mutex_lock(&module_mutex);
        stop_machine(__unlink_module, mod, NULL);
        mutex_unlock(&module_mutex);

This is the sequence of events that leads to the panic.

CPU 1 is removing dummy_module via delete_module().  It acquires the
module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.

CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module.  The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.

Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.

CPU 0 now calls module_flags() with dummy_module and ...

static char *module_flags(struct module *mod, char *buf)
{
        int bx = 0;

        BUG_ON(mod->state == MODULE_STATE_UNFORMED);

and BOOM.

Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.

Testing: In the unpatched kernel I can panic the system within 1 minute by
doing

while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done

and

while (true) do cat /proc/modules; done

in separate terminals.

In the patched kernel I was able to run just over one hour without seeing
any issues.  I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.

        dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
        dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
        dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)

Signed-off-by: Prarit Bhargava <[email protected]>
Reviewed-by: Oleg Nesterov <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Cc: [email protected]

commit | commitdiff | tree

Eric W. Biederman [Wed, 8 Oct 2014 17:42:27 +0000 (10:42 -0700)]

mnt: Prevent pivot_root from creating a loop in the mount tree

Andy Lutomirski recently demonstrated that when chroot is used to set
the root path below the path for the new ``root'' passed to pivot_root
the pivot_root system call succeeds and leaks mounts.

In examining the code I see that starting with a new root that is
below the current root in the mount tree will result in a loop in the
mount tree after the mounts are detached and then reattached to one
another. Resulting in all kinds of ugliness including a leak of that
mounts involved in the leak of the mount loop.

Prevent this problem by ensuring that the new mount is reachable from
the current root of the mount tree.

[Added stable cc. Fixes CVE-2014-7970. --Andy]

Cc: [email protected]
Reported-by: Andy Lutomirski <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>

commit | commitdiff | tree

Eric Dumazet [Mon, 13 Oct 2014 13:27:47 +0000 (06:27 -0700)]

tcp: TCP Small Queues and strange attractors

TCP Small queues tries to keep number of packets in qdisc
as small as possible, and depends on a tasklet to feed following
packets at TX completion time.
Choice of tasklet was driven by latencies requirements.

Then, TCP stack tries to avoid reorders, by locking flows with
outstanding packets in qdisc in a given TX queue.

What can happen is that many flows get attracted by a low performing
TX queue, and cpu servicing TX completion has to feed packets for all of
them, making this cpu 100% busy in softirq mode.

This became particularly visible with latest skb->xmit_more support

Strategy adopted in this patch is to detect when tcp_wfree() is called
from ksoftirqd and let the outstanding queue for this flow being drained
before feeding additional packets, so that skb->ooo_okay can be set
to allow select_queue() to select the optimal queue :

Incoming ACKS are normally handled by different cpus, so this patch
gives more chance for these cpus to take over the burden of feeding
qdisc with future packets.

Tested:

lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 &

lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:18 AM      eth1 595448.00 1190564.00  38381.09 1760253.12      0.00      0.00      1.00
06:16:19 AM      eth1 594858.00 1189686.00  38340.76 1758952.72      0.00      0.00      0.00
06:16:20 AM      eth1 597017.00 1194019.00  38480.79 1765370.29      0.00      0.00      1.00
06:16:21 AM      eth1 595450.00 1190936.00  38380.19 1760805.05      0.00      0.00      0.00
06:16:22 AM      eth1 596385.00 1193096.00  38442.56 1763976.29      0.00      0.00      1.00
06:16:23 AM      eth1 598155.00 1195978.00  38552.97 1768264.60      0.00      0.00      0.00
06:16:24 AM      eth1 594405.00 1188643.00  38312.57 1757414.89      0.00      0.00      1.00
06:16:25 AM      eth1 593366.00 1187154.00  38252.16 1755195.83      0.00      0.00      0.00
06:16:26 AM      eth1 593188.00 1186118.00  38232.88 1753682.57      0.00      0.00      1.00
06:16:27 AM      eth1 596301.00 1192241.00  38440.94 1762733.09      0.00      0.00      0.00
Average:         eth1 595457.30 1190843.50  38381.69 1760664.84      0.00      0.00      0.50
lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog
backlog 7606336b 2513p requeues 167982
backlog 224072b 74p requeues 566
backlog 581376b 192p requeues 5598
backlog 181680b 60p requeues 1070
backlog 5305056b 1753p requeues 110166    // Here, this TX queue is attracting flows
backlog 157456b 52p requeues 1758
backlog 672216b 222p requeues 3025
backlog 60560b 20p requeues 24541
backlog 448144b 148p requeues 21258

lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect

Immediate jump to full bandwidth, and traffic is properly
shard on all tx queues.

lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:46 AM      eth1 1397632.00 2795397.00  90081.87 4133031.26      0.00      0.00      1.00
06:16:47 AM      eth1 1396874.00 2793614.00  90032.99 4130385.46      0.00      0.00      0.00
06:16:48 AM      eth1 1395842.00 2791600.00  89966.46 4127409.67      0.00      0.00      1.00
06:16:49 AM      eth1 1395528.00 2791017.00  89946.17 4126551.24      0.00      0.00      0.00
06:16:50 AM      eth1 1397891.00 2795716.00  90098.74 4133497.39      0.00      0.00      1.00
06:16:51 AM      eth1 1394951.00 2789984.00  89908.96 4125022.51      0.00      0.00      0.00
06:16:52 AM      eth1 1394608.00 2789190.00  89886.90 4123851.36      0.00      0.00      1.00
06:16:53 AM      eth1 1395314.00 2790653.00  89934.33 4125983.09      0.00      0.00      0.00
06:16:54 AM      eth1 1396115.00 2792276.00  89984.25 4128411.21      0.00      0.00      1.00
06:16:55 AM      eth1 1396829.00 2793523.00  90030.19 4130250.28      0.00      0.00      0.00
Average:         eth1 1396158.40 2792297.00  89987.09 4128439.35      0.00      0.00      0.50

lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog
backlog 7900052b 2609p requeues 173287
backlog 878120b 290p requeues 589
backlog 1068884b 354p requeues 5621
backlog 996212b 329p requeues 1088
backlog 984100b 325p requeues 115316
backlog 956848b 316p requeues 1781
backlog 1080996b 357p requeues 3047
backlog 975016b 322p requeues 24571
backlog 990156b 327p requeues 21274

(All 8 TX queues get a fair share of the traffic)

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Roland Dreier [Tue, 14 Oct 2014 21:09:12 +0000 (14:09 -0700)]

Merge branches 'core', 'cxgb4', 'iser', 'mlx5' and 'ocrdma' into for-next

commit | commitdiff | tree

David S. Miller [Tue, 14 Oct 2014 21:05:23 +0000 (17:05 -0400)]

Merge branch 'qlcnic'

Rajesh Borundia says:

====================
qlcnic: Bug fixes

This series fixes following issues.

* We were programming maximum number of arguments supported by
adapter instead of required in a command.
* Destroy tx command requires three arguments instead of two.

Please apply these patches to net.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Rajesh Borundia [Tue, 14 Oct 2014 11:41:46 +0000 (07:41 -0400)]

qlcnic: Fix number of arguments in destroy tx context command

o Number of arguments taken by destroy tx command is three
instead of two.

Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Rajesh Borundia [Tue, 14 Oct 2014 11:41:45 +0000 (07:41 -0400)]

qlcnic: Fix programming number of arguments in a command.

o Initially we were programming maximum number of arguments.
  Instead we should program number of arguments required in
  a command.
o Maximum number of arguments for 82xx adapter is four. Fix it
  for GET_ESWITCH_STATS command.

Signed-off-by: Rajesh Borundia <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Mark Rustad [Tue, 14 Oct 2014 13:28:38 +0000 (06:28 -0700)]

genl_magic: Resolve logical-op warnings

Resolve "logical 'and' applied to non-boolean constant" warnings"
that appear in W=2 builds by adding !! to a bit test.

Signed-off-by: Mark Rustad <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

David S. Miller [Tue, 14 Oct 2014 21:02:37 +0000 (17:02 -0400)]

net: Trap attempts to call sock_kfree_s() with a NULL pointer.

Unlike normal kfree() it is never right to call sock_kfree_s() with
a NULL pointer, because sock_kfree_s() also has the side effect of
discharging the memory from the sockets quota.

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Cong Wang [Tue, 14 Oct 2014 19:35:08 +0000 (12:35 -0700)]

rds: avoid calling sock_kfree_s() on allocation failure

It is okay to free a NULL pointer but not okay to mischarge the socket optmem
accounting. Compile test only.

Reported-by: [email protected]
Cc: Chien Yen <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Hariprasad Shenai [Tue, 14 Oct 2014 20:24:14 +0000 (01:54 +0530)]

cxgb4: Fix FW flash logic using ethtool

Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since
t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed
enough to require a power cycle of the system.

Based on original work by Casey Leedom <[email protected]>

Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Tue, 14 Oct 2014 20:19:44 +0000 (17:19 -0300)]

perf symbols: Make sym->end be the first address after the symbol range

To follow vm_area_struct->vm_end convention.

By adhering to the convention that ->end is the first address outside
the symbol's range we can do things like:

sym->end = start + len;
len = sym->end - sym->start;

This is also now the convention used for struct map->end, fixing some
off-by-one bugs.

Cc: Adrian Hunter <[email protected]>
Cc: Chuck Ebbert <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Tue, 14 Oct 2014 19:39:27 +0000 (16:39 -0300)]

perf symbols: Fix map->end fixup

When synthesizing maps from files that have incomplete symbol
information, like kallsyms, we need to fixup the end of maps by seting
its end from the ->start of the next map, fix it to set prev_map->end to
curr_map->start, since ->end is the first byte outside prev_map address
range.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Namhyung Kim [Tue, 14 Oct 2014 19:05:38 +0000 (16:05 -0300)]

perf tools: Fixup off-by-one comparision in maps__find

map->end is the first addr _outside_ the a map, following the convention
of vm_area_struct->vm_end.

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Stephane Eranian [Mon, 6 Oct 2014 08:35:32 +0000 (10:35 +0200)]

perf tools: fix off-by-one error in maps

This patch fixes off-by-one errors in the management of maps.

A map is defined by start address and length as implemented by
map__new():

  map__init(map, type, start, start + len, pgoff, dso);

  map->start = addr;
  map->end = end;

Consequently, the actual address range is [start; end[ map->end is the
first byte outside the range.

This patch fixes two bugs where upper bound checking was off-by-one.

In V2, we fix map_groups__fixup_overlappings() some more where
map->start was off-by-one as reported by Jiri.

Signed-off-by: Stephane Eranian <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20141006083532.GA4850@quad
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Tue, 14 Oct 2014 18:07:48 +0000 (15:07 -0300)]

perf machine: Add missing dsos->root rbtree root initialization

A segfault happens on 'perf test hists_link' because we end up using a
struct machines on the stack, and then machines__init() was not
initializing the newly introduced rb_root, just the existing list_head.

When we introduced struct dsos, to group the two ways to store dsos,
i.e. the linked list and the rbtree, we didn't turned the initialization
done in:

machines__init(machines->host) ->
machine__init() ->
INIT_LIST_HEAD

into a dsos__init() to keep on initializing the list_head but _as well_
initializing the rb_root, oops.

All worked because outside perf-test we probably zalloc the whole thing
which ends up initializing it in to NULL.

So the problem looks contained to 'perf test' that uses it on stack,
etc.

Reported-by: Jiri Olsa <[email protected]>
Acked-by: Waiman Long <[email protected]>,
Cc: Adrian Hunter <[email protected]>,
Cc: Don Zickus <[email protected]>
Cc: Douglas Hatch <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Scott J Norton <[email protected]>
Cc: Waiman Long <[email protected]>,
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

David S. Miller [Tue, 14 Oct 2014 20:40:49 +0000 (16:40 -0400)]

Merge branch 'stmmac'

Giuseppe Cavallaro says:

====================
stmmac: review and fix the dwmac-sti glue-logic

This patch is to review the whole glue logic adopted on STi SoCs that
was bugged.
In the old glue-logic there was a lot of confusion when setup the
retiming especially for STiD127 where, for example, the bits 6 and 7
(in the GMAC control register) have a different meaning of what is
used for STiH4xx SoCs. So we cannot adopt the same glue for all these
SoCs.
Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
couldn't run when the speed was 10Mbps (because the clock was not properly
managed).
Note that the phy clock needs to be provided by the platform as well as
documented in the related binding file (updated as consequence).

The old code supported too many configurations never adopted and validated.
This made the code very complex to maintain and debug in case of issues.

The patch simplifies all the configurations as commented in the tables
inside the file and obviously it has been tested on all the boards
based on the SoCs mentioned.

With this patch, the dwmac-sti is also ready to support new configurations that
will be available on next SoC generations.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:12:56 +0000 (08:12 +0200)]

stmmac: dwmac-sti: review the glue-logic for STi4xx and STiD127 SoCs

This patch is to review the whole glue logic adopted on STi SoCs that
was bugged.

In the old glue-logic there was a lot of confusion when setup the
retiming especially for STiD127 where, for example, the bits 6 and 7
(in the GMAC control register) have a different meaning of what is
used for STiH4xx SoCs. So we cannot adopt the same glue for all these
SoCs.
Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
couldn't run when the speed was 10Mbps (because the clock was not properly
managed).
Note that the phy clock needs to be provided by the platform as well as
documented in the related binding file (updated as consequence).

The old code supported too many configurations never adopted and validated.
This made the code very complex to maintain and debug in case of issues.

The patch simplifies all the configurations as commented in the tables
inside the file and obviously it has been tested on all the boards
based on the SoCs mentioned.

With this patch, the dwmac-sti is also ready to support new configurations that
will be available on next SoC generations.

Signed-off-by: Giuseppe Cavallaro <[email protected]>
Cc: Srinivas Kandagatla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:12:55 +0000 (08:12 +0200)]

stmmac: make the STi Layer compatible to STiH407

This adds the missing compatibility to the STiH407 SoC.

Signed-off-by: Giuseppe Cavallaro <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:11:54 +0000 (08:11 +0200)]

stmmac: platform: fix FIXED_PHY support.

On several STi platforms: e.g. stihxxx-b2120 an Ethernet switch is
embedded and connected to the stmmac via RGMII mode. So this is managed
by using the FIXED_PHY. In that case, the support in the platform needs
to be fixed to allow the stmmac to dialog with the switch via fixed-link
by using phy_bus_name property.

Signed-off-by: Giuseppe Cavallaro <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Mon, 13 Oct 2014 16:30:27 +0000 (13:30 -0300)]

perf evsel: Make some exit routines static

Since they are automatically called by other methods used by tools.

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Mon, 13 Oct 2014 13:29:50 +0000 (10:29 -0300)]

perf evsel: Add missing 'target' struct forward declaration

We use it in evsel.h but were getting it indirectly, fix it.

Noticed while working on having evsel.h usable by rasd.c.

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Fri, 10 Oct 2014 18:55:15 +0000 (15:55 -0300)]

perf evlist: Default to syswide target when no thread/cpu maps set

If all a tool wants is to do system wide event monitoring, there is no
more the need to setup thread_map and cpu_map objects, just call
perf_evlist__open() and it will do create one fd per CPU monitoring all
threads.

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Fri, 10 Oct 2014 17:29:49 +0000 (14:29 -0300)]

perf evlist: Check that there is a thread_map when preparing a workload

The perf_evlist__prepare_workload expects a thread map to be in place
so that it can store the pid of the workload being started, so check it
and tell the developer about it instead of segfaulting.

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Fri, 10 Oct 2014 15:03:46 +0000 (12:03 -0300)]

perf thread_map: Create dummy constructor out of open coded equivalent

Create a dummy thread_map, one that has just one entry and it is -1,
meaning 'all threads', as this ends up going down to perf_event_open().

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Thu, 9 Oct 2014 19:16:00 +0000 (16:16 -0300)]

perf tools: Remove hists from evsel

Now tools that deals want to have an hists per evsel need to call
hists__init() before creating any evsels, which can be as early as when
parsing the command line, so do it before calling parse_options().

The current tools using hists/hist_entries are report, top and annotate,
change them to request per evsel hists.

This is in preparation for making evsels usable by 3rd party tools, that
not necessarily live in perf's source code repository.

Acked-by: Borislav Petkov <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Thu, 9 Oct 2014 19:12:24 +0000 (16:12 -0300)]

perf callchain: Move the callchain_param extern to callchain.h

It was lost in hist.h, move it to where it belongs, callchain.h, as
there are places that gets hist.h by means of evsel.h, and since evsel.h
is being untangled from hist.h...

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Arnaldo Carvalho de Melo [Thu, 9 Oct 2014 18:29:51 +0000 (15:29 -0300)]

perf evsel: Subclassing

Provide a method to be called at tool start to config the perf_evsel
instance size, together with optional constructor and destructor.

This will be used so that perf_evsel doesn't always include a struct
hists, tools that works with hists/hist_entries, like report, top and
annotate, will, at start, tell the evsel class the size they need per
instance.

v2: Don't use exit as a name of a member of function parameter, as this
breaks the build on at least fedora14 and rhel6.

Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

commit | commitdiff | tree

Guenter Roeck [Tue, 14 Oct 2014 18:21:04 +0000 (11:21 -0700)]

dsa: mv88e6171: Fix tag_protocol check

tag_protocol is now an enum, so drivers have to check against it.

Cc: Andrew Lunn <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Acked-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Neale Ferguson [Tue, 14 Oct 2014 20:10:48 +0000 (15:10 -0500)]

dlm: fix missing endian conversion of rcom_status flags

The flags are already converted to le when being sent,
but are not being converted back to cpu when received.

Signed-off-by: Neale Ferguson <[email protected]>
Signed-off-by: David Teigland <[email protected]>

commit | commitdiff | tree

David S. Miller [Tue, 14 Oct 2014 20:09:38 +0000 (16:09 -0400)]

Merge branch 'xgene'

Iyappan Subramanian says:

====================
Adding SGMII based 1GbE basic support to APM X-Gene SoC ethernet driver.

v2: Address comments from v1
* Split the patchset into two, the first one being preparatory patch
* Added link_state function pointer to the xgene_mac_ops structure
* Added xgene_indirect_ctl structure for indirect read/write arguments

v1:
* Initial version
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Iyappan Subramanian [Tue, 14 Oct 2014 00:05:35 +0000 (17:05 -0700)]

drivers: net: xgene: Add SGMII based 1GbE ethtool support

Signed-off-by: Iyappan Subramanian <[email protected]>
Signed-off-by: Keyur Chudgar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Iyappan Subramanian [Tue, 14 Oct 2014 00:05:34 +0000 (17:05 -0700)]

drivers: net: xgene: Add SGMII based 1GbE support

Signed-off-by: Iyappan Subramanian <[email protected]>
Signed-off-by: Keyur Chudgar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Iyappan Subramanian [Tue, 14 Oct 2014 00:05:33 +0000 (17:05 -0700)]

drivers: net: xgene: Preparing for adding SGMII based 1GbE

- Added link_state function pointer to the xgene__mac_ops structure
- Moved ring manager (pdata->rm) assignment to xgene_enet_setup_ops
- Removed unused variable (pdata->phy_addr) and macro (FULL_DUPLEX)

Signed-off-by: Iyappan Subramanian <[email protected]>
Signed-off-by: Keyur Chudgar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Iyappan Subramanian [Tue, 14 Oct 2014 00:05:32 +0000 (17:05 -0700)]

dtb: Add SGMII based 1GbE node to APM X-Gene SoC device tree

Signed-off-by: Iyappan Subramanian <[email protected]>
Signed-off-by: Keyur Chudgar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Alexei Starovoitov [Tue, 14 Oct 2014 09:08:54 +0000 (02:08 -0700)]

net: filter: move common defines into bpf_common.h

userspace programs that use eBPF instruction macros need to include two files:
uapi/linux/filter.h and uapi/linux/bpf.h
Move common macro definitions that are shared between classic BPF and eBPF
into uapi/linux/bpf_common.h, so that user app can include only one bpf.h file

Cc: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Fabian Frederick [Tue, 14 Oct 2014 17:01:14 +0000 (19:01 +0200)]

caif_usb: use target structure member in memset

parent cfusbl was used instead of first structure member 'layer'

Suggested-by: Joe Perches <[email protected]>
Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Fabian Frederick [Tue, 14 Oct 2014 17:00:55 +0000 (19:00 +0200)]

caif_usb: remove redundant memory message

Let MM subsystem display out of memory messages.

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Fabian Frederick [Mon, 13 Oct 2014 20:21:46 +0000 (22:21 +0200)]

caif: replace kmalloc/memset 0 by kzalloc

Also add blank line after declaration

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Mugunthan V N [Mon, 13 Oct 2014 16:51:07 +0000 (22:21 +0530)]

drivers: net: cpsw: remove child devices while driver detach

remove all the child devices from the system to make sure that re-insert of
cpsw module doesn't fail on child device populated by of_platform_populate().

Signed-off-by: Mugunthan V N <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Mugunthan V N [Mon, 13 Oct 2014 16:51:06 +0000 (22:21 +0530)]

drivers: net: davinci_cpdma: remove spinlock as SOFTIRQ-unsafe lock order detected

remove spinlock in cpdma_desc_pool_destroy() as there is no active cpdma
channel and iounmap should be called without auquiring lock.

root@dra7xx-evm:~# modprobe -r ti_cpsw
[   50.539743]
[   50.541312] ======================================================
[   50.547796] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   50.554826] 3.14.19-02124-g95c5b7b #308 Not tainted
[   50.559939] ------------------------------------------------------
[   50.566416] modprobe/1921 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[   50.573347]  (vmap_area_lock){+.+...}, at: [<c01127fc>] find_vmap_area+0x10/0x6c
[   50.581132]
[   50.581132] and this task is already holding:
[   50.587249]  (&(&pool->lock)->rlock#2){..-...}, at: [<bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma]
[   50.597766] which would create a new lock dependency:
[   50.603048]  (&(&pool->lock)->rlock#2){..-...} -> (vmap_area_lock){+.+...}
[   50.610296]
[   50.610296] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   50.618601]  (&(&pool->lock)->rlock#2){..-...}
... which became SOFTIRQ-irq-safe at:
[   50.626829]   [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[   50.632677]   [<bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma]
[   50.640437]   [<bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma]
[   50.647289]   [<bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma]
[   50.654512]   [<bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma]
[   50.661455]   [<bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw]
[   50.667038]   [<c05844f4>] net_rx_action+0xc0/0x1e8
[   50.672150]   [<c0048234>] __do_softirq+0xcc/0x304
[   50.677183]   [<c004873c>] irq_exit+0xa8/0xfc
[   50.681751]   [<c000eeac>] handle_IRQ+0x50/0xb0
[   50.686513]   [<c0008638>] gic_handle_irq+0x28/0x5c
[   50.691628]   [<c06590a4>] __irq_svc+0x44/0x5c
[   50.696289]   [<c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44
[   50.702591]   [<c065a9c4>] do_page_fault.part.9+0x144/0x3c4
[   50.708433]   [<c065acb8>] do_page_fault+0x74/0x84
[   50.713453]   [<c00083dc>] do_DataAbort+0x34/0x98
[   50.718391]   [<c065923c>] __dabt_usr+0x3c/0x40
[   50.723148]
[   50.723148] to a SOFTIRQ-irq-unsafe lock:
[   50.728893]  (vmap_area_lock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
[   50.736476] ...  [<c06584e8>] _raw_spin_lock+0x28/0x38
[   50.741876]   [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[   50.747908]   [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[   50.754210]   [<c011486c>] get_vm_area_caller+0x3c/0x48
[   50.759692]   [<c0114be0>] vmap+0x40/0x78
[   50.763900]   [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[   50.769835]   [<c093eac0>] start_kernel+0x320/0x388
[   50.774952]   [<80008074>] 0x80008074
[   50.778793]
[   50.778793] other info that might help us debug this:
[   50.778793]
[   50.787181]  Possible interrupt unsafe locking scenario:
[   50.787181]
[   50.794295]        CPU0                    CPU1
[   50.799042]        ----                    ----
[   50.803785]   lock(vmap_area_lock);
[   50.807446]                                local_irq_disable();
[   50.813652]                                lock(&(&pool->lock)->rlock#2);
[   50.820782]                                lock(vmap_area_lock);
[   50.827086]   <Interrupt>
[   50.829823]     lock(&(&pool->lock)->rlock#2);
[   50.834490]
[   50.834490]  *** DEADLOCK ***
[   50.834490]
[   50.840695] 4 locks held by modprobe/1921:
[   50.844981]  #0:  (&__lockdep_no_validate__){......}, at: [<c03e53e8>] driver_detach+0x44/0xb8
[   50.854038]  #1:  (&__lockdep_no_validate__){......}, at: [<c03e53f4>] driver_detach+0x50/0xb8
[   50.863102]  #2:  (&(&ctlr->lock)->rlock){......}, at: [<bf017c34>] cpdma_ctlr_destroy+0x1c/0x114 [davinci_cpdma]
[   50.873890]  #3:  (&(&pool->lock)->rlock#2){..-...}, at: [<bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma]
[   50.884871]
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[   50.892827] -> (&(&pool->lock)->rlock#2){..-...} ops: 167 {
[   50.898703]    IN-SOFTIRQ-W at:
[   50.901995]                     [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[   50.909476]                     [<bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma]
[   50.918878]                     [<bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma]
[   50.927366]                     [<bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma]
[   50.936218]                     [<bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma]
[   50.944794]                     [<bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw]
[   50.952009]                     [<c05844f4>] net_rx_action+0xc0/0x1e8
[   50.958765]                     [<c0048234>] __do_softirq+0xcc/0x304
[   50.965432]                     [<c004873c>] irq_exit+0xa8/0xfc
[   50.971635]                     [<c000eeac>] handle_IRQ+0x50/0xb0
[   50.978035]                     [<c0008638>] gic_handle_irq+0x28/0x5c
[   50.984788]                     [<c06590a4>] __irq_svc+0x44/0x5c
[   50.991085]                     [<c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44
[   50.999023]                     [<c065a9c4>] do_page_fault.part.9+0x144/0x3c4
[   51.006510]                     [<c065acb8>] do_page_fault+0x74/0x84
[   51.013171]                     [<c00083dc>] do_DataAbort+0x34/0x98
[   51.019738]                     [<c065923c>] __dabt_usr+0x3c/0x40
[   51.026129]    INITIAL USE at:
[   51.029335]                    [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[   51.036729]                    [<bf017d78>] cpdma_chan_submit+0x4c/0x2f0 [davinci_cpdma]
[   51.045225]                    [<bf02863c>] cpsw_ndo_open+0x378/0x6bc [ti_cpsw]
[   51.052897]                    [<c058747c>] __dev_open+0x9c/0x104
[   51.059287]                    [<c05876ec>] __dev_change_flags+0x88/0x160
[   51.066420]                    [<c05877e4>] dev_change_flags+0x18/0x48
[   51.073270]                    [<c05ed51c>] devinet_ioctl+0x61c/0x6e0
[   51.080029]                    [<c056ee54>] sock_ioctl+0x5c/0x298
[   51.086418]                    [<c01350a4>] do_vfs_ioctl+0x78/0x61c
[   51.092993]                    [<c01356ac>] SyS_ioctl+0x64/0x74
[   51.099200]                    [<c000e580>] ret_fast_syscall+0x0/0x48
[   51.105956]  }
[   51.107696]  ... key      at: [<bf019000>] __key.21312+0x0/0xfffff650 [davinci_cpdma]
[   51.115912]  ... acquired at:
[   51.119019]    [<c00899ac>] lock_acquire+0x9c/0x104
[   51.124138]    [<c06584e8>] _raw_spin_lock+0x28/0x38
[   51.129341]    [<c01127fc>] find_vmap_area+0x10/0x6c
[   51.134547]    [<c0114960>] remove_vm_area+0x8/0x6c
[   51.139659]    [<c0114a7c>] __vunmap+0x20/0xf8
[   51.144318]    [<c001c350>] __arm_iounmap+0x10/0x18
[   51.149440]    [<bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]
[   51.156560]    [<bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw]
[   51.162407]    [<c03e62c8>] platform_drv_remove+0x18/0x1c
[   51.168063]    [<c03e4c44>] __device_release_driver+0x70/0xc8
[   51.174094]    [<c03e5458>] driver_detach+0xb4/0xb8
[   51.179212]    [<c03e4a6c>] bus_remove_driver+0x4c/0x90
[   51.184693]    [<c00b024c>] SyS_delete_module+0x10c/0x198
[   51.190355]    [<c000e580>] ret_fast_syscall+0x0/0x48
[   51.195661]
[   51.197217]
the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[   51.205986] -> (vmap_area_lock){+.+...} ops: 520 {
[   51.211032]    HARDIRQ-ON-W at:
[   51.214321]                     [<c06584e8>] _raw_spin_lock+0x28/0x38
[   51.221090]                     [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[   51.228750]                     [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[   51.236690]                     [<c011486c>] get_vm_area_caller+0x3c/0x48
[   51.243811]                     [<c0114be0>] vmap+0x40/0x78
[   51.249654]                     [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[   51.257239]                     [<c093eac0>] start_kernel+0x320/0x388
[   51.263994]                     [<80008074>] 0x80008074
[   51.269474]    SOFTIRQ-ON-W at:
[   51.272769]                     [<c06584e8>] _raw_spin_lock+0x28/0x38
[   51.279525]                     [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[   51.287190]                     [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[   51.295126]                     [<c011486c>] get_vm_area_caller+0x3c/0x48
[   51.302245]                     [<c0114be0>] vmap+0x40/0x78
[   51.308094]                     [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[   51.315669]                     [<c093eac0>] start_kernel+0x320/0x388
[   51.322423]                     [<80008074>] 0x80008074
[   51.327906]    INITIAL USE at:
[   51.331112]                    [<c06584e8>] _raw_spin_lock+0x28/0x38
[   51.337775]                    [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[   51.345352]                    [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[   51.353197]                    [<c011486c>] get_vm_area_caller+0x3c/0x48
[   51.360224]                    [<c0114be0>] vmap+0x40/0x78
[   51.365977]                    [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[   51.373464]                    [<c093eac0>] start_kernel+0x320/0x388
[   51.380131]                    [<80008074>] 0x80008074
[   51.385517]  }
[   51.387260]  ... key      at: [<c0a66948>] vmap_area_lock+0x10/0x20
[   51.393841]  ... acquired at:
[   51.396945]    [<c00899ac>] lock_acquire+0x9c/0x104
[   51.402060]    [<c06584e8>] _raw_spin_lock+0x28/0x38
[   51.407266]    [<c01127fc>] find_vmap_area+0x10/0x6c
[   51.412478]    [<c0114960>] remove_vm_area+0x8/0x6c
[   51.417592]    [<c0114a7c>] __vunmap+0x20/0xf8
[   51.422252]    [<c001c350>] __arm_iounmap+0x10/0x18
[   51.427369]    [<bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]
[   51.434487]    [<bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw]
[   51.440336]    [<c03e62c8>] platform_drv_remove+0x18/0x1c
[   51.446000]    [<c03e4c44>] __device_release_driver+0x70/0xc8
[   51.452031]    [<c03e5458>] driver_detach+0xb4/0xb8
[   51.457147]    [<c03e4a6c>] bus_remove_driver+0x4c/0x90
[   51.462628]    [<c00b024c>] SyS_delete_module+0x10c/0x198
[   51.468289]    [<c000e580>] ret_fast_syscall+0x0/0x48
[   51.473584]
[   51.475140]
[   51.475140] stack backtrace:
[   51.479703] CPU: 0 PID: 1921 Comm: modprobe Not tainted 3.14.19-02124-g95c5b7b #308
[   51.487744] [<c0016090>] (unwind_backtrace) from [<c0012060>] (show_stack+0x10/0x14)
[   51.495865] [<c0012060>] (show_stack) from [<c0652a20>] (dump_stack+0x78/0x94)
[   51.503444] [<c0652a20>] (dump_stack) from [<c0086f18>] (check_usage+0x408/0x594)
[   51.511293] [<c0086f18>] (check_usage) from [<c00870f8>] (check_irq_usage+0x54/0xb0)
[   51.519416] [<c00870f8>] (check_irq_usage) from [<c0088724>] (__lock_acquire+0xe54/0x1b90)
[   51.528077] [<c0088724>] (__lock_acquire) from [<c00899ac>] (lock_acquire+0x9c/0x104)
[   51.536291] [<c00899ac>] (lock_acquire) from [<c06584e8>] (_raw_spin_lock+0x28/0x38)
[   51.544417] [<c06584e8>] (_raw_spin_lock) from [<c01127fc>] (find_vmap_area+0x10/0x6c)
[   51.552726] [<c01127fc>] (find_vmap_area) from [<c0114960>] (remove_vm_area+0x8/0x6c)
[   51.560935] [<c0114960>] (remove_vm_area) from [<c0114a7c>] (__vunmap+0x20/0xf8)
[   51.568693] [<c0114a7c>] (__vunmap) from [<c001c350>] (__arm_iounmap+0x10/0x18)
[   51.576362] [<c001c350>] (__arm_iounmap) from [<bf017d08>] (cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma])
[   51.586494] [<bf017d08>] (cpdma_ctlr_destroy [davinci_cpdma]) from [<bf026294>] (cpsw_remove+0x48/0x8c [ti_cpsw])
[   51.597261] [<bf026294>] (cpsw_remove [ti_cpsw]) from [<c03e62c8>] (platform_drv_remove+0x18/0x1c)
[   51.606659] [<c03e62c8>] (platform_drv_remove) from [<c03e4c44>] (__device_release_driver+0x70/0xc8)
[   51.616237] [<c03e4c44>] (__device_release_driver) from [<c03e5458>] (driver_detach+0xb4/0xb8)
[   51.625264] [<c03e5458>] (driver_detach) from [<c03e4a6c>] (bus_remove_driver+0x4c/0x90)
[   51.633749] [<c03e4a6c>] (bus_remove_driver) from [<c00b024c>] (SyS_delete_module+0x10c/0x198)
[   51.642781] [<c00b024c>] (SyS_delete_module) from [<c000e580>] (ret_fast_syscall+0x0/0x48)

Signed-off-by: Mugunthan V N <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Mugunthan V N [Mon, 13 Oct 2014 16:51:05 +0000 (22:21 +0530)]

drivers: net: davinci_cpdma: remove kfree on objects allocated with devm_* apis

memories allocated with devm_* apis must not be freed with kfree apis,
so removing the kfree calls

Fixes: e194312854ed ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().')
Signed-off-by: Mugunthan V N <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Prashant Sreedharan [Mon, 13 Oct 2014 16:21:42 +0000 (09:21 -0700)]

tg3: Add skb->xmit_more support

Ring TX doorbell only if xmit_more is not set or the queue is stopped.

Suggested-by: Daniel Borkmann <[email protected]>
Signed-off-by: Prashant Sreedharan <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Jiri Pirko [Mon, 13 Oct 2014 14:34:10 +0000 (16:34 +0200)]

ipv4: fix nexthop attlen check in fib_nh_match

fib_nh_match does not match nexthops correctly. Example:

ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
nexthop via 192.168.122.13 dev eth0
ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
nexthop via 192.168.122.15 dev eth0

Del command is successful and route is removed. After this patch
applied, the route is correctly matched and result is:
RTNETLINK answers: No such process

Please consider this for stable trees as well.

Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config")
Signed-off-by: Jiri Pirko <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Eric Dumazet [Sat, 11 Oct 2014 22:17:29 +0000 (15:17 -0700)]

tcp: fix tcp_ack() performance problem

We worked hard to improve tcp_ack() performance, by not accessing
skb_shinfo() in fast path (cd7d8498c9a5 tcp: change tcp_skb_pcount()
location)

We still have one spurious access because of ACK timestamping,
added in commit e1c8a607b281 ("net-timestamp: ACK timestamp for
bytestreams")

By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set,
we can avoid two cache line misses for the common case.

While we are at it, add two prefetchw() :

One in tcp_ack() to bring skb at the head of write queue.

One in tcp_clean_rtx_queue() loop to bring following skb,
as we will delete skb from the write queue and dirty skb->next->prev.

Add a couple of [un]likely() clauses.

After this patch, tcp_ack() is no longer the most consuming
function in tcp stack.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Cc: Neal Cardwell <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Cc: Van Jacobson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree

Yan, Zheng [Tue, 14 Oct 2014 07:38:01 +0000 (15:38 +0800)]

ceph: fix divide-by-zero in __validate_layout()

The 'stripe_unit' field is 64 bits, casting it to 32 bits can result zero.

Signed-off-by: Yan, Zheng <[email protected]>

commit | commitdiff | tree

Ilya Dryomov [Fri, 10 Oct 2014 14:36:07 +0000 (18:36 +0400)]

rbd: rbd workqueues need a resque worker

Need to use WQ_MEM_RECLAIM for our workqueues to prevent I/O lockups
under memory pressure - we sit on the memory reclaim path.

Cc: [email protected] # 3.17, needs backporting for 3.16
Signed-off-by: Ilya Dryomov <[email protected]>
Tested-by: Micha Krause <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Ilya Dryomov [Fri, 10 Oct 2014 12:39:05 +0000 (16:39 +0400)]

libceph: ceph-msgr workqueue needs a resque worker

Commit f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant")
effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq. This is
wrong - libceph is very much a memory reclaim path, so restore it.

Cc: [email protected] # needs backporting for < 3.12
Signed-off-by: Ilya Dryomov <[email protected]>
Tested-by: Micha Krause <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Fabian Frederick [Thu, 9 Oct 2014 21:16:35 +0000 (23:16 +0200)]

ceph: fix bool assignments

Fix some coccinelle warnings:
fs/ceph/caps.c:2400:6-10: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2401:6-15: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2402:6-17: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2403:6-22: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2404:6-22: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2405:6-19: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2440:4-20: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2469:3-16: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2490:2-18: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2519:3-7: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2549:3-12: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2575:2-6: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2589:3-7: WARNING: Assignment of bool to 0/1

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

commit | commitdiff | tree

Ilya Dryomov [Mon, 6 Oct 2014 14:40:27 +0000 (18:40 +0400)]

libceph: separate multiple ops with commas in debugfs output

For requests with multiple ops, separate ops with commas instead of \t,
which is a field separator here.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Ilya Dryomov [Thu, 2 Oct 2014 13:22:29 +0000 (17:22 +0400)]

libceph: sync osd op definitions in rados.h

Bring in missing osd ops and strings, use macros to eliminate multiple
points of maintenance.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Fabian Frederick [Tue, 30 Sep 2014 20:07:50 +0000 (22:07 +0200)]

libceph: remove redundant declaration

ceph_release_page_vector was defined twice in libceph.h

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

commit | commitdiff | tree

John Spray [Fri, 12 Sep 2014 15:58:49 +0000 (16:58 +0100)]

ceph: additional debugfs output

MDS session state and client global ID is
useful instrumentation when testing.

Signed-off-by: John Spray <[email protected]>

commit | commitdiff | tree

John Spray [Fri, 19 Sep 2014 12:51:08 +0000 (13:51 +0100)]

ceph: export ceph_session_state_name function

...so that it can be used from the ceph debugfs
code when dumping session info.

Signed-off-by: John Spray <[email protected]>

commit | commitdiff | tree

Yan, Zheng [Tue, 16 Sep 2014 12:35:17 +0000 (20:35 +0800)]

ceph: include the initial ACL in create/mkdir/mknod MDS requests

Current code set new file/directory's initial ACL in a non-atomic
manner.
Client first sends request to MDS to create new file/directory, then set
the initial ACL after the new file/directory is successfully created.

The fix is include the initial ACL in create/mkdir/mknod MDS requests.
So MDS can handle creating file/directory and setting the initial ACL in
one request.

Signed-off-by: Yan, Zheng <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Yan, Zheng [Tue, 16 Sep 2014 11:15:28 +0000 (19:15 +0800)]

ceph: use pagelist to present MDS request data

Current code uses page array to present MDS request data. Pages in the
array are allocated/freed by caller of ceph_mdsc_do_request(). If request
is interrupted, the pages can be freed while they are still being used by
the request message.

The fix is use pagelist to present MDS request data. Pagelist is
reference counted.

Signed-off-by: Yan, Zheng <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Yan, Zheng [Tue, 16 Sep 2014 09:50:45 +0000 (17:50 +0800)]

libceph: reference counting pagelist

this allow pagelist to present data that may be sent multiple times.

Signed-off-by: Yan, Zheng <[email protected]>
Reviewed-by: Sage Weil <[email protected]>

commit | commitdiff | tree

Yan, Zheng [Thu, 18 Sep 2014 08:11:12 +0000 (16:11 +0800)]

ceph: fix llistxattr on symlink

only regular file and directory have vxattrs.

Signed-off-by: Yan, Zheng <[email protected]>

commit | commitdiff | tree

John Spray [Tue, 9 Sep 2014 18:26:01 +0000 (19:26 +0100)]

ceph: send client metadata to MDS

Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax,
which includes additional client metadata to allow
the MDS to report on clients by user-sensible names
like hostname.

Signed-off-by: John Spray <[email protected]>
Reviewed-by: Yan, Zheng <[email protected]>

commit | commitdiff | tree

David S. Miller [Tue, 14 Oct 2014 19:05:39 +0000 (15:05 -0400)]

Merge branch 'isdn'

Tilman Schmidt says:

====================
Coverity patches for drivers/isdn

Here's a series of patches for the ISDN CAPI subsystem and the
Gigaset ISDN driver.
Patches 1 to 7 are specific fixes for Coverity warnings.
Patches 8 to 11 fix related problems with the handling of invalid
CAPI command codes I noticed while working on this.
Patch 12 fixes an unrelated problem I noticed during the subsequent
regression tests.
It would be great if these could still be merged.
====================

Signed-off-by: David S. Miller <[email protected]>

commit | commitdiff | tree