]> Git Repo - linux.git/log
linux.git
7 years agoarcnet: com20020: remove needless base_addr assignment
Michael Grzeschik [Wed, 28 Jun 2017 16:28:35 +0000 (18:28 +0200)]
arcnet: com20020: remove needless base_addr assignment

The assignment is superfluous.

Signed-off-by: Michael Grzeschik <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agoTrivial fix to spelling mistake in arc_printk message
Colin Ian King [Wed, 28 Jun 2017 16:28:34 +0000 (18:28 +0200)]
Trivial fix to spelling mistake in arc_printk message

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Michael Grzeschik <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agoarcnet: change irq handler to lock irqsave
Michael Grzeschik [Wed, 28 Jun 2017 16:28:33 +0000 (18:28 +0200)]
arcnet: change irq handler to lock irqsave

This patch prevents the arcnet driver from the following deadlock.

[   41.273910] ======================================================
[   41.280397] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   41.287433] 4.4.0-00034-gc0ae784 #536 Not tainted
[   41.292366] ------------------------------------------------------
[   41.298863] arcecho/233 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
[   41.305628]  (&(&lp->lock)->rlock){+.+...}, at: [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   41.315199]
[   41.315199] and this task is already holding:
[   41.321324]  (_xmit_ARCNET#2){+.-...}, at: [<c06b934c>] packet_direct_xmit+0xfc/0x1c8
[   41.329593] which would create a new lock dependency:
[   41.334893]  (_xmit_ARCNET#2){+.-...} -> (&(&lp->lock)->rlock){+.+...}
[   41.341801]
[   41.341801] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   41.350108]  (_xmit_ARCNET#2){+.-...}
... which became SOFTIRQ-irq-safe at:
[   41.357539]   [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.362677]   [<c063ab8c>] dev_watchdog+0x5c/0x264
[   41.367723]   [<c0094edc>] call_timer_fn+0x6c/0xf4
[   41.372759]   [<c00950b8>] run_timer_softirq+0x154/0x210
[   41.378340]   [<c0036b30>] __do_softirq+0x144/0x298
[   41.383469]   [<c0036fb4>] irq_exit+0xcc/0x130
[   41.388138]   [<c0085c50>] __handle_domain_irq+0x60/0xb4
[   41.393728]   [<c0014578>] __irq_svc+0x58/0x78
[   41.398402]   [<c0010274>] arch_cpu_idle+0x24/0x3c
[   41.403443]   [<c007127c>] cpu_startup_entry+0x1f8/0x25c
[   41.409029]   [<c09adc90>] start_kernel+0x3c0/0x3cc
[   41.414170]
[   41.414170] to a SOFTIRQ-irq-unsafe lock:
[   41.419931]  (&(&lp->lock)->rlock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
[   41.427996] ...  [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.433409]   [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.439646]   [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.445063]   [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.450661]   [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.455700]   [<c0087548>] irq_thread+0x13c/0x1dc
[   41.460649]   [<c0050f10>] kthread+0xe4/0xf8
[   41.465158]   [<c000f810>] ret_from_fork+0x14/0x24
[   41.470207]
[   41.470207] other info that might help us debug this:
[   41.470207]
[   41.478627]  Possible interrupt unsafe locking scenario:
[   41.478627]
[   41.485763]        CPU0                    CPU1
[   41.490521]        ----                    ----
[   41.495279]   lock(&(&lp->lock)->rlock);
[   41.499414]                                local_irq_disable();
[   41.505636]                                lock(_xmit_ARCNET#2);
[   41.511967]                                lock(&(&lp->lock)->rlock);
[   41.518741]   <Interrupt>
[   41.521490]     lock(_xmit_ARCNET#2);
[   41.525356]
[   41.525356]  *** DEADLOCK ***
[   41.525356]
[   41.531587] 1 lock held by arcecho/233:
[   41.535617]  #0:  (_xmit_ARCNET#2){+.-...}, at: [<c06b934c>] packet_direct_xmit+0xfc/0x1c8
[   41.544355]
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[   41.552362] -> (_xmit_ARCNET#2){+.-...} ops: 27 {
[   41.557357]    HARDIRQ-ON-W at:
[   41.560664]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.567445]                     [<c063ba28>] dev_deactivate_many+0x114/0x304
[   41.574866]                     [<c063bc3c>] dev_deactivate+0x24/0x38
[   41.581646]                     [<c0630374>] linkwatch_do_dev+0x40/0x74
[   41.588613]                     [<c06305d8>] __linkwatch_run_queue+0xec/0x140
[   41.596120]                     [<c0630658>] linkwatch_event+0x2c/0x34
[   41.602991]                     [<c004af30>] process_one_work+0x188/0x40c
[   41.610131]                     [<c004b200>] worker_thread+0x4c/0x480
[   41.616912]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.623048]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.629735]    IN-SOFTIRQ-W at:
[   41.633039]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.639820]                     [<c063ab8c>] dev_watchdog+0x5c/0x264
[   41.646508]                     [<c0094edc>] call_timer_fn+0x6c/0xf4
[   41.653190]                     [<c00950b8>] run_timer_softirq+0x154/0x210
[   41.660425]                     [<c0036b30>] __do_softirq+0x144/0x298
[   41.667201]                     [<c0036fb4>] irq_exit+0xcc/0x130
[   41.673518]                     [<c0085c50>] __handle_domain_irq+0x60/0xb4
[   41.680754]                     [<c0014578>] __irq_svc+0x58/0x78
[   41.687077]                     [<c0010274>] arch_cpu_idle+0x24/0x3c
[   41.693769]                     [<c007127c>] cpu_startup_entry+0x1f8/0x25c
[   41.701006]                     [<c09adc90>] start_kernel+0x3c0/0x3cc
[   41.707791]    INITIAL USE at:
[   41.711003]                    [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.717696]                    [<c063ba28>] dev_deactivate_many+0x114/0x304
[   41.725026]                    [<c063bc3c>] dev_deactivate+0x24/0x38
[   41.731718]                    [<c0630374>] linkwatch_do_dev+0x40/0x74
[   41.738593]                    [<c06305d8>] __linkwatch_run_queue+0xec/0x140
[   41.746011]                    [<c0630658>] linkwatch_event+0x2c/0x34
[   41.752789]                    [<c004af30>] process_one_work+0x188/0x40c
[   41.759847]                    [<c004b200>] worker_thread+0x4c/0x480
[   41.766541]                    [<c0050f10>] kthread+0xe4/0xf8
[   41.772596]                    [<c000f810>] ret_from_fork+0x14/0x24
[   41.779198]  }
[   41.780945]  ... key      at: [<c124d620>] netdev_xmit_lock_key+0x38/0x1c8
[   41.788192]  ... acquired at:
[   41.791309]    [<c007bed8>] lock_acquire+0x70/0x90
[   41.796361]    [<c06f9140>] _raw_spin_lock_irqsave+0x40/0x54
[   41.802324]    [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   41.808844]    [<c06b9380>] packet_direct_xmit+0x130/0x1c8
[   41.814622]    [<c06bc7e4>] packet_sendmsg+0x3b8/0x680
[   41.820034]    [<c05fe8b0>] sock_sendmsg+0x14/0x24
[   41.825091]    [<c05ffd68>] SyS_sendto+0xb8/0xe0
[   41.829956]    [<c05ffda8>] SyS_send+0x18/0x20
[   41.834638]    [<c000f780>] ret_fast_syscall+0x0/0x1c
[   41.839954]
[   41.841514]
the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[   41.850302] -> (&(&lp->lock)->rlock){+.+...} ops: 5 {
[   41.855644]    HARDIRQ-ON-W at:
[   41.858945]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.865726]                     [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.873607]                     [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.880666]                     [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.887901]                     [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.894593]                     [<c0087548>] irq_thread+0x13c/0x1dc
[   41.901195]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.907338]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.914025]    SOFTIRQ-ON-W at:
[   41.917328]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.924106]                     [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.931981]                     [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.939028]                     [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.946264]                     [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.952954]                     [<c0087548>] irq_thread+0x13c/0x1dc
[   41.959548]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.965689]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.972379]    INITIAL USE at:
[   41.975595]                    [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.982283]                    [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.990063]                    [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.997027]                    [<c03c1170>] regmap_irq_thread+0x190/0x314
[   42.004172]                    [<c0087244>] irq_thread_fn+0x1c/0x34
[   42.010766]                    [<c0087548>] irq_thread+0x13c/0x1dc
[   42.017267]                    [<c0050f10>] kthread+0xe4/0xf8
[   42.023314]                    [<c000f810>] ret_from_fork+0x14/0x24
[   42.029903]  }
[   42.031648]  ... key      at: [<bf0854cc>] __key.42091+0x0/0xfffff0f8 [arcnet]
[   42.039255]  ... acquired at:
[   42.042372]    [<c007bed8>] lock_acquire+0x70/0x90
[   42.047413]    [<c06f9140>] _raw_spin_lock_irqsave+0x40/0x54
[   42.053364]    [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   42.059872]    [<c06b9380>] packet_direct_xmit+0x130/0x1c8
[   42.065634]    [<c06bc7e4>] packet_sendmsg+0x3b8/0x680
[   42.071030]    [<c05fe8b0>] sock_sendmsg+0x14/0x24
[   42.076069]    [<c05ffd68>] SyS_sendto+0xb8/0xe0
[   42.080926]    [<c05ffda8>] SyS_send+0x18/0x20
[   42.085601]    [<c000f780>] ret_fast_syscall+0x0/0x1c
[   42.090918]
[   42.092481]
[   42.092481] stack backtrace:
[   42.097065] CPU: 0 PID: 233 Comm: arcecho Not tainted 4.4.0-00034-gc0ae784 #536
[   42.104751] Hardware name: Generic AM33XX (Flattened Device Tree)
[   42.111183] [<c0017ec8>] (unwind_backtrace) from [<c00139d0>] (show_stack+0x10/0x14)
[   42.119337] [<c00139d0>] (show_stack) from [<c02a82c4>] (dump_stack+0x8c/0x9c)
[   42.126937] [<c02a82c4>] (dump_stack) from [<c0078260>] (check_usage+0x4bc/0x63c)
[   42.134815] [<c0078260>] (check_usage) from [<c0078438>] (check_irq_usage+0x58/0xb0)
[   42.142964] [<c0078438>] (check_irq_usage) from [<c007aaa0>] (__lock_acquire+0x1524/0x20b0)
[   42.151740] [<c007aaa0>] (__lock_acquire) from [<c007bed8>] (lock_acquire+0x70/0x90)
[   42.159886] [<c007bed8>] (lock_acquire) from [<c06f9140>] (_raw_spin_lock_irqsave+0x40/0x54)
[   42.168768] [<c06f9140>] (_raw_spin_lock_irqsave) from [<bf083bc8>] (arcnet_send_packet+0x60/0x1c0 [arcnet])
[   42.179115] [<bf083bc8>] (arcnet_send_packet [arcnet]) from [<c06b9380>] (packet_direct_xmit+0x130/0x1c8)
[   42.189182] [<c06b9380>] (packet_direct_xmit) from [<c06bc7e4>] (packet_sendmsg+0x3b8/0x680)
[   42.198059] [<c06bc7e4>] (packet_sendmsg) from [<c05fe8b0>] (sock_sendmsg+0x14/0x24)
[   42.206199] [<c05fe8b0>] (sock_sendmsg) from [<c05ffd68>] (SyS_sendto+0xb8/0xe0)
[   42.213978] [<c05ffd68>] (SyS_sendto) from [<c05ffda8>] (SyS_send+0x18/0x20)
[   42.221388] [<c05ffda8>] (SyS_send) from [<c000f780>] (ret_fast_syscall+0x0/0x1c)

Signed-off-by: Michael Grzeschik <[email protected]>
   ---
   v1 -> v2: removed unneeded zero assignment of flags
Signed-off-by: David S. Miller <[email protected]>
7 years agorocker: move dereference before free
Dan Carpenter [Wed, 28 Jun 2017 11:44:21 +0000 (14:44 +0300)]
rocker: move dereference before free

My static checker complains that ofdpa_neigh_del() can sometimes free
"found".   It just makes sense to use it first before deleting it.

Fixes: ecf244f753e0 ("rocker: fix maybe-uninitialized warning")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agomlxsw: spectrum_router: Fix NULL pointer dereference
Ido Schimmel [Wed, 28 Jun 2017 06:03:12 +0000 (09:03 +0300)]
mlxsw: spectrum_router: Fix NULL pointer dereference

In case a VLAN device is enslaved to a bridge we shouldn't create a
router interface (RIF) for it when it's configured with an IP address.
This is already handled by the driver for other types of netdevs, such
as physical ports and LAG devices.

If this IP address is then removed and the interface is subsequently
unlinked from the bridge, a NULL pointer dereference can happen, as the
original 802.1d FID was replaced with an rFID which was then deleted.

To reproduce:
$ ip link set dev enp3s0np9 up
$ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111
$ ip link set dev enp3s0np9.111 up
$ ip link add name br0 type bridge
$ ip link set dev br0 up
$ ip link set enp3s0np9.111 master br0
$ ip address add dev enp3s0np9.111 192.168.0.1/24
$ ip address del dev enp3s0np9.111 192.168.0.1/24
$ ip link set dev enp3s0np9.111 nomaster

Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Petr Machata <[email protected]>
Tested-by: Petr Machata <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agonet: sched: Fix one possible panic when no destroy callback
Gao Feng [Wed, 28 Jun 2017 04:53:54 +0000 (12:53 +0800)]
net: sched: Fix one possible panic when no destroy callback

When qdisc fail to init, qdisc_create would invoke the destroy callback
to cleanup. But there is no check if the callback exists really. So it
would cause the panic if there is no real destroy callback like the qdisc
codel, fq, and so on.

Take codel as an example following:
When a malicious user constructs one invalid netlink msg, it would cause
codel_init->codel_change->nla_parse_nested failed.
Then kernel would invoke the destroy callback directly but qdisc codel
doesn't define one. It causes one panic as a result.

Now add one the check for destroy to avoid the possible panic.

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Gao Feng <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agovirtio-net: serialize tx routine during reset
Jason Wang [Wed, 28 Jun 2017 01:51:03 +0000 (09:51 +0800)]
virtio-net: serialize tx routine during reset

We don't hold any tx lock when trying to disable TX during reset, this
would lead a use after free since ndo_start_xmit() tries to access
the virtqueue which has already been freed. Fix this by using
netif_tx_disable() before freeing the vqs, this could make sure no tx
after vq freeing.

Reported-by: Jean-Philippe Menil <[email protected]>
Tested-by: Jean-Philippe Menil <[email protected]>
Fixes commit f600b6905015 ("virtio_net: Add XDP support")
Cc: John Fastabend <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Acked-by: Robert McCabe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agonvme: Makefile: remove dead build rule
Valentin Rothberg [Thu, 29 Jun 2017 06:59:07 +0000 (08:59 +0200)]
nvme: Makefile: remove dead build rule

Remove dead build rule for drivers/nvme/host/scsi.c which has been
removed by commit ("nvme: Remove SCSI translations").

Signed-off-by: Valentin Rothberg <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblk-mq: map all HWQ also in hyperthreaded system
Max Gurtovoy [Thu, 29 Jun 2017 14:40:11 +0000 (08:40 -0600)]
blk-mq: map all HWQ also in hyperthreaded system

This patch performs sequential mapping between CPUs and queues.
In case the system has more CPUs than HWQs then there are still
CPUs to map to HWQs. In hyperthreaded system, map the unmapped CPUs
and their siblings to the same HWQ.
This actually fixes a bug that found unmapped HWQs in a system with
2 sockets, 18 cores per socket, 2 threads per core (total 72 CPUs)
running NVMEoF (opens upto maximum of 64 HWQs).

Performance results running fio (72 jobs, 128 iodepth)
using null_blk (w/w.o patch):

bs      IOPS(read submit_queues=72)   IOPS(write submit_queues=72)   IOPS(read submit_queues=24)  IOPS(write submit_queues=24)
-----  ----------------------------  ------------------------------ ---------------------------- -----------------------------
512    4890.4K/4723.5K                 4524.7K/4324.2K                   4280.2K/4264.3K               3902.4K/3909.5K
1k     4910.1K/4715.2K                 4535.8K/4309.6K                   4296.7K/4269.1K               3906.8K/3914.9K
2k     4906.3K/4739.7K                 4526.7K/4330.6K                   4301.1K/4262.4K               3890.8K/3900.1K
4k     4918.6K/4730.7K                 4556.1K/4343.6K                   4297.6K/4264.5K               3886.9K/3893.9K
8k     4906.4K/4748.9K                 4550.9K/4346.7K                   4283.2K/4268.8K               3863.4K/3858.2K
16k    4903.8K/4782.6K                 4501.5K/4233.9K                   4292.3K/4282.3K               3773.1K/3773.5K
32k    4885.8K/4782.4K                 4365.9K/4184.2K                   4307.5K/4289.4K               3780.3K/3687.3K
64k    4822.5K/4762.7K                 2752.8K/2675.1K                   4308.8K/4312.3K               2651.5K/2655.7K
128k   2388.5K/2313.8K                 1391.9K/1375.7K                   2142.8K/2152.2K               1395.5K/1374.2K

Signed-off-by: Max Gurtovoy <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoftrace: Fix regression with module command in stack_trace_filter
Steven Rostedt (VMware) [Thu, 29 Jun 2017 14:05:45 +0000 (10:05 -0400)]
ftrace: Fix regression with module command in stack_trace_filter

When doing the following command:

 # echo ":mod:kvm_intel" > /sys/kernel/tracing/stack_trace_filter

it triggered a crash.

This happened with the clean up of probes. It required all callers to the
regex function (doing ftrace filtering) to have ops->private be a pointer to
a trace_array. But for the stack tracer, that is not the case.

Allow for the ops->private to be NULL, and change the function command
callbacks to handle the trace_array pointer being NULL as well.

Fixes: d2afd57a4b96 ("tracing/ftrace: Allow instances to have their own function probes")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
7 years agoRevert "pinctrl: rockchip: avoid hardirq-unsafe functions in irq_chip"
Brian Norris [Fri, 23 Jun 2017 20:59:11 +0000 (13:59 -0700)]
Revert "pinctrl: rockchip: avoid hardirq-unsafe functions in irq_chip"

This reverts commit 88bb94216f59e10802aaf78c858a4146085faf18.

It introduced a new CONFIG_DEBUG_ATOMIC_SLEEP warning in v4.12-rc1:

[ 7226.716713] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
[ 7226.716716] in_atomic(): 0, irqs_disabled(): 0, pid: 1708, name: bash
[ 7226.716722] CPU: 1 PID: 1708 Comm: bash Not tainted 4.12.0-rc6+ #1213
[ 7226.716724] Hardware name: Google Kevin (DT)
[ 7226.716726] Call trace:
[ 7226.716738] [<ffffff8008089928>] dump_backtrace+0x0/0x24c
[ 7226.716743] [<ffffff8008089b94>] show_stack+0x20/0x28
[ 7226.716749] [<ffffff8008371370>] dump_stack+0x90/0xb0
[ 7226.716755] [<ffffff80080cd2a0>] ___might_sleep+0x10c/0x124
[ 7226.716760] [<ffffff80080cd330>] __might_sleep+0x78/0x88
[ 7226.716765] [<ffffff800879e210>] mutex_lock+0x2c/0x64
[ 7226.716771] [<ffffff80083ad678>] rockchip_irq_bus_lock+0x30/0x3c
[ 7226.716777] [<ffffff80080f6d40>] __irq_get_desc_lock+0x78/0x98
[ 7226.716782] [<ffffff80080f7e6c>] irq_set_irq_wake+0x44/0x12c
[ 7226.716787] [<ffffff8008486e18>] dev_pm_arm_wake_irq+0x4c/0x58
[ 7226.716792] [<ffffff800848b80c>] device_wakeup_arm_wake_irqs+0x3c/0x58
[ 7226.716796] [<ffffff80084896fc>] dpm_suspend_noirq+0xf8/0x3a0
[ 7226.716800] [<ffffff80080f1384>] suspend_devices_and_enter+0x1a4/0x9a8
[ 7226.716803] [<ffffff80080f21ec>] pm_suspend+0x664/0x6a4
[ 7226.716807] [<ffffff80080f04d8>] state_store+0xd4/0xf8
...

It was reported on -rc1, and it's still not fixed in -rc6, so it should
just be reverted.

Cc: John Keeping <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Reviewed-by: Heiko Stuebner <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
7 years agogpio: acpi: Skip _AEI entries without a handler rather then aborting the scan
Hans de Goede [Fri, 23 Jun 2017 07:26:13 +0000 (09:26 +0200)]
gpio: acpi: Skip _AEI entries without a handler rather then aborting the scan

acpi_walk_resources will stop as soon as the callback passed in returns
an error status. On a x86 tablet I have the first GpioInt in the _AEI
resource list has no handler defined in the DSDT, causing
acpi_walk_resources to abort scanning the rest of the resource list,
which does define valid ACPI GPIO events.

This commit changes the return for not finding a handler from
AE_BAD_PARAMETER to AE_OK so that the rest of the resource list will
get scanned normally in case of missing event handlers.

Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Mika Westerberg <[email protected]>
Acked-by: Andy Shevchenko <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
7 years agogpiolib: fix filtering out unwanted events
Bartosz Golaszewski [Fri, 23 Jun 2017 11:45:16 +0000 (13:45 +0200)]
gpiolib: fix filtering out unwanted events

GPIOEVENT_REQUEST_BOTH_EDGES is not a single flag, but a binary OR of
GPIOEVENT_REQUEST_RISING_EDGE and GPIOEVENT_REQUEST_FALLING_EDGE.

The expression 'le->eflags & GPIOEVENT_REQUEST_BOTH_EDGES' we'll get
evaluated to true even if only one event type was requested.

Fix it by checking both RISING & FALLING flags explicitly.

Cc: [email protected]
Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events")
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
7 years agoEDAC, pnd2: Fix Apollo Lake DIMM detection
Tony Luck [Wed, 28 Jun 2017 23:44:07 +0000 (16:44 -0700)]
EDAC, pnd2: Fix Apollo Lake DIMM detection

Non-existent or empty DIMM slots result in error return from
RD_REGP(). But we shouldn't give up on failure.

So long as we find at least one DIMM we can continue.

Signed-off-by: Tony Luck <[email protected]>
Cc: Qiuxu Zhuo <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
7 years agoEDAC, i5000, i5400: Fix definition of NRECMEMB register
Jérémy Lefaure [Thu, 29 Jun 2017 00:57:29 +0000 (20:57 -0400)]
EDAC, i5000, i5400: Fix definition of NRECMEMB register

In the i5000 and i5400 drivers, the NRECMEMB register is defined as a
16-bit value, which results in wrong shifts in the code, as reported by
sparse.

In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21),
this register is a 32-bit register. A u32 value for the register fixes
the wrong shifts warnings and matches the datasheet.

Also fix the mask to access to the CAS bits [27:16] in the i5000 driver.

[1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf
[2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf

Signed-off-by: Jérémy Lefaure <[email protected]>
Cc: linux-edac <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
7 years agosched/numa: Hide numa_wake_affine() from UP build
Thomas Gleixner [Thu, 29 Jun 2017 06:25:52 +0000 (08:25 +0200)]
sched/numa: Hide numa_wake_affine() from UP build

Stephen reported the following build warning in UP:

kernel/sched/fair.c:2657:9: warning: 'struct sched_domain' declared inside
parameter list
         ^
/home/sfr/next/next/kernel/sched/fair.c:2657:9: warning: its scope is only this
definition or declaration, which is probably not what you want

Hide the numa_wake_affine() inline stub on UP builds to get rid of it.

Fixes: 3fed382b46ba ("sched/numa: Implement NUMA node level wake_affine()")
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
7 years agoarch: remove unused macro/function thread_saved_pc()
Tobias Klauser [Wed, 28 Jun 2017 13:30:02 +0000 (15:30 +0200)]
arch: remove unused macro/function thread_saved_pc()

The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d5597793 ("sched/core: Remove pointless printout in
sched_show_task()").  Remove the implementations as well.

Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.

Signed-off-by: Tobias Klauser <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
7 years agoblock: provide bio_uninit() free freeing integrity/task associations
Jens Axboe [Wed, 28 Jun 2017 21:30:13 +0000 (15:30 -0600)]
block: provide bio_uninit() free freeing integrity/task associations

Wen reports significant memory leaks with DIF and O_DIRECT:

"With nvme devive + T10 enabled, On a system it has 256GB and started
logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
leaking.

/proc/meminfo | grep SUnreclaim...

SUnreclaim:      6752128 kB
SUnreclaim:      6874880 kB
SUnreclaim:      7238080 kB
....
SUnreclaim:     22307264 kB
SUnreclaim:     22485888 kB
SUnreclaim:     22720256 kB

When testcases with T10 enabled call into __blkdev_direct_IO_simple,
code doesn't free memory allocated by bio_integrity_alloc. The patch
fixes the issue. HTX has been run with +60 hours without failure."

Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
doesn't go through the regular bio free. This means that any ancillary
data allocated with the bio through the stack is not freed. Hence, we
can leak the integrity data associated with the bio, if the device is
using DIF/DIX.

Fix this by providing a bio_uninit() and export it, so that we can use
it to free this data. Note that this is a minimal fix for this issue.
Any current user of bio's that are allocated outside of
bio_alloc_bioset() suffers from this issue, most notably some drivers.
We will fix those in a more comprehensive patch for 4.13. This also
means that the commit marked as being fixed by this isn't the real
culprit, it's just the most obvious one out there.

Fixes: 542ff7bf18c6 ("block: new direct I/O implementation")
Reported-by: Wen Xiong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoMerge tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Wed, 28 Jun 2017 20:27:15 +0000 (13:27 -0700)]
Merge tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Bugfixes include:

   - stable fix for exclusive create if the server supports the umask
     attribute

   - trunking detection should handle ERESTARTSYS/EINTR

   - stable fix for a race in the LAYOUTGET function

   - stable fix to revert "nfs_rename() handle -ERESTARTSYS dentry left
     behind"

   - nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()"

* tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4.1: nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()
  Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
  NFSv4.1: Fix a race in nfs4_proc_layoutget
  NFS: Trunking detection should handle ERESTARTSYS/EINTR
  NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umask

7 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 28 Jun 2017 20:22:26 +0000 (13:22 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "This is the final set of fixes for -rc8, just a few i915 and one
  vmwgfx ones.

  I'm off on holidays for a week, so if anything shows up for fixes I've
  asked Daniel or Sean Paul to herd it in the right direction"

[ The additional etnaviv fixes were already herded towards me as seen in
  my previous pull - Linus ]

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
  drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
  drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object
  drm/i915: Retire the VMA's fence tracker before unbinding

7 years agoMerge branch 'etnaviv/fixes' of git://git.pengutronix.de/git/lst/linux
Linus Torvalds [Wed, 28 Jun 2017 20:13:48 +0000 (13:13 -0700)]
Merge branch 'etnaviv/fixes' of git://git.pengutronix.de/git/lst/linux

Pull drm/etnaviv fixes from Lucas Stach:
 "I realized I just missed the cut-off point for the final drm fixes
  pull, but I have 2 more etnaviv fixes that need to go into 4.12, as
  they fix fallout from the explicit sync work introduced in the last
  merge window"

[ Pulling directly because Dave is on vacation. Noted by Daniel Vetter,
  and acked by Dave Airlie  - Linus ]

* 'etnaviv/fixes' of git://git.pengutronix.de/git/lst/linux:
  drm/etnaviv: Fix implicit/explicit sync sense inversion
  drm/etnaviv: fix submit flags getting overwritten by BO content

7 years agolocking/refcount: Create unchecked atomic_t implementation
Kees Cook [Wed, 21 Jun 2017 20:00:26 +0000 (13:00 -0700)]
locking/refcount: Create unchecked atomic_t implementation

Many subsystems will not use refcount_t unless there is a way to build the
kernel so that there is no regression in speed compared to atomic_t. This
adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
which has the validation but is slightly slower. When not enabled,
refcount_t uses the basic unchecked atomic_t routines, which results in
no code changes compared to just using atomic_t directly.

Signed-off-by: Kees Cook <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Hans Liljestrand <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: linux-arch <[email protected]>
Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast
Signed-off-by: Ingo Molnar <[email protected]>
7 years agonvmet-rdma: register ib_client to not deadlock in device removal
Sagi Grimberg [Tue, 27 Jun 2017 06:23:33 +0000 (09:23 +0300)]
nvmet-rdma: register ib_client to not deadlock in device removal

We can deadlock in case we got to a device removal
event on a queue which is already in the process of
destroying the cm_id is this is blocking until all
events on this cm_id will drain. On the other hand
we cannot guarantee that rdma_destroy_id was invoked
as we only have indication that the queue disconnect
flow has been queued (the queue state is updated before
the realease work has been queued).

So, we leave all the queue removal to a separate ib_client
to avoid this deadlock as ib_client device removal is in
a different context than the cm_id itself.

Reported-by: Shiraz Saleem <[email protected]>
Tested-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme_fc: fix error recovery on link down.
James Smart [Thu, 22 Jun 2017 00:43:21 +0000 (17:43 -0700)]
nvme_fc: fix error recovery on link down.

Currently, the fc transport invokes nvme_fc_error_recovery() on every
io in which the transport detects an error.  Which means:
a) it's really noisy on large io loads that all get hit by a link down.
b) we repeatively call nvme_stop_queues() even though queues are
 stopped upon the first error or as first steps of reset_work.

Correct by:
Errors are only meaningful if the controller is in the LIVE state.
Thus, enact the reset_work only if LIVE. If called repeatively, state
will have already transitioned.
There's no need to stop the queues here. Let the first steps of
reset_work do the queue stopping.

Signed-off-by: James Smart <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvmet_fc: fix crashes on bad opcodes
James Smart [Fri, 16 Jun 2017 06:41:41 +0000 (23:41 -0700)]
nvmet_fc: fix crashes on bad opcodes

if a nvme command is issued with an opcode that is not supported by
the target (example: opcode 21 - detach namespace), the target
crashes due to a null pointer.

nvmet_req_init() detects the bad opcode and immediately calls the nvme
command done routine with an error status, allowing the transport to
send the response. However, the FC transport was aborting the command
on error, so the abort freed the lldd point, but the rsp transmit path
referenced it psot the free.

Fix by removing the abort call on nvmet_req_init() failure.
The completion response will be sent with an error status code.

As the completion path will terminate the io, ensure the data_sg
lists show an unused state so that teardown paths are successful.

Signed-off-by: Paul Ely <[email protected]>
Signed-off-by: James Smart <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme_fc: Fix crash when nvme controller connection fails.
James Smart [Fri, 16 Jun 2017 06:40:54 +0000 (23:40 -0700)]
nvme_fc: Fix crash when nvme controller connection fails.

If a controller connection is attempted (say to a subsystem that
does not exist), the first attempt errors out.  If another connect
is attempted, it crashes.

Issue is the prior controller has yet execute it's final put, thus
its still on lists. However, opts points on it have been cleared, thus
causing the crash if they are referenced.

Fix is to add the missing put after the nvme_uninit_ctrl() call on
the attachment failure.

Signed-off-by: Paul Ely <[email protected]>
Signed-off-by: James Smart <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme_fc: replace ioabort msleep loop with completion
James Smart [Mon, 22 May 2017 22:28:42 +0000 (15:28 -0700)]
nvme_fc: replace ioabort msleep loop with completion

Per the recommendation by Sagi on:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html

Wait for io aborts to complete wait converted from msleep look to
using a struct completion.

Signed-off-by: James Smart <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme_fc: fix double calls to nvme_cleanup_cmd()
James Smart [Thu, 22 Jun 2017 00:43:05 +0000 (17:43 -0700)]
nvme_fc: fix double calls to nvme_cleanup_cmd()

Current fc transport code, on io termination, is calling
nvme_cleanup_cmd() followed by the transport dma unmap routine
which also calls nvme_cleanup_cmd(). Which means two kfrees occur
on the same address, raising havoc. This resulted in odd data errors,
effectively corruption..

Fix by removing the extraneous double calls. Call now occurs only in
teardown paths and as part of dma unmap routine.

Signed-off-by: James Smart <[email protected]>
Reviewed-by: Ewan D. Milne <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme-fabrics: verify that a controller returns the correct NQN
Christoph Hellwig [Mon, 26 Jun 2017 10:39:04 +0000 (12:39 +0200)]
nvme-fabrics: verify that a controller returns the correct NQN

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: simplify nvme_dev_attrs_are_visible
Christoph Hellwig [Mon, 26 Jun 2017 10:39:03 +0000 (12:39 +0200)]
nvme: simplify nvme_dev_attrs_are_visible

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: read the subsystem NQN from Identify Controller
Christoph Hellwig [Mon, 26 Jun 2017 10:39:02 +0000 (12:39 +0200)]
nvme: read the subsystem NQN from Identify Controller

NVMe 1.2.1 or later requires controllers to provide a subsystem NQN in the
Identify controller data structures.  Use this NQN for the subsysnqn
sysfs attribute by storing it in the nvme_ctrl structure after verifying
it.  For older controllers we generate a "fake" NQN per non-normative
text in the NVMe 1.3 spec.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: remove a misleading comment on struct nvme_ns
Christoph Hellwig [Mon, 26 Jun 2017 10:39:01 +0000 (12:39 +0200)]
nvme: remove a misleading comment on struct nvme_ns

While a NVMe Namespace is somewhat similar to a SCSI Logical Unit (and not
a Logical Unit Number anyway) there are subtile differences.  Remove the
misleading comment.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: explicitly disable APST on quirked devices
Kai-Heng Feng [Mon, 26 Jun 2017 20:39:54 +0000 (16:39 -0400)]
nvme: explicitly disable APST on quirked devices

A user reports APST is enabled, even when the NVMe is quirked or with
option "default_ps_max_latency_us=0".

The current logic will not set APST if the device is quirked. But the
NVMe in question will enable APST automatically.

Separate the logic "apst is supported" and "to enable apst", so we can
use the latter one to explicitly disable APST at initialiaztion.

BugLink: https://bugs.launchpad.net/bugs/1699004
Signed-off-by: Kai-Heng Feng <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: use a single NVME_AQ_DEPTH and relax it to 32
Sagi Grimberg [Sun, 18 Jun 2017 13:15:59 +0000 (16:15 +0300)]
nvme: use a single NVME_AQ_DEPTH and relax it to 32

No need to differentiate fabrics from pci/loop, also lower
it to 32 as we don't really need 256 inflight admin commands.

Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: add hostid token to fabric options
Johannes Thumshirn [Tue, 20 Jun 2017 12:23:01 +0000 (14:23 +0200)]
nvme: add hostid token to fabric options

Currently we have no way to define a stable host-id but always use the one
which is randomly generated when we add the host or use the default host.

Provide a "hostid=%s" for user-space to pass in a persistent host-id which
overrides the randomly generated one.

Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: Remove SCSI translations
Keith Busch [Tue, 20 Jun 2017 19:09:56 +0000 (15:09 -0400)]
nvme: Remove SCSI translations

The SCSI-to-NVMe translations were added to assist storage applications
utilizing SG_IO transitioning to NVMe. It was always recommended,
however, to use native NVMe for device management as too much is lost
in translation and the maintenance burden in keeping this kludgey
layer around has been neglected such that much of the translations are
completely broken.

This patch removes SG_IO handling from NVMe to avoid any confusion
regarding maintenance support for this interface. The config option for
NVMe SCSI emulation has been disabled by default since 4.5. The driver
has supported native nvme user commands since the beginning, and native
tooling is publicly available for use or as reference for anyone writing
their own tools, so there's no excuse for hanging onto a broken crutch.

Signed-off-by: Keith Busch <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Guan Junxiong <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme-pci: open-code polling logic in nvme_poll
Sagi Grimberg [Sun, 18 Jun 2017 14:28:10 +0000 (17:28 +0300)]
nvme-pci: open-code polling logic in nvme_poll

Given that the code is simple enough it seems better
then passing a tag by reference for each call site, also
we can now get rid of __nvme_process_cq.

Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme-pci: factor out the cqe reading mechanics from __nvme_process_cq
Sagi Grimberg [Sun, 18 Jun 2017 14:28:09 +0000 (17:28 +0300)]
nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq

Also, maintain a consumed counter to rely on for doorbell and
cqe_seen update instead of directly relying on the cq head and phase.

Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme-pci: factor out cqe handling into a dedicated routine
Sagi Grimberg [Sun, 18 Jun 2017 14:28:08 +0000 (17:28 +0300)]
nvme-pci: factor out cqe handling into a dedicated routine

Makes the code slightly more readable.

Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme-pci: Introduce nvme_ring_cq_doorbell
Sagi Grimberg [Sun, 18 Jun 2017 14:28:07 +0000 (17:28 +0300)]
nvme-pci: Introduce nvme_ring_cq_doorbell

Nice abstraction of the actual mechanics of how to do it.
Note the change that we call it after we assign nvmeq->cq_head
to avoid passing it.

Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agofs/fcntl: use copy_to/from_user() for u64 types
Jens Axboe [Wed, 28 Jun 2017 14:09:45 +0000 (08:09 -0600)]
fs/fcntl: use copy_to/from_user() for u64 types

Some architectures (at least PPC) doesn't like get/put_user with
64-bit types on a 32-bit system. Use the variably sized copy
to/from user variants instead.

Reported-by: Stephen Rothwell <[email protected]>
Fixes: c75b1d9421f8 ("fs: add fcntl() interface for setting/getting write life time hints")
Signed-off-by: Jens Axboe <[email protected]>
7 years agoiommu/amd: Fix interrupt remapping when disable guest_mode
Suravee Suthikulpanit [Mon, 26 Jun 2017 09:28:04 +0000 (04:28 -0500)]
iommu/amd: Fix interrupt remapping when disable guest_mode

Pass-through devices to VM guest can get updated IRQ affinity
information via irq_set_affinity() when not running in guest mode.
Currently, AMD IOMMU driver in GA mode ignores the updated information
if the pass-through device is setup to use vAPIC regardless of guest_mode.
This could cause invalid interrupt remapping.

Also, the guest_mode bit should be set and cleared only when
SVM updates posted-interrupt interrupt remapping information.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
Cc: Joerg Roedel <[email protected]>
Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode by default')
Signed-off-by: Joerg Roedel <[email protected]>
7 years agoovl: don't set origin on broken lower hardlink
Miklos Szeredi [Wed, 28 Jun 2017 11:41:22 +0000 (13:41 +0200)]
ovl: don't set origin on broken lower hardlink

When copying up a file that has multiple hard links we need to break any
association with the origin file.  This makes copy-up be essentially an
atomic replace.

The new file has nothing to do with the old one (except having the same
data and metadata initially), so don't set the overlay.origin attribute.

We can relax this in the future when we are able to index upper object by
origin.

Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up")
7 years agoovl: copy-up: don't unlock between lookup and link
Miklos Szeredi [Wed, 28 Jun 2017 11:41:22 +0000 (13:41 +0200)]
ovl: copy-up: don't unlock between lookup and link

Nothing prevents mischief on upper layer while we are busy copying up the
data.

Move the lookup right before the looked up dentry is actually used.

Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 01ad3eb8a073 ("ovl: concurrent copy up of regular files")
Cc: <[email protected]> # v4.11
7 years agoALSA: hda - Fix endless loop of codec configure
Takashi Iwai [Wed, 28 Jun 2017 10:02:02 +0000 (12:02 +0200)]
ALSA: hda - Fix endless loop of codec configure

azx_codec_configure() loops over the codecs found on the given
controller via a linked list.  The code used to work in the past, but
in the current version, this may lead to an endless loop when a codec
binding returns an error.

The culprit is that the snd_hda_codec_configure() unregisters the
device upon error, and this eventually deletes the given codec object
from the bus.  Since the list is initialized via list_del_init(), the
next object points to the same device itself.  This behavior change
was introduced at splitting the HD-audio code code, and forgotten to
adapt it here.

For fixing this bug, just use a *_safe() version of list iteration.

Fixes: d068ebc25e6e ("ALSA: hda - Move some codes up to hdac_bus struct")
Reported-by: Daniel Vetter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
7 years agodrm/etnaviv: Fix implicit/explicit sync sense inversion
Daniel Stone [Thu, 22 Jun 2017 11:22:22 +0000 (12:22 +0100)]
drm/etnaviv: Fix implicit/explicit sync sense inversion

We were reading the no-implicit sync flag the wrong way around,
synchronizing too much for the explicit case, and not at all for the
implicit case. Oops.

Signed-off-by: Daniel Stone <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
7 years agodrm/etnaviv: fix submit flags getting overwritten by BO content
Lucas Stach [Tue, 27 Jun 2017 14:02:51 +0000 (16:02 +0200)]
drm/etnaviv: fix submit flags getting overwritten by BO content

The addition of the flags member to etnaviv_gem_submit structure didn't
take into account that the last member of this structure is a variable
length array.

Signed-off-by: Lucas Stach <[email protected]>
7 years agoMerge tag 'drm-intel-fixes-2017-06-27' of git://anongit.freedesktop.org/git/drm-intel...
Dave Airlie [Wed, 28 Jun 2017 07:07:15 +0000 (17:07 +1000)]
Merge tag 'drm-intel-fixes-2017-06-27' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

Just a few minor fixes. Important one is the execbuf async fix (aka
ANDROID_native_sync). There was another patch for a display coherency
corner case on APL, but we've random-walked in that space too much,
and the cherry-pick looked really invasive.

* tag 'drm-intel-fixes-2017-06-27' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
  drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object
  drm/i915: Retire the VMA's fence tracker before unbinding

7 years agoMerge branch 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux into...
Dave Airlie [Wed, 28 Jun 2017 07:06:58 +0000 (17:06 +1000)]
Merge branch 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Single vmwgfx fix
* 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr

7 years agoALSA: hda - set input_path bitmap to zero after moving it to new place
Hui Wang [Wed, 28 Jun 2017 00:59:16 +0000 (08:59 +0800)]
ALSA: hda - set input_path bitmap to zero after moving it to new place

Recently we met a problem, the codec has valid adcs and input pins,
and they can form valid input paths, but the driver does not build
valid controls for them like "Mic boost", "Capture Volume" and
"Capture Switch".

Through debugging, I found the driver needs to shrink the invalid
adcs and input paths for this machine, so it will move the whole
column bitmap value to the previous column, after moving it, the
driver forgets to set the original column bitmap value to zero, as a
result, the driver will invalidate the path whose index value is the
original colume bitmap value. After executing this function, all
valid input paths are invalidated by a mistake, there are no any
valid input paths, so the driver won't build controls for them.

Fixes: 3a65bcdc577a ("ALSA: hda - Fix inconsistent input_paths after ADC reduction")
Cc: <[email protected]>
Signed-off-by: Hui Wang <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
7 years agoNFSv4.1: nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()
Trond Myklebust [Tue, 27 Jun 2017 21:40:50 +0000 (17:40 -0400)]
NFSv4.1: nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()

The current code works only for the case where we have exactly one slot,
which is no longer true.
nfs4_free_slot() will automatically declare the callback channel to be
drained when all slots have been returned.

Signed-off-by: Trond Myklebust <[email protected]>
7 years agoRevert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
Benjamin Coddington [Fri, 16 Jun 2017 15:12:59 +0000 (11:12 -0400)]
Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"

This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
call d_move() without holding the directory's i_mutex, and reverts commit
d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
move", which was a follow-up fix.

Signed-off-by: Benjamin Coddington <[email protected]>
Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
Cc: [email protected] # v4.10+
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
7 years agoNFSv4.1: Fix a race in nfs4_proc_layoutget
Trond Myklebust [Tue, 27 Jun 2017 21:33:38 +0000 (17:33 -0400)]
NFSv4.1: Fix a race in nfs4_proc_layoutget

If the task calling layoutget is signalled, then it is possible for the
calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race,
in which case we leak a slot.
The fix is to move the call to nfs4_sequence_free_slot() into the
nfs4_layoutget_release() so that it gets called at task teardown time.

Fixes: 2e80dbe7ac51 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Cc: [email protected] # v4.8+
Signed-off-by: Trond Myklebust <[email protected]>
7 years agoNFS: Trunking detection should handle ERESTARTSYS/EINTR
Trond Myklebust [Wed, 21 Jun 2017 14:16:56 +0000 (10:16 -0400)]
NFS: Trunking detection should handle ERESTARTSYS/EINTR

Currently, it will return EIO in those cases.

Signed-off-by: Trond Myklebust <[email protected]>
7 years agoMIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
Aleksandar Markovic [Mon, 19 Jun 2017 15:50:12 +0000 (17:50 +0200)]
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately

If accumulator value is zero, just return the value of previously
calculated product. This brings logic in MADDF/MSUBF implementation
closer to the logic in ADD/SUB case.

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/16512/
Signed-off-by: Ralf Baechle <[email protected]>
7 years agodrbd: Drop unnecessary static
Julia Lawall [Tue, 27 Jun 2017 23:56:50 +0000 (17:56 -0600)]
drbd: Drop unnecessary static

Drop static on a local variable, when the variable is initialized before
any use, on every possible execution path through the function.  The
static has no benefit, and dropping it reduces the code size.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
     when strict
?x = e;
// </smpl>

The change in code size is indicates by the following output from the size
command.

before:
   text    data     bss     dec     hex filename
  67299    2291    1056   70646   113f6 drivers/block/drbd/drbd_nl.o

after:
   text    data     bss     dec     hex filename
  67283    2291    1056   70630   113e6 drivers/block/drbd/drbd_nl.o

Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Roland Kammerer <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme/pci: Fix stuck nvme reset
Keith Busch [Tue, 27 Jun 2017 23:44:05 +0000 (17:44 -0600)]
nvme/pci: Fix stuck nvme reset

The controller state is set to resetting prior to disabling the
controller, so this patch accounts for that state when deciding if it
needs to freeze the queues. Without this, an 'nvme reset /dev/nvme0'
blocks forever because the queues were never frozen.

Fixes: 82b057caefaf ("nvme-pci: fix multiple ctrl removal scheduling")
Signed-off-by: Keith Busch <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoMIPS: head: Reorder instructions missing a delay slot
Karl Beldan [Tue, 27 Jun 2017 19:22:16 +0000 (19:22 +0000)]
MIPS: head: Reorder instructions missing a delay slot

In this sequence the 'move' is assumed in the delay slot of the 'beq',
but head.S is in reorder mode and the former gets pushed one 'nop'
farther by the assembler.

The corrected behavior made booting with an UHI supplied dtb erratic.

Fixes: 15f37e158892 ("MIPS: store the appended dtb address in a variable")
Signed-off-by: Karl Beldan <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cc: Jonas Gorski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/16614/
Signed-off-by: Ralf Baechle <[email protected]>
7 years agonet: usb: asix88179_178a: Add support for the Belkin B2B128
Andrew F. Davis [Mon, 26 Jun 2017 17:41:20 +0000 (12:41 -0500)]
net: usb: asix88179_178a: Add support for the Belkin B2B128

The Belkin B2B128 is a USB 3.0 Hub + Gigabit Ethernet Adapter, the
Ethernet adapter uses the ASIX AX88179 USB 3.0 to Gigabit Ethernet
chip supported by this driver, add the USB ID for the same.

This patch is based on work by Geoffrey Tran <[email protected]>
who has indicated they would like this upstreamed by someone more
familiar with the upstreaming process.

Signed-off-by: Andrew F. Davis <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agofsl/fman: add dependency on HAS_DMA
Madalin Bucur [Mon, 26 Jun 2017 15:47:00 +0000 (18:47 +0300)]
fsl/fman: add dependency on HAS_DMA

A previous commit (5567e989198b5a8d) inserted a dependency on DMA
API that requires HAS_DMA to be added in Kconfig.

Signed-off-by: Madalin Bucur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agodm thin: do not queue freed thin mapping for next stage processing
Vallish Vaidyeshwara [Fri, 23 Jun 2017 18:53:06 +0000 (18:53 +0000)]
dm thin: do not queue freed thin mapping for next stage processing

process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.

dm_pool_inc_data_range() can fail trying to get a block reference:

metadata operation 'dm_pool_inc_data_range' failed: error = -61

When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.

Code flow without fix:
-> process_prepared_discard_passdown_pt1(m)
   -> dm_thin_remove_range()
   -> discard passdown
      --> passdown_endio(m) queues m onto next stage
   -> dm_pool_inc_data_range() fails, frees memory m
            but does not remove it from next stage queue

-> process_prepared_discard_passdown_pt2(m)
   -> processes freed memory m and crashes

One such stack:

Call Trace:
[<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool]
[<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool]
[<ffffffff8152bf54>] ? __schedule+0x244/0x680
[<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0
[<ffffffff81089f53>] process_one_work+0x153/0x3f0
[<ffffffff8108a71b>] worker_thread+0x12b/0x4b0
[<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350
[<ffffffff8108fd6a>] kthread+0xca/0xe0
[<ffffffff8108fca0>] ? kthread_park+0x60/0x60
[<ffffffff81530b45>] ret_from_fork+0x25/0x30

The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.

Code flow with fix:
-> process_prepared_discard_passdown_pt1(m)
   -> dm_thin_remove_range()
   -> dm_pool_inc_data_range()
      --> if fails, free memory m and bail out
   -> discard passdown
      --> passdown_endio(m) queues m onto next stage

Cc: stable <[email protected]> # v4.9+
Reviewed-by: Eduardo Valentin <[email protected]>
Reviewed-by: Cristian Gafton <[email protected]>
Reviewed-by: Anchal Agarwal <[email protected]>
Signed-off-by: Vallish Vaidyeshwara <[email protected]>
Reviewed-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
7 years agonet: prevent sign extension in dev_get_stats()
Eric Dumazet [Tue, 27 Jun 2017 14:02:20 +0000 (07:02 -0700)]
net: prevent sign extension in dev_get_stats()

Similar to the fix provided by Dominik Heidler in commit
9b3dc0a17d73 ("l2tp: cast l2tp traffic counter to unsigned")
we need to take care of 32bit kernels in dev_get_stats().

When using atomic_long_read(), we add a 'long' to u64 and
might misinterpret high order bit, unless we cast to unsigned.

Fixes: caf586e5f23ce ("net: add a core netdev->rx_dropped counter")
Fixes: 015f0688f57ca ("net: net: add a core netdev->tx_dropped counter")
Fixes: 6e7333d315a76 ("net: add rx_nohandler stat counter")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jarod Wilson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
7 years agoblock, bfq: update wr_busy_queues if needed on a queue split
Paolo Valente [Tue, 27 Jun 2017 18:30:47 +0000 (12:30 -0600)]
block, bfq: update wr_busy_queues if needed on a queue split

This commit fixes a bug triggered by a non-trivial sequence of
events. These events are briefly described in the next two
paragraphs. The impatiens, or those who are familiar with queue
merging and splitting, can jump directly to the last paragraph.

On each I/O-request arrival for a shared bfq_queue, i.e., for a
bfq_queue that is the result of the merge of two or more bfq_queues,
BFQ checks whether the shared bfq_queue has become seeky (i.e., if too
many random I/O requests have arrived for the bfq_queue; if the device
is non rotational, then random requests must be also small for the
bfq_queue to be tagged as seeky). If the shared bfq_queue is actually
detected as seeky, then a split occurs: the bfq I/O context of the
process that has issued the request is redirected from the shared
bfq_queue to a new non-shared bfq_queue. As a degenerate case, if the
shared bfq_queue actually happens to be shared only by one process
(because of previous splits), then no new bfq_queue is created: the
state of the shared bfq_queue is just changed from shared to non
shared.

Regardless of whether a brand new non-shared bfq_queue is created, or
the pre-existing shared bfq_queue is just turned into a non-shared
bfq_queue, several parameters of the non-shared bfq_queue are set
(restored) to the original values they had when the bfq_queue
associated with the bfq I/O context of the process (that has just
issued an I/O request) was merged with the shared bfq_queue. One of
these parameters is the weight-raising state.

If, on the split of a shared bfq_queue,
1) a pre-existing shared bfq_queue is turned into a non-shared
bfq_queue;
2) the previously shared bfq_queue happens to be busy;
3) the weight-raising state of the previously shared bfq_queue happens
to change;
the number of weight-raised busy queues changes. The field
wr_busy_queues must then be updated accordingly, but such an update
was missing. This commit adds the missing update.

Reported-by: Luca Miccio <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agommc/block: remove a call to blk_queue_bounce_limit
Christoph Hellwig [Mon, 19 Jun 2017 07:26:28 +0000 (09:26 +0200)]
mmc/block: remove a call to blk_queue_bounce_limit

BLK_BOUNCE_ANY is the defauly now, so the call is superflous.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agodm: don't set bounce limit
Christoph Hellwig [Mon, 19 Jun 2017 07:26:27 +0000 (09:26 +0200)]
dm: don't set bounce limit

Now all queues allocators come without abounce limit by default,
dm doesn't have to override this anymore.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: don't set bounce limit in blk_init_queue
Christoph Hellwig [Mon, 19 Jun 2017 07:26:26 +0000 (09:26 +0200)]
block: don't set bounce limit in blk_init_queue

Instead move it to the callers.  Those that either don't use bio_data() or
page_address() or are specific to architectures that do not support highmem
are skipped.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: don't set bounce limit in blk_init_allocated_queue
Christoph Hellwig [Mon, 19 Jun 2017 07:26:25 +0000 (09:26 +0200)]
block: don't set bounce limit in blk_init_allocated_queue

And just move it into scsi_transport_sas which needs it due to low-level
drivers directly derferencing bio_data, and into blk_init_queue_node,
which will need a further push into the callers.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblk-mq: don't bounce by default
Christoph Hellwig [Mon, 19 Jun 2017 07:26:24 +0000 (09:26 +0200)]
blk-mq: don't bounce by default

For historical reasons we default to bouncing highmem pages for all block
queues.  But the blk-mq drivers are easy to audit to ensure that we don't
need this - scsi and mtip32xx set explicit limits and everyone else doesn't
have any particular ones.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: don't bother with bounce limits for make_request drivers
Christoph Hellwig [Mon, 19 Jun 2017 07:26:23 +0000 (09:26 +0200)]
block: don't bother with bounce limits for make_request drivers

We only call blk_queue_bounce for request-based drivers, so stop messing
with it for make_request based drivers.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: remove the queue_bounce_pfn helper
Christoph Hellwig [Mon, 19 Jun 2017 07:26:22 +0000 (09:26 +0200)]
block: remove the queue_bounce_pfn helper

Only used inside the bounce code, and opencoding it makes it more obvious
what is going on.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: move bounce declarations to block/blk.h
Christoph Hellwig [Mon, 19 Jun 2017 07:26:21 +0000 (09:26 +0200)]
block: move bounce declarations to block/blk.h

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblk-map: call blk_queue_bounce from blk_rq_append_bio
Christoph Hellwig [Tue, 27 Jun 2017 18:13:21 +0000 (12:13 -0600)]
blk-map: call blk_queue_bounce from blk_rq_append_bio

This makes moves the knowledge about bouncing out of the callers into the
block core (just like we do for the normal I/O path), and allows to unexport
blk_queue_bounce.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agopktcdvd: remove the call to blk_queue_bounce
Christoph Hellwig [Mon, 19 Jun 2017 07:26:19 +0000 (09:26 +0200)]
pktcdvd: remove the call to blk_queue_bounce

pktcdvd is a make_request based stacking driver and thus doesn't have any
addressing limits on it's own.  It also doesn't use bio_data() or
page_address(), so it doesn't need a lowmem bounce either.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agonvme: add support for streams and directives
Jens Axboe [Tue, 27 Jun 2017 18:03:06 +0000 (12:03 -0600)]
nvme: add support for streams and directives

This adds support for Directives in NVMe, particular for the Streams
directive. Support for Directives is a new feature in NVMe 1.3. It
allows a user to pass in information about where to store the data, so
that it the device can do so most effiently. If an application is
managing and writing data with different life times, mixing differently
retentioned data onto the same locations on flash can cause write
amplification to grow. This, in turn, will reduce performance and life
time of the device.

Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agobtrfs: add support for passing in write hints for buffered writes
Jens Axboe [Tue, 27 Jun 2017 17:51:28 +0000 (11:51 -0600)]
btrfs: add support for passing in write hints for buffered writes

Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoxfs: add support for passing in write hints for buffered writes
Jens Axboe [Tue, 27 Jun 2017 15:34:01 +0000 (09:34 -0600)]
xfs: add support for passing in write hints for buffered writes

Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoext4: add support for passing in write hints for buffered writes
Jens Axboe [Tue, 27 Jun 2017 15:32:37 +0000 (09:32 -0600)]
ext4: add support for passing in write hints for buffered writes

Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agofs: add support for buffered writeback to pass down write hints
Jens Axboe [Tue, 27 Jun 2017 15:30:05 +0000 (09:30 -0600)]
fs: add support for buffered writeback to pass down write hints

Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agofs: add O_DIRECT and aio support for sending down write life time hints
Jens Axboe [Tue, 27 Jun 2017 17:01:22 +0000 (11:01 -0600)]
fs: add O_DIRECT and aio support for sending down write life time hints

Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblk-mq: expose write hints through debugfs
Jens Axboe [Mon, 26 Jun 2017 14:15:27 +0000 (08:15 -0600)]
blk-mq: expose write hints through debugfs

Useful to verify that things are working the way they should.
Reading the file will return number of kb written with each
write hint. Writing the file will reset the statistics. No care
is taken to ensure that we don't race on updates.

Drivers will write to q->write_hints[] if they handle a given
write hint.

Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoblock: add support for write hints in a bio
Jens Axboe [Tue, 27 Jun 2017 15:22:02 +0000 (09:22 -0600)]
block: add support for write hints in a bio

No functional changes in this patch, we just use up some holes
in the bio and request structures to define a write hint that
we psas down the stack.

Ensure that we don't merge requests that have different life time
hints assigned to them, and that we inherit the write hint when
cloning a bio.

Reviewed-by: Martin K. Petersen <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agofs: add fcntl() interface for setting/getting write life time hints
Jens Axboe [Tue, 27 Jun 2017 17:47:04 +0000 (11:47 -0600)]
fs: add fcntl() interface for setting/getting write life time hints

Define a set of write life time hints:

RWH_WRITE_LIFE_NOT_SET No hint information set
RWH_WRITE_LIFE_NONE No hints about write life time
RWH_WRITE_LIFE_SHORT Data written has a short life time
RWH_WRITE_LIFE_MEDIUM Data written has a medium life time
RWH_WRITE_LIFE_LONG Data written has a long life time
RWH_WRITE_LIFE_EXTREME Data written has an extremely long life time

The intent is for these values to be relative to each other, no
absolute meaning should be attached to these flag names.

Add an fcntl interface for querying these flags, and also for
setting them as well:

F_GET_RW_HINT Returns the read/write hint set on the
underlying inode.

F_SET_RW_HINT Set one of the above write hints on the
underlying inode.

F_GET_FILE_RW_HINT Returns the read/write hint set on the
file descriptor.

F_SET_FILE_RW_HINT Set one of the above write hints on the
file descriptor.

The user passes in a 64-bit pointer to get/set these values, and
the interface returns 0/-1 on success/error.

Sample program testing/implementing basic setting/getting of write
hints is below.

Add support for storing the write life time hint in the inode flags
and in struct file as well, and pass them to the kiocb flags. If
both a file and its corresponding inode has a write hint, then we
use the one in the file, if available. The file hint can be used
for sync/direct IO, for buffered writeback only the inode hint
is available.

This is in preparation for utilizing these hints in the block layer,
to guide on-media data placement.

/*
 * writehint.c: get or set an inode write hint
 */
 #include <stdio.h>
 #include <fcntl.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <stdbool.h>
 #include <inttypes.h>

 #ifndef F_GET_RW_HINT
 #define F_LINUX_SPECIFIC_BASE 1024
 #define F_GET_RW_HINT (F_LINUX_SPECIFIC_BASE + 11)
 #define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
 #endif

static char *str[] = { "RWF_WRITE_LIFE_NOT_SET", "RWH_WRITE_LIFE_NONE",
"RWH_WRITE_LIFE_SHORT", "RWH_WRITE_LIFE_MEDIUM",
"RWH_WRITE_LIFE_LONG", "RWH_WRITE_LIFE_EXTREME" };

int main(int argc, char *argv[])
{
uint64_t hint;
int fd, ret;

if (argc < 2) {
fprintf(stderr, "%s: file <hint>\n", argv[0]);
return 1;
}

fd = open(argv[1], O_RDONLY);
if (fd < 0) {
perror("open");
return 2;
}

if (argc > 2) {
hint = atoi(argv[2]);
ret = fcntl(fd, F_SET_RW_HINT, &hint);
if (ret < 0) {
perror("fcntl: F_SET_RW_HINT");
return 4;
}
}

ret = fcntl(fd, F_GET_RW_HINT, &hint);
if (ret < 0) {
perror("fcntl: F_GET_RW_HINT");
return 3;
}

printf("%s: hint %s\n", argv[1], str[hint]);
close(fd);
return 0;
}

Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoMerge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Tue, 27 Jun 2017 15:56:52 +0000 (08:56 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
 "Three more fixes:

   - Fix the previous fix merged in the last pull for the Thumb2
     decompressor.

   - A fix from Vladimir to correctly identify the V7M cache type.

   - The optimised 3G vmsplit case does not work with LPAE, so don't
     allow this to be selected for LPAE configurations"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8682/1: V7M: Set cacheid iff DminLine or IminLine is nonzero
  ARM: 8681/1: make VMSPLIT_3G_OPT depends on !ARM_LPAE
  ARM: 8680/1: boot/compressed: fix inappropriate Thumb2 mnemonic for __nop

7 years agoperf script: Add 'synth' field for synthesized event payloads
Adrian Hunter [Fri, 26 May 2017 08:17:22 +0000 (11:17 +0300)]
perf script: Add 'synth' field for synthesized event payloads

Add a field to display the content the raw_data of a synthesized event.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Resolved conflict with 106dacd86f04 ("perf script: Support -F brstackoff,dso") ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf auxtrace: Add itrace option to output power events
Adrian Hunter [Fri, 26 May 2017 08:17:25 +0000 (11:17 +0300)]
perf auxtrace: Add itrace option to output power events

Add itrace option to output power events.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf auxtrace: Add itrace option to output ptwrite events
Adrian Hunter [Fri, 26 May 2017 08:17:24 +0000 (11:17 +0300)]
perf auxtrace: Add itrace option to output ptwrite events

Add itrace option to output ptwrite events.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agotools include: Add byte-swapping macros to kernel.h
Adrian Hunter [Fri, 26 May 2017 08:17:23 +0000 (11:17 +0300)]
tools include: Add byte-swapping macros to kernel.h

Add byte-swapping macros to kernel.h

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf script: Add 'synth' event type for synthesized events
Adrian Hunter [Wed, 21 Jun 2017 10:17:19 +0000 (13:17 +0300)]
perf script: Add 'synth' event type for synthesized events

Instruction trace decoders such as Intel PT may have additional information
recorded in the trace. For example, Intel PT has power information and a
there is a new instruction 'ptwrite' that can write a value into a PTWRITE
trace packet.

Such information may be associated with an IP and so can be treated as a
sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the
sample as raw_data (PERF_SAMPLE_RAW).

However a means of identifying the raw data format is needed. That will
be done by synthesizing an attribute for it.

So add an attribute type for custom synthesized events.  Different
synthesized events will be identified by the attribute 'config'.

Committer notes:

Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U),
i.e. after perf_pmu_register() -> idr_alloc(end=0).

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agox86/insn: perf tools: Add new ptwrite instruction
Adrian Hunter [Fri, 19 May 2017 07:50:30 +0000 (10:50 +0300)]
x86/insn: perf tools: Add new ptwrite instruction

Add ptwrite to the op code map and the perf tools new instructions test.
To run the test:

  $ tools/perf/perf test "x86 ins"
  39: Test x86 instruction decoder - new instructions          : Ok

Or to see the details:

  $ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite

For information about ptwrite, refer the Intel SDM.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf jit: fix typo: "incalid" -> "invalid"
Colin Ian King [Tue, 27 Jun 2017 12:49:17 +0000 (13:49 +0100)]
perf jit: fix typo: "incalid" -> "invalid"

Trivial fix to typo in jvmti_close() warnx warning message.

Signed-off-by: Colin King <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf tools: Kill die()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:49:13 +0000 (11:49 -0300)]
perf tools: Kill die()

Finally can nuke this function, no more users.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf config: Do not die when parsing u64 or int config values
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:44:58 +0000 (11:44 -0300)]
perf config: Do not die when parsing u64 or int config values

Just warn the user and ignore those values.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf tools: Replace error() with pr_err()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:22:31 +0000 (11:22 -0300)]
perf tools: Replace error() with pr_err()

To consolidate the error reporting facility.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agolightnvm: if LUNs are already allocated fix return
Rakesh Pandit [Tue, 27 Jun 2017 11:55:33 +0000 (14:55 +0300)]
lightnvm: if LUNs are already allocated fix return

While creating new device with NVM_DEV_CREATE if LUNs are already
allocated ioctl would return -ENOMEM which is wrong.  This patch
propagates -EBUSY from nvm_reserve_luns which is correct response.

Fixes: ade69e243 ("lightnvm: merge gennvm with core")
Reviewed-by: Frans Klaver <[email protected]>
Signed-off-by: Rakesh Pandit <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
7 years agoperf tools: Remove warning()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:13:20 +0000 (11:13 -0300)]
perf tools: Remove warning()

Now everything uses pr_warning(), so ditch it.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf event-parse: Use pr_warning()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:08:14 +0000 (11:08 -0300)]
perf event-parse: Use pr_warning()

Convert sole user of warning() in this file to pr_warning(),
consolidating error reporting facilities.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf config: Use pr_warning()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:03:17 +0000 (11:03 -0300)]
perf config: Use pr_warning()

warning() is going away, consolidating error reporting.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf help: Use pr_warning()
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 14:01:17 +0000 (11:01 -0300)]
perf help: Use pr_warning()

Complete the switch to using te pr_{warning,error,etc} error reporting
facilities.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoperf help: Elliminate dup code for reporting
Arnaldo Carvalho de Melo [Tue, 27 Jun 2017 13:59:28 +0000 (10:59 -0300)]
perf help: Elliminate dup code for reporting

And switch from warning() to pr_warning(), to elliminate another
duplication: too many error reporting facilities.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
7 years agoACPI: hns_dsaf_acpi_dsm_guid can be static
kbuild test robot [Thu, 22 Jun 2017 15:45:23 +0000 (23:45 +0800)]
ACPI: hns_dsaf_acpi_dsm_guid can be static

Signed-off-by: Fengguang Wu <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
This page took 0.138199 seconds and 4 git commands to generate.