Peter Zijlstra [Thu, 16 Mar 2017 12:47:51 +0000 (13:47 +0100)]
perf/core: Better explain the inherit magic
While going through the event inheritance code Oleg got confused.
Add some comments to better explain the silent dissapearance of
orphaned events.
So what happens is that at perf_event_release_kernel() time; when an
event looses its connection to userspace (and ceases to exist from the
user's perspective) we can still have an arbitrary amount of inherited
copies of the event. We want to synchronously find and remove all
these child events.
Since that requires a bit of lock juggling, there is the possibility
that concurrent clone()s will create new child events. Therefore we
first mark the parent event as DEAD, which marks all the extant child
events as orphaned.
We then avoid copying orphaned events; in order to avoid getting more
of them.
Peter Zijlstra [Thu, 16 Mar 2017 12:47:49 +0000 (13:47 +0100)]
perf/core: Fix event inheritance on fork()
While hunting for clues to a use-after-free, Oleg spotted that
perf_event_init_context() can loose an error value with the result
that fork() can succeed even though we did not fully inherit the perf
event context.
This patch closes the hole by making perf_event_free_task() destroy the
task <-> ctx relation such that perf_event_release_kernel() will no longer
observe the now dead task.
sched/deadline: Use deadline instead of period when calculating overflow
I was testing Daniel's changes with his test case, and tweaked it a
little. Instead of having the runtime equal to the deadline, I
increased the deadline ten fold.
The results were rather surprising. The behavior that Daniel's patch
was fixing came back. The task started using much more than .1% of the
CPU. More like 20%.
Looking into this I found that it was due to the dl_entity_overflow()
constantly returning true. That's because it uses the relative period
against relative runtime vs the absolute deadline against absolute
runtime.
runtime / (deadline - t) > dl_runtime / dl_period
There's even a comment mentioning this, and saying that when relative
deadline equals relative period, that the equation is the same as using
deadline instead of period. That comment is backwards! What we really
want is:
We care about if the runtime can make its deadline, not its period. And
then we can say "when the deadline equals the period, the equation is
the same as using dl_period instead of dl_deadline".
After correcting this, now when the task gets enqueued, it can throttle
correctly, and Daniel's fix to the throttling of sleeping deadline
tasks works even when the runtime and deadline are not the same.
sched/deadline: Throttle a constrained deadline task activated after the deadline
During the activation, CBS checks if it can reuse the current task's
runtime and period. If the deadline of the task is in the past, CBS
cannot use the runtime, and so it replenishes the task. This rule
works fine for implicit deadline tasks (deadline == period), and the
CBS was designed for implicit deadline tasks. However, a task with
constrained deadline (deadine < period) might be awakened after the
deadline, but before the next period. In this case, replenishing the
task would allow it to run for runtime / deadline. As in this case
deadline < period, CBS enables a task to run for more than the
runtime / period. In a very loaded system, this can cause a domino
effect, making other tasks miss their deadlines.
To avoid this problem, in the activation of a constrained deadline
task after the deadline but before the next period, throttle the
task and set the replenishing timer to the begin of the next period,
unless it is boosted.
Reproducer:
--------------- %< ---------------
int main (int argc, char **argv)
{
int ret;
int flags = 0;
unsigned long l = 0;
struct timespec ts;
struct sched_attr attr;
if (ret < 0) {
perror("sched_setattr");
exit(-1);
}
for(;;) {
/* XXX: you may need to adjust the loop */
for (l = 0; l < 150000; l++);
/*
* The ideia is to go to sleep right before the deadline
* and then wake up before the next period to receive
* a new replenishment.
*/
nanosleep(&ts, NULL);
}
exit(0);
}
--------------- >% ---------------
On my box, this reproducer uses almost 50% of the CPU time, which is
obviously wrong for a task with 2/2000 reservation.
sched/deadline: Make sure the replenishment timer fires in the next period
Currently, the replenishment timer is set to fire at the deadline
of a task. Although that works for implicit deadline tasks because the
deadline is equals to the begin of the next period, that is not correct
for constrained deadline tasks (deadline < period).
For instance:
f.c:
--------------- %< ---------------
int main (void)
{
for(;;);
}
--------------- >% ---------------
The task is throttled after running 492318 ms, as expected:
f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> watchdog/3:32 [0]
But then, the task is replenished 500719 ms after the first
replenishment:
<idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R ==> f:11295 [-1]
Running for 490277 ms:
f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> swapper/3:0 [120]
Hence, in the first period, the task runs 2 * runtime, and that is a bug.
During the first replenishment, the next deadline is set one period away.
So the runtime / period starts to be respected. However, as the second
replenishment took place in the wrong instant, the next replenishment
will also be held in a wrong instant of time. Rather than occurring in
the nth period away from the first activation, it is taking place
in the (nth period - relative deadline).
Niklas Cassel [Sat, 25 Feb 2017 00:17:53 +0000 (01:17 +0100)]
locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y
We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:
INFO: task libupnp:21868 blocked for more than 120 seconds.
libupnp D 0 21868 1 0x08100008
...
Call Trace:
__schedule()
schedule()
__down_read()
do_exit()
do_group_exit()
__wake_up_parent()
This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:
Matt Fleming [Fri, 17 Feb 2017 12:07:31 +0000 (12:07 +0000)]
sched/loadavg: Use {READ,WRITE}_ONCE() for sample window
'calc_load_update' is accessed without any kind of locking and there's
a clear assumption in the code that only a single value is read or
written.
Make this explicit by using READ_ONCE() and WRITE_ONCE(), and avoid
unintentionally seeing multiple values, or having the load/stores
split.
Technically the loads in calc_global_*() don't require this since
those are the only functions that update 'calc_load_update', but I've
added the READ_ONCE() for consistency.
Matt Fleming [Fri, 17 Feb 2017 12:07:30 +0000 (12:07 +0000)]
sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to
the pending sample window time on exit, setting the next update not
one window into the future, but two.
This has the effect of delaying load average updates for potentially
up to ~9seconds.
This can result in huge spikes in the load average values due to
per-cpu uninterruptible task counts being out of sync when accumulated
across all CPUs.
It's safe to update the per-cpu active count if we wake between sample
windows because any load that we left in 'calc_load_idle' will have
been zero'd when the idle load was folded in calc_global_load().
This issue is easy to reproduce before,
commit 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")
just by forking short-lived process pipelines built from ps(1) and
grep(1) in a loop. I'm unable to reproduce the spikes after that
commit, but the bug still seems to be present from code review.
The DL task will be migrated to a suitable later deadline rq once the DL
timer fires and currnet rq is offline. The rq clock of the new rq should
be updated. This patch fixes it by updating the rq clock after holding
the new rq's rq lock.
Linus Torvalds [Wed, 15 Mar 2017 23:54:58 +0000 (16:54 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Four small fixes for this cycle:
- followup fix from Neil for a fix that went in before -rc2, ensuring
that we always see the full per-task bio_list.
- fix for blk-mq-sched from me that ensures that we retain similar
direct-to-issue behavior on running the queue.
- fix from Sagi fixing a potential NULL pointer dereference in blk-mq
on spurious CPU unplug.
- a memory leak fix in writeback from Tahsin, fixing a case where
device removal of a mounted device can leak a struct
wb_writeback_work"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq-sched: don't run the queue async from blk_mq_try_issue_directly()
writeback: fix memory leak in wb_queue_work()
blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
blk: Ensure users for current->bio_list can see the full list.
parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
The previously submitted patch did not resolve the random segmentation
faults observed on the phantom buildd system. There are still
unresolved problems with the Debian 4.8 and 4.9 kernels on C8000.
The attached patch removes the flush of the offset map pages and does a
whole data cache flush for large ranges. No other arch flushes the
offset map in these routines as far as I can tell.
I have not observed any random segmentation faults on rp3440 in two
weeks of testing with 4.10.0 and 4.10.1.
Mikulas Patocka [Tue, 14 Mar 2017 15:47:29 +0000 (11:47 -0400)]
parisc: support R_PARISC_SECREL32 relocation in modules
The parisc kernel doesn't work with CONFIG_MODVERSIONS since the commit 71810db27c1c853b335675bee335d893bc3d324b. It can't load modules with the
error: "module unix: Unknown relocation: 41".
The commit changes __kcrctab from 64-bit valus to 32-bit values. The
assembler generates R_PARISC_SECREL32 secrel relocation for them and the
module loader doesn't support this relocation.
This patch adds the R_PARISC_SECREL32 relocation to the module loader.
Linus Torvalds [Wed, 15 Mar 2017 17:44:19 +0000 (10:44 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a rather large set of fixes. The bulk are for lpfc correcting
a lot of issues in the new NVME driver code which just went in in the
merge window.
The others are:
- fix a hang in the vmware paravirt driver caused by incorrect
handling of the new MSI vector allocation
- long standing bug in storvsc, which recent block changes turned
from being a harmless annoyance into a hang
- yet more fallout (in mpt3sas) from the changes to device blocking
The remainder are small fixes and updates"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (34 commits)
scsi: lpfc: Add shutdown method for kexec
scsi: storvsc: Workaround for virtual DVD SCSI version
scsi: lpfc: revise version number to 11.2.0.10
scsi: lpfc: code cleanups in NVME initiator discovery
scsi: lpfc: code cleanups in NVME initiator base
scsi: lpfc: correct rdp diag portnames
scsi: lpfc: remove dead sli3 nvme code
scsi: lpfc: correct double print
scsi: lpfc: Rename LPFC_MAX_EQ_DELAY to LPFC_MAX_EQ_DELAY_EQID_CNT
scsi: lpfc: Rework lpfc Kconfig for NVME options
scsi: lpfc: add transport eh_timed_out reference
scsi: lpfc: Fix eh_deadline setting for sli3 adapters.
scsi: lpfc: add NVME exchange aborts
scsi: lpfc: Fix nvme allocation bug on failed nvme_fc_register_localport
scsi: lpfc: Fix IO submission if WQ is full
scsi: lpfc: Fix NVME CMD IU byte swapped word 1 problem
scsi: lpfc: Fix RCTL value on NVME LS request and response
scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters
scsi: lpfc: fix missing spin_unlock on sql_list_lock
scsi: lpfc: don't dereference dma_buf->iocbq before null check
...
Linus Torvalds [Wed, 15 Mar 2017 16:33:15 +0000 (09:33 -0700)]
Merge tag 'gfs2-4.11-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fix from Bob Peterson:
"This is an emergency patch for 4.11-rc3
The GFS2 developers uncovered a really nasty problem that can lead to
random corruption and kernel panic, much like the last one. Andreas
Gruenbacher wrote a simple one-line patch to fix the problem."
* tag 'gfs2-4.11-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Avoid alignment hole in struct lm_lockname
cpufreq: intel_pstate: Avoid percentages in limits-related computations
Currently, intel_pstate_update_perf_limits() first converts the
policy minimum and maximum limits into percentages of the maximum
turbo frequency (rounding up to an integer) and then converts these
percentages to fractions (by using fixed-point arithmetic to divide
them by 100).
That introduces a rounding error unnecessarily, because the fractions
can be obtained by carrying out fixed-point divisions directly on the
input numbers.
Rework the computations in intel_pstate_hwp_set() to use fractions
instead of percentages (and drop redundant local variables from
there) and modify intel_pstate_update_perf_limits() to compute the
fractions directly and percentages out of them.
While at it, introduce percent_ext_fp() for converting percentages
to fractions (with extended number of fraction bits) and use it in
the computations.
Stafford Horne [Sun, 12 Mar 2017 22:41:55 +0000 (07:41 +0900)]
openrisc: xchg: fix `computed is not used` warning
When building allmodconfig this warning shows.
fs/ocfs2/file.c: In function 'ocfs2_file_write_iter':
./arch/openrisc/include/asm/cmpxchg.h:81:3: warning: value computed is
not used [-Wunused-value]
((typeof(*(ptr)))__xchg((unsigned long)(with), (ptr), sizeof(*(ptr))))
^
Applying the same patch logic that was done to the cmpxchg macro.
Commit 88ffbf3e03 switches to using rhashtables for glocks, hashing over
the entire struct lm_lockname instead of its individual fields. On some
architectures, struct lm_lockname contains a hole of uninitialized
memory due to alignment rules, which now leads to incorrect hash values.
Get rid of that hole.
Darrick J. Wong [Wed, 15 Mar 2017 07:24:25 +0000 (00:24 -0700)]
xfs: verify inline directory data forks
When we're reading or writing the data fork of an inline directory,
check the contents to make sure we're not overflowing buffers or eating
garbage data. xfs/348 corrupts an inline symlink into an inline
directory, triggering a buffer overflow bug.
Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
---
v2: add more checks consistent with _dir2_sf_check and make the verifier
usable from anywhere.
1) Ensure that mtu is at least IPV6_MIN_MTU in ipv6 VTI tunnel driver,
from Steffen Klassert.
2) Fix crashes when user tries to get_next_key on an LPM bpf map, from
Alexei Starovoitov.
3) Fix detection of VLAN fitlering feature for bnx2x VF devices, from
Michal Schmidt.
4) We can get a divide by zero when TCP socket are morphed into
listening state, fix from Eric Dumazet.
5) Fix socket refcounting bugs in skb_complete_wifi_ack() and
skb_complete_tx_timestamp(). From Eric Dumazet.
6) Use after free in dccp_feat_activate_values(), also from Eric
Dumazet.
7) Like bonding team needs to use ETH_MAX_MTU as netdev->max_mtu, from
Jarod Wilson.
8) Fix use after free in vrf_xmit(), from David Ahern.
9) Don't do UDP Fragmentation Offload on IPComp ipsec packets, from
Alexey Kodanev.
10) Properly check napi_complete_done() return value in order to decide
whether to re-enable IRQs or not in amd-xgbe driver, from Thomas
Lendacky.
11) Fix double free of hwmon device in marvell phy driver, from Andrew
Lunn.
12) Don't crash on malformed netlink attributes in act_connmark, from
Etienne Noss.
13) Don't remove routes with a higher metric in ipv6 ECMP route replace,
from Sabrina Dubroca.
14) Don't write into a cloned SKB in ipv6 fragmentation handling, from
Florian Westphal.
15) Fix routing redirect races in dccp and tcp, basically the ICMP
handler can't modify the socket's cached route in it's locked by the
user at this moment. From Jon Maxwell.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (108 commits)
qed: Enable iSCSI Out-of-Order
qed: Correct out-of-bound access in OOO history
qed: Fix interrupt flags on Rx LL2
qed: Free previous connections when releasing iSCSI
qed: Fix mapping leak on LL2 rx flow
qed: Prevent creation of too-big u32-chains
qed: Align CIDs according to DORQ requirement
mlxsw: reg: Fix SPVMLR max record count
mlxsw: reg: Fix SPVM max record count
net: Resend IGMP memberships upon peer notification.
dccp: fix memory leak during tear-down of unsuccessful connection request
tun: fix premature POLLOUT notification on tun devices
dccp/tcp: fix routing redirect race
ucc/hdlc: fix two little issue
vxlan: fix ovs support
net: use net->count to check whether a netns is alive or not
bridge: drop netfilter fake rtable unconditionally
ipv6: avoid write to a possibly cloned skb
net: wimax/i2400m: fix NULL-deref at probe
isdn/gigaset: fix NULL-deref at probe
...
Dave Airlie [Wed, 15 Mar 2017 01:32:46 +0000 (11:32 +1000)]
Merge tag 'drm-intel-fixes-2017-03-14' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.11-rc3
* tag 'drm-intel-fixes-2017-03-14' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Fix forcewake active domain tracking
drm/i915: Nuke skl_update_plane debug message from the pipe update critical section
drm/i915: use correct node for handling cache domain eviction
drm/i915: Drain the freed state from the tail of the next commit
drm/i915: Nuke debug messages from the pipe update critical section
drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl
drm/i915: Store a permanent error in obj->mm.pages
drm/i915: Move updating color management to before vblank evasion
drm/i915/gen9: Increase PCODE request timeout to 50ms
drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3
drm/i915: Remove the vma from the drm_mm if binding fails
drm/i915/fbdev: Stop repeating tile configuration on stagnation
drm/i915/glk: Fix watermark computations for third sprite plane
drm/i915: Squelch any ktime/jiffie rounding errors for wait-ioctl
Dave Airlie [Wed, 15 Mar 2017 01:29:33 +0000 (11:29 +1000)]
Merge tag 'tilcdc-4.11-fixes' of https://github.com/jsarha/linux into drm-fixes
drm/tilcdc fixes for Linux v4.11
* tag 'tilcdc-4.11-fixes' of https://github.com/jsarha/linux:
drm/tilcdc: Set framebuffer DMA address to HW only if CRTC is enabled
drm/tilcdc: Fix hardcoded fail-return value in tilcdc_crtc_create()
Arnd Bergmann [Tue, 14 Mar 2017 21:27:11 +0000 (22:27 +0100)]
drm: amd: remove broken include path
The AMD ACP driver adds "-I../acp -I../acp/include" to the gcc command
line, which makes no sense, since these are evaluated relative to the
build directory. When we build with "make W=1", they instead cause
a warning:
cc1: error: ../acp/: No such file or directory [-Werror=missing-include-dirs]
cc1: error: ../acp/include: No such file or directory [-Werror=missing-include-dirs]
cc1: all warnings being treated as errors
../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_drv.o' failed
../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_device.o' failed
../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_kms.o' failed
This removes the subdir-ccflags variable that evidently did not
serve any purpose here.
Linus Torvalds [Tue, 14 Mar 2017 22:00:43 +0000 (15:00 -0700)]
Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Three libata fixes:
- fix for a circular reference bug in sysfs code which prevented
pata_legacy devices from being released after probe failure, which
in turn prevented devres from releasing the associated resources.
- drop spurious WARN in the command issue path which can be triggered
by a legitimate passthrough command.
- an ahci_qoriq specific fix"
* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: qoriq: correct the sata ecc setting error
libata: drop WARN from protocol error in ata_sff_qc_issue()
libata: transport: Remove circular dependency at free time
Linus Torvalds [Tue, 14 Mar 2017 21:52:08 +0000 (14:52 -0700)]
Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"If a delayed work is queued with NULL @wq, workqueue code explodes
after the timer expires at which point it's difficult to tell who the
culprit was.
This actually happened and the offender was net/smc this time.
Add an explicit sanity check for it in the queueing path"
* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq
Linus Torvalds [Tue, 14 Mar 2017 21:48:50 +0000 (14:48 -0700)]
Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fixes from Tejun Heo:
- the allocation path was updating pcpu_nr_empty_pop_pages without the
required locking which can lead to incorrect handling of empty chunks
(e.g. keeping too many around), which is buggy but shouldn't lead to
critical failures. Fixed by adding the locking
- a trivial patch to drop an unused param from pcpu_get_pages()
* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: remove unused chunk_alloc parameter from pcpu_get_pages()
percpu: acquire pcpu_lock when updating pcpu_nr_empty_pop_pages
Jiri Olsa [Tue, 14 Mar 2017 14:20:53 +0000 (15:20 +0100)]
x86/intel_rdt: Put group node in rdtgroup_kn_unlock
The rdtgroup_kn_unlock waits for the last user to release and put its
node. But it's calling kernfs_put on the node which calls the
rdtgroup_kn_unlock, which might not be the group's directory node, but
another group's file node.
This race could be easily reproduced by running 2 instances
of following script:
Josh Poimboeuf [Tue, 14 Mar 2017 04:27:47 +0000 (23:27 -0500)]
x86/unwind: Fix last frame check for aligned function stacks
Pavel Machek reported the following warning on x86-32:
WARNING: kernel stack frame pointer at f50cdf98 in swapper/2:0 has bad value (null)
The warning is caused by the unwinder not realizing that it reached the
end of the stack, due to an unusual prologue which gcc sometimes
generates for aligned stacks. The prologue is based on a gcc feature
called the Dynamic Realign Argument Pointer (DRAP). It's almost always
enabled for aligned stacks when -maccumulate-outgoing-args isn't set.
This issue is similar to the one fixed by the following commit:
8023e0e2a48d ("x86/unwind: Adjust last frame check for aligned function stacks")
... but that fix was specific to x86-64.
Make the fix more generic to cover x86-32 as well, and also ensure that
the return address referred to by the frame pointer is a copy of the
original return address.
Arnd Bergmann [Tue, 14 Mar 2017 12:12:53 +0000 (13:12 +0100)]
mm, x86: Fix native_pud_clear build error
We still get a build error in random configurations, after this has been
modified a few times:
In file included from include/linux/mm.h:68:0,
from include/linux/suspend.h:8,
from arch/x86/kernel/asm-offsets.c:12:
arch/x86/include/asm/pgtable.h:66:26: error: redefinition of 'native_pud_clear'
#define pud_clear(pud) native_pud_clear(pud)
My interpretation is that the build error comes from a typo in __PAGETABLE_PUD_FOLDED,
so fix that typo now, and remove the incorrect #ifdef around the native_pud_clear
definition.
David S. Miller [Tue, 14 Mar 2017 18:37:06 +0000 (11:37 -0700)]
Merge branch 'qed-fixes'
Yuval Mintz says:
====================
qed: Fixes series
This address several different issues in qed.
The more significant portions:
Patch #1 would cause timeout when qedr utilizes the highest
CIDs availble for it [or when future qede adapters would utilize
queues in some constellations].
Patch #4 fixes a leak of mapped addresses; When iommu is enabled,
offloaded storage protocols might eventually run out of resources
and fail to map additional buffers.
Patches #6,#7 were missing in the initial iSCSI infrastructure
submissions, and would hamper qedi's stability when it reaches
out-of-order scenarios.
====================
Mintz, Yuval [Tue, 14 Mar 2017 13:26:04 +0000 (15:26 +0200)]
qed: Enable iSCSI Out-of-Order
Missing in the initial submission, qed fails to propagate qedi's
request to enable OOO to firmware.
Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI") Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Mintz, Yuval [Tue, 14 Mar 2017 13:26:03 +0000 (15:26 +0200)]
qed: Correct out-of-bound access in OOO history
Need to set the number of entries in database, otherwise the logic
would quickly surpass the array.
Fixes: 1d6cff4fca43 ("qed: Add iSCSI out of order packet handling") Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Ram Amrani [Tue, 14 Mar 2017 13:26:02 +0000 (15:26 +0200)]
qed: Fix interrupt flags on Rx LL2
Before iterating over the the LL2 Rx ring, the ring's
spinlock is taken via spin_lock_irqsave().
The actual processing of the packet [including handling
by the protocol driver] is done without said lock,
so qed releases the spinlock and re-claims it afterwards.
Problem is that the final spin_lock_irqrestore() at the end
of the iteration uses the original flags saved from the
initial irqsave() instead of the flags from the most recent
irqsave(). So it's possible that the interrupt status would
be incorrect at the end of the processing.
Mintz, Yuval [Tue, 14 Mar 2017 13:26:01 +0000 (15:26 +0200)]
qed: Free previous connections when releasing iSCSI
Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI") Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Tomer Tayar [Tue, 14 Mar 2017 13:25:59 +0000 (15:25 +0200)]
qed: Prevent creation of too-big u32-chains
Current Logic would allow the creation of a chain with U32_MAX + 1
elements, when the actual maximum supported by the driver infrastructure
is U32_MAX.
Ram Amrani [Tue, 14 Mar 2017 13:25:58 +0000 (15:25 +0200)]
qed: Align CIDs according to DORQ requirement
The Doorbell HW block can be configured at a granularity
of 16 x CIDs, so we need to make sure that the actual number
of CIDs configured would be a multiplication of 16.
Today, when RoCE is enabled - given that the number is unaligned,
doorbelling the higher CIDs would fail to reach the firmware and
would eventually timeout.
Fixes: dbb799c39717 ("qed: Initialize hardware for new protocols") Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Jiri Pirko [Tue, 14 Mar 2017 13:00:00 +0000 (14:00 +0100)]
mlxsw: reg: Fix SPVM max record count
The num_rec field is 8 bit, so the maximal count number is 255. This
fixes vlans not being enabled for wider ranges than 255.
Fixes: b2e345f9a454 ("mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Vlad Yasevich [Tue, 14 Mar 2017 12:58:08 +0000 (08:58 -0400)]
net: Resend IGMP memberships upon peer notification.
When we notify peers of potential changes, it's also good to update
IGMP memberships. For example, during VM migration, updating IGMP
memberships will redirect existing multicast streams to the VM at the
new location.
The setup/remove_state/instance() functions in the hotplug core code are
serialized against concurrent CPU hotplug, but unfortunately not serialized
against themself.
As a consequence a concurrent invocation of these function results in
corruption of the callback machinery because two instances try to invoke
callbacks on remote cpus at the same time. This results in missing callback
invocations and initiator threads waiting forever on the completion.
The obvious solution to replace get_cpu_online() with cpu_hotplug_begin()
is not possible because at least one callsite calls into these functions
from a get_online_cpu() locked region.
Extend the protection scope of the cpuhp_state_mutex from solely protecting
the state arrays to cover the callback invocation machinery as well.
Jens Axboe [Tue, 14 Mar 2017 17:51:59 +0000 (11:51 -0600)]
blk-mq-sched: don't run the queue async from blk_mq_try_issue_directly()
If we have scheduling enabled, we jump directly to insert-and-run.
That's fine, but we run the queue async and we don't pass in information
on whether we can block from this context or not. Fixup both these
cases.
Song Liu [Mon, 13 Mar 2017 20:44:35 +0000 (13:44 -0700)]
md/r5cache: fix set_syndrome_sources() for data in cache
Before this patch, device InJournal will be included in prexor
(SYNDROME_SRC_WANT_DRAIN) but not in reconstruct (SYNDROME_SRC_WRITTEN). So it
will break parity calculation. With srctype == SYNDROME_SRC_WRITTEN, we need
include both dev with non-null ->written and dev with R5_InJournal. This fixes
logic in 1e6d690(md/r5cache: caching phase of r5cache)
Jyri Sarha [Wed, 1 Mar 2017 08:30:28 +0000 (10:30 +0200)]
drm/tilcdc: Set framebuffer DMA address to HW only if CRTC is enabled
Touching HW while clocks are off is a serious error and for instance
breaks suspend functionality. After this patch tilcdc_crtc_update_fb()
always updates the primary plane's framebuffer pointer, increases fb's
reference count and stores vblank event. tilcdc_crtc_update_fb() only
writes the fb's DMA address to HW if the crtc is enabled, as
tilcdc_crtc_enable() takes care of writing the address on enable.
This patch also refactors the tilcdc_crtc_update_fb() a bit. Number of
subsequent small changes had made it almost unreadable. There should
be no other functional changes but checking the CRTC's enable
state. However, the locking goes a bit differently and some of the
redundant checks have been removed in this new version.
The enable_lock should be enough to protect the access to
tilcdc_crtc->enabled. The irq_lock protects the access to last_vblank
and next_fb. The check for vrefresh and last_vblank being valid is
redundant, as the vrefresh should be always valid if the CRTC is
enabled and now last_vblank should be too, because it is initialized
to current time when CRTC raster is enabled. If for some reason the
values are not correctly initialized the division by zero warning is
quite appropriate.
dccp: fix memory leak during tear-down of unsuccessful connection request
This patch fixes a memory leak, which happens if the connection request
is not fulfilled between parsing the DCCP options and handling the SYN
(because e.g. the backlog is full), because we forgot to free the
list of ack vectors.
tun: fix premature POLLOUT notification on tun devices
aszlig observed failing ssh tunnels (-w) during initialization since
commit cc9da6cc4f56e0 ("ipv6: addrconf: use stable address generator for
ARPHRD_NONE"). We already had reports that the mentioned commit breaks
Juniper VPN connections. I can't clearly say that the Juniper VPN client
has the same problem, but it is worth a try to hint to this patch.
Because of the early generation of link local addresses, the kernel now
can start asking for routers on the local subnet much earlier than usual.
Those router solicitation packets arrive inside the ssh channels and
should be transmitted to the tun fd before the configuration scripts
might have upped the interface and made it ready for transmission.
ssh polls on the interface and receives back a POLL_OUT. It tries to send
the earily router solicitation packet to the tun interface. Unfortunately
it hasn't been up'ed yet by config scripts, thus failing with -EIO. ssh
doesn't retry again and considers the tun interface broken forever.
Jon Maxwell [Fri, 10 Mar 2017 05:40:33 +0000 (16:40 +1100)]
dccp/tcp: fix routing redirect race
As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.
We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:
But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.
All the vmcores showed 2 significant clues:
- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.
- All vmcores showed a postitive LockDroppedIcmps value, e.g:
LockDroppedIcmps 267
A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:
do_redirect()->__sk_dst_check()-> dst_release().
Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.
To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.
The dccp/IPv6 code is very similar in this respect, so fixing it there too.
As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().
Zhao Qiang [Tue, 14 Mar 2017 01:38:33 +0000 (09:38 +0800)]
ucc/hdlc: fix two little issue
1. modify bd_status from u32 to u16 in function hdlc_rx_done,
because bd_status register is 16bits
2. write bd_length register before writing bd_status register
cpufreq: intel_pstate: Correct frequency setting in the HWP mode
In the functions intel_pstate_hwp_set(), min/max range from HWP capability
MSR along with max_perf_pct and min_perf_pct, is used to set the HWP
request MSR. In some cases this doesn't result in the correct HWP max/min
in HWP request.
For example: In the following case:
HWP capabilities from MSR 0x771
0x70a1220
Here cpufreq min/max frequencies from above MSR dump are 700MHz and 3.2GHz
respectively.
This will result in
hwp_min = 0x07
hwp_max = 0x20
To limit max frequency to 2GHz:
perf_limits->max_perf_pct = 63 (2GHz as a percent of 3.2GHz rounded up)
With the current calculation:
adj_range = max_perf_pct * range / 100;
adj_range = 63 * (32 - 7) / 100
adj_range = 15
max = hw_min + adj_range;
max = 7 + 15 = 22
This will result in HWP request of 0x160f, which will result in a
frequency cap of 2.2GHz not 2GHz.
The problem with the above calculation is that hwp_min of 7 is treated
as 0% in the range. But max_perf_pct is calculated with respect to minimum
as 0 and max as 3.2GHz or hwp_max, so adding hwp_min to it will result in
more than the desired.
Since the min_perf_pct and max_perf_pct is already a percent of max
frequency or hwp_max, this min/max HWP request value can be calculated
directly applying these percentage to hwp_max.
Linus Torvalds [Tue, 14 Mar 2017 02:48:22 +0000 (19:48 -0700)]
Merge tag 'powerpc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull some more powerpc fixes from Michael Ellerman:
"The main item is the addition of the Power9 Machine Check handler.
This was delayed to make sure some details were correct, and is as
minimal as possible.
The rest is small fixes, two for the Power9 PMU, two dealing with
obscure toolchain problems, two for the PowerNV IOMMU code (used by
VFIO), and one to fix a crash on 32-bit machines with macio devices
due to missing dma_ops.
Thanks to:
Alexey Kardashevskiy, Cyril Bur, Larry Finger, Madhavan Srinivasan,
Nicholas Piggin"
* tag 'powerpc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: POWER9 machine check handler
powerpc/64s: allow machine check handler to set severity and initiator
powerpc/64s: fix handling of non-synchronous machine checks
powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops
powerpc/powernv/ioda2: Update iommu table base on ownership change
powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
selftests/powerpc: Replace stxvx and lxvx with stxvd2x/lxvd2x
powerpc/perf: Handle sdar_mode for marked event in power9
powerpc/perf: Fix perf_get_data_addr() for power9 DD1
powerpc/boot: Fix zImage TOC alignment
Nicolas Dichtel [Mon, 13 Mar 2017 15:24:03 +0000 (16:24 +0100)]
vxlan: fix ovs support
The required changes in the function vxlan_dev_create() were missing
in commit 8bcdc4f3a20b.
The vxlan device is not registered anymore after this patch and the error
path causes an stack dump:
WARNING: CPU: 3 PID: 1498 at net/core/dev.c:6713 rollback_registered_many+0x9d/0x3f0
Andrey Vagin [Mon, 13 Mar 2017 04:36:18 +0000 (21:36 -0700)]
net: use net->count to check whether a netns is alive or not
The previous idea was to check whether a net namespace is in
net_exit_list or not. It doesn't work, because net->exit_list is used in
__register_pernet_operations and __unregister_pernet_operations where
all namespaces are added to a temporary list to make cleanup in a error
case, so list_empty(&net->exit_list) always returns false.
Reported-by: Mantas Mikulėnas <[email protected]> Fixes: 002d8a1a6c11 ("net: skip genenerating uevents for network namespaces that are exiting") Signed-off-by: Andrei Vagin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Andrey Ryabinin [Mon, 13 Mar 2017 16:33:37 +0000 (19:33 +0300)]
x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y
The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.
Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().
cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()
Fix the debugfs interface for PID tuning to actually update
pid_params.sample_rate_ns on PID parameters updates, as changing
pid_params.sample_rate_ms via debugfs has no effect now.
Fixes: a4675fbc4a7a (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
Florian Westphal [Mon, 13 Mar 2017 16:38:17 +0000 (17:38 +0100)]
bridge: drop netfilter fake rtable unconditionally
Andreas reports kernel oops during rmmod of the br_netfilter module.
Hannes debugged the oops down to a NULL rt6info->rt6i_indev.
Problem is that br_netfilter has the nasty concept of adding a fake
rtable to skb->dst; this happens in a br_netfilter prerouting hook.
A second hook (in bridge LOCAL_IN) is supposed to remove these again
before the skb is handed up the stack.
However, on module unload hooks get unregistered which means an
skb could traverse the prerouting hook that attaches the fake_rtable,
while the 'fake rtable remove' hook gets removed from the hooklist
immediately after.
Florian Westphal [Mon, 13 Mar 2017 15:24:28 +0000 (16:24 +0100)]
ipv6: avoid write to a possibly cloned skb
ip6_fragment, in case skb has a fraglist, checks if the
skb is cloned. If it is, it will move to the 'slow path' and allocates
new skbs for each fragment.
However, right before entering the slowpath loop, it updates the
nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
to account for the fragment header that will be inserted in the new
ipv6-fragment skbs.
In case original skb is cloned this munges nexthdr value of another
skb. Avoid this by doing the nexthdr update for each of the new fragment
skbs separately.
This was observed with tcpdump on a bridge device where netfilter ipv6
reassembly is active: tcpdump shows malformed fragment headers as
the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.
Mike Travis [Tue, 7 Mar 2017 21:08:42 +0000 (15:08 -0600)]
x86/platform: Remove warning message for duplicate NMI handlers
Remove the WARNING message associated with multiple NMI handlers as
there are at least two that are legitimate. These are the KGDB and the
UV handlers and both want to be called if the NMI has not been claimed
by any other NMI handler.
Use of the UNKNOWN NMI call chain dramatically lowers the NMI call rate
when high frequency NMI tools are in use, notably the perf tools. It is
required on systems that cannot sustain a high NMI call rate without
adversely affecting the system operation.
Johan Hovold [Mon, 13 Mar 2017 12:42:03 +0000 (13:42 +0100)]
net: wimax/i2400m: fix NULL-deref at probe
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.
The endpoints are specifically dereferenced in the i2400m_bootrom_init
path during probe (e.g. in i2400mu_tx_bulk_out).
Fixes: f398e4240fce ("i2400m/USB: probe/disconnect, dev init/shutdown
and reset backends") Cc: Inaky Perez-Gonzalez <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Sabrina Dubroca [Mon, 13 Mar 2017 12:28:09 +0000 (13:28 +0100)]
ipv6: make ECMP route replacement less greedy
Commit 27596472473a ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.
Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.
Reproducer:
===========
#!/bin/sh
ip netns add ns1
ip netns add ns2
ip -net ns1 link set lo up
for x in 0 1 2 ; do
ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
ip -net ns1 link set eth$x up
ip -net ns2 link set veth$x up
done
ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048
echo "before replace, 3 routes"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
echo
ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2
echo "after replace, only 2 routes, metric 2048 is gone"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
Peter Zijlstra [Mon, 13 Mar 2017 14:57:12 +0000 (15:57 +0100)]
x86/tsc: Fix ART for TSC_KNOWN_FREQ
Subhransu reported that convert_art_to_tsc() isn't working for him.
The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.
Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.
The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.
Note for stable: The conditional has changed from TSC_RELIABLE to
TSC_KNOWN_FREQUENCY.
[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]
Elena Reshetova [Mon, 6 Mar 2017 14:21:16 +0000 (16:21 +0200)]
drivers, xen: convert grant_map.users from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Tvrtko Ursulin [Fri, 10 Mar 2017 09:32:49 +0000 (09:32 +0000)]
drm/i915: Fix forcewake active domain tracking
In commit 003342a50021 ("drm/i915: Keep track of active
forcewake domains in a bitmask") I forgot to adjust the
newly introduce fw_domains_active state across reset.
This caused the assert_forcewakes_inactive to trigger
during suspend and resume if there were user held
forcewakes.
v2: Bitmask checks are required since vfuncs are not
always present.
v3: Move bitmask tracking to get/put vfunc for simplicity.
(Chris Wilson)
drm/i915: Nuke skl_update_plane debug message from the pipe update critical section
printks are slow so we should not be doing them from the vblank evade
critical section. These could explain why we sometimes seem to
blow past our 100 usec deadline.
The problem has been there ever since commit c331879ce8ea ("drm/i915:
skylake sprite plane scaling using shared scalers.") but it may not have
been readily visible until commit e1edbd44e23b ("drm/i915: Complain
if we take too long under vblank evasion.") increased our chances
of noticing it.
Matthew Auld [Mon, 6 Mar 2017 23:54:01 +0000 (23:54 +0000)]
drm/i915: use correct node for handling cache domain eviction
It looks like we were incorrectly comparing vma->node against itself
instead of the target node, when evicting for a node on systems where we
need guard pages between regions with different cache domains. As a
consequence we can end up trying to needlessly evict neighbouring nodes,
even if they have the same cache domain, and if they were pinned we
would fail the eviction.
Kinglong Mee [Mon, 6 Mar 2017 14:29:14 +0000 (22:29 +0800)]
NFSv4: fix a reference leak caused WARNING messages
Because nfs4_opendata_access() has close the state when access is denied,
so the state isn't leak.
Rather than revert the commit a974deee47, I'd like clean the strange state close.
Fixes: a974deee47 ("NFSv4: Fix memory and state leak in...") Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Tahsin Erdogan [Fri, 10 Mar 2017 20:09:49 +0000 (12:09 -0800)]
writeback: fix memory leak in wb_queue_work()
When WB_registered flag is not set, wb_queue_work() skips queuing the
work, but does not perform the necessary clean up. In particular, if
work->auto_free is true, it should free the memory.
The leak condition can be reprouced by following these steps:
mount /dev/sdb /mnt/sdb
/* In qemu console: device_del sdb */
umount /dev/sdb
Above will result in a wb_queue_work() call on an unregistered wb and
thus leak memory.
Sagi Grimberg [Mon, 13 Mar 2017 14:10:11 +0000 (16:10 +0200)]
blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
In case cpu was unplugged, we need to make sure not to assume
that the tags for that cpu are still allocated. so check
for null tags when reinitializing a tagset.
Consistently use types from linux/types.h like in other uapi drm/*_drm.h
header files to fix the following drm/omap_drm.h userspace compilation
errors:
/usr/include/drm/omap_drm.h:36:2: error: unknown type name 'uint64_t'
uint64_t param; /* in */
/usr/include/drm/omap_drm.h:37:2: error: unknown type name 'uint64_t'
uint64_t value; /* in (set_param), out (get_param) */
/usr/include/drm/omap_drm.h:56:2: error: unknown type name 'uint32_t'
uint32_t bytes; /* (for non-tiled formats) */
/usr/include/drm/omap_drm.h:58:3: error: unknown type name 'uint16_t'
uint16_t width;
/usr/include/drm/omap_drm.h:59:3: error: unknown type name 'uint16_t'
uint16_t height;
/usr/include/drm/omap_drm.h:65:2: error: unknown type name 'uint32_t'
uint32_t flags; /* in */
/usr/include/drm/omap_drm.h:66:2: error: unknown type name 'uint32_t'
uint32_t handle; /* out */
/usr/include/drm/omap_drm.h:67:2: error: unknown type name 'uint32_t'
uint32_t __pad;
/usr/include/drm/omap_drm.h:77:2: error: unknown type name 'uint32_t'
uint32_t handle; /* buffer handle (in) */
/usr/include/drm/omap_drm.h:78:2: error: unknown type name 'uint32_t'
uint32_t op; /* mask of omap_gem_op (in) */
/usr/include/drm/omap_drm.h:82:2: error: unknown type name 'uint32_t'
uint32_t handle; /* buffer handle (in) */
/usr/include/drm/omap_drm.h:83:2: error: unknown type name 'uint32_t'
uint32_t op; /* mask of omap_gem_op (in) */
/usr/include/drm/omap_drm.h:88:2: error: unknown type name 'uint32_t'
uint32_t nregions;
/usr/include/drm/omap_drm.h:89:2: error: unknown type name 'uint32_t'
uint32_t __pad;
/usr/include/drm/omap_drm.h:93:2: error: unknown type name 'uint32_t'
uint32_t handle; /* buffer handle (in) */
/usr/include/drm/omap_drm.h:94:2: error: unknown type name 'uint32_t'
uint32_t pad;
/usr/include/drm/omap_drm.h:95:2: error: unknown type name 'uint64_t'
uint64_t offset; /* mmap offset (out) */
/usr/include/drm/omap_drm.h:102:2: error: unknown type name 'uint32_t'
uint32_t size; /* virtual size for mmap'ing (out) */
/usr/include/drm/omap_drm.h:103:2: error: unknown type name 'uint32_t'
uint32_t __pad;
Fixes: ef6503e89194 ("drm: Kbuild: add omap_drm.h to the installed headers") Signed-off-by: Dmitry V. Levin <[email protected]> Signed-off-by: Tomi Valkeinen <[email protected]>
Tomi Valkeinen [Tue, 28 Feb 2017 08:11:45 +0000 (10:11 +0200)]
drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
omap_gem_dmabuf_mmap() returns an error (with a WARN) when called for a
buffer which is allocated with dma_alloc_*(). This prevents dmabuf mmap
from working on SoCs without DMM, e.g. AM4 and OMAP3.
I could not find any reason for omap_gem_dmabuf_mmap() rejecting such
buffers, and just removing the if() fixes the limitation.
Daniel Borkmann [Sat, 11 Mar 2017 15:55:49 +0000 (16:55 +0100)]
bpf: improve read-only handling
Improve bpf_{prog,jit_binary}_{un,}lock_ro() by throwing a
one-time warning in case of an error when the image couldn't
be set read-only, and also mark struct bpf_prog as locked when
bpf_prog_lock_ro() was called.
Reason for the latter is that bpf_prog_unlock_ro() is called from
various places including error paths, and we shouldn't mess with
page attributes when really not needed.
For bpf_jit_binary_unlock_ro() this is not needed as jited flag
implicitly indicates this, thus for archs with ARCH_HAS_SET_MEMORY
we're guaranteed to have a previously locked image. Overall, this
should also help us to identify any further potential issues with
set_memory_*() helpers.