Eric Dumazet [Wed, 2 May 2018 17:03:30 +0000 (10:03 -0700)]
net_sched: fq: take care of throttled flows before reuse
Normally, a socket can not be freed/reused unless all its TX packets
left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
this to happen.
With commit fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for
reused flows") we cleared f->time_next_packet but took no special
action if the flow was still in the throttled rb-tree.
Since f->time_next_packet is the key used in the rb-tree searches,
blindly clearing it might break rb-tree integrity. We need to make
sure the flow is no longer in the rb-tree to avoid this problem.
Fixes: fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for reused flows") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Ido Schimmel [Wed, 2 May 2018 19:41:56 +0000 (22:41 +0300)]
ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"
This reverts commit edd7ceb78296 ("ipv6: Allow non-gateway ECMP for
IPv6").
Eric reported a division by zero in rt6_multipath_rebalance() which is
caused by above commit that considers identical local routes to be
siblings. The division by zero happens because a nexthop weight is not
set for local routes.
Revert the commit as it does not fix a bug and has side effects.
To reproduce:
# ip -6 address add 2001:db8::1/64 dev dummy0
# ip -6 address add 2001:db8::1/64 dev dummy1
Fix three section mismatches:
1) Section mismatch in reference from the function ioread8() to the
function .init.text:pcibios_init_bridge()
2) Section mismatch in reference from the function free_initmem() to the
function .init.text:map_pages()
3) Section mismatch in reference from the function ccio_ioc_init() to
the function .init.text:count_parisc_driver()
Fix two section mismatches in drivers.c:
1) Section mismatch in reference from the function alloc_tree_node() to
the function .init.text:create_tree_node().
2) Section mismatch in reference from the function walk_native_bus() to
the function .init.text:alloc_pa_dev().
Daniel Borkmann [Wed, 2 May 2018 18:12:23 +0000 (20:12 +0200)]
bpf, x64: fix memleak when not converging on calls
The JIT logic in jit_subprogs() is as follows: for all subprogs we
allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here),
and pass it to bpf_int_jit_compile(). If a failure occurred during
JIT and prog->jited is not set, then we bail out from attempting to
JIT the whole program, and punt to the interpreter instead. In case
JITing went successful, we fixup BPF call offsets and do another
pass to bpf_int_jit_compile() (extra_pass is true at that point) to
complete JITing calls. Given that requires to pass JIT context around
addrs and jit_data from x86 JIT are freed in the extra_pass in
bpf_int_jit_compile() when calls are involved (if not, they can
be freed immediately). However, if in the original pass, the JIT
image didn't converge then we leak addrs and jit_data since image
itself is NULL, the prog->is_func is set and extra_pass is false
in that case, meaning both will become unreachable and are never
cleaned up, therefore we need to free as well on !image. Only x64
JIT is affected.
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
Daniel Borkmann [Wed, 2 May 2018 18:12:22 +0000 (20:12 +0200)]
bpf, x64: fix memleak when not converging after image
While reviewing x64 JIT code, I noticed that we leak the prior allocated
JIT image in the case where proglen != oldproglen during the JIT passes.
Prior to the commit e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT
compiler") we would just break out of the loop, and using the image as the
JITed prog since it could only shrink in size anyway. After e0ee9c12157d,
we would bail out to out_addrs label where we free addrs and jit_data but
not the image coming from bpf_jit_binary_alloc().
Ursula Braun [Wed, 2 May 2018 14:53:56 +0000 (16:53 +0200)]
net/smc: restrict non-blocking connect finish
The smc_poll code tries to finish connect() if the socket is in
state SMC_INIT and polling of the internal CLC-socket returns with
EPOLLOUT. This makes sense for a select/poll call following a connect
call, but not without preceding connect().
With this patch smc_poll starts connect logic only, if the CLC-socket
is no longer in its initial state TCP_CLOSE.
In addition, a poll error on the internal CLC-socket is always
propagated to the SMC socket.
With this patch the code path mentioned by syzbot
https://syzkaller.appspot.com/bug?extid=03faa2dc16b8b64be396
is no longer possible.
Darrick J. Wong [Tue, 17 Apr 2018 06:07:36 +0000 (23:07 -0700)]
xfs: cap the length of deduplication requests
Since deduplication potentially has to read in all the pages in both
files in order to compare the contents, cap the deduplication request
length at MAX_RW_COUNT/2 (roughly 1GB) so that we have /some/ upper bound
on the request length and can't just lock up the kernel forever. Found
by running generic/304 after commit 1ddae54555b62 ("common/rc: add
missing 'local' keywords").
Rasmus Villemoes [Thu, 22 Mar 2018 21:05:23 +0000 (22:05 +0100)]
modpost: delete stale comment
Commit 7840fea200cd ("kbuild: Fix computing srcversion for modules")
fixed the comment above parse_source_files to refer to the new source_
line, but left this one behind that could still give the impression that
drivers/net/dummy.c appears in the deps_ variable.
Xin Long [Wed, 2 May 2018 05:45:12 +0000 (13:45 +0800)]
sctp: fix the issue that the cookie-ack with auth can't get processed
When auth is enabled for cookie-ack chunk, in sctp_inq_pop, sctp
processes auth chunk first, then continues to the next chunk in
this packet if chunk_end + chunk_hdr size < skb_tail_pointer().
Otherwise, it will go to the next packet or discard this chunk.
However, it missed the fact that cookie-ack chunk's size is equal
to chunk_hdr size, which couldn't match that check, and thus this
chunk would not get processed.
This patch fixes it by changing the check to chunk_end + chunk_hdr
size <= skb_tail_pointer().
Fixes: 26b87c788100 ("net: sctp: fix remote memory pressure from excessive queueing") Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Xin Long [Wed, 2 May 2018 05:39:46 +0000 (13:39 +0800)]
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
When processing a duplicate cookie-echo chunk, for case 'D', sctp will
not process the param from this chunk. It means old asoc has nothing
to be updated, and the new temp asoc doesn't have the complete info.
So there's no reason to use the new asoc when creating the cookie-ack
chunk. Otherwise, like when auth is enabled for cookie-ack, the chunk
can not be set with auth, and it will definitely be dropped by peer.
This issue is there since very beginning, and we fix it by using the
old asoc instead.
Xin Long [Wed, 2 May 2018 05:37:44 +0000 (13:37 +0800)]
sctp: init active key for the new asoc in dupcook_a and dupcook_b
When processing a duplicate cookie-echo chunk, for case 'A' and 'B',
after sctp_process_init for the new asoc, if auth is enabled for the
cookie-ack chunk, the active key should also be initialized.
Otherwise, the cookie-ack chunk made later can not be set with auth
shkey properly, and a crash can even be caused by this, as after
Commit 1b1e0bc99474 ("sctp: add refcnt support for sh_key"), sctp
needs to hold the shkey when making control chunks.
Neal Cardwell [Wed, 2 May 2018 01:45:41 +0000 (21:45 -0400)]
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Previously the bbr->idle_restart tracking was zeroing out the
bbr->idle_restart bit upon ACKs that did not SACK or ACK anything,
e.g. receiving incoming data or receiver window updates. In such
situations BBR would forget that this was a restart-from-idle
situation, and if the min_rtt had expired it would unnecessarily enter
PROBE_RTT (even though we were actually restarting from idle but had
merely forgotten that fact).
The fix is simple: we need to remember we are restarting from idle
until we receive a S/ACK for some data (a S/ACK for the first flight
of data we send as we are restarting).
This commit is a stable candidate for kernels back as far as 4.9.
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
In dual_mac mode packets arrived on one port should not be forwarded by
switch hw to another port. Only Linux Host can forward packets between
ports. The below test case (reported in [1]) shows that packet arrived on
one port can be leaked to anoter (reproducible with dual port evms):
- connect port 1 (eth0) to linux Host 0 and run tcpdump or Wireshark
- connect port 2 (eth1) to linux Host 1 with vlan 1 configured
- ping <IPx> from Host 1 through vlan 1 interface.
ARP packets will be seen on Host 0.
Issue happens because dual_mac mode is implemnted using two vlans: 1 (Port
1+Port 0) and 2 (Port 2+Port 0), so there are vlan records created for for
each vlan. By default, the ALE will find valid vlan record in its table
when vlan 1 tagged packet arrived on Port 2 and so forwards packet to all
ports which are vlan 1 members (like Port.
To avoid such behaviorr the ALE VLAN ID Ingress Check need to be enabled
for each external CPSW port (ALE_PORTCTLn.VID_INGRESS_CHECK) so ALE will
drop ingress packets if Rx port is not VLAN member.
Thomas Gleixner [Mon, 30 Apr 2018 19:47:46 +0000 (21:47 +0200)]
x86/cpu: Restore CPUID_8000_0008_EBX reload
The recent commt which addresses the x86_phys_bits corruption with
encrypted memory on CPUID reload after a microcode update lost the reload
of CPUID_8000_0008_EBX as well.
As a consequence IBRS and IBRS_FW are not longer detected
Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX
back. This restore has a twist due to the convoluted way the cpuid analysis
works:
CPUID_8000_0008_EBX is used by AMD to enumerate IBRB, IBRS, STIBP. On Intel
EBX is not used. But the speculation control code sets the AMD bits when
running on Intel depending on the Intel specific speculation control
bits. This was done to use the same bits for alternatives.
The change which moved the 8000_0008 evaluation out of get_cpu_cap() broke
this nasty scheme due to ordering. So that on Intel the store to
CPUID_8000_0008_EBX clears the IBRB, IBRS, STIBP bits which had been set
before by software.
So the actual CPUID_8000_0008_EBX needs to go back to the place where it
was and the phys/virt address space calculation cannot touch it.
In hindsight this should have used completely synthetic bits for IBRB,
IBRS, STIBP instead of reusing the AMD bits, but that's for 4.18.
/me needs to find time to cleanup that steaming pile of ...
Peter Zijlstra [Mon, 30 Apr 2018 10:00:13 +0000 (12:00 +0200)]
clocksource: Consistent de-rate when marking unstable
When a registered clocksource gets marked unstable the watchdog_kthread
will de-rate and re-select the clocksource. Ensure it also de-rates
when getting called on an unregistered clocksource.
Peter Zijlstra [Mon, 30 Apr 2018 10:00:12 +0000 (12:00 +0200)]
x86/tsc: Fix mark_tsc_unstable()
mark_tsc_unstable() also needs to affect tsc_early, Now that
clocksource_mark_unstable() can be used on a clocksource irrespective of
its registration state, use it on both tsc_early and tsc.
This does however require cs->list to be initialized empty, otherwise it
cannot tell the registation state before registation.
Peter Zijlstra [Mon, 30 Apr 2018 10:00:11 +0000 (12:00 +0200)]
clocksource: Initialize cs->wd_list
A number of places relies on list_empty(&cs->wd_list), however the
list_head does not get initialized. Do so upon registration, such that
thereafter it is possible to rely on list_empty() correctly reflecting
the list membership status.
Peter Zijlstra [Mon, 23 Apr 2018 15:28:55 +0000 (17:28 +0200)]
clocksource: Allow clocksource_mark_unstable() on unregistered clocksources
Because of how the code flips between tsc-early and tsc clocksources
it might need to mark one or both unstable. The current code in
mark_tsc_unstable() only worked because previously it registered the
tsc clocksource once and then never touched it.
Since it now unregisters the tsc-early clocksource, it needs to know
if a clocksource got unregistered and the current cs->mult test
doesn't work for that. Instead use list_empty(&cs->list) to test for
registration.
Furthermore, since clocksource_mark_unstable() needs to place the cs
on the wd_list, it links the cs->list and cs->wd_list serialization.
It must not see a clocsource registered (!empty cs->list) but already
past dequeue_watchdog(). So place {en,de}queue{,_watchdog}() under the
same lock.
Provided cs->list is initialized to empty, this then allows us to
unconditionally use clocksource_mark_unstable(), regardless of the
registration state.
Peter Zijlstra [Mon, 30 Apr 2018 10:00:09 +0000 (12:00 +0200)]
x86/tsc: Always unregister clocksource_tsc_early
Don't leave the tsc-early clocksource registered if it errors out
early.
This was reported by Diego, who on his Core2 era machine got TSC
invalidated while it was running with tsc-early (due to C-states).
This results in keeping tsc-early with very bad effects.
When the interrupts for a combiner span multiple registers it must be
checked if any interrupts have been asserted on each register before
checking for spurious interrupts.
Checking each register seperately leads to false positive warnings.
Filipe Manana [Mon, 30 Apr 2018 18:05:07 +0000 (19:05 +0100)]
Btrfs: send, fix missing truncate for inode with prealloc extent past eof
An incremental send operation can miss a truncate operation when an inode
has an increased size in the send snapshot and a prealloc extent beyond
its size.
Consider the following scenario where a necessary truncate operation is
missing in the incremental send stream:
1) In the parent snapshot an inode has a size of 1282957 bytes and it has
no prealloc extents beyond its size;
2) In the the send snapshot it has a size of 5738496 bytes and has a new
extent at offsets 1884160 (length of 106496 bytes) and a prealloc
extent beyond eof at offset 6729728 (and a length of 339968 bytes);
3) When processing the prealloc extent, at offset 6729728, we end up at
send.c:send_write_or_clone() and set the @len variable to a value of 18446744073708560384 because @offset plus the original @len value is
larger then the inode's size (6729728 + 339968 > 5738496). We then
call send_extent_data(), with that @offset and @len, which in turn
calls send_write(), and then the later calls fill_read_buf(). Because
the offset passed to fill_read_buf() is greater then inode's i_size,
this function returns 0 immediately, which makes send_write() and
send_extent_data() do nothing and return immediately as well. When
we get back to send.c:send_write_or_clone() we adjust the value
of sctx->cur_inode_next_write_offset to @offset plus @len, which
corresponds to 6729728 + 18446744073708560384 = 5738496, which is
precisely the the size of the inode in the send snapshot;
4) Later when at send.c:finish_inode_if_needed() we determine that
we don't need to issue a truncate operation because the value of
sctx->cur_inode_next_write_offset corresponds to the inode's new
size, 5738496 bytes. This is wrong because the last write operation
that was issued started at offset 1884160 with a length of 106496
bytes, so the correct value for sctx->cur_inode_next_write_offset
should be 1990656 (1884160 + 106496), so that a truncate operation
with a value of 5738496 bytes would have been sent to insert a
trailing hole at the destination.
So fix the issue by making send.c:send_write_or_clone() not attempt
to send write or clone operations for extents that start beyond the
inode's size, since such attempts do nothing but waste time by
calling helper functions and allocating path structures, and send
currently has no fallocate command in order to create prealloc extents
at the destination (either beyond a file's eof or not).
The issue was found running the test btrfs/007 from fstests using a seed
value of 1524346151 for fsstress.
Reported-by: Gu, Jinxiang <[email protected]> Fixes: ffa7c4296e93 ("Btrfs: send, do not issue unnecessary truncate operations") Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
btrfs: Take trans lock before access running trans in check_delayed_ref
In preivous patch:
Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist
We avoid starting btrfs transaction and get this information from
fs_info->running_transaction directly.
When accessing running_transaction in check_delayed_ref, there's a
chance that current transaction will be freed by commit transaction
after the NULL pointer check of running_transaction is passed.
After looking all the other places using fs_info->running_transaction,
they are either protected by trans_lock or holding the transactions.
Fix this by using trans_lock and increasing the use_count.
Fixes: e4c3b2dcd144 ("Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist") CC: [email protected] # 4.14+ Signed-off-by: ethanwu <[email protected]> Signed-off-by: David Sterba <[email protected]>
If we get an invalid device configuration from a palm 3 type device, we
might incorrectly parse things, and we have the potential to crash in
"interesting" ways.
Fix this up by verifying the size of the configuration passed to us by
the device, and only if it is correct, will we handle it.
Note that this also fixes an information leak of slab data.
Takashi Iwai [Wed, 2 May 2018 06:48:46 +0000 (08:48 +0200)]
ALSA: pcm: Check PCM state at xfern compat ioctl
Since snd_pcm_ioctl_xfern_compat() has no PCM state check, it may go
further and hit the sanity check pcm_sanity_check() when the ioctl is
called right after open. It may eventually spew a kernel warning, as
triggered by syzbot, depending on kconfig.
The lack of PCM state check there was just an oversight. Although
it's no real crash, the spurious kernel warning is annoying, so let's
add the proper check.
John Hurley [Tue, 1 May 2018 22:49:49 +0000 (15:49 -0700)]
nfp: flower: set tunnel ttl value to net default
Firmware requires that the ttl value for an encapsulating ipv4 tunnel
header be included as an action field. Prior to the support of Geneve
tunnel encap (when ttl set was removed completely), ttl value was
extracted from the tunnel key. However, tests have shown that this can
still produce a ttl of 0.
Fix the issue by setting the namespace default value for each new tunnel.
Follow up patch for net-next will do a full route lookup.
Fixes: 3ca3059dc3a9 ("nfp: flower: compile Geneve encap actions") Fixes: b27d6a95a70d ("nfp: compile flower vxlan tunnel set actions") Signed-off-by: John Hurley <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
tls_push_sg already does a loop over all sending sg's, so ignore
any tls_write_space notifications until we are done sending.
We then have to call the previous write_space to wake up
poll() waiters after we are done with the send loop.
Input: atmel_mxt_ts - add missing compatible strings to OF device table
Commit af503716ac14 ("i2c: core: report OF style module alias for devices
registered via OF") fixed how the I2C core reports the module alias when
devices are registered via OF.
But the atmel_mxt_ts driver only has an "atmel,maxtouch" compatible in its
OF device ID table, so if a Device Tree is using a different one, autoload
won't be working for the module (the matching works because the I2C device
ID table is used as a fallback).
So add compatible strings for each of the entries in the I2C device table.
Fixes: af503716ac14 ("i2c: core: report OF style module alias for devices registered via OF") Reported-by: Enric Balletbo i Serra <[email protected]> Signed-off-by: Javier Martinez Canillas <[email protected]> Tested-by: Enric Balletbo i Serra <[email protected]> Reviewed-by: Rob Herring <[email protected]>
[dtor: document which compatibles are deprecated and should not be used] Signed-off-by: Dmitry Torokhov <[email protected]>
Stephen Boyd [Tue, 1 May 2018 21:44:16 +0000 (14:44 -0700)]
Merge tag 'meson-clk-fixes-4.17-1' of https://github.com/BayLibre/clk-meson into clk-fixes
Pull meson clk fixes from Jerome Brunet:
- fix typos in two meson8 clock names
- remove unused clock ops declaration
* tag 'meson-clk-fixes-4.17-1' of https://github.com/BayLibre/clk-meson:
clk: meson: meson8b: fix meson8b_cpu_clk parent clock name
clk: meson: meson8b: fix meson8b_fclk_div3_div clock name
clk: meson: drop meson_aoclk_gate_regmap_ops
We already have memcpy_toio(), but not memset_io(), so let's
add the obvious version to allow building an allmodconfig kernel
without errors like
drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy':
drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration]
Nick Dyer [Tue, 1 May 2018 18:40:18 +0000 (11:40 -0700)]
Input: atmel_mxt_ts - fix the firmware update
The automatic update mechanism will trigger an update if the
info block CRCs are different between maxtouch configuration
file (maxtouch.cfg) and chip.
The driver compared the CRCs without retrieving the chip CRC,
resulting always in a failure and firmware flashing action
triggered. Fix this issue by retrieving the chip info block
CRC before the check.
Note that this solution has the benefit that by reading the
information block and the object table into a contiguous region
of memory, we can verify the checksum at probe time. This means
we make sure that we are indeed talking to a chip that supports
object protocol correctly.
Using this patch on a kevin chromebook, the touchscreen and
touchpad drivers are able to match the CRC:
Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
This patch adds the correct platform data information for the Caroline
Chromebook, so that the mouse button does not get stuck in pressed state
after the first click.
The Samus button keymap and platform data definition are the correct
ones for Caroline, so they have been reused here.
Thomas Winter [Mon, 30 Apr 2018 21:15:29 +0000 (09:15 +1200)]
ipv6: Allow non-gateway ECMP for IPv6
It is valid to have static routes where the nexthop
is an interface not an address such as tunnels.
For IPv4 it was possible to use ECMP on these routes
but not for IPv6.
Wenwen Wang [Mon, 30 Apr 2018 17:31:13 +0000 (12:31 -0500)]
ethtool: fix a potential missing-check bug
In ethtool_get_rxnfc(), the object "info" is firstly copied from
user-space. If the FLOW_RSS flag is set in the member field flow_type of
"info" (and cmd is ETHTOOL_GRXFH), info needs to be copied again from
user-space because FLOW_RSS is newer and has new definition, as mentioned
in the comment. However, given that the user data resides in user-space, a
malicious user can race to change the data after the first copy. By doing
so, the user can inject inconsistent data. For example, in the second
copy, the FLOW_RSS flag could be cleared in the field flow_type of "info".
In the following execution, "info" will be used in the function
ops->get_rxnfc(). Such inconsistent data can potentially lead to unexpected
information leakage since ops->get_rxnfc() will prepare various types of
data according to flow_type, and the prepared data will be eventually
copied to user-space. This inconsistent data may also cause undefined
behaviors based on how ops->get_rxnfc() is implemented.
This patch simply re-verifies the flow_type field of "info" after the
second copy. If the value is not as expected, an error code will be
returned.
Linus Torvalds [Tue, 1 May 2018 16:11:45 +0000 (09:11 -0700)]
Merge tag 'xfs-4.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Here are a few more bug fixes for xfs for 4.17-rc4. Most of them are
fixes for bad behavior.
This series has been run through a full xfstests run during LSF and
through a quick xfstests run against this morning's master, with no
major failures reported.
Summary:
- Enhance inode fork verifiers to prevent loading of corrupted
metadata.
- Fix a crash when we try to convert extents format inodes to btree
format, we run out of space, but forget to revert the in-core state
changes.
- Fix file size checks when doing INSERT_RANGE that could cause files
to end up negative size if there previously was an extent mapped at
s_maxbytes.
- Fix a bug when doing a remove-then-add ATTR_REPLACE xattr update
where we forget to clear ATTR_REPLACE after the remove, which
causes the attr to be lost and the fs to shut down due to (what it
thinks is) inconsistent in-core state"
* tag 'xfs-4.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't fail when converting shortform attr to long form during ATTR_REPLACE
xfs: prevent creating negative-sized file via INSERT_RANGE
xfs: set format back to extents if xfs_bmap_extents_to_btree
xfs: enhance dinode verifier
Merge tag 'errseq-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull errseq infrastructure fix from Jeff Layton:
"The PostgreSQL developers recently had a spirited discussion about the
writeback error handling in Linux, and reached out to us about a
behavoir change to the code that bit them when the errseq_t changes
were merged.
When we changed to using errseq_t for tracking writeback errors, we
lost the ability for an application to see a writeback error that
occurred before the open on which the fsync was issued. This was
problematic for PostgreSQL which offloads fsync calls to a completely
separate process from the DB writers.
This patch restores that ability. If the errseq_t value in the inode
does not have the SEEN flag set, then we just return 0 for the sample.
That ensures that any recorded error is always delivered at least
once.
Note that we might still lose the error if the inode gets evicted from
the cache before anything can reopen it, but that was the case before
errseq_t was merged. At LSF/MM we had some discussion about keeping
inodes with unreported writeback errors around in the cache for longer
(possibly indefinitely), but that's really a separate problem"
* tag 'errseq-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
errseq: Always report a writeback error once
- Fixup license text for oradax driver, from Rob Gardner.
- Release device object with put_device() instead of straight kfree(),
from Arvind Yadav.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: vio: use put_device() instead of kfree()
sparc64: Fix mistake in oradax license text
Boris Brezillon [Mon, 30 Apr 2018 13:32:32 +0000 (15:32 +0200)]
drm/vc4: Make sure vc4_bo_{inc,dec}_usecnt() calls are balanced
Commit b9f19259b84d ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl")
introduced a mechanism to mark some BOs as purgeable to allow the driver
to drop them under memory pressure. In order to implement this feature
we had to add a mechanism to mark BOs as currently used by a piece of
hardware which materialized through the ->usecnt counter.
Plane code is supposed to increment usecnt when it attaches a BO to a
plane and decrement it when it's done with this BO, which was done in
the ->prepare_fb() and ->cleanup_fb() hooks. The problem is, async page
flip logic does not go through the regular atomic update path, and
->prepare_fb() and ->cleanup_fb() are not called in this case.
Fix that by manually calling vc4_bo_{inc,dec}_usecnt() in the
async-page-flip path.
Note that all this should go away as soon as we get generic async page
flip support in the core, in the meantime, this fix should do the
trick.
Currently, the kernel protects access to the agent ID allocator on a per
port basis using a spinlock, so it is impossible for two apps/threads on
the same port to get the same TID, but it is entirely possible for two
threads on different ports to end up with the same TID.
As this can be confusing (regardless of it being legal according to the
IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID usage),
and as the rdma-core user space API for /dev/umad devices implies unique
TIDs even across ports, make the TID an atomic type so that no two
allocations, regardless of port number, will be the same.
Bin Liu [Mon, 30 Apr 2018 16:20:54 +0000 (11:20 -0500)]
usb: musb: trace: fix NULL pointer dereference in musb_g_tx()
The usb_request pointer could be NULL in musb_g_tx(), where the
tracepoint call would trigger the NULL pointer dereference failure when
parsing the members of the usb_request pointer.
Move the tracepoint call to where the usb_request pointer is already
checked to solve the issue.
musb_start_urb() doesn't check the pass-in parameter if it is NULL. But
in musb_bulk_nak_timeout() the parameter passed to musb_start_urb() is
returned from first_qh(), which could be NULL.
So wrap the musb_start_urb() call here with a if condition check to
avoid the potential NULL pointer dereference.
Tracepoint should only warn when a kernel API user does not respect the
required preconditions (e.g. same tracepoint enabled twice, or called
to remove a tracepoint that does not exist).
Silence warning in out-of-memory conditions, given that the error is
returned to the caller.
This ensures that out-of-memory error-injection testing does not trigger
warnings in tracepoint.c, which were seen by syzbot.
cpufreq / CPPC: Set platform specific transition_delay_us
Add support to specify platform specific transition_delay_us instead
of using the transition delay derived from PCC.
With commit 3d41386d556d (cpufreq: CPPC: Use transition_delay_us
depending transition_latency) we are setting transition_delay_us
directly and not applying the LATENCY_MULTIPLIER. Because of that,
on Qualcomm Centriq we can end up with a very high rate of frequency
change requests when using the schedutil governor (default
rate_limit_us=10 compared to an earlier value of 10000).
The PCC subspace describes the rate at which the platform can accept
commands on the CPPC's PCC channel. This includes read and write
command on the PCC channel that can be used for reasons other than
frequency transitions. Moreover the same PCC subspace can be used by
multiple freq domains and deriving transition_delay_us from it as we
do now can be sub-optimal.
Moreover if a platform does not use PCC for desired_perf register then
there is no way to compute the transition latency or the delay_us.
CPPC does not have a standard defined mechanism to get the transition
rate or the latency at the moment.
Given the above limitations, it is simpler to have a platform specific
transition_delay_us and rely on PCC derived value only if a platform
specific value is not available.
Signed-off-by: Prashanth Prakash <[email protected]> Cc: 4.14+ <[email protected]> # 4.14+ Fixes: 3d41386d556d (cpufreq: CPPC: Use transition_delay_us depending transition_latency) Signed-off-by: Rafael J. Wysocki <[email protected]>
ALSA: aloop: Add missing cable lock to ctl API callbacks
Some control API callbacks in aloop driver are too lazy to take the
loopback->cable_lock and it results in possible races of cable access
while it's being freed. It eventually lead to a UAF, as reported by
fuzzer recently.
This patch covers such control API callbacks and add the proper mutex
locks.
Hangbin Liu [Fri, 27 Apr 2018 12:59:24 +0000 (20:59 +0800)]
bridge: check iface upper dev when setting master via ioctl
When we set a bond slave's master to bridge via ioctl, we only check
the IFF_BRIDGE_PORT flag. Although we will find the slave's real master
at netdev_master_upper_dev_link() later, it already does some settings
and allocates some resources. It would be better to return as early
as possible.
v1 -> v2:
use netdev_master_upper_dev_get() instead of netdev_has_any_upper_dev()
to check if we have a master, because not all upper devs are masters,
e.g. vlan device.
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Another set of x86 related updates:
- Fix the long broken x32 version of the IPC user space headers which
was noticed by Arnd Bergman in course of his ongoing y2038 work.
GLIBC seems to have non broken private copies of these headers so
this went unnoticed.
- Two microcode fixlets which address some more fallout from the
recent modifications in that area:
- Unconditionally save the microcode patch, which was only saved
when CPU_HOTPLUG was enabled causing failures in the late
loading mechanism
- Make the later loader synchronization finally work under all
circumstances. It was exiting early and causing timeout failures
due to a missing synchronization point.
- Do not use mwait_play_dead() on AMD systems to prevent excessive
power consumption as the CPU cannot go into deep power states from
there.
- Address an annoying sparse warning due to lost type qualifiers of
the vmemmap and vmalloc base address constants.
- Prevent reserving crash kernel region on Xen PV as this leads to
the wrong perception that crash kernels actually work there which
is not the case. Xen PV has its own crash mechanism handled by the
hypervisor.
- Add missing TLB cpuid values to the table to make the printout on
certain machines correct.
- Enumerate the new CLDEMOTE instruction
- Fix an incorrect SPDX identifier
- Remove stale macros"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds
x86/setup: Do not reserve a crash kernel region if booted on Xen PV
x86/cpu/intel: Add missing TLB cpuid values
x86/smpboot: Don't use mwait_play_dead() on AMD systems
x86/mm: Make vmemmap and vmalloc base address constants unsigned long
x86/vector: Remove the unused macro FPU_IRQ
x86/vector: Remove the macro VECTOR_OFFSET_START
x86/cpufeatures: Enumerate cldemote instruction
x86/microcode: Do not exit early from __reload_late()
x86/microcode/intel: Save microcode patch unconditionally
x86/jailhouse: Fix incorrect SPDX identifier
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti fixes from Thomas Gleixner:
"A set of updates for the x86/pti related code:
- Preserve r8-r11 in int $0x80. r8-r11 need to be preserved, but the
int$80 entry code removed that quite some time ago. Make it correct
again.
- A set of fixes for the Global Bit work which went into 4.17 and
caused a bunch of interesting regressions:
- Triggering a BUG in the page attribute code due to a missing
check for early boot stage
- Warnings in the page attribute code about holes in the kernel
text mapping which are caused by the freeing of the init code.
Handle such holes gracefully.
- Reduce the amount of kernel memory which is set global to the
actual text and do not incidentally overlap with data.
- Disable the global bit when RANDSTRUCT is enabled as it
partially defeats the hardening.
- Make the page protection setup correct for vma->page_prot
population again. The adjustment of the protections fell through
the crack during the Global bit rework and triggers warnings on
machines which do not support certain features, e.g. NX"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64/compat: Preserve r8-r11 in int $0x80
x86/pti: Filter at vma->vm_page_prot population
x86/pti: Disallow global kernel text with RANDSTRUCT
x86/pti: Reduce amount of kernel text allowed to be Global
x86/pti: Fix boot warning from Global-bit setting
x86/pti: Fix boot problems from Global-bit setting
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Two fixes from the timer departement:
- Fix a long standing issue in the NOHZ tick code which causes RB
tree corruption, delayed timers and other malfunctions. The cause
for this is code which modifies the expiry time of an enqueued
hrtimer.
- Revert the CLOCK_MONOTONIC/CLOCK_BOOTTIME unification due to
regression reports. Seems userspace _is_ relying on the documented
behaviour despite our hope that it wont"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME
tick/sched: Do not mess with an enqueued hrtimer
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"The perf update contains the following bits:
x86:
- Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP
perf stat:
- Keep the '/' event modifier separator in fallback, for example when
fallbacking from 'cpu/cpu-cycles/' to user level only, where it
should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri
Olsa)
- Disable write_backward and other event attributes for !group events
in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S'
that has leader sampling (:S) and where just the 'cycles', the
leader event, should have the write_backward attribute set, in this
case it all fails because the PMU where 'msr/aperf/' lives doesn't
accepts write_backward style sampling (Jiri Olsa)
- Only fall back group read for leader (Kan Liang)
- Fix core PMU alias list for x86 platform (Kan Liang)
- Print out hint for mixed PMU group error (Kan Liang)
- Fix duplicate PMU name for interval print (Kan Liang)
Core:
- Set main kernel end address properly when reading kernel and module
maps (Namhyung Kim)
perf mem:
- Fix incorrect entries and add missing man options (Sangwon Hong)
s/390:
- Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter)
- Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390
- Fix s390 undefined record__auxtrace_init() return value in 'perf
record' (Thomas Richter)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
perf stat: Fix duplicate PMU name for interval print
perf evsel: Only fall back group read for leader
perf stat: Print out hint for mixed PMU group error
perf pmu: Fix core PMU alias list for X86 platform
perf record: Fix s390 undefined record__auxtrace_init() return value
perf mem: Document incorrect and missing options
perf evsel: Disable write_backward for leader sampling group events
perf pmu: Fix pmu events parsing rule
perf stat: Keep the / modifier separator in fallback
perf test: Adapt test case record+probe_libc_inet_pton.sh for s390
perf list: Remove s390 specific strcmp_cpuid_cmp function
perf machine: Set main kernel end address properly
ALSA: dice: fix kernel NULL pointer dereference due to invalid calculation for array index
At a commit f91c9d7610a ('ALSA: firewire-lib: cache maximum length of
payload to reduce function calls'), maximum size of payload for tx
isochronous packet is cached to reduce the number of function calls.
This cache was programmed to updated at a first callback of ohci1394 IR
context. However, the maximum size is required to queueing packets before
starting the isochronous context.
As a result, the cached value is reused to queue packets in next time to
starting the isochronous context. Then the cache is updated in a first
callback of the isochronous context. This can cause kernel NULL pointer
dereference in a below call graph:
The issued dereference occurs in a case that:
- target unit supports different stream formats for sampling transmission
frequency.
- maximum length of payload for tx stream in a first trial is bigger
than the length in a second trial.
In this case, correct number of pages are allocated for DMA and the 'pages'
array has enough elements, while index of the element is wrongly calculated
according to the old value of length of payload in a call of
'queue_in_packet()'. Then it causes the issue.
This commit fixes the critical bug. This affects all of drivers in ALSA
firewire stack in Linux kernel v4.12 or later.
Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix misc bugs and a regression for ext4"
* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add MODULE_SOFTDEP to ensure crc32c is included in the initramfs
ext4: fix bitmap position validation
ext4: set h_journal if there is a failure starting a reserved handle
ext4: prevent right-shifting extents beyond EXT_MAX_BLOCKS
Amir Goldstein [Mon, 5 Feb 2018 17:32:18 +0000 (19:32 +0200)]
<linux/stringhash.h>: fix end_name_hash() for 64bit long
The comment claims that this helper will try not to loose bits, but for
64bit long it looses the high bits before hashing 64bit long into 32bit
int. Use the helper hash_long() to do the right thing for 64bit long.
For 32bit long, there is no change.
All the callers of end_name_hash() either assign the result to
qstr->hash, which is u32 or return the result as an int value (e.g.
full_name_hash()). Change the helper return type to int to conform to
its users.
[ It took me a while to apply this, because my initial reaction to it
was - incorrectly - that it could make for slower code.
After having looked more at it, I take back all my complaints about
the patch, Amir was right and I was mis-reading things or just being
stupid.
I also don't worry too much about the possible performance impact of
this on 64-bit, since most architectures that actually care about
performance end up not using this very much (the dcache code is the
most performance-critical, but the word-at-a-time case uses its own
hashing anyway).
So this ends up being mostly used for filesystems that do their own
degraded hashing (usually because they want a case-insensitive
comparison function).
A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
and then this potentially makes things more expensive on 64-bit
architectures with slow or lacking multipliers even for the normal
case.
That said, realistically the only such architecture I can think of is
PA-RISC. Nobody really cares about performance on that, it's more of a
"look ma, I've got warts^W an odd machine" platform.
So the patch is fine, and all my initial worries were just misplaced
from not looking at this properly. - Linus ]
David Sterba [Sat, 28 Apr 2018 17:05:04 +0000 (19:05 +0200)]
MAINTAINERS: add myself as maintainer of AFFS
The AFFS filesystem is still in use by m68k community (Link #2), but as
there was no code activity and no maintainer, the filesystem appeared on
the list of candidates for staging/removal (Link #1).
I volunteer to act as a maintainer of AFFS to collect any fixes that
might show up and to guard fs/affs/ against another spring cleaning.
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- two driver fixes
- better parameter check for the core
- Documentation updates
- part of a tree-wide HAS_DMA cleanup
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sprd: Fix the i2c count issue
i2c: sprd: Prevent i2c accesses after suspend is called
i2c: dev: prevent ZERO_SIZE_PTR deref in i2cdev_ioctl_rdwr()
Documentation/i2c: adopt kernel commenting style in examples
Documentation/i2c: sync docs with current state of i2c-tools
Documentation/i2c: whitespace cleanup
i2c: Remove depends on HAS_DMA in case of platform dependency
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- crypto API regression that may cause sporadic alloc failures
- double-free bug in drbg
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: drbg - set freed buffers to NULL
crypto: api - fix finding algorithm currently being tested
Merge tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"A few security related fixes for SMB3, most importantly for SMB3.11
encryption"
* tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: smbd: Avoid allocating iov on the stack
cifs: smbd: Don't use RDMA read/write when signing is used
SMB311: Fix reconnect
SMB3: Fix 3.11 encryption to Windows and handle encrypted smb3 tcon
CIFS: set *resp_buf_type to NO_BUFFER on error
Merge tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A bunch of fixes, mostly for existing code and going to stable.
Our memory hot-unplug path wasn't flushing the cache before removing
memory. That is a problem now that we are doing memory hotplug on bare
metal.
Three fixes for the NPU code that supports devices connected via
NVLink (ie. GPUs). The main one tweaks the TLB flush algorithm to
avoid soft lockups for large flushes.
A fix for our memory error handling where we would loop infinitely,
returning back to the bad access and hard lockup the CPU.
Fixes for the OPAL RTC driver, which wasn't handling some error cases
correctly.
A fix for a hardlockup in the powernv cpufreq driver.
And finally two fixes to our smp_send_stop(), required due to a recent
change to use it on shutdown.
Thanks to: Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh
Salgaonkar, Mark Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri
G Bhat"
* tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/kvm/booke: Fix altivec related build break
powerpc: Fix deadlock with multiple calls to smp_send_stop
cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt
powerpc: Fix smp_send_stop NMI IPI handling
rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
powerpc/mce: Fix a bug where mce loops on memory UE.
powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters
powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy
powerpc/powernv/memtrace: Let the arch hotunplug code flush cache
powerpc/mm: Flush cache on memory hot(un)plug
David S. Miller [Sat, 28 Apr 2018 00:21:08 +0000 (20:21 -0400)]
Merge branch 'sfc-more-ARFS-fixes'
Edward Cree says:
====================
sfc: more ARFS fixes
A couple more bits of breakage in my recent ARFS and async filters work.
Patch #1 in particular fixes a bug that leads to memory trampling and
consequent crashes.
====================
Edward Cree [Fri, 27 Apr 2018 14:08:57 +0000 (15:08 +0100)]
sfc: fix ARFS expiry check on EF10
Owing to a missing conditional, the result of rps_may_expire_flow() was
being ignored and filters were being removed even if we'd decided not to
expire them.
Fixes: f8d6203780b7 ("sfc: ARFS filter IDs") Signed-off-by: Edward Cree <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Edward Cree [Fri, 27 Apr 2018 14:08:41 +0000 (15:08 +0100)]
sfc: Use filter index rather than ID for rps_flow_id table
efx->type->filter_insert() returns an ID rather than the index that
efx->type->filter_async_insert() used to, which causes it to exceed
efx->type->max_rx_ip_filters on some EF10 configurations, leading to out-
of-bounds array writes.
So, in efx_filter_rfs_work(), convert this back into an index (which is
what the remove call in the expiry path expects, anyway).
Fixes: 3af0f34290f6 ("sfc: replace asynchronous filter operations") Signed-off-by: Edward Cree <[email protected]> Signed-off-by: David S. Miller <[email protected]>
For the x32 ABI, struct timeval has two 64-bit fields. However
the kernel currently interprets the user-space values used for
the SO_RCVTIMEO and SO_SNDTIMEO socket options as having a pair
of 32-bit fields.
When the seconds portion of the requested timeout is less than 2**32,
the seconds portion of the effective timeout is correct but the
microseconds portion is zero. When the seconds portion of the
requested timeout is zero and the microseconds portion is non-zero,
the kernel interprets the timeout as zero (never timeout).
Fix by using 64-bit time for SO_RCVTIMEO/SO_SNDTIMEO as required
for the ABI.
Fixes: 515c7af85ed9 ("x32: Use compat shims for {g,s}etsockopt") Reported-by: Gopal RajagopalSai <[email protected]> Signed-off-by: Lance Richardson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
rMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- PSCI selection API, a leftover from 4.16 (for stable)
- Kick vcpu on active interrupt affinity change
- Plug a VMID allocation race on oversubscribed systems
- Silence debug messages
- Update Christoffer's email address (linaro -> arm)
x86:
- Expose userspace-relevant bits of a newly added feature
- Fix TLB flushing on VMX with VPID, but without EPT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI
kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use
arm/arm64: KVM: Add PSCI version selection API
KVM: arm/arm64: vgic: Kick new VCPU on interrupt migration
arm64: KVM: Demote SVE and LORegion warnings to debug only
MAINTAINERS: Update e-mail address for Christoffer Dall
KVM: arm/arm64: Close VMID generation race
Within run_tests target, the whole script needs to be executed within
the same shell and not as separate subshells, so the initial test_num
variable set to 0 is still present when executing "test_num=`echo
$$test_num+1 | bc`;".
Demonstration of the issue (make run_tests):
TAP version 13
(standard_in) 1: syntax error
selftests: basic_test
========================================
ok 1.. selftests: basic_test [PASS]
(standard_in) 1: syntax error
selftests: basic_percpu_ops_test
========================================
ok 1.. selftests: basic_percpu_ops_test [PASS]
(standard_in) 1: syntax error
selftests: param_test
========================================
ok 1.. selftests: param_test [PASS]
With fix applied:
TAP version 13
selftests: basic_test
========================================
ok 1..1 selftests: basic_test [PASS]
selftests: basic_percpu_ops_test
========================================
ok 1..2 selftests: basic_percpu_ops_test [PASS]
selftests: param_test
========================================
ok 1..3 selftests: param_test [PASS]
Vivien Didelot [Thu, 26 Apr 2018 23:47:35 +0000 (19:47 -0400)]
MAINTAINERS: add davem in NETWORKING DRIVERS
"./scripts/get_maintainer.pl -f" does not actually show us David as the
maintainer of drivers/net directories such as team, bonding, phy or dsa.
Adding him in an M: entry of NETWORKING DRIVERS fixes this.
When a CQ is shared by multiple QPs, c4iw_flush_hw_cq() needs to acquire
corresponding QP lock before moving the CQEs into its corresponding SW
queue and accessing the SQ contents for completing a WR.
Ignore CQEs if corresponding QP is already flushed.
This pull request includes fixes for mlx5 core and netdev driver.
Please pull and let me know if there's any problems.
For -stable v4.12
net/mlx5e: TX, Use correct counter in dma_map error flow
For -stable v4.13
net/mlx5: Avoid cleaning flow steering table twice during error flow
For -stable v4.14
net/mlx5e: Allow offloading ipv4 header re-write for icmp
For -stable v4.15
net/mlx5e: DCBNL fix min inline header size for dscp
For -stable v4.16
net/mlx5: Fix mlx5_get_vector_affinity function
====================
IB/uverbs: Fix kernel crash during MR deregistration flow
This patch fixes a crash that happens due to access to an
uninitialized DM pointer within the MR object.
The change makes sure the DM pointer in the MR object is set to
NULL during a non-DM MR creation to prevent a false indication
that this MR is related to a DM in the dereg flow.
IB/uverbs: Prevent reregistration of DM_MR to regular MR
This patch adds a check in the ib_uverbs_rereg_mr flow to make
sure there's no attempt to rereg a device memory MR to regular MR.
In such case the command will fail with -EINVAL status.