linux.git
4 years agouprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB...
Oleg Nesterov [Thu, 23 Jul 2020 15:44:20 +0000 (17:44 +0200)]
uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression

If a tracee is uprobed and it hits int3 inserted by debugger, handle_swbp()
does send_sig(SIGTRAP, current, 0) which means si_code == SI_USER. This used
to work when this code was written, but then GDB started to validate si_code
and now it simply can't use breakpoints if the tracee has an active uprobe:

# cat test.c
void unused_func(void)
{
}
int main(void)
{
return 0;
}

# gcc -g test.c -o test
# perf probe -x ./test -a unused_func
# perf record -e probe_test:unused_func gdb ./test -ex run
GNU gdb (GDB) 10.0.50.20200714-git
...
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7ddf909 in dl_main () from /lib64/ld-linux-x86-64.so.2
(gdb)

The tracee hits the internal breakpoint inserted by GDB to monitor shared
library events but GDB misinterprets this SIGTRAP and reports a signal.

Change handle_swbp() to use force_sig(SIGTRAP), this matches do_int3_user()
and fixes the problem.

This is the minimal fix for -stable, arch/x86/kernel/uprobes.c is equally
wrong; it should use send_sigtrap(TRAP_TRACE) instead of send_sig(SIGTRAP),
but this doesn't confuse GDB and needs another x86-specific patch.

Reported-by: Aaron Merey <amerey@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200723154420.GA32043@redhat.com
4 years agosched: Warn if garbage is passed to default_wake_function()
Chris Wilson [Thu, 23 Jul 2020 20:10:42 +0000 (21:10 +0100)]
sched: Warn if garbage is passed to default_wake_function()

Since the default_wake_function() passes its flags onto
try_to_wake_up(), warn if those flags collide with internal values.

Given that the supplied flags are garbage, no repair can be done but at
least alert the user to the damage they are causing.

In the belief that these errors should be picked up during testing, the
warning is only compiled in under CONFIG_SCHED_DEBUG.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200723201042.18861-1-chris@chris-wilson.co.uk
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
David S. Miller [Fri, 24 Jul 2020 00:22:09 +0000 (17:22 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for net:

1) Fix NAT hook deletion when table is dormant, from Florian Westphal.

2) Fix IPVS sync stalls, from guodeqing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agogeneve: fix an uninitialized value in geneve_changelink()
Cong Wang [Thu, 23 Jul 2020 01:56:25 +0000 (18:56 -0700)]
geneve: fix an uninitialized value in geneve_changelink()

geneve_nl2info() sets 'df' conditionally, so we have to
initialize it by copying the value from existing geneve
device in geneve_changelink().

Fixes: 56c09de347e4 ("geneve: allow changing DF behavior after creation")
Reported-by: syzbot+7ebc2e088af5e4c0c9fa@syzkaller.appspotmail.com
Cc: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobonding: check return value of register_netdevice() in bond_newlink()
Cong Wang [Wed, 22 Jul 2020 23:31:54 +0000 (16:31 -0700)]
bonding: check return value of register_netdevice() in bond_newlink()

Very similar to commit 544f287b8495
("bonding: check error value of register_netdevice() immediately"),
we should immediately check the return value of register_netdevice()
before doing anything else.

Fixes: 005db31d5f5f ("bonding: set carrier off for devices created through netlink")
Reported-and-tested-by: syzbot+bbc3a11c4da63c1b74d6@syzkaller.appspotmail.com
Cc: Beniamino Galvani <bgalvani@redhat.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoRevert "cifs: Fix the target file was deleted when rename failed."
Steve French [Thu, 23 Jul 2020 19:41:29 +0000 (14:41 -0500)]
Revert "cifs: Fix the target file was deleted when rename failed."

This reverts commit 9ffad9263b467efd8f8dc7ae1941a0a655a2bab2.

Upon additional testing with older servers, it was found that
the original commit introduced a regression when using the old SMB1
dialect and rsyncing over an existing file.

The patch will need to be respun to address this, likely including
a larger refactoring of the SMB1 and SMB3 rename code paths to make
it less confusing and also to address some additional rename error
cases that SMB3 may be able to workaround.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reported-by: Patrick Fernie <patrick.fernie@gmail.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Acked-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
4 years agoMerge tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux...
Linus Torvalds [Thu, 23 Jul 2020 20:42:46 +0000 (13:42 -0700)]
Merge tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into master

Pull s390 fixes from Heiko Carstens:

 - Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH
   to reflect reality and avoid further confusion. This is a user space
   visible change therefore the commit has also a stable tag for 5.7,
   where this counter was introduced.

 - Add Matthew Rosato as s390 IOMMU maintainer.

* tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: add Matthew for s390 IOMMU
  s390/cpum_cf,perf: change DFLT_CCERROR counter name

4 years agoi2c: i2c-qcom-geni: Fix DMA transfer race
Douglas Anderson [Wed, 22 Jul 2020 22:00:21 +0000 (15:00 -0700)]
i2c: i2c-qcom-geni: Fix DMA transfer race

When I have KASAN enabled on my kernel and I start stressing the
touchscreen my system tends to hang.  The touchscreen is one of the
only things that does a lot of big i2c transfers and ends up hitting
the DMA paths in the geni i2c driver.  It appears that KASAN adds
enough delay in my system to tickle a race condition in the DMA setup
code.

When the system hangs, I found that it was running the geni_i2c_irq()
over and over again.  It had these:

m_stat   = 0x04000080
rx_st    = 0x30000011
dm_tx_st = 0x00000000
dm_rx_st = 0x00000000
dma      = 0x00000001

Notably we're in DMA mode but are getting M_RX_IRQ_EN and
M_RX_FIFO_WATERMARK_EN over and over again.

Putting some traces in geni_i2c_rx_one_msg() showed that when we
failed we were getting to the start of geni_i2c_rx_one_msg() but were
never executing geni_se_rx_dma_prep().

I believe that the problem here is that we are starting the geni
command before we run geni_se_rx_dma_prep().  If a transfer makes it
far enough before we do that then we get into the state I have
observed.  Let's change the order, which seems to work fine.

Although problems were seen on the RX path, code inspection suggests
that the TX should be changed too.  Change it as well.

Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agoi2c: rcar: always clear ICSAR to avoid side effects
Wolfram Sang [Sat, 4 Jul 2020 13:38:29 +0000 (15:38 +0200)]
i2c: rcar: always clear ICSAR to avoid side effects

On R-Car Gen2, we get a timeout when reading from the address set in
ICSAR, even though the slave interface is disabled. Clearing it fixes
this situation. Note that Gen3 is not affected.

To reproduce: bind and undbind an I2C slave on some bus, run
'i2cdetect' on that bus.

Fixes: de20d1857dd6 ("i2c: rcar: add slave support")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agotcp: allow at most one TLP probe per flight
Yuchung Cheng [Thu, 23 Jul 2020 19:00:06 +0000 (12:00 -0700)]
tcp: allow at most one TLP probe per flight

Previously TLP may send multiple probes of new data in one
flight. This happens when the sender is cwnd limited. After the
initial TLP containing new data is sent, the sender receives another
ACK that acks partial inflight.  It may re-arm another TLP timer
to send more, if no further ACK returns before the next TLP timeout
(PTO) expires. The sender may send in theory a large amount of TLP
until send queue is depleted. This only happens if the sender sees
such irregular uncommon ACK pattern. But it is generally undesirable
behavior during congestion especially.

The original TLP design restrict only one TLP probe per inflight as
published in "Reducing Web Latency: the Virtue of Gentle Aggression",
SIGCOMM 2013. This patch changes TLP to send at most one probe
per inflight.

Note that if the sender is app-limited, TLP retransmits old data
and did not have this issue.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoAX.25: Prevent integer overflows in connect and sendmsg
Dan Carpenter [Thu, 23 Jul 2020 14:49:57 +0000 (17:49 +0300)]
AX.25: Prevent integer overflows in connect and sendmsg

We recently added some bounds checking in ax25_connect() and
ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because
they were no longer required.

Unfortunately, I believe they are required to prevent integer overflows
so I have added them back.

Fixes: 8885bb0621f0 ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()")
Fixes: 2f2a7ffad5c6 ("AX.25: Fix out-of-bounds read in ax25_connect()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodm integrity: fix integrity recalculation that is improperly skipped
Mikulas Patocka [Thu, 23 Jul 2020 14:42:09 +0000 (10:42 -0400)]
dm integrity: fix integrity recalculation that is improperly skipped

Commit adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad366b ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoio_uring: missed req_init_async() for IOSQE_ASYNC
Pavel Begunkov [Thu, 23 Jul 2020 17:17:20 +0000 (20:17 +0300)]
io_uring: missed req_init_async() for IOSQE_ASYNC

IOSQE_ASYNC branch of io_queue_sqe() is another place where an
unitialised req->work can be accessed (i.e. prior io_req_init_async()).
Nothing really bad though, it just looses IO_WQ_WORK_CONCURRENT flag.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoarm64: vdso32: Fix '--prefix=' value for newer versions of clang
Nathan Chancellor [Thu, 23 Jul 2020 04:15:10 +0000 (21:15 -0700)]
arm64: vdso32: Fix '--prefix=' value for newer versions of clang

Newer versions of clang only look for $(COMPAT_GCC_TOOLCHAIN_DIR)as [1],
rather than $(COMPAT_GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE_COMPAT)as,
resulting in the following build error:

$ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
CROSS_COMPILE_COMPAT=arm-linux-gnueabi- LLVM=1 O=out/aarch64 distclean \
defconfig arch/arm64/kernel/vdso32/
...
/home/nathan/cbl/toolchains/llvm-binutils/bin/as: unrecognized option '-EL'
clang-12: error: assembler command failed with exit code 1 (use -v to see invocation)
make[3]: *** [arch/arm64/kernel/vdso32/Makefile:181: arch/arm64/kernel/vdso32/note.o] Error 1
...

Adding the value of CROSS_COMPILE_COMPAT (adding notdir to account for a
full path for CROSS_COMPILE_COMPAT) fixes this issue, which matches the
solution done for the main Makefile [2].

[1]: https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6ac76ffd90
[2]: https://lore.kernel.org/lkml/20200721173125.1273884-1-maskray@google.com/

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
Link: https://lore.kernel.org/r/20200723041509.400450-1-natechancellor@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
4 years agoMerge tag 'amd-drm-fixes-5.8-2020-07-22' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Thu, 23 Jul 2020 04:06:14 +0000 (14:06 +1000)]
Merge tag 'amd-drm-fixes-5.8-2020-07-22' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.8-2020-07-22:

amdgpu:
- Fix crash when overclocking VegaM
- Fix possible crash when editing dpm levels

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200723032608.3865-1-alexander.deucher@amd.com
4 years agoMerge tag 'drm-misc-fixes-2020-07-22' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 23 Jul 2020 04:05:28 +0000 (14:05 +1000)]
Merge tag 'drm-misc-fixes-2020-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

 * sun4i: Fix inverted HPD result; fixes an earlier fix
 * lima: fix timeout during reset

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200722070321.GA29190@linux-uq9g
4 years agocxgb4: add missing release on skb in uld_send()
Navid Emamdoost [Thu, 23 Jul 2020 02:58:39 +0000 (21:58 -0500)]
cxgb4: add missing release on skb in uld_send()

In the implementation of uld_send(), the skb is consumed on all
execution paths except one. Release skb when returning NET_XMIT_DROP.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: fix PTP on AQC10X
Egor Pomozov [Wed, 22 Jul 2020 19:09:58 +0000 (22:09 +0300)]
net: atlantic: fix PTP on AQC10X

This patch fixes PTP on AQC10X.
PTP support on AQC10X requires FW involvement and FW configures the
TPS data arb mode itself.
So we must make sure driver doesn't touch TPS data arb mode on AQC10x
if PTP is enabled. Otherwise, there are no timestamps even though
packets are flowing.

Fixes: 2deac71ac492a ("net: atlantic: QoS implementation: min_rate")
Signed-off-by: Egor Pomozov <epomozov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoAX.25: Prevent out-of-bounds read in ax25_sendmsg()
Peilin Ye [Wed, 22 Jul 2020 16:05:12 +0000 (12:05 -0400)]
AX.25: Prevent out-of-bounds read in ax25_sendmsg()

Checks on `addr_len` and `usax->sax25_ndigis` are insufficient.
ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7
or 8. Fix it.

It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`

Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'sctp-shrink-stream-outq-in-the-right-place'
David S. Miller [Thu, 23 Jul 2020 01:00:12 +0000 (18:00 -0700)]
Merge branch 'sctp-shrink-stream-outq-in-the-right-place'

Xin Long says:

====================
sctp: shrink stream outq in the right place

Patch 1 is an improvement, and Patch 2 is a bug fix.
====================

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: shrink stream outq when fails to do addstream reconf
Xin Long [Wed, 22 Jul 2020 15:52:12 +0000 (23:52 +0800)]
sctp: shrink stream outq when fails to do addstream reconf

When adding a stream with stream reconf, the new stream firstly is in
CLOSED state but new out chunks can still be enqueued. Then once gets
the confirmation from the peer, the state will change to OPEN.

However, if the peer denies, it needs to roll back the stream. But when
doing that, it only sets the stream outcnt back, and the chunks already
in the new stream don't get purged. It caused these chunks can still be
dequeued in sctp_outq_dequeue_data().

As its stream is still in CLOSE, the chunk will be enqueued to the head
again by sctp_outq_head_data(). This chunk will never be sent out, and
the chunks after it can never be dequeued. The assoc will be 'hung' in
a dead loop of sending this chunk.

To fix it, this patch is to purge these chunks already in the new
stream by calling sctp_stream_shrink_out() when failing to do the
addstream reconf.

Fixes: 11ae76e67a17 ("sctp: implement receiver-side procedures for the Reconf Response Parameter")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosctp: shrink stream outq only when new outcnt < old outcnt
Xin Long [Wed, 22 Jul 2020 15:52:11 +0000 (23:52 +0800)]
sctp: shrink stream outq only when new outcnt < old outcnt

It's not necessary to go list_for_each for outq->out_chunk_list
when new outcnt >= old outcnt, as no chunk with higher sid than
new (outcnt - 1) exists in the outqueue.

While at it, also move the list_for_each code in a new function
sctp_stream_shrink_out(), which will be used in the next patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoAX.25: Fix out-of-bounds read in ax25_connect()
Peilin Ye [Wed, 22 Jul 2020 15:19:01 +0000 (11:19 -0400)]
AX.25: Fix out-of-bounds read in ax25_connect()

Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient.
ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis`
equals to 7 or 8. Fix it.

This issue has been reported as a KMSAN uninit-value bug, because in such
a case, ax25_connect() reaches into the uninitialized portion of the
`struct sockaddr_storage` statically allocated in __sys_connect().

It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`.

Reported-by: syzbot+c82752228ed975b0a623@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=55ef9d629f3b3d7d70b69558015b63b48d01af66
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoenetc: Remove the mdio bus on PF probe bailout
Claudiu Manoil [Wed, 22 Jul 2020 14:40:12 +0000 (17:40 +0300)]
enetc: Remove the mdio bus on PF probe bailout

For ENETC ports that register an external MDIO bus,
the bus doesn't get removed on the error bailout path
of enetc_pf_probe().

This issue became much more visible after recent:
commit 07095c025ac2 ("net: enetc: Use DT protocol information to set up the ports")
Before this commit, one could make probing fail on the error
path only by having register_netdev() fail, which is unlikely.
But after this commit, because it moved the enetc_of_phy_get()
call up in the probing sequence, now we can trigger an mdiobus_free()
bug just by forcing enetc_alloc_msix() to return error, i.e. with the
'pci=nomsi' kernel bootarg (since ENETC relies on MSI support to work),
as the calltrace below shows:

kernel BUG at /home/eiz/work/enetc/net/drivers/net/phy/mdio_bus.c:648!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
Hardware name: LS1028A RDB Board (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
pc : mdiobus_free+0x50/0x58
lr : devm_mdiobus_free+0x14/0x20
[...]
Call trace:
 mdiobus_free+0x50/0x58
 devm_mdiobus_free+0x14/0x20
 release_nodes+0x138/0x228
 devres_release_all+0x38/0x60
 really_probe+0x1c8/0x368
 driver_probe_device+0x5c/0xc0
 device_driver_attach+0x74/0x80
 __driver_attach+0x8c/0xd8
 bus_for_each_dev+0x7c/0xd8
 driver_attach+0x24/0x30
 bus_add_driver+0x154/0x200
 driver_register+0x64/0x120
 __pci_register_driver+0x44/0x50
 enetc_pf_driver_init+0x24/0x30
 do_one_initcall+0x60/0x1c0
 kernel_init_freeable+0x1fc/0x274
 kernel_init+0x14/0x110
 ret_from_fork+0x10/0x34

Fixes: ebfcb23d62ab ("enetc: Add ENETC PF level external MDIO support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'efi-urgent-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Thomas Gleixner [Wed, 22 Jul 2020 22:46:44 +0000 (00:46 +0200)]
Merge tag 'efi-urgent-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent

Pull EFI fixes from Ard Biesheuvel:

 - Fix the layering violation in the use of the EFI runtime services
   availability mask in users of the 'efivars' abstraction
 - Revert build fix for GCC v4.8 which is no longer supported
 - Some fixes for build issues found by Atish while working on RISC-V support
 - Avoid --whole-archive when linking the stub on arm64
 - Some x86 EFI stub cleanups from Arvind

4 years agox86/dumpstack: Dump user space code correctly again
Thomas Gleixner [Wed, 22 Jul 2020 08:39:54 +0000 (10:39 +0200)]
x86/dumpstack: Dump user space code correctly again

H.J. reported that post 5.7 a segfault of a user space task does not longer
dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It
prints 'Code: Bad RIP value.' instead.

This was broken by a recent change which made probe_kernel_read() reject
non-kernel addresses.

Update show_opcodes() so it retrieves user space opcodes via
copy_from_user_nmi().

Fixes: 98a23609b103 ("maccess: always use strict semantics for probe_kernel_read")
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87h7tz306w.fsf@nanos.tec.linutronix.de
4 years agox86/stacktrace: Fix reliable check for empty user task stacks
Josh Poimboeuf [Fri, 17 Jul 2020 14:04:26 +0000 (09:04 -0500)]
x86/stacktrace: Fix reliable check for empty user task stacks

If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack.  But arch_stack_walk_reliable()
incorrectly treats it as unreliable.

That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks.  Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.

Thanks to commit 71c95825289f ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached.  So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
4 years agox86/unwind/orc: Fix ORC for newly forked tasks
Josh Poimboeuf [Fri, 17 Jul 2020 14:04:25 +0000 (09:04 -0500)]
x86/unwind/orc: Fix ORC for newly forked tasks

The ORC unwinder fails to unwind newly forked tasks which haven't yet
run on the CPU.  It correctly reads the 'ret_from_fork' instruction
pointer from the stack, but it incorrectly interprets that value as a
call stack address rather than a "signal" one, so the address gets
incorrectly decremented in the call to orc_find(), resulting in bad ORC
data.

Fix it by forcing 'ret_from_fork' frames to be signal frames.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f91a8778dde8aae7f71884b5df2b16d552040441.1594994374.git.jpoimboe@redhat.com
4 years agonfsd4: fix NULL dereference in nfsd/clients display code
J. Bruce Fields [Wed, 15 Jul 2020 17:31:36 +0000 (13:31 -0400)]
nfsd4: fix NULL dereference in nfsd/clients display code

We hold the cl_lock here, and that's enough to keep stateid's from going
away, but it's not enough to prevent the files they point to from going
away.  Take fi_lock and a reference and check for NULL, as we do in
other code.

Reported-by: NeilBrown <neilb@suse.de>
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
4 years agoMerge tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Wed, 22 Jul 2020 18:56:00 +0000 (11:56 -0700)]
Merge tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into master

Pull media fixes from Mauro Carvalho Chehab:
 "A series of fixes for the upcoming atomisp driver. They solve issues
  when probing atomisp on devices with multiple cameras and get rid of
  warnings when built with W=1.

  The diffstat is a bit long, as this driver has several abstractions.
  The patches that solved the issues with W=1 had to get rid of some
  duplicated code (there used to have 2 versions of the same code, one
  for ISP2401 and another one for ISP2400).

  As this driver is not in 5.7, such changes won't cause regressions"

* tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (38 commits)
  Revert "media: atomisp: keep the ISP powered on when setting it"
  media: atomisp: fix mask and shift operation on ISPSSPM0
  media: atomisp: move system_local consts into a C file
  media: atomisp: get rid of version-specific system_local.h
  media: atomisp: move global stuff into a common header
  media: atomisp: remove non-used 32-bits consts at system_local
  media: atomisp: get rid of some unused static vars
  media: atomisp: Fix error code in ov5693_probe()
  media: atomisp: Replace trace_printk by pr_info
  media: atomisp: Fix __func__ style warnings
  media: atomisp: fix help message for ISP2401 selection
  media: atomisp: i2c: atomisp-ov2680.c: fixed a brace coding style issue.
  media: atomisp: make const arrays static, makes object smaller
  media: atomisp: Clean up non-existing folders from Makefile
  media: atomisp: Get rid of ACPI specifics in gmin_subdev_add()
  media: atomisp: Provide Gmin subdev as parameter to gmin_subdev_add()
  media: atomisp: Use temporary variable for device in gmin_subdev_add()
  media: atomisp: Refactor PMIC detection to a separate function
  media: atomisp: Deduplicate return ret in gmin_i2c_write()
  media: atomisp: Make pointer to PMIC client global
  ...

4 years agoMerge tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkin...
Linus Torvalds [Wed, 22 Jul 2020 18:30:07 +0000 (11:30 -0700)]
Merge tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat into master

Pull exfat fixes from Namjae Jeon:

 - fix overflow issue at sector calculation

 - fix wrong hint_stat initialization

 - fix wrong size update of stream entry

 - fix endianness of upname in name_hash computation

* tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix name_hash computation on big endian systems
  exfat: fix wrong size update of stream entry by typo
  exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()
  exfat: fix overflow issue in exfat_cluster_to_sector()

4 years agoRevert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
Bjorn Helgaas [Fri, 17 Jul 2020 22:21:28 +0000 (17:21 -0500)]
Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"

This reverts commit ec411e02b7a2e785a4ed9ed283207cd14f48699d.

Patrick reported that this commit broke hybrid graphics on a ThinkPad X1
Extreme 2nd with Intel UHD Graphics 630 and NVIDIA GeForce GTX 1650 Mobile:

  nouveau 0000:01:00.0: fifo: PBDMA0: 01000000 [] ch 0 [00ff992000 DRM] subc 0 mthd 0008 data 00000000

Karol reported that this commit broke Nouveau firmware loading on a Lenovo
P1G2 with Intel UHD Graphics 630 and NVIDIA TU117GLM [Quadro T1000 Mobile]:

  nouveau 0000:01:00.0: acr: AHESASC binary failed

In both cases, reverting ec411e02b7a2 solved the problem.  Unfortunately,
this revert will reintroduce the "Thunderbolt bridges take long time to
resume from D3cold" problem:
https://bugzilla.kernel.org/show_bug.cgi?id=206837

Link: https://lore.kernel.org/r/CAErSpo5sTeK_my1dEhWp7aHD0xOp87+oHYWkTjbL7ALgDbXo-Q@mail.gmail.com
Link: https://lore.kernel.org/r/CACO55tsAEa5GXw5oeJPG=mcn+qxNvspXreJYWDJGZBy5v82JDA@mail.gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208597
Reported-by: Patrick Volkerding <volkerdi@gmail.com>
Reported-by: Karol Herbst <kherbst@redhat.com>
Fixes: ec411e02b7a2 ("PCI/PM: Assume ports without DLL Link Active train links in 100 ms")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agovirtio-mmio: Reject invalid IRQ 0 command line argument
Bjorn Helgaas [Wed, 1 Jul 2020 20:53:15 +0000 (15:53 -0500)]
virtio-mmio: Reject invalid IRQ 0 command line argument

The "virtio_mmio.device=" command line argument allows a user to specify
the size, address, and IRQ of a virtio device.  Previously the only
requirement for the IRQ was that it be an unsigned integer.

Zero is an unsigned integer but an invalid IRQ number, and after
a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is invalid"),
attempts to use IRQ 0 cause warnings.

If the user specifies IRQ 0, return failure instead of registering a device
with IRQ 0.

Fixes: a85a6c86c25be ("driver core: platform: Clarify that IRQ 0 is invalid")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
4 years agoiommu/qcom: Use domain rather than dev as tlb cookie
Rob Clark [Mon, 20 Jul 2020 15:52:17 +0000 (08:52 -0700)]
iommu/qcom: Use domain rather than dev as tlb cookie

The device may be torn down, but the domain should still be valid.  Lets
use that as the tlb flush ops cookie.

Fixes a problem reported in [1]

[1] https://lkml.org/lkml/2020/7/20/104

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 09b5dfff9ad6 ("iommu/qcom: Use accessor functions for iommu private data")
Link: https://lore.kernel.org/r/20200720155217.274994-1-robdclark@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
4 years agoMAINTAINERS: add Matthew for s390 IOMMU
Gerald Schaefer [Tue, 21 Jul 2020 13:04:30 +0000 (15:04 +0200)]
MAINTAINERS: add Matthew for s390 IOMMU

Acked-By: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
4 years agoMAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu
Ludovic Desroches [Thu, 9 Jul 2020 08:42:33 +0000 (10:42 +0200)]
MAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu

My colleague Codrin Ciubotariu, now, maintains this driver internally.
Then I handover the mainline maintenance to him.

Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agoi2c: drop duplicated word in the header file
Randy Dunlap [Fri, 17 Jul 2020 23:38:15 +0000 (16:38 -0700)]
i2c: drop duplicated word in the header file

Drop the doubled word "be" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agoi2c: cadence: Clear HOLD bit at correct time in Rx path
Raviteja Narayanam [Fri, 3 Jul 2020 13:56:12 +0000 (19:26 +0530)]
i2c: cadence: Clear HOLD bit at correct time in Rx path

There are few issues on Zynq SOC observed in the stress tests causing
timeout errors. Even though all the data is received, timeout error
is thrown. This is due to an IP bug in which the COMP bit in ISR is
not set at end of transfer and completion interrupt is not generated.

This bug is seen on Zynq platforms when the following condition occurs:
Master read & HOLD bit set & Transfer size register reaches '0'.

One workaround is to clear the HOLD bit before the transfer size
register reaches '0'. The current implementation checks for this at
the start of the loop and also only for less than FIFO DEPTH case
(ignoring the equal to case).

So clear the HOLD bit when the data yet to receive is less than or
equal to the FIFO DEPTH. This avoids the IP bug condition.

Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agoRevert "i2c: cadence: Fix the hold bit setting"
Raviteja Narayanam [Fri, 3 Jul 2020 13:55:49 +0000 (19:25 +0530)]
Revert "i2c: cadence: Fix the hold bit setting"

This reverts commit d358def706880defa4c9e87381c5bf086a97d5f9.

There are two issues with "i2c: cadence: Fix the hold bit setting" commit.

1. In case of combined message request from user space, when the HOLD
bit is cleared in cdns_i2c_mrecv function, a STOP condition is sent
on the bus even before the last message is started. This is because when
the HOLD bit is cleared, the FIFOS are empty and there is no pending
transfer. The STOP condition should occur only after the last message
is completed.

2. The code added by the commit is redundant. Driver is handling the
setting/clearing of HOLD bit in right way before the commit.

The setting of HOLD bit based on 'bus_hold_flag' is taken care in
cdns_i2c_master_xfer function even before cdns_i2c_msend/cdns_i2c_recv
functions.

The clearing of HOLD bit is taken care at the end of cdns_i2c_msend and
cdns_i2c_recv functions based on bus_hold_flag and byte count.
Since clearing of HOLD bit is done after the slave address is written to
the register (writing to address register triggers the message transfer),
it is ensured that STOP condition occurs at the right time after
completion of the pending transfer (last message).

Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
4 years agosched: Fix race against ptrace_freeze_trace()
Peter Zijlstra [Mon, 20 Jul 2020 15:20:21 +0000 (17:20 +0200)]
sched: Fix race against ptrace_freeze_trace()

There is apparently one site that violates the rule that only current
and ttwu() will modify task->state, namely ptrace_{,un}freeze_traced()
will change task->state for a remote task.

Oleg explains:

  "TASK_TRACED/TASK_STOPPED was always protected by siglock. In
particular, ttwu(__TASK_TRACED) must be always called with siglock
held. That is why ptrace_freeze_traced() assumes it can safely do
s/TASK_TRACED/__TASK_TRACED/ under spin_lock(siglock)."

This breaks the ordering scheme introduced by commit:

  dbfb089d360b ("sched: Fix loadavg accounting race")

Specifically, the reload not matching no longer implies we don't have
to block.

Simply things by noting that what we need is a LOAD->STORE ordering
and this can be provided by a control dependency.

So replace:

prev_state = prev->state;
raw_spin_lock(&rq->lock);
smp_mb__after_spinlock(); /* SMP-MB */
if (... && prev_state && prev_state == prev->state)
deactivate_task();

with:

prev_state = prev->state;
if (... && prev_state) /* CTRL-DEP */
deactivate_task();

Since that already implies the 'prev->state' load must be complete
before allowing the 'prev->on_rq = 0' store to become visible.

Fixes: dbfb089d360b ("sched: Fix loadavg accounting race")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
4 years agox86, vmlinux.lds: Page-align end of ..page_aligned sections
Joerg Roedel [Tue, 21 Jul 2020 09:34:48 +0000 (11:34 +0200)]
x86, vmlinux.lds: Page-align end of ..page_aligned sections

On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.

As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.

This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.

Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.

Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.

Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.

[ tglx: Amended changelog ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
4 years agonet: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
Murali Karicheri [Fri, 17 Jul 2020 12:19:32 +0000 (15:19 +0300)]
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload

Currently drive supports taprio offload which is a tc feature offloaded
to cpsw hardware. So driver has to set the hw feature flag, NETIF_F_HW_TC
in the net device to be compliant. This patch adds the flag.

Fixes: 8127224c2708 ("ethernet: ti: am65-cpsw-qos: add TAPRIO offload support")
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ave: Fix error returns in ave_init
Wang Hai [Fri, 17 Jul 2020 02:50:49 +0000 (10:50 +0800)]
net: ethernet: ave: Fix error returns in ave_init

When regmap_update_bits failed in ave_init(), calls of the functions
reset_control_assert() and clk_disable_unprepare() were missed.
Add goto out_reset_assert to do this.

Fixes: 57878f2f4697 ("net: ethernet: ave: add support for phy-mode setting of system controller")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrivers/net/wan/x25_asy: Fix to make it work
Xie He [Thu, 16 Jul 2020 23:44:33 +0000 (16:44 -0700)]
drivers/net/wan/x25_asy: Fix to make it work

This driver is not working because of problems of its receiving code.
This patch fixes it to make it work.

When the driver receives an LAPB frame, it should first pass the frame
to the LAPB module to process. After processing, the LAPB module passes
the data (the packet) back to the driver, the driver should then add a
one-byte pseudo header and pass the data to upper layers.

The changes to the "x25_asy_bump" function and the
"x25_asy_data_indication" function are to correctly implement this
procedure.

Also, the "x25_asy_unesc" function ignores any frame that is shorter
than 3 bytes. However the shortest frames are 2-byte long. So we need
to change it to allow 2-byte frames to pass.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Reviewed-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoipvs: fix the connection sync failed in some cases
guodeqing [Thu, 16 Jul 2020 08:12:08 +0000 (16:12 +0800)]
ipvs: fix the connection sync failed in some cases

The sync_thread_backup only checks sk_receive_queue is empty or not,
there is a situation which cannot sync the connection entries when
sk_receive_queue is empty and sk_rmem_alloc is larger than sk_rcvbuf,
the sync packets are dropped in __udp_enqueue_schedule_skb, this is
because the packets in reader_queue is not read, so the rmem is
not reclaimed.

Here I add the check of whether the reader_queue of the udp sock is
empty or not to solve this problem.

Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception")
Reported-by: zhouxudong <zhouxudong8@huawei.com>
Signed-off-by: guodeqing <geffrey.guo@huawei.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
4 years agoselftest: txtimestamp: fix net ns entry logic
Paolo Pisati [Tue, 21 Jul 2020 16:17:10 +0000 (18:17 +0200)]
selftest: txtimestamp: fix net ns entry logic

According to 'man 8 ip-netns', if `ip netns identify` returns an empty string,
there's no net namespace associated with current PID: fix the net ns entrance
logic.

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoxtensa: fix access check in csum_and_copy_from_user
Max Filippov [Tue, 21 Jul 2020 22:00:35 +0000 (15:00 -0700)]
xtensa: fix access check in csum_and_copy_from_user

Commit d341659f470b ("xtensa: switch to providing
csum_and_copy_from_user()") introduced access check, but incorrectly
tested dst instead of src.
Fix access_ok argument in csum_and_copy_from_user.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: d341659f470b ("xtensa: switch to providing csum_and_copy_from_user()")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agoMerge branch 'qed-suppress-irrelevant-error-messages-on-HW-init'
David S. Miller [Tue, 21 Jul 2020 23:07:34 +0000 (16:07 -0700)]
Merge branch 'qed-suppress-irrelevant-error-messages-on-HW-init'

Alexander Lobakin says:

====================
qed: suppress irrelevant error messages on HW init

This raises the verbosity level of several error/warning messages on
driver/module initialization, most of which are false-positives, and
the one actively spamming the log for no reason.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoqed: suppress false-positives interrupt error messages on HW init
Alexander Lobakin [Tue, 21 Jul 2020 14:41:43 +0000 (17:41 +0300)]
qed: suppress false-positives interrupt error messages on HW init

It was found that qed_pglueb_rbc_attn_handler() can produce a lot of
false-positive error detections on driver load/reload (especially after
crashes/recoveries) and spam the kernel log:

[    4.958275] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d00ff0
[ 2079.146764] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2116.374631] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[ 2135.250564] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0
[...]

Reduce the logging level of two false-positive prone error messages from
notice to verbose on initialization (only) to not mix it with real error
attentions while debugging.

Fixes: 666db4862f2d ("qed: Revise load sequence to avoid PCI errors")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoqed: suppress "don't support RoCE & iWARP" flooding on HW init
Alexander Lobakin [Tue, 21 Jul 2020 14:41:42 +0000 (17:41 +0300)]
qed: suppress "don't support RoCE & iWARP" flooding on HW init

Change the verbosity of the "don't support RoCE & iWARP simultaneously"
warning to debug level to stop flooding on driver/hardware initialization:

[    4.783230] qede 01:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth0]
[    4.810020] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    4.861186] qede 01:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth1]
[    4.893311] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.181713] qede a1:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth2]
[    5.224740] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.276449] qede a1:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth3]
[    5.318671] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[    5.369548] qede a1:00.02: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth4]
[    5.411645] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only

Fixes: e0a8f9de16fc ("qed: Add iWARP enablement support")
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetdevsim: fix unbalaced locking in nsim_create()
Taehee Yoo [Tue, 21 Jul 2020 14:51:50 +0000 (14:51 +0000)]
netdevsim: fix unbalaced locking in nsim_create()

In the nsim_create(), rtnl_lock() is called before nsim_bpf_init().
If nsim_bpf_init() is failed, rtnl_unlock() should be called,
but it isn't called.
So, unbalanced locking would occur.

Fixes: e05b2d141fef ("netdevsim: move netdev creation/destruction to dev probe")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: microchip: call phy_remove_link_mode during probe
Helmut Grohne [Tue, 21 Jul 2020 11:07:39 +0000 (13:07 +0200)]
net: dsa: microchip: call phy_remove_link_mode during probe

When doing "ip link set dev ... up" for a ksz9477 backed link,
ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
called. Doing so reverts any previous change to advertised link modes
e.g. using a udevd .link file.

phy_remove_link_mode is not meant to be used while opening a link and
should be called during phy probe when the link is not yet available to
userspace.

Therefore move the phy_remove_link_mode calls into
ksz9477_switch_register. It indirectly calls dsa_register_switch, which
creates the relevant struct phy_devices and we update the link modes
right after that. At that time dev->features is already initialized by
ksz9477_switch_detect.

Remove phy_setup from ksz_dev_ops as no users remain.

Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/
Fixes: 42fc6a4c613019 ("net: dsa: microchip: prepare PHY for proper advertisement")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'hns3-fixes'
David S. Miller [Tue, 21 Jul 2020 22:49:17 +0000 (15:49 -0700)]
Merge branch 'hns3-fixes'

Huazhong Tan says:

====================
net: hns3: fixes for -net

There are some bugfixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix return value error when query MAC link status fail
Jian Shen [Tue, 21 Jul 2020 11:03:54 +0000 (19:03 +0800)]
net: hns3: fix return value error when query MAC link status fail

Currently, PF queries the MAC link status per second by calling
function hclge_get_mac_link_status(). It return the error code
when failed to send cmdq command to firmware. It's incorrect,
because this return value is used as the MAC link status, which
0 means link down, and none-zero means link up. So fixes it.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix error handling for desc filling
Yunsheng Lin [Tue, 21 Jul 2020 11:03:53 +0000 (19:03 +0800)]
net: hns3: fix error handling for desc filling

The content of the TX desc is automatically cleared by the HW
when the HW has sent out the packet to the wire. When desc filling
fails in hns3_nic_net_xmit(), it will call hns3_clear_desc() to do
the error handling, which miss zeroing of the TX desc and the
checking if a unmapping is needed.

So add the zeroing and checking in hns3_clear_desc() to avoid the
above problem. Also add DESC_TYPE_UNKNOWN to indicate the info in
desc_cb is not valid, because hns3_nic_reclaim_desc() may treat
the desc_cb->type of zero as packet and add to the sent pkt
statistics accordingly.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix for not calculating TX BD send size correctly
Yunsheng Lin [Tue, 21 Jul 2020 11:03:52 +0000 (19:03 +0800)]
net: hns3: fix for not calculating TX BD send size correctly

With GRO and fraglist support, the SKB can be aggregated to
a total size of 65535, and when that SKB is forwarded through
a bridge, the size of the SKB may be pushed to exceed the size
of 65535 when br_dev_queue_push_xmit() is called.

The max send size of BD supported by the HW is 65535, when a SKB
with a headlen of over 65535 is sent to the driver, the driver
needs to use multi BD to send the linear data, and the send size
of the last BD is calculated incorrectly by the driver who is
using '&' operation, which causes a TX error.

Use '%' operation to fix this problem.

Fixes: 3fe13ed95dd3 ("net: hns3: avoid mult + div op in critical data path")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix for not unmapping TX buffer correctly
Yunsheng Lin [Tue, 21 Jul 2020 11:03:51 +0000 (19:03 +0800)]
net: hns3: fix for not unmapping TX buffer correctly

When a big TX buffer is sent using multi BD, the driver maps the
whole TX buffer, and unmaps it using info in desc_cb corresponding
to each BD, but only the info in the desc_cb of first BD is correct,
other info in desc_cb is wrong, which causes TX unmapping problem
when SMMU is on.

Only set the mapping and freeing info in the desc_cb of first BD to
fix this problem, because the TX buffer only need to be unmapped and
freed once.

Fixes: 1e8a7977d09f("net: hns3: add handling for big TX fragment")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: udp: Fix wrong clean up for IS_UDPLITE macro
Miaohe Lin [Tue, 21 Jul 2020 09:11:44 +0000 (17:11 +0800)]
net: udp: Fix wrong clean up for IS_UDPLITE macro

We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is
checked.

Fixes: b2bf1e2659b1 ("[UDP]: Clean up for IS_UDPLITE macro")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet-sysfs: add a newline when printing 'tx_timeout' by sysfs
Xiongfeng Wang [Tue, 21 Jul 2020 07:02:57 +0000 (15:02 +0800)]
net-sysfs: add a newline when printing 'tx_timeout' by sysfs

When I cat 'tx_timeout' by sysfs, it displays as follows. It's better to
add a newline for easy reading.

root@syzkaller:~# cat /sys/devices/virtual/net/lo/queues/tx-0/tx_timeout
0root@syzkaller:~#

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ravb: exit if re-initialization fails in tx timeout
Yoshihiro Shimoda [Tue, 21 Jul 2020 06:23:12 +0000 (15:23 +0900)]
net: ethernet: ravb: exit if re-initialization fails in tx timeout

According to the report of [1], this driver is possible to cause
the following error in ravb_tx_timeout_work().

ravb e6800000.ethernet ethernet: failed to switch device to config mode

This error means that the hardware could not change the state
from "Operation" to "Configuration" while some tx and/or rx queue
are operating. After that, ravb_config() in ravb_dmac_init() will fail,
and then any descriptors will be not allocaled anymore so that NULL
pointer dereference happens after that on ravb_start_xmit().

To fix the issue, the ravb_tx_timeout_work() should check
the return values of ravb_stop_dma() and ravb_dmac_init().
If ravb_stop_dma() fails, ravb_tx_timeout_work() re-enables TX and RX
and just exits. If ravb_dmac_init() fails, just exits.

[1]
https://lore.kernel.org/linux-renesas-soc/20200518045452.2390-1-dirk.behme@de.bosch.com/

Reported-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'udp-Fix-reuseport-selection-with-connected-sockets'
David S. Miller [Tue, 21 Jul 2020 22:31:03 +0000 (15:31 -0700)]
Merge branch 'udp-Fix-reuseport-selection-with-connected-sockets'

Kuniyuki Iwashima says:

====================
udp: Fix reuseport selection with connected sockets.

This patch set addresses two issues which happen when both connected and
unconnected sockets are in the same UDP reuseport group.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoudp: Improve load balancing for SO_REUSEPORT.
Kuniyuki Iwashima [Tue, 21 Jul 2020 06:15:31 +0000 (15:15 +0900)]
udp: Improve load balancing for SO_REUSEPORT.

Currently, SO_REUSEPORT does not work well if connected sockets are in a
UDP reuseport group.

Then reuseport_has_conns() returns true and the result of
reuseport_select_sock() is discarded. Also, unconnected sockets have the
same score, hence only does the first unconnected socket in udp_hslot
always receive all packets sent to unconnected sockets.

So, the result of reuseport_select_sock() should be used for load
balancing.

The noteworthy point is that the unconnected sockets placed after
connected sockets in sock_reuseport.socks will receive more packets than
others because of the algorithm in reuseport_select_sock().

    index | connected | reciprocal_scale | result
    ---------------------------------------------
    0     | no        | 20%              | 40%
    1     | no        | 20%              | 20%
    2     | yes       | 20%              | 0%
    3     | no        | 20%              | 40%
    4     | yes       | 20%              | 0%

If most of the sockets are connected, this can be a problem, but it still
works better than now.

Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets")
CC: Willem de Bruijn <willemb@google.com>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoudp: Copy has_conns in reuseport_grow().
Kuniyuki Iwashima [Tue, 21 Jul 2020 06:15:30 +0000 (15:15 +0900)]
udp: Copy has_conns in reuseport_grow().

If an unconnected socket in a UDP reuseport group connect()s, has_conns is
set to 1. Then, when a packet is received, udp[46]_lib_lookup2() scans all
sockets in udp_hslot looking for the connected socket with the highest
score.

However, when the number of sockets bound to the port exceeds max_socks,
reuseport_grow() resets has_conns to 0. It can cause udp[46]_lib_lookup2()
to return without scanning all sockets, resulting in that packets sent to
connected sockets may be distributed to unconnected sockets.

Therefore, reuseport_grow() should copy has_conns.

Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets")
CC: Willem de Bruijn <willemb@google.com>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobtrfs: fix mount failure caused by race with umount
Boris Burkov [Thu, 16 Jul 2020 20:29:46 +0000 (13:29 -0700)]
btrfs: fix mount failure caused by race with umount

It is possible to cause a btrfs mount to fail by racing it with a slow
umount. The crux of the sequence is generic_shutdown_super not yet
calling sop->put_super before btrfs_mount_root calls btrfs_open_devices.
If that occurs, btrfs_open_devices will decide the opened counter is
non-zero, increment it, and skip resetting fs_devices->total_rw_bytes to
0. From here, mount will call sget which will result in grab_super
trying to take the super block umount semaphore. That semaphore will be
held by the slow umount, so mount will block. Before up-ing the
semaphore, umount will delete the super block, resulting in mount's sget
reliably allocating a new one, which causes the mount path to dutifully
fill it out, and increment total_rw_bytes a second time, which causes
the mount to fail, as we see double the expected bytes.

Here is the sequence laid out in greater detail:

CPU0                                                    CPU1
down_write sb->s_umount
btrfs_kill_super
  kill_anon_super(sb)
    generic_shutdown_super(sb);
      shrink_dcache_for_umount(sb);
      sync_filesystem(sb);
      evict_inodes(sb); // SLOW

                                              btrfs_mount_root
                                                btrfs_scan_one_device
                                                fs_devices = device->fs_devices
                                                fs_info->fs_devices = fs_devices
                                                // fs_devices-opened makes this a no-op
                                                btrfs_open_devices(fs_devices, mode, fs_type)
                                                s = sget(fs_type, test, set, flags, fs_info);
                                                  find sb in s_instances
                                                  grab_super(sb);
                                                    down_write(&s->s_umount); // blocks

      sop->put_super(sb)
        // sb->fs_devices->opened == 2; no-op
      spin_lock(&sb_lock);
      hlist_del_init(&sb->s_instances);
      spin_unlock(&sb_lock);
      up_write(&sb->s_umount);
                                                    return 0;
                                                  retry lookup
                                                  don't find sb in s_instances (deleted by CPU0)
                                                  s = alloc_super
                                                  return s;
                                                btrfs_fill_super(s, fs_devices, data)
                                                  open_ctree // fs_devices total_rw_bytes improperly set!
                                                    btrfs_read_chunk_tree
                                                      read_one_dev // increment total_rw_bytes again!!
                                                      super_total_bytes < fs_devices->total_rw_bytes // ERROR!!!

To fix this, we clear total_rw_bytes from within btrfs_read_chunk_tree
before the calls to read_one_dev, while holding the sb umount semaphore
and the uuid mutex.

To reproduce, it is sufficient to dirty a decent number of inodes, then
quickly umount and mount.

  for i in $(seq 0 500)
  do
    dd if=/dev/zero of="/mnt/foo/$i" bs=1M count=1
  done
  umount /mnt/foo&
  mount /mnt/foo

does the trick for me.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agobtrfs: fix page leaks after failure to lock page for delalloc
Robbie Ko [Mon, 20 Jul 2020 01:42:09 +0000 (09:42 +0800)]
btrfs: fix page leaks after failure to lock page for delalloc

When locking pages for delalloc, we check if it's dirty and mapping still
matches. If it does not match, we need to return -EAGAIN and release all
pages. Only the current page was put though, iterate over all the
remaining pages too.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agobtrfs: qgroup: fix data leak caused by race between writeback and truncate
Qu Wenruo [Fri, 17 Jul 2020 07:12:05 +0000 (15:12 +0800)]
btrfs: qgroup: fix data leak caused by race between writeback and truncate

[BUG]
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leak, detected at unmount time:

  BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
  ------------[ cut here ]------------
  WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
  RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
  Call Trace:
   btrfs_put_super+0x15/0x17 [btrfs]
   generic_shutdown_super+0x72/0x110
   kill_anon_super+0x18/0x30
   btrfs_kill_super+0x17/0x30 [btrfs]
   deactivate_locked_super+0x3b/0xa0
   deactivate_super+0x40/0x50
   cleanup_mnt+0x135/0x190
   __cleanup_mnt+0x12/0x20
   task_work_run+0x64/0xb0
   __prepare_exit_to_usermode+0x1bc/0x1c0
   __syscall_return_slowpath+0x47/0x230
   do_syscall_64+0x64/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  ---[ end trace caf08beafeca2392 ]---
  BTRFS error (device dm-3): qgroup reserved space leaked

[CAUSE]
In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0

The following sequence of events could happen after the writev():
CPU1 (writeback) | CPU2 (truncate)
-----------------------------------------------------------------
btrfs_writepages() |
|- extent_write_cache_pages() |
   |- Got page for 1003520 |
   |  1003520 is Dirty, no writeback |
   |  So (!clear_page_dirty_for_io())   |
   |  gets called for it |
   |- Now page 1003520 is Clean. |
   | | btrfs_setattr()
   | | |- btrfs_setsize()
   | |    |- truncate_setsize()
   | |       New i_size is 18388
   |- __extent_writepage() |
   |  |- page_offset() > i_size |
      |- btrfs_invalidatepage() |
 |- Page is clean, so no qgroup |
    callback executed

This means, the qgroup reserved data space is not properly released in
btrfs_invalidatepage() as the page is Clean.

[FIX]
Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().

As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
io_tree, not bound to page status, thus we won't cause double freeing
anyway.

Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agodrm/amdgpu: Fix NULL dereference in dpm sysfs handlers
Paweł Gronowski [Sun, 19 Jul 2020 15:54:53 +0000 (17:54 +0200)]
drm/amdgpu: Fix NULL dereference in dpm sysfs handlers

NULL dereference occurs when string that is not ended with space or
newline is written to some dpm sysfs interface (for example pp_dpm_sclk).
This happens because strsep replaces the tmp with NULL if the delimiter
is not present in string, which is then dereferenced by tmp[0].

Reproduction example:
sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk'

Signed-off-by: Paweł Gronowski <me@woland.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/amd/powerplay: fix a crash when overclocking Vega M
Qiu Wenbo [Fri, 17 Jul 2020 07:09:57 +0000 (15:09 +0800)]
drm/amd/powerplay: fix a crash when overclocking Vega M

Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and
vddci_voltage_table is empty. It has been tested on Intel Hades Canyon
(i7-8809G).

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208489
Fixes: ac7822b0026f ("drm/amd/powerplay: add smumgr support for VEGAM (v2)")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agobtrfs: fix double free on ulist after backref resolution failure
Filipe Manana [Mon, 13 Jul 2020 14:11:56 +0000 (15:11 +0100)]
btrfs: fix double free on ulist after backref resolution failure

At btrfs_find_all_roots_safe() we allocate a ulist and set the **roots
argument to point to it. However if later we fail due to an error returned
by find_parent_nodes(), we free that ulist but leave a dangling pointer in
the **roots argument. Upon receiving the error, a caller of this function
can attempt to free the same ulist again, resulting in an invalid memory
access.

One such scenario is during qgroup accounting:

btrfs_qgroup_account_extents()

 --> calls btrfs_find_all_roots() passes &new_roots (a stack allocated
     pointer) to btrfs_find_all_roots()

   --> btrfs_find_all_roots() just calls btrfs_find_all_roots_safe()
       passing &new_roots to it

     --> allocates ulist and assigns its address to **roots (which
         points to new_roots from btrfs_qgroup_account_extents())

     --> find_parent_nodes() returns an error, so we free the ulist
         and leave **roots pointing to it after returning

 --> btrfs_qgroup_account_extents() sees btrfs_find_all_roots() returned
     an error and jumps to the label 'cleanup', which just tries to
     free again the same ulist

Stack trace example:

 ------------[ cut here ]------------
 BTRFS: tree first key check failed
 WARNING: CPU: 1 PID: 1763215 at fs/btrfs/disk-io.c:422 btrfs_verify_level_key+0xe0/0x180 [btrfs]
 Modules linked in: dm_snapshot dm_thin_pool (...)
 CPU: 1 PID: 1763215 Comm: fsstress Tainted: G        W         5.8.0-rc3-btrfs-next-64 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_verify_level_key+0xe0/0x180 [btrfs]
 Code: 28 5b 5d (...)
 RSP: 0018:ffffb89b473779a0 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff90397759bf08 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000000027 RDI: 00000000ffffffff
 RBP: ffff9039a419c000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: ffffb89b43301000 R12: 000000000000005e
 R13: ffffb89b47377a2e R14: ffffb89b473779af R15: 0000000000000000
 FS:  00007fc47e1e1000(0000) GS:ffff9039ac200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fc47e1df000 CR3: 00000003d9e4e001 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  read_block_for_search+0xf6/0x350 [btrfs]
  btrfs_next_old_leaf+0x242/0x650 [btrfs]
  resolve_indirect_refs+0x7cf/0x9e0 [btrfs]
  find_parent_nodes+0x4ea/0x12c0 [btrfs]
  btrfs_find_all_roots_safe+0xbf/0x130 [btrfs]
  btrfs_qgroup_account_extents+0x9d/0x390 [btrfs]
  btrfs_commit_transaction+0x4f7/0xb20 [btrfs]
  btrfs_sync_file+0x3d4/0x4d0 [btrfs]
  do_fsync+0x38/0x70
  __x64_sys_fdatasync+0x13/0x20
  do_syscall_64+0x5c/0xe0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fc47e2d72e3
 Code: Bad RIP value.
 RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3
 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003
 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8
 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0
 softirqs last  enabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace 8639237550317b48 ]---
 BTRFS error (device sdc): tree first key mismatch detected, bytenr=62324736 parent_transid=94 key expected=(262,108,1351680) has=(259,108,1921024)
 general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
 CPU: 2 PID: 1763215 Comm: fsstress Tainted: G        W         5.8.0-rc3-btrfs-next-64 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:ulist_release+0x14/0x60 [btrfs]
 Code: c7 07 00 (...)
 RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282
 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840
 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840
 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840
 FS:  00007fc47e1e1000(0000) GS:ffff9039ac600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f8c1c0a51c8 CR3: 00000003d9e4e004 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  ulist_free+0x13/0x20 [btrfs]
  btrfs_qgroup_account_extents+0xf3/0x390 [btrfs]
  btrfs_commit_transaction+0x4f7/0xb20 [btrfs]
  btrfs_sync_file+0x3d4/0x4d0 [btrfs]
  do_fsync+0x38/0x70
  __x64_sys_fdatasync+0x13/0x20
  do_syscall_64+0x5c/0xe0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fc47e2d72e3
 Code: Bad RIP value.
 RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3
 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003
 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8
 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50
 Modules linked in: dm_snapshot dm_thin_pool (...)
 ---[ end trace 8639237550317b49 ]---
 RIP: 0010:ulist_release+0x14/0x60 [btrfs]
 Code: c7 07 00 (...)
 RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282
 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840
 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840
 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840
 FS:  00007fc47e1e1000(0000) GS:ffff9039ad200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f6a776f7d40 CR3: 00000003d9e4e002 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fix this by making btrfs_find_all_roots_safe() set *roots to NULL after
it frees the ulist.

Fixes: 8da6d5815c592b ("Btrfs: added btrfs_find_all_roots()")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agoRDMA/mlx5: Prevent prefetch from racing with implicit destruction
Jason Gunthorpe [Sun, 19 Jul 2020 06:54:35 +0000 (09:54 +0300)]
RDMA/mlx5: Prevent prefetch from racing with implicit destruction

Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
concurrently with destruction of the implicit MR. The num_deferred_work
was intended to serialize this, but there is a race:

       CPU0                                          CPU1

    mlx5_ib_free_implicit_mr()
      xa_erase(odp_mkeys)
      synchronize_srcu()
      __xa_erase(implicit_children)
                                      mlx5_ib_prefetch_mr_work()
                                        pagefault_mr()
                                         pagefault_implicit_mr()
                                          implicit_get_child_mr()
                                           xa_cmpxchg()
                                        atomic_dec_and_test(num_deferred_mr)
      wait_event(imr->q_deferred_work)
      ib_umem_odp_release(odp_imr)
        kfree(odp_imr)

At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
supposed to be empty forever so that destroy_unused_implicit_child_mr()
and related are not and will not be running.

Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
imr parent.

The solution is to flush out the prefetch wq by driving num_deferred_work
to zero after creation of new prefetch work is blocked.

Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065435.130722-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoMerge tag 'timers-v5.8-rc7' of https://git.linaro.org/people/daniel.lezcano/linux...
Thomas Gleixner [Tue, 21 Jul 2020 15:18:32 +0000 (17:18 +0200)]
Merge tag 'timers-v5.8-rc7' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent

Pull a timer chip fix from Daniel Lezcano:

 - Fix kernel panic at suspend / resume time on TI am3/am4 (Tony Lindgren)

4 years agoMerge tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Tue, 21 Jul 2020 15:06:45 +0000 (08:06 -0700)]
Merge tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into master

Pull sound fixes from Takashi Iwai:
 "This became fairly large, containing mostly the collection of ASoC
  fixes that slipped from the previous request, so I sent now a bit
  earlier than usual. But all changes look small and mostly
  device-specific, hence nothing to worry too much.

  Majority of changes are for x86 based platforms and their CODEC
  drivers, in order to address some issues hit by their recent tests and
  fuzzing. The rest are other ASoC device-specific fixes (imx, qcom,
  wm8974, amd, rockchip) as well as a trivial fix for a kernel WARNING
  hit by syzkaller"

* tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
  ALSA: hda/realtek: Fixed ALC298 sound bug by adding quirk for Samsung Notebook Pen S
  ALSA: info: Drop WARN_ON() from buffer NULL sanity check
  ASoC: rt5682: Report the button event in the headset type only
  ASoC: Intel: bytcht_es8316: Add missed put_device()
  ASoC: rt5682: Enable Vref2 under using PLL2
  ASoC: rt286: fix unexpected interrupt happens
  ASoC: wm8974: remove unsupported clock mode
  ASoC: wm8974: fix Boost Mixer Aux Switch
  ASoC: SOF: core: fix null-ptr-deref bug during device removal
  ASoc: codecs: max98373: remove Idle_bias_on to let codec suspend
  ASoC: codecs: max98373: Removed superfluous volume control from chip default
  ASoC: topology: fix tlvs in error handling for widget_dmixer
  ASoC: topology: fix kernel oops on route addition error
  ASoC: SOF: imx: add min/max channels for SAI/ESAI on i.MX8/i.MX8M
  ASoC: Intel: bdw-rt5677: fix non BE conversion
  ASoC: soc-dai: set dai_link dpcm_ flags with a helper
  MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
  ASoC: core: Remove only the registered component in devm functions
  MAINTAINERS: Change Maintainer for some at91 drivers
  ASoC: dt-bindings: simple-card: Fix 'make dt_binding_check' warnings
  ...

4 years agoclocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Tony Lindgren [Mon, 13 Jul 2020 16:26:01 +0000 (09:26 -0700)]
clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4

Carlos Hernandez <ceh@ti.com> reported that we now have a suspend and
resume regresssion on am3 and am4 compared to the earlier kernels. While
suspend and resume works with v5.8-rc3, we now get errors with rtcwake:

pm33xx pm33xx: PM: Could not transition all powerdomains to target state
...
rtcwake: write error

This is because we now fail to idle the system timer clocks that the
idle code checks and the error gets propagated to the rtcwake.

Turns out there are several issues that need to be fixed:

1. Ignore no-idle and no-reset configured timers for the ti-sysc
   interconnect target driver as otherwise it will keep the system timer
   clocks enabled

2. Toggle the system timer functional clock for suspend for am3 and am4
   (but not for clocksource on am3)

3. Only reconfigure type1 timers in dmtimer_systimer_disable()

4. Use of_machine_is_compatible() instead of of_device_is_compatible()
   for checking the SoC type

Fixes: 52762fbd1c47 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Reported-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200713162601.6829-1-tony@atomide.com
4 years agos390/cpum_cf,perf: change DFLT_CCERROR counter name
Thomas Richter [Fri, 17 Jul 2020 09:27:22 +0000 (11:27 +0200)]
s390/cpum_cf,perf: change DFLT_CCERROR counter name

Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.

Fixes: d68d5d51dc89 ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e402 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
4 years agoriscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
Vincent Chen [Fri, 10 Jul 2020 02:40:54 +0000 (10:40 +0800)]
riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence

It fails to boot the v5.8-rc4 kernel with CONFIG_KASAN because kasan_init
and kasan_early_init use uninitialized __sbi_rfence as executing the
tlb_flush_all(). Actually, at this moment, only the CPU which is
responsible for the system initialization enables the MMU. Other CPUs are
parking at the .Lsecondary_start. Hence the tlb_flush_all() is able to be
replaced by local_tlb_flush_all() to avoid using uninitialized
__sbi_rfence.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
4 years agotipc: allow to build NACK message in link timeout function
Tung Nguyen [Tue, 21 Jul 2020 01:57:05 +0000 (08:57 +0700)]
tipc: allow to build NACK message in link timeout function

Commit 02288248b051 ("tipc: eliminate gap indicator from ACK messages")
eliminated sending of the 'gap' indicator in regular ACK messages and
only allowed to build NACK message with enabled probe/probe_reply.
However, necessary correction for building NACK message was missed
in tipc_link_timeout() function. This leads to significant delay and
link reset (due to retransmission failure) in lossy environment.

This commit fixes it by setting the 'probe' flag to 'true' when
the receive deferred queue is not empty. As a result, NACK message
will be built to send back to another peer.

Fixes: 02288248b051 ("tipc: eliminate gap indicator from ACK messages")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoexfat: fix name_hash computation on big endian systems
Ilya Ponetayev [Thu, 16 Jul 2020 08:27:53 +0000 (17:27 +0900)]
exfat: fix name_hash computation on big endian systems

On-disk format for name_hash field is LE, so it must be explicitly
transformed on BE system for proper result.

Fixes: 370e812b3ec1 ("exfat: add nls operations")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Chen Minqiang <ptpt52@gmail.com>
Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
4 years agoexfat: fix wrong size update of stream entry by typo
Hyeongseok Kim [Wed, 8 Jul 2020 09:52:33 +0000 (18:52 +0900)]
exfat: fix wrong size update of stream entry by typo

The stream.size field is updated to the value of create timestamp
of the file entry. Fix this to use correct stream entry pointer.

Fixes: 29bbb14bfc80 ("exfat: fix incorrect update of stream entry in __exfat_truncate()")
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
4 years agoexfat: fix wrong hint_stat initialization in exfat_find_dir_entry()
Namjae Jeon [Fri, 3 Jul 2020 02:19:46 +0000 (11:19 +0900)]
exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()

We found the wrong hint_stat initialization in exfat_find_dir_entry().
It should be initialized when cluster is EXFAT_EOF_CLUSTER.

Fixes: ca06197382bd ("exfat: add directory operations")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
4 years agoexfat: fix overflow issue in exfat_cluster_to_sector()
Namjae Jeon [Fri, 3 Jul 2020 02:16:32 +0000 (11:16 +0900)]
exfat: fix overflow issue in exfat_cluster_to_sector()

An overflow issue can occur while calculating sector in
exfat_cluster_to_sector(). It needs to cast clus's type to sector_t
before left shifting.

Fixes: 1acf1a564b60 ("exfat: add in-memory and on-disk structures and headers")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
4 years agonet: neterion: vxge: reduce stack usage in VXGE_COMPLETE_VPATH_TX
Bixuan Cui [Mon, 20 Jul 2020 01:58:39 +0000 (09:58 +0800)]
net: neterion: vxge: reduce stack usage in VXGE_COMPLETE_VPATH_TX

Fix the warning: [-Werror=-Wframe-larger-than=]

drivers/net/ethernet/neterion/vxge/vxge-main.c:
In function'VXGE_COMPLETE_VPATH_TX.isra.37':
drivers/net/ethernet/neterion/vxge/vxge-main.c:119:1:
warning: the frame size of 1056 bytes is larger than 1024 bytes

Dropping the NR_SKB_COMPLETED to 16 is appropriate that won't
have much impact on performance and functionality.

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ag71xx: add missed clk_disable_unprepare in error path of probe
Huang Guobin [Mon, 20 Jul 2020 01:46:14 +0000 (21:46 -0400)]
net: ag71xx: add missed clk_disable_unprepare in error path of probe

The ag71xx_mdio_probe() forgets to call clk_disable_unprepare() when
of_reset_control_get_exclusive() failed. Add the missed call to fix it.

Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/sched: act_ct: fix restore the qdisc_skb_cb after defrag
wenxu [Sun, 19 Jul 2020 12:30:37 +0000 (20:30 +0800)]
net/sched: act_ct: fix restore the qdisc_skb_cb after defrag

The fragment packets do defrag in tcf_ct_handle_fragments
will clear the skb->cb which make the qdisc_skb_cb clear
too. So the qdsic_skb_cb should be store before defrag and
restore after that.
It also update the pkt_len after all the
fragments finish the defrag to one packet and make the
following actions counter correct.

Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame
Navid Emamdoost [Sat, 18 Jul 2020 05:31:49 +0000 (00:31 -0500)]
nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame

The implementation of s3fwrn5_recv_frame() is supposed to consume skb on
all execution paths. Release skb before returning -ENODEV.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocrypto/chtls: correct net_device reference count
Vinay Kumar Yadav [Fri, 17 Jul 2020 19:11:07 +0000 (00:41 +0530)]
crypto/chtls: correct net_device reference count

ip_dev_find() call holds net_device reference which is not needed,
use __ip_dev_find() which does not hold reference.

v1->v2:
- Correct submission tree.
- Add fixes tag.

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocrypto/chtls: fix tls alert messages corrupted by tls data
Vinay Kumar Yadav [Fri, 17 Jul 2020 19:01:42 +0000 (00:31 +0530)]
crypto/chtls: fix tls alert messages corrupted by tls data

When tls data skb is pending for Tx and tls alert comes , It
is wrongly overwrite the record type of tls data to tls alert
record type. fix the issue correcting it.

v1->v2:
- Correct submission tree.
- Add fixes tag.

Fixes: 6919a8264a32 ("Crypto/chtls: add/delete TLS header in driver")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'ionic-locking-and-filter-fixes'
David S. Miller [Tue, 21 Jul 2020 01:09:38 +0000 (18:09 -0700)]
Merge branch 'ionic-locking-and-filter-fixes'

Shannon Nelson says:

====================
ionic: locking and filter fixes

These patches address an ethtool show regs problem, some locking sightings,
and issues with RSS hash and filter_id tracking after a managed FW update.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: use mutex to protect queue operations
Shannon Nelson [Mon, 20 Jul 2020 23:00:17 +0000 (16:00 -0700)]
ionic: use mutex to protect queue operations

The ionic_wait_on_bit_lock() was a open-coded mutex knock-off
used only for protecting the queue reset operations, and there
was no reason not to use the real thing.  We can use the lock
more correctly and to better protect the queue stop and start
operations from cross threading.  We can also remove a useless
and expensive bit operation from the Rx path.

This fixes a case found where the link_status_check from a link
flap could run into an MTU change and cause a crash.

Fixes: beead698b173 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: keep rss hash after fw update
Shannon Nelson [Mon, 20 Jul 2020 23:00:16 +0000 (16:00 -0700)]
ionic: keep rss hash after fw update

Make sure the RSS hash key is kept across a fw update by not
de-initing it when an update is happening.

Fixes: c672412f6172 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: update filter id after replay
Shannon Nelson [Mon, 20 Jul 2020 23:00:15 +0000 (16:00 -0700)]
ionic: update filter id after replay

When we replay the rx filters after a fw-upgrade we get new
filter_id values from the FW, which we need to save and update
in our local filter list.  This allows us to delete the filters
with the correct filter_id when we're done.

Fixes: 7e4d47596b68 ("ionic: replay filters after fw upgrade")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: fix up filter locks and debug msgs
Shannon Nelson [Mon, 20 Jul 2020 23:00:14 +0000 (16:00 -0700)]
ionic: fix up filter locks and debug msgs

Add in a couple of forgotten spinlocks and fix up some of
the debug messages around filter management.

Fixes: c1e329ebec8d ("ionic: Add management of rx filters")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: use offset for ethtool regs data
Shannon Nelson [Mon, 20 Jul 2020 23:00:13 +0000 (16:00 -0700)]
ionic: use offset for ethtool regs data

Use an offset to write the second half of the regs data into the
second half of the buffer instead of overwriting the first half.

Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hsr: check for return value of skb_put_padto()
Murali Karicheri [Mon, 20 Jul 2020 16:43:27 +0000 (12:43 -0400)]
net: hsr: check for return value of skb_put_padto()

skb_put_padto() can fail. So check for return type and return NULL
for skb. Caller checks for skb and acts correctly if it is NULL.

Fixes: 6d6148bc78d2 ("net: hsr: fix incorrect lsdu size in the tag of HSR frames for small frames")
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoDocumentation: bareudp: update iproute2 sample commands
Guillaume Nault [Mon, 20 Jul 2020 15:46:58 +0000 (17:46 +0200)]
Documentation: bareudp: update iproute2 sample commands

bareudp.rst was written before iproute2 gained support for this new
type of tunnel. Therefore, the sample command lines didn't match the
final iproute2 implementation.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: destroy workqueue when trap_register in mlxsw_emad_init
Liu Jian [Mon, 20 Jul 2020 14:31:49 +0000 (22:31 +0800)]
mlxsw: destroy workqueue when trap_register in mlxsw_emad_init

When mlxsw_core_trap_register fails in mlxsw_emad_init,
destroy_workqueue() shouled be called to destroy mlxsw_core->emad_wq.

Fixes: d965465b60ba ("mlxsw: core: Fix possible deadlock")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodpaa_eth: Fix one possible memleak in dpaa_eth_probe
Liu Jian [Mon, 20 Jul 2020 14:28:29 +0000 (22:28 +0800)]
dpaa_eth: Fix one possible memleak in dpaa_eth_probe

When dma_coerce_mask_and_coherent() fails, the alloced netdev need to be freed.

Fixes: 060ad66f9795 ("dpaa_eth: change DMA device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'smc-fixes'
David S. Miller [Tue, 21 Jul 2020 00:52:25 +0000 (17:52 -0700)]
Merge branch 'smc-fixes'

Karsten Graul says:

====================
net/smc: fixes 2020-07-20

Please apply the following patch series for smc to netdev's net tree.

Patch 1 fixes a problem with a buffer that is not put back when the
connection was killed in the meantime.
Patch 2 fixes a wrong behaviour when the maximum dmb buffer count
exceeded.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/smc: fix dmb buffer shortage
Karsten Graul [Mon, 20 Jul 2020 14:24:29 +0000 (16:24 +0200)]
net/smc: fix dmb buffer shortage

There is a current limit of 1920 registered dmb buffers per ISM device
for smc-d. One link group can contain 255 connections, each connection
is using one dmb buffer. When the connection is closed then the
registered buffer is held in a queue and is reused by the next
connection. When a link group is 'full' then another link group is
created and uses an own buffer pool. The link groups are added to a
list using list_add() which puts a new link group to the first position
in the list.
In the situation that many connections are opened (>1920) and a few of
them stay open while others are closed quickly we end up with at least 8
link groups. For a new connection a matching link group is looked up,
iterating over the list of link groups. The trailing 7 link groups
all have registered dmb buffers which could be reused, while the first
link group has only a few dmb buffers and then hit the 1920 limit.
Because the first link group is not full (255 connection limit not
reached) it is chosen and finally the connection falls back to TCP
because there is no dmb buffer available in this link group.
There are multiple ways to fix that: using list_add_tail() allows
to scan older link groups first for free buffers which ensures that
buffers are reused first. This fixes the problem for smc-r link groups
as well. For smc-d there is an even better way to address this problem
because smc-d does not have the 255 connections per link group limit.
So fix the problem for smc-d by allowing large link groups.

Fixes: c6ba7c9ba43d ("net/smc: add base infrastructure for SMC-D and ISM")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/smc: put slot when connection is killed
Karsten Graul [Mon, 20 Jul 2020 14:24:28 +0000 (16:24 +0200)]
net/smc: put slot when connection is killed

To get a send slot smc_wr_tx_get_free_slot() is called, which might
wait for a free slot. When smc_wr_tx_get_free_slot() returns there is a
check if the connection was killed in the meantime. In that case don't
only return an error, but also put back the free slot.

Fixes: b290098092e4 ("net/smc: cancel send and receive for terminated socket")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agorxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATA
David Howells [Mon, 20 Jul 2020 11:41:46 +0000 (12:41 +0100)]
rxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATA

rxrpc_sendmsg() returns EPIPE if there's an outstanding error, such as if
rxrpc_recvmsg() indicating ENODATA if there's nothing for it to read.

Change rxrpc_recvmsg() to return EAGAIN instead if there's nothing to read
as this particular error doesn't get stored in ->sk_err by the networking
core.

Also change rxrpc_sendmsg() so that it doesn't fail with delayed receive
errors (there's no way for it to report which call, if any, the error was
caused by).

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This page took 0.137935 seconds and 4 git commands to generate.