linux.git
5 years agowil6210: use writel_relaxed in wil_debugfs_iomem_x32_set
Lior David [Tue, 10 Sep 2019 13:46:39 +0000 (16:46 +0300)]
wil6210: use writel_relaxed in wil_debugfs_iomem_x32_set

writel_relaxed can be used in wil_debugfs_iomem_x32_set
since there is a wmb call immediately after.

Signed-off-by: Lior David <liord@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: report boottime_ns in scan results
Maya Erez [Tue, 10 Sep 2019 13:46:36 +0000 (16:46 +0300)]
wil6210: report boottime_ns in scan results

Call cfg80211_inform_bss_frame_data to report cfg80211 on the
boottime_ns in order to prevent the scan results filtering due to
aging.

Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: properly initialize discovery_expired_work
Dedy Lansky [Tue, 10 Sep 2019 13:46:34 +0000 (16:46 +0300)]
wil6210: properly initialize discovery_expired_work

Upon driver rmmod, cancel_work_sync() can be invoked on
p2p.discovery_expired_work before this work struct was initialized.
This causes a WARN_ON with newer kernel version.

Add initialization of discovery_expired_work inside wil_vif_init().

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: verify cid value is valid
Alexei Avshalom Lazar [Tue, 10 Sep 2019 13:46:31 +0000 (16:46 +0300)]
wil6210: verify cid value is valid

cid value is not being verified in wmi_evt_delba(),
verification is added.

Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: make sure DR bit is read before rest of the status message
Dedy Lansky [Tue, 10 Sep 2019 13:46:29 +0000 (16:46 +0300)]
wil6210: make sure DR bit is read before rest of the status message

Due to compiler optimization, it's possible that dr_bit (descriptor
ready) is read last from the status message.
Due to race condition between HW writing the status message and
driver reading it, other fields that were read earlier (before dr_bit)
could have invalid values.

Fix this by explicitly reading the dr_bit first and then using rmb
before reading the rest of the status message.

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: fix PTK re-key race
Ahmad Masri [Tue, 10 Sep 2019 13:46:26 +0000 (16:46 +0300)]
wil6210: fix PTK re-key race

Fix a race between cfg80211 add_key call and transmitting of 4/4 EAP
packet. In case the transmit is delayed until after the add key takes
place, message 4/4 will be encrypted with the new key, and the
receiver side (AP) will drop it due to MIC error.

Wil6210 will monitor and look for the transmitted packet 4/4 eap key.
In case add_key takes place before the transmission completed, then
wil6210 will let the FW store the key and wil6210 will notify the FW
to use the PTK key only after 4/4 eap packet transmission was
completed.

Signed-off-by: Ahmad Masri <amasri@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: add debugfs to show PMC ring content
Dedy Lansky [Tue, 10 Sep 2019 13:46:24 +0000 (16:46 +0300)]
wil6210: add debugfs to show PMC ring content

PMC is a hardware debug mechanism which allows capturing real time
debug data and stream it to host memory. The driver allocates memory
buffers and set them inside PMC ring of descriptors.
Add pmcring debugfs that application can use to read the binary
content of descriptors inside the PMC ring (cat pmcring).

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agowil6210: add wil_netif_rx() helper function
Dedy Lansky [Tue, 10 Sep 2019 13:46:21 +0000 (16:46 +0300)]
wil6210: add wil_netif_rx() helper function

Move common part of wil_netif_rx_any into new helper function and add
support for non-gro receive using netif_rx_ni.

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agoath10k: fix channel info parsing for non tlv target
Rakesh Pillai [Fri, 8 Mar 2019 11:26:06 +0000 (16:56 +0530)]
ath10k: fix channel info parsing for non tlv target

The tlv targets such as WCN3990 send more data in the chan info event, which is
not sent by the non tlv targets. There is a minimum size check in the wmi event
for non-tlv targets and hence we cannot update the common channel info
structure as it was done in commit 13104929d2ec ("ath10k: fill the channel
survey results for WCN3990 correctly"). This broke channel survey results on
10.x firmware versions.

If the common channel info structure is updated, the size check for chan info
event for non-tlv targets will fail and return -EPROTO and we see the below
error messages

   ath10k_pci 0000:01:00.0: failed to parse chan info event: -71

Add tlv specific channel info structure and restore the original size of the
common channel info structure to mitigate this issue.

Tested HW: WCN3990
   QCA9887
Tested FW: WLAN.HL.3.1-00784-QCAHLSWMTPLZ-1
   10.2.4-1.0-00037

Fixes: 13104929d2ec ("ath10k: fill the channel survey results for WCN3990 correctly")
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agoath10k: adjust skb length in ath10k_sdio_mbox_rx_packet
Nicolas Boichat [Tue, 10 Sep 2019 13:46:17 +0000 (16:46 +0300)]
ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet

When the FW bundles multiple packets, pkt->act_len may be incorrect
as it refers to the first packet only (however, the FW will only
bundle packets that fit into the same pkt->alloc_len).

Before this patch, the skb length would be set (incorrectly) to
pkt->act_len in ath10k_sdio_mbox_rx_packet, and then later manually
adjusted in ath10k_sdio_mbox_rx_process_packet.

The first problem is that ath10k_sdio_mbox_rx_process_packet does not
use proper skb_put commands to adjust the length (it directly changes
skb->len), so we end up with a mismatch between skb->head + skb->tail
and skb->data + skb->len. This is quite serious, and causes corruptions
in the TCP stack, as the stack tries to coalesce packets, and relies
on skb->tail being correct (that is, skb_tail_pointer must point to
the first byte_after_ the data).

Instead of re-adjusting the size in ath10k_sdio_mbox_rx_process_packet,
this moves the code to ath10k_sdio_mbox_rx_packet, and also add a
bounds check, as skb_put would crash the kernel if not enough space is
available.

Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00007-QCARMSWP-1.

Fixes: 8530b4e7b22bc3b ("ath10k: sdio: set skb len for all rx packets")
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agoath10k: free beacon buf later in vdev teardown
Ben Greear [Tue, 10 Sep 2019 13:46:15 +0000 (16:46 +0300)]
ath10k: free beacon buf later in vdev teardown

My wave-1 firmware often crashes when I am bringing down
AP vdevs, and sometimes at least some machines lockup hard
after spewing IOMMU errors.

I don't see the same issue in STA mode, so I suspect beacons
are the issue.

Moving the beacon buf deletion to later in the vdev teardown
logic appears to help this problem.  Firmware still crashes
often, but several iterations did not show IOMMU errors and
machine didn't hang.

Tested hardware: QCA9880
Tested firmware: ath10k-ct from beginning of 2019, exact version unknown

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agoRevert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"
Chris Wilson [Thu, 12 Sep 2019 12:56:34 +0000 (13:56 +0100)]
Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"

The userptr put_pages can be called from inside try_to_unmap, and so
enters with the page lock held on one of the object's backing pages. We
cannot take the page lock ourselves for fear of recursion.

Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Martin Wilck <Martin.Wilck@suse.com>
Reported-by: Leo Kraav <leho@kraav.com>
Fixes: aa56a292ce62 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/braune...
Linus Torvalds [Thu, 12 Sep 2019 13:50:14 +0000 (14:50 +0100)]
Merge tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux

Pull clone3 fix from Christian Brauner:
 "This is a last-minute bugfix for clone3() that should go in before we
  release 5.3 with clone3().

  clone3() did not verify that the exit_signal argument was set to a
  valid signal. This can be used to cause a crash by specifying a signal
  greater than NSIG. e.g. -1.

  The commit from Eugene adds a check to copy_clone_args_from_user() to
  verify that the exit signal is limited by CSIGNAL as with legacy
  clone() and that the signal is valid. With this we don't get the
  legacy clone behavior were an invalid signal could be handed down and
  would only be detected and then ignored in do_notify_parent(). Users
  of clone3() will now get a proper error right when they pass an
  invalid exit signal. Note, that this is not a change in user-visible
  behavior since no kernel with clone3() has been released yet"

* tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
  fork: block invalid exit signals with clone3()

5 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 12 Sep 2019 13:47:35 +0000 (14:47 +0100)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "A KVM guest fix, and a kdump kernel relocation errors fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/timer: Force PIT initialization when !X86_FEATURE_ARAT
  x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large to fix kexec relocation errors

5 years agoMerge tag 'drm-misc-fixes-2019-09-12' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 12 Sep 2019 13:14:29 +0000 (23:14 +1000)]
Merge tag 'drm-misc-fixes-2019-09-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v5.3 final:
- Constify modes whitelist harder.
- Fix lima driver gem_wait ioctl.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/99e52e7a-d4ce-6a2c-0501-bc559a710955@linux.intel.com
5 years agoMerge tag 'drm-intel-fixes-2019-09-11' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 12 Sep 2019 13:11:36 +0000 (23:11 +1000)]
Merge tag 'drm-intel-fixes-2019-09-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Final drm/i915 fixes for v5.3:
- Fox DP MST high color depth regression
- Fix GPU hangs on Vulkan compute workloads

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/877e6e27qm.fsf@intel.com
5 years agofork: block invalid exit signals with clone3()
Eugene Syromiatnikov [Wed, 11 Sep 2019 17:45:40 +0000 (18:45 +0100)]
fork: block invalid exit signals with clone3()

Previously, higher 32 bits of exit_signal fields were lost when copied
to the kernel args structure (that uses int as a type for the respective
field). Moreover, as Oleg has noted, exit_signal is used unchecked, so
it has to be checked for sanity before use; for the legacy syscalls,
applying CSIGNAL mask guarantees that it is at least non-negative;
however, there's no such thing is done in clone3() code path, and that
can break at least thread_group_leader.

This commit adds a check to copy_clone_args_from_user() to verify that
the exit signal is limited by CSIGNAL as with legacy clone() and that
the signal is valid. With this we don't get the legacy clone behavior
were an invalid signal could be handed down and would only be detected
and ignored in do_notify_parent(). Users of clone3() will now get a
proper error when they pass an invalid exit signal. Note, that this is
not user-visible behavior since no kernel with clone3() has been
released yet.

The following program will cause a splat on a non-fixed clone3() version
and will fail correctly on a fixed version:

 #define _GNU_SOURCE
 #include <linux/sched.h>
 #include <linux/types.h>
 #include <sched.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/syscall.h>
 #include <sys/wait.h>
 #include <unistd.h>

 int main(int argc, char *argv[])
 {
        pid_t pid = -1;
        struct clone_args args = {0};
        args.exit_signal = -1;

        pid = syscall(__NR_clone3, &args, sizeof(struct clone_args));
        if (pid < 0)
                exit(EXIT_FAILURE);

        if (pid == 0)
                exit(EXIT_SUCCESS);

        wait(NULL);

        exit(EXIT_SUCCESS);
 }

Fixes: 7f192e3cd316 ("fork: add clone3")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Link: https://lore.kernel.org/r/4b38fa4ce420b119a4c6345f42fe3cec2de9b0b5.1568223594.git.esyr@redhat.com
[christian.brauner@ubuntu.com: simplify check and rework commit message]
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
5 years agoKVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
Thomas Huth [Thu, 12 Sep 2019 11:54:38 +0000 (13:54 +0200)]
KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl

When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
an interrupt, we convert them from the legacy struct kvm_s390_interrupt
to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
However, this function does not take care of all types of interrupts
that we can inject into the guest later (see do_inject_vcpu()). Since we
do not clear out the s390irq values before calling s390int_to_s390irq(),
there is a chance that we copy random data from the kernel stack which
could be leaked to the userspace later.

Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
interrupt: s390int_to_s390irq() does not handle it, and the function
__inject_pfault_init() later copies irq->u.ext which contains the
random kernel stack data. This data can then be leaked either to
the guest memory in __deliver_pfault_init(), or the userspace might
retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl.

Fix it by handling that interrupt type in s390int_to_s390irq(), too,
and by making sure that the s390irq struct is properly pre-initialized.
And while we're at it, make sure that s390int_to_s390irq() now
directly returns -EINVAL for unknown interrupt types, so that we
immediately get a proper error code in case we add more interrupt
types to do_inject_vcpu() without updating s390int_to_s390irq()
sometime in the future.

Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
5 years agosctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'
Christophe JAILLET [Wed, 11 Sep 2019 16:02:39 +0000 (18:02 +0200)]
sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'

The '.exit' functions from 'pernet_operations' structure should be marked
as __net_exit, not __net_init.

Fixes: 8e2d61e0aed2 ("sctp: fix race on protocol/netns initialization")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Fix spelling typos
Arkadiusz Drabczyk [Tue, 10 Sep 2019 20:49:01 +0000 (22:49 +0200)]
cxgb4: Fix spelling typos

Fix several spelling typos in comments in t4_hw.c.

Signed-off-by: Arkadiusz Drabczyk <arkadiusz@drabczyk.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoixgbe: Fix secpath usage for IPsec TX offload.
Steffen Klassert [Thu, 12 Sep 2019 11:01:44 +0000 (13:01 +0200)]
ixgbe: Fix secpath usage for IPsec TX offload.

The ixgbe driver currently does IPsec TX offloading
based on an existing secpath. However, the secpath
can also come from the RX side, in this case it is
misinterpreted for TX offload and the packets are
dropped with a "bad sa_idx" error. Fix this by using
the xfrm_offload() function to test for TX offload.

Fixes: 592594704761 ("ixgbe: process the Tx ipsec offload")
Reported-by: Michael Marley <michael@michaelmarley.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoBtrfs: fix unwritten extent buffers and hangs on future writeback attempts
Filipe Manana [Wed, 11 Sep 2019 16:42:00 +0000 (17:42 +0100)]
Btrfs: fix unwritten extent buffers and hangs on future writeback attempts

The lock_extent_buffer_io() returns 1 to the caller to tell it everything
went fine and the callers needs to start writeback for the extent buffer
(submit a bio, etc), 0 to tell the caller everything went fine but it does
not need to start writeback for the extent buffer, and a negative value if
some error happened.

When it's about to return 1 it tries to lock all pages, and if a try lock
on a page fails, and we didn't flush any existing bio in our "epd", it
calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or
an error. The page might have been locked elsewhere, not with the goal
of starting writeback of the extent buffer, and even by some code other
than btrfs, like page migration for example, so it does not mean the
writeback of the extent buffer was already started by some other task,
so returning a 0 tells the caller (btree_write_cache_pages()) to not
start writeback for the extent buffer. Note that epd might currently have
either no bio, so flush_write_bio() returns 0 (success) or it might have
a bio for another extent buffer with a lower index (logical address).

Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the
extent buffer and writeback is never started for the extent buffer,
future attempts to writeback the extent buffer will hang forever waiting
on that bit to be cleared, since it can only be cleared after writeback
completes. Such hang is reported with a trace like the following:

  [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds.
  [49887.347059]       Not tainted 5.2.13-gentoo #2
  [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [49887.347062] btrfs-transacti D    0  1752      2 0x80004000
  [49887.347064] Call Trace:
  [49887.347069]  ? __schedule+0x265/0x830
  [49887.347071]  ? bit_wait+0x50/0x50
  [49887.347072]  ? bit_wait+0x50/0x50
  [49887.347074]  schedule+0x24/0x90
  [49887.347075]  io_schedule+0x3c/0x60
  [49887.347077]  bit_wait_io+0x8/0x50
  [49887.347079]  __wait_on_bit+0x6c/0x80
  [49887.347081]  ? __lock_release.isra.29+0x155/0x2d0
  [49887.347083]  out_of_line_wait_on_bit+0x7b/0x80
  [49887.347084]  ? var_wake_function+0x20/0x20
  [49887.347087]  lock_extent_buffer_for_io+0x28c/0x390
  [49887.347089]  btree_write_cache_pages+0x18e/0x340
  [49887.347091]  do_writepages+0x29/0xb0
  [49887.347093]  ? kmem_cache_free+0x132/0x160
  [49887.347095]  ? convert_extent_bit+0x544/0x680
  [49887.347097]  filemap_fdatawrite_range+0x70/0x90
  [49887.347099]  btrfs_write_marked_extents+0x53/0x120
  [49887.347100]  btrfs_write_and_wait_transaction.isra.4+0x38/0xa0
  [49887.347102]  btrfs_commit_transaction+0x6bb/0x990
  [49887.347103]  ? start_transaction+0x33e/0x500
  [49887.347105]  transaction_kthread+0x139/0x15c

So fix this by not overwriting the return value (ret) with the result
from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK
bit in case flush_write_bio() returns an error, otherwise it will hang
any future attempts to writeback the extent buffer, and undo all work
done before (set back EXTENT_BUFFER_DIRTY, etc).

This is a regression introduced in the 5.2 kernel.

Fixes: 2e3c25136adfb ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()")
Fixes: f4340622e0226 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up")
Reported-by: Zdenek Sojka <zsojka@seznam.cz>
Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u
Reported-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t
Reported-by: Drazen Kacar <drazen.kacar@oradian.com>
Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
5 years agoBtrfs: fix assertion failure during fsync and use of stale transaction
Filipe Manana [Tue, 10 Sep 2019 14:26:49 +0000 (15:26 +0100)]
Btrfs: fix assertion failure during fsync and use of stale transaction

Sometimes when fsync'ing a file we need to log that other inodes exist and
when we need to do that we acquire a reference on the inodes and then drop
that reference using iput() after logging them.

That generally is not a problem except if we end up doing the final iput()
(dropping the last reference) on the inode and that inode has a link count
of 0, which can happen in a very short time window if the logging path
gets a reference on the inode while it's being unlinked.

In that case we end up getting the eviction callback, btrfs_evict_inode(),
invoked through the iput() call chain which needs to drop all of the
inode's items from its subvolume btree, and in order to do that, it needs
to join a transaction at the helper function evict_refill_and_join().
However because the task previously started a transaction at the fsync
handler, btrfs_sync_file(), it has current->journal_info already pointing
to a transaction handle and therefore evict_refill_and_join() will get
that transaction handle from btrfs_join_transaction(). From this point on,
two different problems can happen:

1) evict_refill_and_join() will often change the transaction handle's
   block reserve (->block_rsv) and set its ->bytes_reserved field to a
   value greater than 0. If evict_refill_and_join() never commits the
   transaction, the eviction handler ends up decreasing the reference
   count (->use_count) of the transaction handle through the call to
   btrfs_end_transaction(), and after that point we have a transaction
   handle with a NULL ->block_rsv (which is the value prior to the
   transaction join from evict_refill_and_join()) and a ->bytes_reserved
   value greater than 0. If after the eviction/iput completes the inode
   logging path hits an error or it decides that it must fallback to a
   transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
   non-zero value from btrfs_log_dentry_safe(), and because of that
   non-zero value it tries to commit the transaction using a handle with
   a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
   the transaction commit hit an assertion failure at
   btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
   the ->block_rsv is NULL. The produced stack trace for that is like the
   following:

   [192922.917158] assertion failed: !trans->bytes_reserved, file: fs/btrfs/transaction.c, line: 816
   [192922.917553] ------------[ cut here ]------------
   [192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
   [192922.918310] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
   [192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: G        W         5.1.4-btrfs-next-47 #1
   [192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
   [192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
   (...)
   [192922.920925] RSP: 0018:ffffaebdc8a27da8 EFLAGS: 00010286
   [192922.921315] RAX: 0000000000000051 RBX: ffff95c9c16a41c0 RCX: 0000000000000000
   [192922.921692] RDX: 0000000000000000 RSI: ffff95cab6b16838 RDI: ffff95cab6b16838
   [192922.922066] RBP: ffff95c9c16a41c0 R08: 0000000000000000 R09: 0000000000000000
   [192922.922442] R10: ffffaebdc8a27e70 R11: 0000000000000000 R12: ffff95ca731a0980
   [192922.922820] R13: 0000000000000000 R14: ffff95ca84c73338 R15: ffff95ca731a0ea8
   [192922.923200] FS:  00007f337eda4e80(0000) GS:ffff95cab6b00000(0000) knlGS:0000000000000000
   [192922.923579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   [192922.923948] CR2: 00007f337edad000 CR3: 00000001e00f6002 CR4: 00000000003606e0
   [192922.924329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   [192922.924711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   [192922.925105] Call Trace:
   [192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
   [192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
   [192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
   [192922.926731]  do_fsync+0x38/0x60
   [192922.927138]  __x64_sys_fdatasync+0x13/0x20
   [192922.927543]  do_syscall_64+0x60/0x1c0
   [192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
   (...)
   [192922.934077] ---[ end trace f00808b12068168f ]---

2) If evict_refill_and_join() decides to commit the transaction, it will
   be able to do it, since the nested transaction join only increments the
   transaction handle's ->use_count reference counter and it does not
   prevent the transaction from getting committed. This means that after
   eviction completes, the fsync logging path will be using a transaction
   handle that refers to an already committed transaction. What happens
   when using such a stale transaction can be unpredictable, we are at
   least having a use-after-free on the transaction handle itself, since
   the transaction commit will call kmem_cache_free() against the handle
   regardless of its ->use_count value, or we can end up silently losing
   all the updates to the log tree after that iput() in the logging path,
   or using a transaction handle that in the meanwhile was allocated to
   another task for a new transaction, etc, pretty much unpredictable
   what can happen.

In order to fix both of them, instead of using iput() during logging, use
btrfs_add_delayed_iput(), so that the logging path of fsync never drops
the last reference on an inode, that step is offloaded to a safe context
(usually the cleaner kthread).

The assertion failure issue was sporadically triggered by the test case
generic/475 from fstests, which loads the dm error target while fsstress
is running, which lead to fsync failing while logging inodes with -EIO
errors and then trying later to commit the transaction, triggering the
assertion failure.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
5 years agoKVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target...
Igor Mammedov [Wed, 11 Sep 2019 07:52:18 +0000 (03:52 -0400)]
KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()

If userspace doesn't set KVM_MEM_LOG_DIRTY_PAGES on memslot before calling
kvm_s390_vm_start_migration(), kernel will oops with:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 0000000000000000 TEID: 0000000000000483
  Fault in home space mode while using kernel ASCE.
  AS:0000000002a2000b R2:00000001bff8c00b R3:00000001bff88007 S:00000001bff91000 P:000000000000003d
  Oops: 0004 ilc:2 [#1] SMP
  ...
  Call Trace:
  ([<001fffff804ec552>] kvm_s390_vm_set_attr+0x347a/0x3828 [kvm])
   [<001fffff804ecfc0>] kvm_arch_vm_ioctl+0x6c0/0x1998 [kvm]
   [<001fffff804b67e4>] kvm_vm_ioctl+0x51c/0x11a8 [kvm]
   [<00000000008ba572>] do_vfs_ioctl+0x1d2/0xe58
   [<00000000008bb284>] ksys_ioctl+0x8c/0xb8
   [<00000000008bb2e2>] sys_ioctl+0x32/0x40
   [<000000000175552c>] system_call+0x2b8/0x2d8
  INFO: lockdep is turned off.
  Last Breaking-Event-Address:
   [<0000000000dbaf60>] __memset+0xc/0xa0

due to ms->dirty_bitmap being NULL, which might crash the host.

Make sure that ms->dirty_bitmap is set before using it or
return -EINVAL otherwise.

Cc: <stable@vger.kernel.org>
Fixes: afdad61615cc ("KVM: s390: Fix storage attributes migration with memory slots")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/kvm/20190911075218.29153-1-imammedo@redhat.com/
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
5 years agonet: qrtr: fix memort leak in qrtr_tun_write_iter
Navid Emamdoost [Wed, 11 Sep 2019 15:09:02 +0000 (10:09 -0500)]
net: qrtr: fix memort leak in qrtr_tun_write_iter

In qrtr_tun_write_iter the allocated kbuf should be release in case of
error or success return.

v2 Update: Thanks to David Miller for pointing out the release on success
path as well.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Fix null de-reference of device refcount
Subash Abhinov Kasiviswanathan [Tue, 10 Sep 2019 20:02:57 +0000 (14:02 -0600)]
net: Fix null de-reference of device refcount

In event of failure during register_netdevice, free_netdev is
invoked immediately. free_netdev assumes that all the netdevice
refcounts have been dropped prior to it being called and as a
result frees and clears out the refcount pointer.

However, this is not necessarily true as some of the operations
in the NETDEV_UNREGISTER notifier handlers queue RCU callbacks for
invocation after a grace period. The IPv4 callback in_dev_rcu_put
tries to access the refcount after free_netdev is called which
leads to a null de-reference-

44837.761523:   <6> Unable to handle kernel paging request at
                    virtual address 0000004a88287000
44837.761651:   <2> pc : in_dev_finish_destroy+0x4c/0xc8
44837.761654:   <2> lr : in_dev_finish_destroy+0x2c/0xc8
44837.762393:   <2> Call trace:
44837.762398:   <2>  in_dev_finish_destroy+0x4c/0xc8
44837.762404:   <2>  in_dev_rcu_put+0x24/0x30
44837.762412:   <2>  rcu_nocb_kthread+0x43c/0x468
44837.762418:   <2>  kthread+0x118/0x128
44837.762424:   <2>  ret_from_fork+0x10/0x1c

Fix this by waiting for the completion of the call_rcu() in
case of register_netdevice errors.

Fixes: 93ee31f14f6f ("[NET]: Fix free_netdev on register_netdev failure.")
Cc: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'add-ksz9567-with-I2C-support-to-ksz9477-driver'
David S. Miller [Thu, 12 Sep 2019 10:36:12 +0000 (11:36 +0100)]
Merge branch 'add-ksz9567-with-I2C-support-to-ksz9477-driver'

George McCollister says:

====================
add ksz9567 with I2C support to ksz9477 driver

Resurrect KSZ9477 I2C driver support patch originally sent to the list
by Tristram Ha and resolve outstanding issues. It now works as similarly to
the ksz9477 SPI driver as possible, using the same regmap macros.

Add support for ksz9567 to the ksz9477 driver (tested on a board with
ksz9567 connected via I2C).

Remove NET_DSA_TAG_KSZ_COMMON since it's not needed.

Changes since v1:
Put ksz9477_i2c.c includes in alphabetical order.
Added Reviewed-Bys.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: microchip: remove NET_DSA_TAG_KSZ_COMMON
George McCollister [Tue, 10 Sep 2019 13:18:36 +0000 (08:18 -0500)]
net: dsa: microchip: remove NET_DSA_TAG_KSZ_COMMON

Remove the superfluous NET_DSA_TAG_KSZ_COMMON and just use the existing
NET_DSA_TAG_KSZ. Update the description to mention the three switch
families it supports. No functional change.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: microchip: add ksz9567 to ksz9477 driver
George McCollister [Tue, 10 Sep 2019 13:18:35 +0000 (08:18 -0500)]
net: dsa: microchip: add ksz9567 to ksz9477 driver

Add support for the KSZ9567 7-Port Gigabit Ethernet Switch to the
ksz9477 driver. The KSZ9567 supports both SPI and I2C. Oddly the
ksz9567 is already in the device tree binding documentation.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: microchip: add KSZ9477 I2C driver
Tristram Ha [Tue, 10 Sep 2019 13:18:34 +0000 (08:18 -0500)]
net: dsa: microchip: add KSZ9477 I2C driver

Add KSZ9477 I2C driver support.  The code ksz9477.c and ksz_common.c are
used together to generate the I2C driver.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
[george.mccollister@gmail.com: bring up to date, use ksz_common regmap macros]
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'
Christophe JAILLET [Tue, 10 Sep 2019 11:29:59 +0000 (13:29 +0200)]
ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'

The '.exit' functions from 'pernet_operations' structure should be marked
as __net_exit, not __net_init.

Fixes: d862e5461423 ("net: ipv6: Implement /proc/net/icmp6.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotun: fix use-after-free when register netdev failed
Yang Yingliang [Tue, 10 Sep 2019 10:56:57 +0000 (18:56 +0800)]
tun: fix use-after-free when register netdev failed

I got a UAF repport in tun driver when doing fuzzy test:

[  466.269490] ==================================================================
[  466.271792] BUG: KASAN: use-after-free in tun_chr_read_iter+0x2ca/0x2d0
[  466.271806] Read of size 8 at addr ffff888372139250 by task tun-test/2699
[  466.271810]
[  466.271824] CPU: 1 PID: 2699 Comm: tun-test Not tainted 5.3.0-rc1-00001-g5a9433db2614-dirty #427
[  466.271833] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[  466.271838] Call Trace:
[  466.271858]  dump_stack+0xca/0x13e
[  466.271871]  ? tun_chr_read_iter+0x2ca/0x2d0
[  466.271890]  print_address_description+0x79/0x440
[  466.271906]  ? vprintk_func+0x5e/0xf0
[  466.271920]  ? tun_chr_read_iter+0x2ca/0x2d0
[  466.271935]  __kasan_report+0x15c/0x1df
[  466.271958]  ? tun_chr_read_iter+0x2ca/0x2d0
[  466.271976]  kasan_report+0xe/0x20
[  466.271987]  tun_chr_read_iter+0x2ca/0x2d0
[  466.272013]  do_iter_readv_writev+0x4b7/0x740
[  466.272032]  ? default_llseek+0x2d0/0x2d0
[  466.272072]  do_iter_read+0x1c5/0x5e0
[  466.272110]  vfs_readv+0x108/0x180
[  466.299007]  ? compat_rw_copy_check_uvector+0x440/0x440
[  466.299020]  ? fsnotify+0x888/0xd50
[  466.299040]  ? __fsnotify_parent+0xd0/0x350
[  466.299064]  ? fsnotify_first_mark+0x1e0/0x1e0
[  466.304548]  ? vfs_write+0x264/0x510
[  466.304569]  ? ksys_write+0x101/0x210
[  466.304591]  ? do_preadv+0x116/0x1a0
[  466.304609]  do_preadv+0x116/0x1a0
[  466.309829]  do_syscall_64+0xc8/0x600
[  466.309849]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  466.309861] RIP: 0033:0x4560f9
[  466.309875] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
[  466.309889] RSP: 002b:00007ffffa5166e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000127
[  466.322992] RAX: ffffffffffffffda RBX: 0000000000400460 RCX: 00000000004560f9
[  466.322999] RDX: 0000000000000003 RSI: 00000000200008c0 RDI: 0000000000000003
[  466.323007] RBP: 00007ffffa516700 R08: 0000000000000004 R09: 0000000000000000
[  466.323014] R10: 0000000000000000 R11: 0000000000000206 R12: 000000000040cb10
[  466.323021] R13: 0000000000000000 R14: 00000000006d7018 R15: 0000000000000000
[  466.323057]
[  466.323064] Allocated by task 2605:
[  466.335165]  save_stack+0x19/0x80
[  466.336240]  __kasan_kmalloc.constprop.8+0xa0/0xd0
[  466.337755]  kmem_cache_alloc+0xe8/0x320
[  466.339050]  getname_flags+0xca/0x560
[  466.340229]  user_path_at_empty+0x2c/0x50
[  466.341508]  vfs_statx+0xe6/0x190
[  466.342619]  __do_sys_newstat+0x81/0x100
[  466.343908]  do_syscall_64+0xc8/0x600
[  466.345303]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  466.347034]
[  466.347517] Freed by task 2605:
[  466.348471]  save_stack+0x19/0x80
[  466.349476]  __kasan_slab_free+0x12e/0x180
[  466.350726]  kmem_cache_free+0xc8/0x430
[  466.351874]  putname+0xe2/0x120
[  466.352921]  filename_lookup+0x257/0x3e0
[  466.354319]  vfs_statx+0xe6/0x190
[  466.355498]  __do_sys_newstat+0x81/0x100
[  466.356889]  do_syscall_64+0xc8/0x600
[  466.358037]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  466.359567]
[  466.360050] The buggy address belongs to the object at ffff888372139100
[  466.360050]  which belongs to the cache names_cache of size 4096
[  466.363735] The buggy address is located 336 bytes inside of
[  466.363735]  4096-byte region [ffff888372139100ffff88837213a100)
[  466.367179] The buggy address belongs to the page:
[  466.368604] page:ffffea000dc84e00 refcount:1 mapcount:0 mapping:ffff8883df1b4f00 index:0x0 compound_mapcount: 0
[  466.371582] flags: 0x2fffff80010200(slab|head)
[  466.372910] raw: 002fffff80010200 dead000000000100 dead000000000122 ffff8883df1b4f00
[  466.375209] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[  466.377778] page dumped because: kasan: bad access detected
[  466.379730]
[  466.380288] Memory state around the buggy address:
[  466.381844]  ffff888372139100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  466.384009]  ffff888372139180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  466.386131] >ffff888372139200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  466.388257]                                                  ^
[  466.390234]  ffff888372139280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  466.392512]  ffff888372139300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  466.394667] ==================================================================

tun_chr_read_iter() accessed the memory which freed by free_netdev()
called by tun_set_iff():

        CPUA                                           CPUB
  tun_set_iff()
    alloc_netdev_mqs()
    tun_attach()
                                                  tun_chr_read_iter()
                                                    tun_get()
                                                    tun_do_read()
                                                      tun_ring_recv()
    register_netdevice() <-- inject error
    goto err_detach
    tun_detach_all() <-- set RCV_SHUTDOWN
    free_netdev() <-- called from
                     err_free_dev path
      netdev_freemem() <-- free the memory
                        without check refcount
      (In this path, the refcount cannot prevent
       freeing the memory of dev, and the memory
       will be used by dev_put() called by
       tun_chr_read_iter() on CPUB.)
                                                     (Break from tun_ring_recv(),
                                                     because RCV_SHUTDOWN is set)
                                                   tun_put()
                                                     dev_put() <-- use the memory
                                                                   freed by netdev_freemem()

Put the publishing of tfile->tun after register_netdevice(),
so tun_get() won't get the tun pointer that freed by
err_detach path if register_netdevice() failed.

Fixes: eb0fb363f920 ("tuntap: attach queue 0 before registering netdevice")
Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Thu, 12 Sep 2019 10:07:31 +0000 (11:07 +0100)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Last minute bugfixes.

  A couple of security things.

  And an error handling bugfix that is never encountered by most people,
  but that also makes it kind of safe to push at the last minute, and it
  helps push the fix to stable a bit sooner"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: make sure log_num < in_num
  vhost: block speculation of translated descriptors
  virtio_ring: fix unmap of indirect descriptors

5 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 12 Sep 2019 10:04:50 +0000 (11:04 +0100)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fix from Ingo Molnar:
 "Fix an initialization bug in the hw-breakpoints, which triggered on
  the ARM platform"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization

5 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 12 Sep 2019 10:02:00 +0000 (11:02 +0100)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix a race in the IRQ resend mechanism, which can result in a NULL
  dereference crash"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Prevent NULL pointer dereference in resend_irqs()

5 years agoMerge tag 'pinctrl-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 12 Sep 2019 09:58:47 +0000 (10:58 +0100)]
Merge tag 'pinctrl-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fix from Linus Walleij:
 "Hopefully last pin control fix: a single patch for some Aspeed
  problems. The BMCs are much happier now"

* tag 'pinctrl-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: aspeed: Fix spurious mux failures on the AST2500

5 years agoMerge tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Thu, 12 Sep 2019 08:53:38 +0000 (09:53 +0100)]
Merge tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "I don't really like to send so many fixes at the very last minute, but
  the bug-sport activity is unpredictable.

  Four fixes, three are -stable material that will go everywhere, one is
  for the current cycle:

   - An ACPI DSDT error fixup of the type we always see and Hans
     invariably gets to fix.

   - A OF quirk fix for the current release (v5.3)

   - Some consistency checks on the userspace ABI.

   - A memory leak"

* tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist
  gpiolib: of: fix fallback quirks handling
  gpio: fix line flag validation in lineevent_create
  gpio: fix line flag validation in linehandle_create
  gpio: mockup: add missing single_release()

5 years agopinctrl: aspeed: Fix spurious mux failures on the AST2500
Andrew Jeffery [Thu, 29 Aug 2019 07:17:38 +0000 (16:47 +0930)]
pinctrl: aspeed: Fix spurious mux failures on the AST2500

Commit 674fa8daa8c9 ("pinctrl: aspeed-g5: Delay acquisition of regmaps")
was determined to be a partial fix to the problem of acquiring the LPC
Host Controller and GFX regmaps: The AST2500 pin controller may need to
fetch syscon regmaps during expression evaluation as well as when
setting mux state. For example, this case is hit by attempting to export
pins exposing the LPC Host Controller as GPIOs.

An optional eval() hook is added to the Aspeed pinmux operation struct
and called from aspeed_sig_expr_eval() if the pointer is set by the
SoC-specific driver. This enables the AST2500 to perform the custom
action of acquiring its regmap dependencies as required.

John Wang tested the fix on an Inspur FP5280G2 machine (AST2500-based)
where the issue was found, and I've booted the fix on Witherspoon
(AST2500) and Palmetto (AST2400) machines, and poked at relevant pins
under QEMU by forcing mux configurations via devmem before exporting
GPIOs to exercise the driver.

Fixes: 7d29ed88acbb ("pinctrl: aspeed: Read and write bits in LPC and GFX controllers")
Fixes: 674fa8daa8c9 ("pinctrl: aspeed-g5: Delay acquisition of regmaps")
Reported-by: John Wang <wangzqbj@inspur.com>
Tested-by: John Wang <wangzqbj@inspur.com>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Link: https://lore.kernel.org/r/20190829071738.2523-1-andrew@aj.id.au
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Wed, 11 Sep 2019 23:05:52 +0000 (00:05 +0100)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2019-09-11

This series contains fixes to ixgbe.

Alex fixes up the adaptive ITR scheme for ixgbe which could result in a
value that was either 0 or something less than 10 which was causing
issues with hardware features, like RSC, that do not function well with
ITR values that low.

Ilya Maximets fixes the ixgbe driver to limit the number of transmit
descriptors to clean by the number of transmit descriptors used in the
transmit ring, so that the driver does not try to "double" clean the
same descriptors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: read chip model from the PluDevice register
Dirk van der Merwe [Wed, 11 Sep 2019 15:21:18 +0000 (16:21 +0100)]
nfp: read chip model from the PluDevice register

The PluDevice register provides the authoritative chip model/revision.

Since the model number is purely used for reporting purposes, follow
the hardware team convention of subtracting 0x10 from the PluDevice
register to obtain the chip model/revision number.

Suggested-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: force a PSH flag on TSO packets
Eric Dumazet [Tue, 10 Sep 2019 21:49:28 +0000 (14:49 -0700)]
tcp: force a PSH flag on TSO packets

When tcp sends a TSO packet, adding a PSH flag on it
reduces the sojourn time of GRO packet in GRO receivers.

This is particularly the case under pressure, since RX queues
receive packets for many concurrent flows.

A sender can give a hint to GRO engines when it is
appropriate to flush a super-packet, especially when pacing
is in the picture, since next packet is probably delayed by
one ms.

Having less packets in GRO engine reduces chance
of LRU eviction or inflated RTT, and reduces GRO cost.

We found recently that we must not set the PSH flag on
individual full-size MSS segments [1] :

 Under pressure (CWR state), we better let the packet sit
 for a small delay (depending on NAPI logic) so that the
 ACK packet is delayed, and thus next packet we send is
 also delayed a bit. Eventually the bottleneck queue can
 be drained. DCTCP flows with CWND=1 have demonstrated
 the issue.

This patch allows to slowdown the aggregate traffic without
involving high resolution timers on senders and/or
receivers.

It has been used at Google for about four years,
and has been discussed at various networking conferences.

[1] segments smaller than MSS already have PSH flag set
    by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE
    has been requested by the user.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR
Neal Cardwell [Mon, 9 Sep 2019 20:56:02 +0000 (16:56 -0400)]
tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR

Fix tcp_ecn_withdraw_cwr() to clear the correct bit:
TCP_ECN_QUEUE_CWR.

Rationale: basically, TCP_ECN_DEMAND_CWR is a bit that is purely about
the behavior of data receivers, and deciding whether to reflect
incoming IP ECN CE marks as outgoing TCP th->ece marks. The
TCP_ECN_QUEUE_CWR bit is purely about the behavior of data senders,
and deciding whether to send CWR. The tcp_ecn_withdraw_cwr() function
is only called from tcp_undo_cwnd_reduction() by data senders during
an undo, so it should zero the sender-side state,
TCP_ECN_QUEUE_CWR. It does not make sense to stop the reflection of
incoming CE bits on incoming data packets just because outgoing
packets were spuriously retransmitted.

The bug has been reproduced with packetdrill to manifest in a scenario
with RFC3168 ECN, with an incoming data packet with CE bit set and
carrying a TCP timestamp value that causes cwnd undo. Before this fix,
the IP CE bit was ignored and not reflected in the TCP ECE header bit,
and sender sent a TCP CWR ('W') bit on the next outgoing data packet,
even though the cwnd reduction had been undone.  After this fix, the
sender properly reflects the CE bit and does not set the W bit.

Note: the bug actually predates 2005 git history; this Fixes footer is
chosen to be the oldest SHA1 I have tested (from Sep 2007) for which
the patch applies cleanly (since before this commit the code was in a
.h file).

Fixes: bdf1ee5d3bd3 ("[TCP]: Move code from tcp_ecn.h to tcp*.c and tcp.h & remove it")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Don't use dst gateway directly in ip6_confirm_neigh()
Stefano Brivio [Mon, 9 Sep 2019 20:44:06 +0000 (22:44 +0200)]
ipv6: Don't use dst gateway directly in ip6_confirm_neigh()

This is the equivalent of commit 2c6b55f45d53 ("ipv6: fix neighbour
resolution with raw socket") for ip6_confirm_neigh(): we can send a
packet with MSG_CONFIRM on a raw socket for a connected route, so the
gateway would be :: here, and we should pick the next hop using
rt6_nexthop() instead.

This was found by code review and, to the best of my knowledge, doesn't
actually fix a practical issue: the destination address from the packet
is not considered while confirming a neighbour, as ip6_confirm_neigh()
calls choose_neigh_daddr() without passing the packet, so there are no
similar issues as the one fixed by said commit.

A possible source of issues with the existing implementation might come
from the fact that, if we have a cached dst, we won't consider it,
while rt6_nexthop() takes care of that. I might just not be creative
enough to find a practical problem here: the only way to affect this
with cached routes is to have one coming from an ICMPv6 redirect, but
if the next hop is a directly connected host, there should be no
topology for which a redirect applies here, and tests with redirected
routes show no differences for MSG_CONFIRM (and MSG_PROBE) packets on
raw sockets destined to a directly connected host.

However, directly using the dst gateway here is not consistent anymore
with neighbour resolution, and, in general, as we want the next hop,
using rt6_nexthop() looks like the only sane way to fetch it.

Reported-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: pci: Add HAPS support using GMAC5
Jose Abreu [Mon, 9 Sep 2019 16:54:26 +0000 (18:54 +0200)]
net: stmmac: pci: Add HAPS support using GMAC5

Add the support for Synopsys HAPS board that uses GMAC5.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Add SGMII mode type switching
Vitaly Gaiduk [Mon, 9 Sep 2019 17:19:24 +0000 (20:19 +0300)]
net: phy: dp83867: Add SGMII mode type switching

This patch adds ability to switch beetween two PHY SGMII modes.
Some hardware, for example, FPGA IP designs may use 6-wire mode
which enables differential SGMII clock to MAC.

Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Add documentation for SGMII mode type
Vitaly Gaiduk [Mon, 9 Sep 2019 17:19:25 +0000 (20:19 +0300)]
net: phy: dp83867: Add documentation for SGMII mode type

Add documentation of ti,sgmii-ref-clock-output-enable
which can be used to select SGMII mode type (4 or 6-wire).

Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovhost: make sure log_num < in_num
yongduan [Wed, 11 Sep 2019 09:44:24 +0000 (17:44 +0800)]
vhost: make sure log_num < in_num

The code assumes log_num < in_num everywhere, and that is true as long as
in_num is incremented by descriptor iov count, and log_num by 1. However
this breaks if there's a zero sized descriptor.

As a result, if a malicious guest creates a vring desc with desc.len = 0,
it may cause the host kernel to crash by overflowing the log array. This
bug can be triggered during the VM migration.

There's no need to log when desc.len = 0, so just don't increment log_num
in this case.

Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
Cc: stable@vger.kernel.org
Reviewed-by: Lidong Chen <lidongchen@tencent.com>
Signed-off-by: ruippan <ruippan@tencent.com>
Signed-off-by: yongduan <yongduan@tencent.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
5 years agovhost: block speculation of translated descriptors
Michael S. Tsirkin [Sun, 8 Sep 2019 11:04:08 +0000 (07:04 -0400)]
vhost: block speculation of translated descriptors

iovec addresses coming from vhost are assumed to be
pre-validated, but in fact can be speculated to a value
out of range.

Userspace address are later validated with array_index_nospec so we can
be sure kernel info does not leak through these addresses, but vhost
must also not leak userspace info outside the allowed memory table to
guests.

Following the defence in depth principle, make sure
the address is not validated out of node range.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
5 years agoixgbe: fix double clean of Tx descriptors with xdp
Ilya Maximets [Thu, 22 Aug 2019 17:12:37 +0000 (20:12 +0300)]
ixgbe: fix double clean of Tx descriptors with xdp

Tx code doesn't clear the descriptors' status after cleaning.
So, if the budget is larger than number of used elems in a ring, some
descriptors will be accounted twice and xsk_umem_complete_tx will move
prod_tail far beyond the prod_head breaking the completion queue ring.

Fix that by limiting the number of descriptors to clean by the number
of used descriptors in the Tx ring.

'ixgbe_clean_xdp_tx_irq()' function refactored to look more like
'ixgbe_xsk_clean_tx_ring()' since we're allowed to directly use
'next_to_clean' and 'next_to_use' indexes.

CC: stable@vger.kernel.org
Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoixgbe: Prevent u8 wrapping of ITR value to something less than 10us
Alexander Duyck [Wed, 4 Sep 2019 15:07:11 +0000 (08:07 -0700)]
ixgbe: Prevent u8 wrapping of ITR value to something less than 10us

There were a couple cases where the ITR value generated via the adaptive
ITR scheme could exceed 126. This resulted in the value becoming either 0
or something less than 10. Switching back and forth between a value less
than 10 and a value greater than 10 can cause issues as certain hardware
features such as RSC to not function well when the ITR value has dropped
that low.

CC: stable@vger.kernel.org
Fixes: b4ded8327fea ("ixgbe: Update adaptive ITR algorithm")
Reported-by: Gregg Leventhal <gleventhal@janestreet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: fix potential RX buffer starvation for AF_XDP
Magnus Karlsson [Mon, 9 Sep 2019 16:55:38 +0000 (09:55 -0700)]
i40e: fix potential RX buffer starvation for AF_XDP

When the RX rings are created they are also populated with buffers
so that packets can be received. Usually these are kernel buffers,
but for AF_XDP in zero-copy mode, these are user-space buffers and
in this case the application might not have sent down any buffers
to the driver at this point. And if no buffers are allocated at ring
creation time, no packets can be received and no interrupts will be
generated so the NAPI poll function that allocates buffers to the
rings will never get executed.

To rectify this, we kick the NAPI context of any queue with an
attached AF_XDP zero-copy socket in two places in the code. Once
after an XDP program has loaded and once after the umem is registered.
This take care of both cases: XDP program gets loaded first then AF_XDP
socket is created, and the reverse, AF_XDP socket is created first,
then XDP program is loaded.

Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agonet/ixgbevf: make array api static const, makes object smaller
Colin Ian King [Fri, 6 Sep 2019 11:33:56 +0000 (12:33 +0100)]
net/ixgbevf: make array api static const, makes object smaller

Don't populate the array API on the stack but instead make it
static const. Makes the object code smaller by 58 bytes.

Before:
   text    data     bss     dec     hex filename
  82969    9763     256   92988   16b3c ixgbevf/ixgbevf_main.o

After:
   text    data     bss     dec     hex filename
  82815    9859     256   92930   16b02 ixgbevf/ixgbevf_main.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoiavf: fix MAC address setting for VFs when filter is rejected
Stefan Assmann [Thu, 5 Sep 2019 06:34:22 +0000 (08:34 +0200)]
iavf: fix MAC address setting for VFs when filter is rejected

Currently iavf unconditionally applies MAC address change requests. This
brings the VF in a state where it is no longer able to pass traffic if
the PF rejects a MAC filter change for the VF.
A typical scenario for a rejected MAC filter is for an untrusted VF to
request to change the MAC address when an administratively set MAC is
present.

To keep iavf working in this scenario the MAC filter handling in iavf
needs to act on the PF reply regarding the MAC filter change. In the
case of an ack the new MAC address gets set, whereas in the case of a
nack the previous MAC address needs to stay in place.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate
Stefan Assmann [Tue, 3 Sep 2019 06:08:10 +0000 (08:08 +0200)]
i40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate

In the case of an invalid min Tx rate being requested
i40e_ndo_set_vf_bw() immediately returns -EINVAL instead of releasing
__I40E_VIRTCHNL_OP_PENDING first.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: use BIT macro to specify the cloud filter field flags
Jacob Keller [Mon, 26 Aug 2019 18:16:55 +0000 (11:16 -0700)]
i40e: use BIT macro to specify the cloud filter field flags

The macros used to specify the cloud filter fields are intended to be
individual bits. Declare them using the BIT() macro to make their
intention a little more clear.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: Fix message for other card without FEC.
Czeslaw Zagorski [Mon, 26 Aug 2019 18:16:54 +0000 (11:16 -0700)]
i40e: Fix message for other card without FEC.

When variable "req_fec, fec, an" are empty,
dmesg shows log with "Requested FEC: , Negotiated FEC: , Autoneg:".
Add link dmesg log for cards without FEC.

Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: fix missed "Negotiated" string in i40e_print_link_message()
Aleksandr Loktionov [Mon, 26 Aug 2019 18:16:53 +0000 (11:16 -0700)]
i40e: fix missed "Negotiated" string in i40e_print_link_message()

The "Negotiated" string in i40e_print_link_message() function was missed.
This string has been added to the dmesg and small refactoring done removing
common substrings and unifying link status message format.
Without this patch it was not clear that FEC is related to negotiated FEC.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: mark additional missing bits as reserved
Jacob Keller [Mon, 26 Aug 2019 18:16:52 +0000 (11:16 -0700)]
i40e: mark additional missing bits as reserved

Mark bits 0xD through 0xF for the command flags of a cloud filter as
reserved. These bits are not yet defined and are considered as reserved
in the data sheet.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP
Jacob Keller [Mon, 26 Aug 2019 18:16:51 +0000 (11:16 -0700)]
i40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP

The bit 0x0001 used in the cloud filters adminq command is reserved, and
is not actually a valid type.

The Linux driver has never used this type, and it's not clear if any
driver ever has.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: use ktime_get_real_ts64 instead of ktime_to_timespec64
Jacob Keller [Mon, 26 Aug 2019 18:16:50 +0000 (11:16 -0700)]
i40e: use ktime_get_real_ts64 instead of ktime_to_timespec64

Remove a call to ktime_to_timespec64 by calling ktime_get_real_ts64
directly.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoixgbe: use skb_get_queue_mapping in tx path
Tonghao Zhang [Thu, 22 Aug 2019 10:56:46 +0000 (18:56 +0800)]
ixgbe: use skb_get_queue_mapping in tx path

Use the common api, and don't access queue_mapping directly.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
Stefan Assmann [Wed, 21 Aug 2019 14:09:29 +0000 (16:09 +0200)]
i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask

While testing VF spawn/destroy the following panic occurred.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000029
[...]
Workqueue: i40e i40e_service_task [i40e]
RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e]
[...]
Call Trace:
 ? __switch_to_asm+0x35/0x70
 ? __switch_to_asm+0x41/0x70
 ? __switch_to_asm+0x35/0x70
 ? _cond_resched+0x15/0x30
 i40e_sync_filters_subtask+0x56/0x70 [i40e]
 i40e_service_task+0x382/0x11b0 [i40e]
 ? __switch_to_asm+0x41/0x70
 ? __switch_to_asm+0x41/0x70
 process_one_work+0x1a7/0x3b0
 worker_thread+0x30/0x390
 ? create_worker+0x1a0/0x1a0
 kthread+0x112/0x130
 ? kthread_bind+0x30/0x30
 ret_from_fork+0x35/0x40

Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
accessed by the watchdog via i40e_sync_filters_subtask() although
i40e_free_vfs() already free'd pf->vf.
To avoid this the call to i40e_sync_vsi_filters() in
i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
which is also used by i40e_free_vfs().

Note: put the __I40E_VF_DISABLE check after the
__I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
trigger.

CC: stable@vger.kernel.org
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoixgbe: fix memory leaks
Wenwen Wang [Sun, 11 Aug 2019 20:07:47 +0000 (15:07 -0500)]
ixgbe: fix memory leaks

In ixgbe_configure_clsu32(), 'jump', 'input', and 'mask' are allocated
through kzalloc() respectively in a for loop body. Then,
ixgbe_clsu32_build_input() is invoked to build the input. If this process
fails, next iteration of the for loop will be executed. However, the
allocated 'jump', 'input', and 'mask' are not deallocated on this execution
path, leading to memory leaks.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agodt-bindings: net: dwmac: document 'mac-mode' property
Alexandru Ardelean [Fri, 6 Sep 2019 13:02:56 +0000 (16:02 +0300)]
dt-bindings: net: dwmac: document 'mac-mode' property

This change documents the 'mac-mode' property that was introduced in the
'stmmac' driver to support passive mode converters that can sit in-between
the MAC & PHY.

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: implement support for passive mode converters via dt
Alexandru Ardelean [Fri, 6 Sep 2019 13:02:55 +0000 (16:02 +0300)]
net: stmmac: implement support for passive mode converters via dt

In-between the MAC & PHY there can be a mode converter, which converts one
mode to another (e.g. GMII-to-RGMII).

The converter, can be passive (i.e. no driver or OS/SW information
required), so the MAC & PHY need to be configured differently.

For the `stmmac` driver, this is implemented via a `mac-mode` property in
the device-tree, which configures the MAC into a certain mode, and for the
PHY a `phy_interface` field will hold the mode of the PHY. The mode of the
PHY will be passed to the PHY and from there-on it work in a different
mode. If unspecified, the default `phy-mode` will be used for both.

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlx4: fix spelling mistake "veify" -> "verify"
Colin Ian King [Wed, 11 Sep 2019 14:18:11 +0000 (15:18 +0100)]
mlx4: fix spelling mistake "veify" -> "verify"

There is a spelling mistake in a mlx4_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix spelling mistake "undeflow" -> "underflow"
Colin Ian King [Wed, 11 Sep 2019 14:08:16 +0000 (15:08 +0100)]
net: hns3: fix spelling mistake "undeflow" -> "underflow"

There is a spelling mistake in a .msg literal string. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'qed-Fix-series'
David S. Miller [Wed, 11 Sep 2019 14:15:23 +0000 (15:15 +0100)]
Merge branch 'qed-Fix-series'

Sudarsana Reddy Kalluru says:

====================
qed* Fix series.

The patch series addresses couple of issues in the recent commits.
Patch (1) populates the actual dump-size of config attribute instead of
providing a fixed size value.
Patch(2) updates frame format of flash config buffer as required by
management FW (MFW).

Please consider applying it to net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Fix Config attribute frame format.
Sudarsana Reddy Kalluru [Wed, 11 Sep 2019 11:42:51 +0000 (04:42 -0700)]
qed: Fix Config attribute frame format.

MFW associates the entity id to a config attribute instead of assigning
one entity id for all the config attributes.
This patch incorporates driver changes to link entity id to a config id
attribute.

Fixes: 0dabbe1bb3a4 ("qed: Add driver API for flashing the config attributes.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed*: Fix size of config attribute dump.
Sudarsana Reddy Kalluru [Wed, 11 Sep 2019 11:42:50 +0000 (04:42 -0700)]
qed*: Fix size of config attribute dump.

Driver currently returns max-buf-size as size of the config attribute.
This patch incorporates changes to read this value from MFW (if available)
and provide it to the user. Also did a trivial clean up in this path.

Fixes: d44a3ced7023 ("qede: Add support for reading the config id attributes.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: lmc: fix spelling mistake "runnin" -> "running"
Colin Ian King [Wed, 11 Sep 2019 11:37:34 +0000 (12:37 +0100)]
net: lmc: fix spelling mistake "runnin" -> "running"

There is a spelling mistake in the lmc_trace message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'devlink-unknown'
David S. Miller [Wed, 11 Sep 2019 14:10:05 +0000 (15:10 +0100)]
Merge branch 'devlink-unknown'

Simon Horman says:

====================
devlink: add unknown 'fw_load_policy' value

Dirk says:

Recently we added an unknown value for the 'reset_dev_on_drv_probe' devlink
parameter. Extend the 'fw_load_policy' parameter in the same way.

The only driver that uses this right now is the nfp driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: devlink: set unknown fw_load_policy
Dirk van der Merwe [Wed, 11 Sep 2019 11:08:33 +0000 (12:08 +0100)]
nfp: devlink: set unknown fw_load_policy

If the 'app_fw_from_flash' HWinfo key is invalid, set the
'fw_load_policy' devlink parameter value to unknown.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: add unknown 'fw_load_policy' value
Dirk van der Merwe [Wed, 11 Sep 2019 11:08:32 +0000 (12:08 +0100)]
devlink: add unknown 'fw_load_policy' value

Similar to the 'reset_dev_on_drv_probe' devlink parameter, it is useful
to have an unknown value which can be used by drivers to report that the
hardware value isn't recognized or is otherwise invalid instead of
failing the operation.

This is especially useful for u8/enum parameters.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoNFC: st95hf: fix spelling mistake "receieve" -> "receive"
Colin Ian King [Wed, 11 Sep 2019 10:38:48 +0000 (11:38 +0100)]
NFC: st95hf: fix spelling mistake "receieve" -> "receive"

There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/rds: An rds_sock is added too early to the hash table
Ka-Cheong Poon [Wed, 11 Sep 2019 09:58:05 +0000 (02:58 -0700)]
net/rds: An rds_sock is added too early to the hash table

In rds_bind(), an rds_sock is added to the RDS bind hash table before
rs_transport is set.  This means that the socket can be found by the
receive code path when rs_transport is NULL.  And the receive code
path de-references rs_transport for congestion update check.  This can
cause a panic.  An rds_sock should not be added to the bind hash table
before all the needed fields are set.

Reported-by: syzbot+4b4f8163c2e246df3c4c@syzkaller.appspotmail.com
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomac80211: Do not send Layer 2 Update frame before authorization
Jouni Malinen [Wed, 11 Sep 2019 13:03:05 +0000 (16:03 +0300)]
mac80211: Do not send Layer 2 Update frame before authorization

The Layer 2 Update frame is used to update bridges when a station roams
to another AP even if that STA does not transmit any frames after the
reassociation. This behavior was described in IEEE Std 802.11F-2003 as
something that would happen based on MLME-ASSOCIATE.indication, i.e.,
before completing 4-way handshake. However, this IEEE trial-use
recommended practice document was published before RSN (IEEE Std
802.11i-2004) and as such, did not consider RSN use cases. Furthermore,
IEEE Std 802.11F-2003 was withdrawn in 2006 and as such, has not been
maintained amd should not be used anymore.

Sending out the Layer 2 Update frame immediately after association is
fine for open networks (and also when using SAE, FT protocol, or FILS
authentication when the station is actually authenticated by the time
association completes). However, it is not appropriate for cases where
RSN is used with PSK or EAP authentication since the station is actually
fully authenticated only once the 4-way handshake completes after
authentication and attackers might be able to use the unauthenticated
triggering of Layer 2 Update frame transmission to disrupt bridge
behavior.

Fix this by postponing transmission of the Layer 2 Update frame from
station entry addition to the point when the station entry is marked
authorized. Similarly, send out the VLAN binding update only if the STA
entry has already been authorized.

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host controller"
Daniel Drake [Thu, 5 Sep 2019 04:55:57 +0000 (12:55 +0800)]
Revert "mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host controller"

This reverts commit 414126f9e5abf1973c661d24229543a9458fa8ce.

This commit broke eMMC storage access on a new consumer MiniPC based on
AMD SoC, which has eMMC connected to:

02:00.0 SD Host controller: O2 Micro, Inc. Device 8620 (rev 01) (prog-if 01)
Subsystem: O2 Micro, Inc. Device 0002

During probe, several errors are seen including:

  mmc1: Got data interrupt 0x02000000 even though no data operation was in progress.
  mmc1: Timeout waiting for hardware interrupt.
  mmc1: error -110 whilst initialising MMC card

Reverting this commit allows the eMMC storage to be detected & usable
again.

Signed-off-by: Daniel Drake <drake@endlessm.com>
Fixes: 414126f9e5ab ("mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host
controller")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 years agoRevert "mmc: bcm2835: Terminate timeout work synchronously"
Stefan Wahren [Sun, 8 Sep 2019 07:45:52 +0000 (09:45 +0200)]
Revert "mmc: bcm2835: Terminate timeout work synchronously"

The commit 37fefadee8bb ("mmc: bcm2835: Terminate timeout work
synchronously") causes lockups in case of hardware timeouts due the
timeout work also calling cancel_delayed_work_sync() on its own.
So revert it.

Fixes: 37fefadee8bb ("mmc: bcm2835: Terminate timeout work synchronously")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
5 years agoMerge tag 'mac80211-next-for-davem-2019-09-11' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 11 Sep 2019 13:57:17 +0000 (14:57 +0100)]
Merge tag 'mac80211-next-for-davem-2019-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
We have a number of changes, but things are settling down:
 * a fix in the new 6 GHz channel support
 * a fix for recent minstrel (rate control) updates
   for an infinite loop
 * handle interface type changes better wrt. management frame
   registrations (for management frames sent to userspace)
 * add in-BSS RX time to survey information
 * handle HW rfkill properly if !CONFIG_RFKILL
 * send deauth on IBSS station expiry, to avoid state mismatches
 * handle deferred crypto tailroom updates in mac80211 better
   when device restart happens
 * fix a spectre-v1 - really a continuation of a previous patch
 * advertise NL80211_CMD_UPDATE_FT_IES as supported if so
 * add some missing parsing in VHT extended NSS support
 * support HE in mac80211_hwsim
 * let mac80211 drivers determine the max MTU themselves
along with the usual cleanups etc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agogpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist
Hans de Goede [Tue, 27 Aug 2019 20:28:35 +0000 (22:28 +0200)]
gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist

Another day; another DSDT bug we need to workaround...

Since commit ca876c7483b6 ("gpiolib-acpi: make sure we trigger edge events
at least once on boot") we call _AEI edge handlers at boot.

In some rare cases this causes problems. One example of this is the Minix
Neo Z83-4 mini PC, this device has a clear DSDT bug where it has some copy
and pasted code for dealing with Micro USB-B connector host/device role
switching, while the mini PC does not even have a micro-USB connector.
This code, which should not be there, messes with the DDC data pin from
the HDMI connector (switching it to GPIO mode) breaking HDMI support.

To avoid problems like this, this commit adds a new
gpiolib_acpi.run_edge_events_on_boot kernel commandline option, which
allows disabling the running of _AEI edge event handlers at boot.

The default value is -1/auto which uses a DMI based blacklist, the initial
version of this blacklist contains the Neo Z83-4 fixing the HDMI breakage.

Cc: stable@vger.kernel.org
Cc: Daniel Drake <drake@endlessm.com>
Cc: Ian W MORRISON <ianwmorrison@gmail.com>
Reported-by: Ian W MORRISON <ianwmorrison@gmail.com>
Suggested-by: Ian W MORRISON <ianwmorrison@gmail.com>
Fixes: ca876c7483b6 ("gpiolib-acpi: make sure we trigger edge events at least once on boot")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20190827202835.213456-1-hdegoede@redhat.com
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Ian W MORRISON <ianwmorrison@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agocfg80211: Purge frame registrations on iftype change
Denis Kenzior [Wed, 28 Aug 2019 21:11:10 +0000 (16:11 -0500)]
cfg80211: Purge frame registrations on iftype change

Currently frame registrations are not purged, even when changing the
interface type.  This can lead to potentially weird situations where
frames possibly not allowed on a given interface type remain registered
due to the type switching happening after registration.

The kernel currently relies on userspace apps to actually purge the
registrations themselves, this is not something that the kernel should
rely on.

Add a call to cfg80211_mlme_purge_registrations() to forcefully remove
any registrations left over prior to switching the iftype.

Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190828211110.15005-1-denkenz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
5 years agolib/Kconfig: fix OBJAGG in lib/ menu structure
Randy Dunlap [Mon, 9 Sep 2019 21:54:21 +0000 (14:54 -0700)]
lib/Kconfig: fix OBJAGG in lib/ menu structure

Keep the "Library routines" menu intact by moving OBJAGG into it.
Otherwise OBJAGG is displayed/presented as an orphan in the
various config menus.

Fixes: 0a020d416d0a ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'mlx5-updates-2019-09-10' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 11 Sep 2019 08:26:34 +0000 (09:26 +0100)]
Merge tag 'mlx5-updates-2019-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-09-10

Misc build warnings cleanup for mlx5:

1) Reduce stack usage in FW trace
2) Fix addr's type in mlx5dr_icm_dm
3) Fix rt's type in dr_action_create_reformat_action
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'stmmac-next'
David S. Miller [Wed, 11 Sep 2019 08:21:34 +0000 (09:21 +0100)]
Merge branch 'stmmac-next'

Jose Abreu says:

====================
net: stmmac: Improvements for -next

Misc patches for -next. It includes:
 - Two fixes for features in -next only
 - New features support for GMAC cores (which includes GMAC4 and GMAC5)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: ARP Offload for GMAC4+ Cores
Jose Abreu [Tue, 10 Sep 2019 14:41:27 +0000 (16:41 +0200)]
net: stmmac: ARP Offload for GMAC4+ Cores

Implement the ARP Offload feature in GMAC4 and GMAC5 cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add support for VLAN Insertion Offload in GMAC4+
Jose Abreu [Tue, 10 Sep 2019 14:41:26 +0000 (16:41 +0200)]
net: stmmac: Add support for VLAN Insertion Offload in GMAC4+

Adds support for TX VLAN Offload using descriptors based features
available in GMAC4/5.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add support for SA Insertion/Replacement in GMAC4+
Jose Abreu [Tue, 10 Sep 2019 14:41:25 +0000 (16:41 +0200)]
net: stmmac: Add support for SA Insertion/Replacement in GMAC4+

Add the support for Source Address Insertion and Replacement in GMAC4
and GMAC5 cores. Two methods are supported: Descriptor based and
register based.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: xgmac: Reinitialize correctly a variable
Jose Abreu [Tue, 10 Sep 2019 14:41:24 +0000 (16:41 +0200)]
net: stmmac: xgmac: Reinitialize correctly a variable

'value' was being or'ed with a value from another register. This is a
typo and could cause new written value to be wrong. Fix it.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add VLAN HASH filtering support in GMAC4+
Jose Abreu [Tue, 10 Sep 2019 14:41:23 +0000 (16:41 +0200)]
net: stmmac: Add VLAN HASH filtering support in GMAC4+

Adds the support for VLAN HASH Filtering in GMAC4/5 cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Prevent divide-by-zero
Jose Abreu [Tue, 10 Sep 2019 14:41:22 +0000 (16:41 +0200)]
net: stmmac: Prevent divide-by-zero

When RX Coalesce settings are set to all zero (which is a valid setting)
we will currently get a divide-by-zero error. Fix it.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sonic: replace dev_kfree_skb in sonic_send_packet
Mao Wenan [Wed, 11 Sep 2019 01:36:23 +0000 (09:36 +0800)]
net: sonic: replace dev_kfree_skb in sonic_send_packet

sonic_send_packet will be processed in irq or non-irq
context, so it would better use dev_kfree_skb_any
instead of dev_kfree_skb.

Fixes: d9fb9f384292 ("*sonic/natsemi/ns83829: Move the National Semi-conductor drivers")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agowimax: i2400: fix memory leak
Navid Emamdoost [Tue, 10 Sep 2019 23:01:40 +0000 (18:01 -0500)]
wimax: i2400: fix memory leak

In i2400m_op_rfkill_sw_toggle cmd buffer should be released along with
skb response.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'hns3-next'
David S. Miller [Wed, 11 Sep 2019 08:08:46 +0000 (09:08 +0100)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: add a feature & bugfixes & cleanups

This patch-set includes a VF feature, bugfixes and cleanups for the HNS3
ethernet controller driver.

[patch 01/07] adds ethtool_ops.set_channels support for HNS3 VF driver

[patch 02/07] adds a recovery for setting channel fail.

[patch 03/07] fixes an error related to shaper parameter algorithm.

[patch 04/07] fixes an error related to ksetting.

[patch 05/07] adds cleanups for some log pinting.

[patch 06/07] adds a NULL pointer check before function calling.

[patch 07/07] adds some debugging information for reset issue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: add some DFX info for reset issue
Huazhong Tan [Wed, 11 Sep 2019 02:40:39 +0000 (10:40 +0800)]
net: hns3: add some DFX info for reset issue

This patch adds more information for reset DFX. Also, adds some
cleanups to reset info, move reset_fail_cnt into struct
hclge_rst_stats, and modifies some print formats.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: check NULL pointer before use
Guangbin Huang [Wed, 11 Sep 2019 02:40:38 +0000 (10:40 +0800)]
net: hns3: check NULL pointer before use

This patch checks ops->set_default_reset_request whether is NULL
before using it in function hns3_slot_reset.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: modify some logs format
Guangbin Huang [Wed, 11 Sep 2019 02:40:37 +0000 (10:40 +0800)]
net: hns3: modify some logs format

The pfc_en and pfc_map need to be displayed in hexadecimal notation,
printing dma address should use %pad, and the end of printed string
needs to be add "\n".

This patch modifies them.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix port setting handle for fibre port
Guangbin Huang [Wed, 11 Sep 2019 02:40:36 +0000 (10:40 +0800)]
net: hns3: fix port setting handle for fibre port

For hardware doesn't support use specified speed and duplex
to negotiate, it's unnecessary to check and modify the port
speed and duplex for fibre port when autoneg is on.

Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: fix shaper parameter algorithm
Yonglong Liu [Wed, 11 Sep 2019 02:40:35 +0000 (10:40 +0800)]
net: hns3: fix shaper parameter algorithm

Currently when hns3 driver configures the tm shaper to limit
bandwidth below 20Mbit using the parameters calculated by
hclge_shaper_para_calc(), the actual bandwidth limited by tm
hardware module is not accurate enough, for example, 1.28 Mbit
when the user is configuring 1 Mbit.

This patch adjusts the ir_calc to be closer to ir, and
always calculate the ir_b parameter when user is configuring
a small bandwidth. Also, removes an unnecessary parenthesis
when calculating denominator.

Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: revert to old channel when setting new channel num fail
Peng Li [Wed, 11 Sep 2019 02:40:34 +0000 (10:40 +0800)]
net: hns3: revert to old channel when setting new channel num fail

After setting new channel num, it needs free old ring memory and
allocate new ring memory. If there is no enough memory and allocate
new ring memory fail, the ring may initialize fail. To make sure
the network interface can work normally, driver should revert the
channel to the old configuration.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This page took 0.137039 seconds and 4 git commands to generate.