]> Git Repo - linux.git/log
linux.git
6 years agodrm/vmwgfx: don't check for old_crtc_state enable status
Deepak Rawat [Thu, 13 Sep 2018 10:33:49 +0000 (12:33 +0200)]
drm/vmwgfx: don't check for old_crtc_state enable status

During atomic check to prepare the new topology no need to check if
old_crtc_state was enabled or not. This will cause atomic_check to fail
because due to connector routing a crtc can be in atomic_state even if
there was no change to enable status.

Detected this issue with igt run.

Signed-off-by: Deepak Rawat <[email protected]>
Reviewed-by: Sinclair Yeh <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
6 years agonet: ibm: remove redundant local variables 'act_nr_of_entries' and 'act_pages'
zhong jiang [Wed, 19 Sep 2018 14:30:49 +0000 (22:30 +0800)]
net: ibm: remove redundant local variables 'act_nr_of_entries' and 'act_pages'

That local variable are never used after being assigned.
hence it should be redundant and can be removed.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: ibm: remove a redundant local variable 'k'
zhong jiang [Wed, 19 Sep 2018 14:23:15 +0000 (22:23 +0800)]
net: ibm: remove a redundant local variable 'k'

The local variable 'k' is never used after being assigned.
hence it should be redundant adn can be removed.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvneta: fix the Rx desc buffer DMA unmapping
Antoine Tenart [Wed, 19 Sep 2018 13:29:06 +0000 (15:29 +0200)]
net: mvneta: fix the Rx desc buffer DMA unmapping

With CONFIG_DMA_API_DEBUG enabled we now get a warning when using the
mvneta driver:

  mvneta d0030000.ethernet: DMA-API: device driver frees DMA memory with
  wrong function [device address=0x000000001165b000] [size=4096 bytes]
  [mapped as page] [unmapped as single]

This is because when using the s/w buffer management, the Rx descriptor
buffer is mapped with dma_map_page but unmapped with dma_unmap_single.
This patch fixes this by using the right unmapping function.

Fixes: 562e2f467e71 ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoip6_tunnel: be careful when accessing the inner header
Paolo Abeni [Wed, 19 Sep 2018 13:02:07 +0000 (15:02 +0200)]
ip6_tunnel: be careful when accessing the inner header

the ip6 tunnel xmit ndo assumes that the processed skb always
contains an ip[v6] header, but syzbot has found a way to send
frames that fall short of this assumption, leading to the following splat:

BUG: KMSAN: uninit-value in ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307
[inline]
BUG: KMSAN: uninit-value in ip6_tnl_start_xmit+0x7d2/0x1ef0
net/ipv6/ip6_tunnel.c:1390
CPU: 0 PID: 4504 Comm: syz-executor558 Not tainted 4.16.0+ #87
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x185/0x1d0 lib/dump_stack.c:53
  kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
  __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
  ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307 [inline]
  ip6_tnl_start_xmit+0x7d2/0x1ef0 net/ipv6/ip6_tunnel.c:1390
  __netdev_start_xmit include/linux/netdevice.h:4066 [inline]
  netdev_start_xmit include/linux/netdevice.h:4075 [inline]
  xmit_one net/core/dev.c:3026 [inline]
  dev_hard_start_xmit+0x5f1/0xc70 net/core/dev.c:3042
  __dev_queue_xmit+0x27ee/0x3520 net/core/dev.c:3557
  dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
  packet_snd net/packet/af_packet.c:2944 [inline]
  packet_sendmsg+0x7c70/0x8a30 net/packet/af_packet.c:2969
  sock_sendmsg_nosec net/socket.c:630 [inline]
  sock_sendmsg net/socket.c:640 [inline]
  ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
  __sys_sendmmsg+0x42d/0x800 net/socket.c:2136
  SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167
  SyS_sendmmsg+0x63/0x90 net/socket.c:2162
  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x441819
RSP: 002b:00007ffe58ee8268 EFLAGS: 00000213 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441819
RDX: 0000000000000002 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006cd018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000402510
R13: 00000000004025a0 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
  kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
  kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
  kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
  slab_post_alloc_hook mm/slab.h:445 [inline]
  slab_alloc_node mm/slub.c:2737 [inline]
  __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
  __kmalloc_reserve net/core/skbuff.c:138 [inline]
  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
  alloc_skb include/linux/skbuff.h:984 [inline]
  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
  packet_snd net/packet/af_packet.c:2894 [inline]
  packet_sendmsg+0x6454/0x8a30 net/packet/af_packet.c:2969
  sock_sendmsg_nosec net/socket.c:630 [inline]
  sock_sendmsg net/socket.c:640 [inline]
  ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
  __sys_sendmmsg+0x42d/0x800 net/socket.c:2136
  SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167
  SyS_sendmmsg+0x63/0x90 net/socket.c:2162
  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

This change addresses the issue adding the needed check before
accessing the inner header.

The ipv4 side of the issue is apparently there since the ipv4 over ipv6
initial support, and the ipv6 side predates git history.

Fixes: c4d3efafcc93 ("[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel.")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: [email protected]
Tested-by: Alexander Potapenko <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoipv6: Allow the l3mdev to be a loopback
Robert Shearman [Wed, 19 Sep 2018 12:56:53 +0000 (13:56 +0100)]
ipv6: Allow the l3mdev to be a loopback

There is no way currently for an IPv6 client connect using a loopback
address in a VRF, whereas for IPv4 the loopback address can be added:

    $ sudo ip addr add dev vrfred 127.0.0.1/8
    $ sudo ip -6 addr add ::1/128 dev vrfred
    RTNETLINK answers: Cannot assign requested address

So allow ::1 to be configured on an L3 master device. In order for
this to be usable ip_route_output_flags needs to not consider ::1 to
be a link scope address (since oif == l3mdev and so it would be
dropped), and ipv6_rcv needs to consider the l3mdev to be a loopback
device so that it doesn't drop the packets.

Signed-off-by: Robert Shearman <[email protected]>
Signed-off-by: Mike Manning <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'hns3-fixes'
David S. Miller [Thu, 20 Sep 2018 04:20:23 +0000 (21:20 -0700)]
Merge branch 'hns3-fixes'

Salil Mehta says:

====================
Fixes, cleanups & minor additions to HNS3 driver

This patch-set present some fixes, cleanups to the HNS3 PF and VF driver.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Fix parameter type for q_id in hclge_tm_q_to_qs_map_cfg()
Jian Shen [Wed, 19 Sep 2018 17:29:58 +0000 (18:29 +0100)]
net: hns3: Fix parameter type for q_id in hclge_tm_q_to_qs_map_cfg()

So far all the places calling hclge_tm_q_to_qs_map_cfg() are assigning
an u16 type value to "q_id", and in the processing of
hclge_tm_q_to_qs_map_cfg(), it also converts the "q_id" to le16.

The max tqp number for pf can be more than 256, we should use "u16" to
store the queue id, instead of "u8", which may cause data lost.

Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Fix client initialize state issue when roce client initialize failed
Jian Shen [Wed, 19 Sep 2018 17:29:57 +0000 (18:29 +0100)]
net: hns3: Fix client initialize state issue when roce client initialize failed

When roce is loaded before nic, the roce client will not be initialized
until nic client is initialized, but roce init flag is set before it.
Furthermore, in this case of nic initialized success and roce failed,
the nic init flag is not set, and roce init flag is not cleared.

This patch fixes it by set init flag only after the client is initialized
successfully.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Clear client pointer when initialize client failed or unintialize finished
Jian Shen [Wed, 19 Sep 2018 17:29:56 +0000 (18:29 +0100)]
net: hns3: Clear client pointer when initialize client failed or unintialize finished

If initialize client failed or finish uninitializing client, we should
clear the client pointer. It may cause unexpected result when use
uninitialized client. Meanwhile, we also should check whether client
exist when uninitialize it.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Fix cmdq registers initialization issue for vf
Jian Shen [Wed, 19 Sep 2018 17:29:55 +0000 (18:29 +0100)]
net: hns3: Fix cmdq registers initialization issue for vf

According to hardware's description, the head pointer register should
be written before the tail pointer register while initializing the vf
command queue. Otherwise, it may trigger an interrupt even though there
is no command received.

Fixes: fedd0c15d288 ("net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Fix for setting speed for phy failed problem
Fuyun Liang [Wed, 19 Sep 2018 17:29:54 +0000 (18:29 +0100)]
net: hns3: Fix for setting speed for phy failed problem

The function of genphy_read_status is that reading phy information
from HW and using these information to update SW variable. If user
is using ethtool to setting the speed of phy and service task is calling
by hclge_get_mac_phy_link, the result of speed setting is uncertain.
Because ethtool cmd will modified phydev and hclge_get_mac_phy_link also
will modified phydev.

Because phy state machine will update phy link periodically, we can
just use phydev->link to check the link status. This patch removes
function call of genphy_read_status. To ensure accuracy, this patch
adds a phy state check. If phy state is not PHY_RUNNING, we consider
link is down. Because in some scenarios, phydev->link may be link up,
but phy state is not PHY_RUNNING. This is just an intermediate state.
In fact, the link is not ready yet.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Check hdev state when getting link status
Peng Li [Wed, 19 Sep 2018 17:29:53 +0000 (18:29 +0100)]
net: hns3: Check hdev state when getting link status

By default, HW link status is up. If hclge_update_link_status is called
before net up, driver will print "link up". It is not suitable. hdev
state check is needed when getting link status.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Set STATE_DOWN bit of hdev state when stopping net
Fuyun Liang [Wed, 19 Sep 2018 17:29:52 +0000 (18:29 +0100)]
net: hns3: Set STATE_DOWN bit of hdev state when stopping net

We clear STATE_DOWN bit of hdev state when starting net, but do not set
it again when stopping net. It causes that the net is down, but hdev state
is still up. STATE_DOWN bit of hdev state should be set when stopping net.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Add support for hns3_nic_netdev_ops.ndo_do_ioctl
Xi Wang [Wed, 19 Sep 2018 17:29:51 +0000 (18:29 +0100)]
net: hns3: Add support for hns3_nic_netdev_ops.ndo_do_ioctl

This patch adds the .ndo_do_ioctl net_device_ops operation to support
the PHY MII ioctl for PF driver.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Xi Wang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Remove packet statistics of public
Peng Li [Wed, 19 Sep 2018 17:29:50 +0000 (18:29 +0100)]
net: hns3: Remove packet statistics of public

All pf have permission to read packet statistics of public in hardware,
but the read operation will clear registers which cause statistical
inaccuracy.

This patch removes all packet statistics of public.

Signed-off-by: Junxin Chen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Remove tx budget to clean more TX descriptors in a napi
Peng Li [Wed, 19 Sep 2018 17:29:49 +0000 (18:29 +0100)]
net: hns3: Remove tx budget to clean more TX descriptors in a napi

The the actual Tx work is minimal, driver can clean up as more
Tx descriptors as possible in a irq.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Add unlikely for buf_num check
Peng Li [Wed, 19 Sep 2018 17:29:48 +0000 (18:29 +0100)]
net: hns3: Add unlikely for buf_num check

This patch adds unlikely for buf_num check.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns3: Add default irq affinity
Peng Li [Wed, 19 Sep 2018 17:29:47 +0000 (18:29 +0100)]
net: hns3: Add default irq affinity

All irq will float to cpu0 if do not set irq affinity.
This patch adds default irq affinity in hns3 driver, users can
also change the irq affinity in OS.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: sun: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 11:21:32 +0000 (19:21 +0800)]
net: sun: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Acked-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: amd: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 10:50:17 +0000 (18:50 +0800)]
net: amd: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: broadcom: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 10:45:12 +0000 (18:45 +0800)]
net: broadcom: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: xilinx: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 10:32:40 +0000 (18:32 +0800)]
net: xilinx: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: toshiba: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 10:23:39 +0000 (18:23 +0800)]
net: toshiba: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: marvell: fix return type of ndo_start_xmit function
YueHaibing [Wed, 19 Sep 2018 10:19:26 +0000 (18:19 +0800)]
net: marvell: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'phylink-ensure-the-carrier-is-off-when-starting-phylink'
David S. Miller [Thu, 20 Sep 2018 04:15:02 +0000 (21:15 -0700)]
Merge branch 'phylink-ensure-the-carrier-is-off-when-starting-phylink'

Antoine Tenart says:

====================
net: phy: phylink: ensure the carrier is off when starting phylink

Following the discussion we had regarding the phylink issue related to
the carrier link state not being off when starting phylink, I sent a fix
patch a few days ago for the PPv2 driver:
https://lkml.org/lkml/2018/9/14/633

The idea was to send a patch which could go to the stable branches, but
a better solution would be to directly call netif_carrier_off() from
within phylink_start(). This is the aim of this series.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvneta: do not explicitly set the carrier state in open
Antoine Tenart [Wed, 19 Sep 2018 09:39:33 +0000 (11:39 +0200)]
net: mvneta: do not explicitly set the carrier state in open

This patch removes the explicit call to netif_carrier_off() in
mvneta_open() as this is already handled in phylink_start().

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: do not explicitly set the carrier state in open
Antoine Tenart [Wed, 19 Sep 2018 09:39:32 +0000 (11:39 +0200)]
net: mvpp2: do not explicitly set the carrier state in open

This patch removes the explicit call to netif_carrier_off() in PPv2's
open() path, as this is now handled in phylink_start().

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: phy: phylink: ensure the carrier is off when starting phylink
Antoine Tenart [Wed, 19 Sep 2018 09:39:31 +0000 (11:39 +0200)]
net: phy: phylink: ensure the carrier is off when starting phylink

Phylink made an assumption about the carrier state being down when
calling phylink_start(). If this assumption isn't satisfied, the
internal phylink state could misbehave and a net device could end up not
being functional.

This patch fixes this by explicitly calling netif_carrier_off() in
phylink_start().

Signed-off-by: Antoine Tenart <[email protected]>
Acked-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'net-mvpp2-improve-the-interrupt-usage'
David S. Miller [Thu, 20 Sep 2018 04:09:55 +0000 (21:09 -0700)]
Merge branch 'net-mvpp2-improve-the-interrupt-usage'

Antoine Tenart says:

====================
net: mvpp2: improve the interrupt usage

This series aims to improve the interrupts descriptions and usage in the
Marvell PPv2 driver.

- Before the series interrupts were named after their s/w usage,
  which in fact can be configured. The series rename all those
  interrupts and add a description of the ones left over.

- In PPv2 the interrupts are mapped to vectors. Those vectors were
  directly mapped to a given CPU, and per-cpu accesses were done. While
  this worked on our cases, the registers accesses mapped to the vectors
  are not actually linked to a given CPU. They instead are linked to
  what is called a "s/w thread". The series modify this so that the s/w
  threads are used instead of the CPU numbers, by adding an indirection.
  This means we now can have systems with more CPUs than s/w threads.

This is based on today's net-next, and was tested on various boards
using both versions of the PPv2 engine.

Two more patches will be coming, to update the device trees describing a
PPv2 engine. The patches are ready, but will go through a different
tree. I'll send them once this series will be accepted. This is not an
issue as the PPv2 driver keeps the dt bindings backward compatibility.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: rename mvpp2_percpu function to mvpp2_thread
Antoine Tenart [Wed, 19 Sep 2018 09:27:11 +0000 (11:27 +0200)]
net: mvpp2: rename mvpp2_percpu function to mvpp2_thread

As the mvpp2_percpu_read/write/... functions aren't really per-cpu but
per s/w thread, rename them to include 'thread' instead of 'percpu'.
This is a cosmetic patch.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: handle cases where more CPUs are available than s/w threads
Antoine Tenart [Wed, 19 Sep 2018 09:27:10 +0000 (11:27 +0200)]
net: mvpp2: handle cases where more CPUs are available than s/w threads

The Marvell PPv2 network controller has 9 internal threads. The driver
works fine when there are less CPUs available than threads. This isn't
true if more CPUs are available. As this is a valid use case, handle
this particular case.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: map the CPUs to threads
Antoine Tenart [Wed, 19 Sep 2018 09:27:09 +0000 (11:27 +0200)]
net: mvpp2: map the CPUs to threads

This patch maps all uses of the CPU to threads. All this_cpu calls are
replaced, and all smp_processor_id() calls are wrapped into the
indirection.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: do not use the CPU number to access the per-thread registers
Antoine Tenart [Wed, 19 Sep 2018 09:27:08 +0000 (11:27 +0200)]
net: mvpp2: do not use the CPU number to access the per-thread registers

This patch reworks the Marvell PPv2 driver to stop using directly the
CPU number to access per-thread registers.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: make mvpp2_read_relaxed static
Antoine Tenart [Wed, 19 Sep 2018 09:27:07 +0000 (11:27 +0200)]
net: mvpp2: make mvpp2_read_relaxed static

In the Marvell PPv2 driver the mvpp2_read_relaxed function is only used
in a single file. Make it static and remove its prototype from the
header.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: make the per-cpu helpers static
Antoine Tenart [Wed, 19 Sep 2018 09:27:06 +0000 (11:27 +0200)]
net: mvpp2: make the per-cpu helpers static

The Marvell PPv2 driver has per-cpu functions. As they only are used in
the main file, make them static and remove their prototype from the
header.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: cpu should always be unsigned
Antoine Tenart [Wed, 19 Sep 2018 09:27:05 +0000 (11:27 +0200)]
net: mvpp2: cpu should always be unsigned

Updates the PPv2 driver so that all CPU variables are unsigned, as it
makes no sense to have a negative CPU number. This patch is cosmetic.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: fix the number of queues per cpu for PPv2.2
Antoine Tenart [Wed, 19 Sep 2018 09:27:04 +0000 (11:27 +0200)]
net: mvpp2: fix the number of queues per cpu for PPv2.2

The Marvell PPv2.2 engine only has 8 Rx queues per CPU, while PPv2.1 has
16 of them. This patch updates the code so that the Rx queues mask width
is selected given the version of the network controller used.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: do not update the queue mode while probing
Antoine Tenart [Wed, 19 Sep 2018 09:27:03 +0000 (11:27 +0200)]
net: mvpp2: do not update the queue mode while probing

This patch updates the probing function so that the queue mode isn't
updated while probing, as the driver would silently end up using a
configuration not wanted by the user. The patch adds an extra check to
validate the chosen queue mode instead, and the driver will fail to
probe if the configuration is invalid.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoDocumentation/bindings: net: marvell-pp2: update the IRQs description
Antoine Tenart [Wed, 19 Sep 2018 09:27:02 +0000 (11:27 +0200)]
Documentation/bindings: net: marvell-pp2: update the IRQs description

This patch updates the interrupts part of the Marvell PPv2 driver
bindings documentation, to keep it in sync with the driver.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: rename the IRQs to match the hardware
Antoine Tenart [Wed, 19 Sep 2018 09:27:01 +0000 (11:27 +0200)]
net: mvpp2: rename the IRQs to match the hardware

This patch renames the IRQs in the Marvell PPv2 driver as their current
names match the way they are used in software. But this will change in
the future, and those IRQs have nothing to do with Rx/Tx interrupts
(this can be configured). The new binding also describe more interrupts
as some where left out.

The old binding support is kept for backward compatibility.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: increase the number of s/w threads to 9
Antoine Tenart [Wed, 19 Sep 2018 09:27:00 +0000 (11:27 +0200)]
net: mvpp2: increase the number of s/w threads to 9

This patch sets the number of s/w threads to 9, its maximum value,
instead of 8. This is not a fix as only 4 of the s/w threads were used
so far, but more could be used in the future.

Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoravb: remove tx buffer addr 4byte alilgnment restriction for R-Car Gen3
Kazuya Mizuguchi [Wed, 19 Sep 2018 08:06:21 +0000 (10:06 +0200)]
ravb: remove tx buffer addr 4byte alilgnment restriction for R-Car Gen3

This patch sets from two descriptor to one descriptor because R-Car Gen3
does not have the 4 bytes alignment restriction of the transmission buffer.

Signed-off-by: Kazuya Mizuguchi <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'phy_stop-synchronous'
David S. Miller [Thu, 20 Sep 2018 04:06:46 +0000 (21:06 -0700)]
Merge branch 'phy_stop-synchronous'

Heiner Kallweit says:

====================
net: phy: make phy_stop() synchronous

There have been few not that successful attempts in the past to make
phy_stop() a synchronous call instead of just changing the state.
Patch 1 of this series addresses an issue which prevented this change.
At least for me it works fine now. Would appreciate if Geert could
re-test as well that suspend doesn't throw an error.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: phy: call state machine synchronously in phy_stop
Heiner Kallweit [Tue, 18 Sep 2018 19:56:32 +0000 (21:56 +0200)]
net: phy: call state machine synchronously in phy_stop

phy_stop() may be called e.g. when suspending, therefore all needed
actions should be performed synchronously. Therefore add a synchronous
call to the state machine.

Signed-off-by: Heiner Kallweit <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: linkwatch: add check for netdevice being present to linkwatch_do_dev
Heiner Kallweit [Tue, 18 Sep 2018 19:55:36 +0000 (21:55 +0200)]
net: linkwatch: add check for netdevice being present to linkwatch_do_dev

When bringing down the netdevice (incl. detaching it) and calling
netif_carrier_off directly or indirectly the latter triggers an
asynchronous linkwatch event.
This linkwatch event eventually may fail to access chip registers in
the ndo_get_stats/ndo_get_stats64 callback because the device isn't
accessible any longer, see call trace in [0].

To prevent this scenario don't check for IFF_UP only, but also make
sure that the netdevice is present.

[0] https://lists.openwall.net/netdev/2018/03/15/62

Signed-off-by: Heiner Kallweit <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'net-Use-FIELD_SIZEOF-directly-instead-of-reimplementing-its-function'
David S. Miller [Thu, 20 Sep 2018 03:58:05 +0000 (20:58 -0700)]
Merge branch 'net-Use-FIELD_SIZEOF-directly-instead-of-reimplementing-its-function'

zhong jiang says:

====================
net: Use FIELD_SIZEOF directly instead of reimplementing its function

The issue is detected with the help of Coccinelle.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: ti: Use FIELD_SIZEOF directly instead of reimplementing its function
zhong jiang [Wed, 19 Sep 2018 11:32:14 +0000 (19:32 +0800)]
net: ti: Use FIELD_SIZEOF directly instead of reimplementing its function

FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: qede: Use FIELD_SIZEOF directly instead of reimplementing its function
zhong jiang [Wed, 19 Sep 2018 11:32:13 +0000 (19:32 +0800)]
net: qede: Use FIELD_SIZEOF directly instead of reimplementing its function

FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: core: Use FIELD_SIZEOF directly instead of reimplementing its function
zhong jiang [Wed, 19 Sep 2018 11:32:12 +0000 (19:32 +0800)]
net: core: Use FIELD_SIZEOF directly instead of reimplementing its function

FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: sched: Use FIELD_SIZEOF directly instead of reimplementing its function
zhong jiang [Wed, 19 Sep 2018 11:32:11 +0000 (19:32 +0800)]
net: sched: Use FIELD_SIZEOF directly instead of reimplementing its function

FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: iucv: Use FIELD_SIZEOF directly instead of reimplementing its function
zhong jiang [Wed, 19 Sep 2018 11:32:10 +0000 (19:32 +0800)]
net: iucv: Use FIELD_SIZEOF directly instead of reimplementing its function

FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.

Signed-off-by: zhong jiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge tag 'batadv-next-for-davem-20180919' of git://git.open-mesh.org/linux-merge
David S. Miller [Thu, 20 Sep 2018 03:35:37 +0000 (20:35 -0700)]
Merge tag 'batadv-next-for-davem-20180919' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Inform users about debugfs interface deprecation, by Sven Eckelmann

 - Implement tracing, planned to replace debugfs log messages,
   by Sven Eckelmann

 - Move OGM rebroadcasts to per interface struct, by Sven Eckelmann

 - Enable LockLess TX to increase throughput, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agodrm/amdgpu: add new polaris pci id
Alex Deucher [Tue, 18 Sep 2018 20:28:24 +0000 (15:28 -0500)]
drm/amdgpu: add new polaris pci id

Add new pci id.

Reviewed-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
6 years agoMerge tag 'batadv-net-for-davem-20180919' of git://git.open-mesh.org/linux-merge
David S. Miller [Thu, 20 Sep 2018 03:32:46 +0000 (20:32 -0700)]
Merge tag 'batadv-net-for-davem-20180919' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
pull request for net: batman-adv 2018-09-19

here are some bugfixes which we would like to see integrated into net.

We forgot to bump the version number in the last round for net-next, so
the belated patch to do that is included - we hope you can adopt it.
This will most likely create a merge conflict later when merging into
net-next with this rounds net-next patchset, but net-next should keep
the 2018.4 version[1].

[1] resolution:

--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -25,11 +25,7 @@
 #define BATADV_DRIVER_DEVICE "batman-adv"

 #ifndef BATADV_SOURCE_VERSION
-<<<<<<<
-#define BATADV_SOURCE_VERSION "2018.3"
-=======
 #define BATADV_SOURCE_VERSION "2018.4"
->>>>>>>
 #endif

 /* B.A.T.M.A.N. parameters */

Please pull or let me know of any problem!

Here are some batman-adv bugfixes:

 - Avoid ELP information leak, by Sven Eckelmann

 - Fix sysfs segfault issues, by Sven Eckelmann (2 patches)

 - Fix locking when adding entries in various lists,
   by Sven Eckelmann (5 patches)

 - Fix refcount if queue_work() fails, by Marek Lindner (2 patches)

 - Fixup forgotten version bump, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge tag 'drm-intel-fixes-2018-09-19' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 20 Sep 2018 00:01:46 +0000 (10:01 +1000)]
Merge tag 'drm-intel-fixes-2018-09-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Only fixes coming from gvt containing "Two more BXT fixes from Colin,
one srcu locking fix and one fix for GGTT clear when destroy vGPU."

Signed-off-by: Dave Airlie <[email protected]>
From: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
6 years agoMerge tag 'drm-misc-fixes-2018-09-19' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Thu, 20 Sep 2018 00:00:31 +0000 (10:00 +1000)]
Merge tag 'drm-misc-fixes-2018-09-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v4.19-rc5:
- Fix crash in vgem in drm_drv_uses_atomic_modeset.
- Allow atomic drivers that don't set DRIVER_ATOMIC to create debugfs entries.
- Fix compiler warning for unused connector_funcs.
- Fix null pointer deref on UDL unplug.
- Disable DRM support for sun4i's R40 for now.
  (Not all patches went in for v4.19, so it has to wait a cycle.)
- NULL-terminate the of_device_id table in pl111.
- Make sure vc4 NV12 planar format works when displaying an unscaled fb.

Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
6 years agokvm: selftests: Add platform_info_test
Drew Schmitt [Mon, 20 Aug 2018 17:32:16 +0000 (10:32 -0700)]
kvm: selftests: Add platform_info_test

Test guest access to MSR_PLATFORM_INFO when the capability is enabled
or disabled.

Signed-off-by: Drew Schmitt <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: x86: Control guest reads of MSR_PLATFORM_INFO
Drew Schmitt [Mon, 20 Aug 2018 17:32:15 +0000 (10:32 -0700)]
KVM: x86: Control guest reads of MSR_PLATFORM_INFO

Add KVM_CAP_MSR_PLATFORM_INFO so that userspace can disable guest access
to reads of MSR_PLATFORM_INFO.

Disabling access to reads of this MSR gives userspace the control to "expose"
this platform-dependent information to guests in a clear way. As it exists
today, guests that read this MSR would get unpopulated information if userspace
hadn't already set it (and prior to this patch series, only the CPUID faulting
information could have been populated). This existing interface could be
confusing if guests don't handle the potential for incorrect/incomplete
information gracefully (e.g. zero reported for base frequency).

Signed-off-by: Drew Schmitt <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: x86: Turbo bits in MSR_PLATFORM_INFO
Drew Schmitt [Mon, 20 Aug 2018 17:32:14 +0000 (10:32 -0700)]
KVM: x86: Turbo bits in MSR_PLATFORM_INFO

Allow userspace to set turbo bits in MSR_PLATFORM_INFO. Previously, only
the CPUID faulting bit was settable. But now any bit in
MSR_PLATFORM_INFO would be settable. This can be used, for example, to
convey frequency information about the platform on which the guest is
running.

Signed-off-by: Drew Schmitt <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agonVMX x86: Check VPID value on vmentry of L2 guests
Krish Sadhukhan [Tue, 4 Sep 2018 18:42:58 +0000 (14:42 -0400)]
nVMX x86: Check VPID value on vmentry of L2 guests

According to section "Checks on VMX Controls" in Intel SDM vol 3C, the
following check needs to be enforced on vmentry of L2 guests:

    If the 'enable VPID' VM-execution control is 1, the value of the
    of the VPID VM-execution control field must not be 0000H.

Signed-off-by: Krish Sadhukhan <[email protected]>
Reviewed-by: Mark Kanda <[email protected]>
Reviewed-by: Liran Alon <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agonVMX x86: check posted-interrupt descriptor addresss on vmentry of L2
Krish Sadhukhan [Fri, 24 Aug 2018 00:03:03 +0000 (20:03 -0400)]
nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2

According to section "Checks on VMX Controls" in Intel SDM vol 3C,
the following check needs to be enforced on vmentry of L2 guests:

   - Bits 5:0 of the posted-interrupt descriptor address are all 0.
   - The posted-interrupt descriptor address does not set any bits
     beyond the processor's physical-address width.

Signed-off-by: Krish Sadhukhan <[email protected]>
Reviewed-by: Mark Kanda <[email protected]>
Reviewed-by: Liran Alon <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
Reviewed-by: Karl Heubaum <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv
Liran Alon [Tue, 4 Sep 2018 07:56:52 +0000 (10:56 +0300)]
KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv

In case L1 do not intercept L2 HLT or enter L2 in HLT activity-state,
it is possible for a vCPU to be blocked while it is in guest-mode.

According to Intel SDM 26.6.5 Interrupt-Window Exiting and
Virtual-Interrupt Delivery: "These events wake the logical processor
if it just entered the HLT state because of a VM entry".
Therefore, if L1 enters L2 in HLT activity-state and L2 has a pending
deliverable interrupt in vmcs12->guest_intr_status.RVI, then the vCPU
should be waken from the HLT state and injected with the interrupt.

In addition, if while the vCPU is blocked (while it is in guest-mode),
it receives a nested posted-interrupt, then the vCPU should also be
waken and injected with the posted interrupt.

To handle these cases, this patch enhances kvm_vcpu_has_events() to also
check if there is a pending interrupt in L2 virtual APICv provided by
L1. That is, it evaluates if there is a pending virtual interrupt for L2
by checking RVI[7:4] > VPPR[7:4] as specified in Intel SDM 29.2.1
Evaluation of Pending Interrupts.

Note that this also handles the case of nested posted-interrupt by the
fact RVI is updated in vmx_complete_nested_posted_interrupt() which is
called from kvm_vcpu_check_block() -> kvm_arch_vcpu_runnable() ->
kvm_vcpu_running() -> vmx_check_nested_events() ->
vmx_complete_nested_posted_interrupt().

Reviewed-by: Nikita Leshenko <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: VMX: check nested state and CR4.VMXE against SMM
Paolo Bonzini [Tue, 18 Sep 2018 13:19:17 +0000 (15:19 +0200)]
KVM: VMX: check nested state and CR4.VMXE against SMM

VMX cannot be enabled under SMM, check it when CR4 is set and when nested
virtualization state is restored.

This should fix some WARNs reported by syzkaller, mostly around
alloc_shadow_vmcs.

Signed-off-by: Paolo Bonzini <[email protected]>
6 years agokvm: x86: make kvm_{load|put}_guest_fpu() static
Sebastian Andrzej Siewior [Wed, 12 Sep 2018 13:33:45 +0000 (15:33 +0200)]
kvm: x86: make kvm_{load|put}_guest_fpu() static

The functions
kvm_load_guest_fpu()
kvm_put_guest_fpu()

are only used locally, make them static. This requires also that both
functions are moved because they are used before their implementation.
Those functions were exported (via EXPORT_SYMBOL) before commit
e5bb40251a920 ("KVM: Drop kvm_{load,put}_guest_fpu() exports").

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agox86/hyper-v: rename ipi_arg_{ex,non_ex} structures
Vitaly Kuznetsov [Mon, 27 Aug 2018 16:48:57 +0000 (18:48 +0200)]
x86/hyper-v: rename ipi_arg_{ex,non_ex} structures

These structures are going to be used from KVM code so let's make
their names reflect their Hyper-V origin.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Roman Kagan <[email protected]>
Acked-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: VMX: use preemption timer to force immediate VMExit
Sean Christopherson [Mon, 27 Aug 2018 22:21:12 +0000 (15:21 -0700)]
KVM: VMX: use preemption timer to force immediate VMExit

A VMX preemption timer value of '0' is guaranteed to cause a VMExit
prior to the CPU executing any instructions in the guest.  Use the
preemption timer (if it's supported) to trigger immediate VMExit
in place of the current method of sending a self-IPI.  This ensures
that pending VMExit injection to L1 occurs prior to executing any
instructions in the guest (regardless of nesting level).

When deferring VMExit injection, KVM generates an immediate VMExit
from the (possibly nested) guest by sending itself an IPI.  Because
hardware interrupts are blocked prior to VMEnter and are unblocked
(in hardware) after VMEnter, this results in taking a VMExit(INTR)
before any guest instruction is executed.  But, as this approach
relies on the IPI being received before VMEnter executes, it only
works as intended when KVM is running as L0.  Because there are no
architectural guarantees regarding when IPIs are delivered, when
running nested the INTR may "arrive" long after L2 is running e.g.
L0 KVM doesn't force an immediate switch to L1 to deliver an INTR.

For the most part, this unintended delay is not an issue since the
events being injected to L1 also do not have architectural guarantees
regarding their timing.  The notable exception is the VMX preemption
timer[1], which is architecturally guaranteed to cause a VMExit prior
to executing any instructions in the guest if the timer value is '0'
at VMEnter.  Specifically, the delay in injecting the VMExit causes
the preemption timer KVM unit test to fail when run in a nested guest.

Note: this approach is viable even on CPUs with a broken preemption
timer, as broken in this context only means the timer counts at the
wrong rate.  There are no known errata affecting timer value of '0'.

[1] I/O SMIs also have guarantees on when they arrive, but I have
    no idea if/how those are emulated in KVM.

Signed-off-by: Sean Christopherson <[email protected]>
[Use a hook for SVM instead of leaving the default in x86.c - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: VMX: modify preemption timer bit only when arming timer
Sean Christopherson [Mon, 27 Aug 2018 22:21:11 +0000 (15:21 -0700)]
KVM: VMX: modify preemption timer bit only when arming timer

Provide a singular location where the VMX preemption timer bit is
set/cleared so that future usages of the preemption timer can ensure
the VMCS bit is up-to-date without having to modify unrelated code
paths.  For example, the preemption timer can be used to force an
immediate VMExit.  Cache the status of the timer to avoid redundant
VMREAD and VMWRITE, e.g. if the timer stays armed across multiple
VMEnters/VMExits.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: VMX: immediately mark preemption timer expired only for zero value
Sean Christopherson [Mon, 27 Aug 2018 22:21:10 +0000 (15:21 -0700)]
KVM: VMX: immediately mark preemption timer expired only for zero value

A VMX preemption timer value of '0' at the time of VMEnter is
architecturally guaranteed to cause a VMExit prior to the CPU
executing any instructions in the guest.  This architectural
definition is in place to ensure that a previously expired timer
is correctly recognized by the CPU as it is possible for the timer
to reach zero and not trigger a VMexit due to a higher priority
VMExit being signalled instead, e.g. a pending #DB that morphs into
a VMExit.

Whether by design or coincidence, commit f4124500c2c1 ("KVM: nVMX:
Fully emulate preemption timer") special cased timer values of '0'
and '1' to ensure prompt delivery of the VMExit.  Unlike '0', a
timer value of '1' has no has no architectural guarantees regarding
when it is delivered.

Modify the timer emulation to trigger immediate VMExit if and only
if the timer value is '0', and document precisely why '0' is special.
Do this even if calibration of the virtual TSC failed, i.e. VMExit
will occur immediately regardless of the frequency of the timer.
Making only '0' a special case gives KVM leeway to be more aggressive
in ensuring the VMExit is injected prior to executing instructions in
the nested guest, and also eliminates any ambiguity as to why '1' is
a special case, e.g. why wasn't the threshold for a "short timeout"
set to 10, 100, 1000, etc...

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: SVM: Switch to bitmap_zalloc()
Andy Shevchenko [Thu, 30 Aug 2018 11:49:59 +0000 (14:49 +0300)]
KVM: SVM: Switch to bitmap_zalloc()

Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM/MMU: Fix comment in walk_shadow_page_lockless_end()
Tianyu Lan [Fri, 7 Sep 2018 05:45:02 +0000 (05:45 +0000)]
KVM/MMU: Fix comment in walk_shadow_page_lockless_end()

kvm_commit_zap_page() has been renamed to kvm_mmu_commit_zap_page()
This patch is to fix the commit.

Signed-off-by: Lan Tianyu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agokvm: selftests: use -pthread instead of -lpthread
Lei Yang [Wed, 29 Aug 2018 07:04:08 +0000 (15:04 +0800)]
kvm: selftests: use -pthread instead of -lpthread

I run into the following error

testing/selftests/kvm/dirty_log_test.c:285: undefined reference to `pthread_create'
testing/selftests/kvm/dirty_log_test.c:297: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status

my gcc version is gcc version 4.8.4
"-pthread" would work everywhere

Signed-off-by: Lei Yang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agoKVM: x86: don't reset root in kvm_mmu_setup()
Wei Yang [Fri, 7 Sep 2018 11:59:47 +0000 (19:59 +0800)]
KVM: x86: don't reset root in kvm_mmu_setup()

Here is the code path which shows kvm_mmu_setup() is invoked after
kvm_mmu_create(). Since kvm_mmu_setup() is only invoked in this code path,
this means the root_hpa and prev_roots are guaranteed to be invalid. And
it is not necessary to reset it again.

    kvm_vm_ioctl_create_vcpu()
        kvm_arch_vcpu_create()
            vmx_create_vcpu()
                kvm_vcpu_init()
                    kvm_arch_vcpu_init()
                        kvm_mmu_create()
        kvm_arch_vcpu_setup()
            kvm_mmu_setup()
                kvm_init_mmu()

This patch set reset_roots to false in kmv_mmu_setup().

Fixes: 50c28f21d045dde8c52548f8482d456b3f0956f5
Signed-off-by: Wei Yang <[email protected]>
Reviewed-by: Liran Alon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agokvm: mmu: Don't read PDPTEs when paging is not enabled
Junaid Shahid [Thu, 9 Aug 2018 00:45:24 +0000 (17:45 -0700)]
kvm: mmu: Don't read PDPTEs when paging is not enabled

kvm should not attempt to read guest PDPTEs when CR0.PG = 0 and
CR4.PAE = 1.

Signed-off-by: Junaid Shahid <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agox86/kvm/lapic: always disable MMIO interface in x2APIC mode
Vitaly Kuznetsov [Thu, 2 Aug 2018 15:08:16 +0000 (17:08 +0200)]
x86/kvm/lapic: always disable MMIO interface in x2APIC mode

When VMX is used with flexpriority disabled (because of no support or
if disabled with module parameter) MMIO interface to lAPIC is still
available in x2APIC mode while it shouldn't be (kvm-unit-tests):

PASS: apic_disable: Local apic enabled in x2APIC mode
PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set
FAIL: apic_disable: *0xfee00030: 50014

The issue appears because we basically do nothing while switching to
x2APIC mode when APIC access page is not used. apic_mmio_{read,write}
only check if lAPIC is disabled before proceeding to actual write.

When APIC access is virtualized we correctly manipulate with VMX controls
in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes
in x2APIC mode so there's no issue.

Disabling MMIO interface seems to be easy. The question is: what do we
do with these reads and writes? If we add apic_x2apic_mode() check to
apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will
go to userspace. When lAPIC is in kernel, Qemu uses this interface to
inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This
somehow works with disabled lAPIC but when we're in xAPIC mode we will
get a real injected MSI from every write to lAPIC. Not good.

The simplest solution seems to be to just ignore writes to the region
and return ~0 for all reads when we're in x2APIC mode. This is what this
patch does. However, this approach is inconsistent with what currently
happens when flexpriority is enabled: we allocate APIC access page and
create KVM memory region so in x2APIC modes all reads and writes go to
this pre-allocated page which is, btw, the same for all vCPUs.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
6 years agotools: bpf: fix license for a compat header file
Jakub Kicinski [Tue, 18 Sep 2018 17:13:59 +0000 (10:13 -0700)]
tools: bpf: fix license for a compat header file

libc_compat.h is used by libbpf so make sure it's licensed under
LGPL or BSD license.  The license change should be OK, I'm the only
author of the file.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Quentin Monnet <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
6 years agoflow_dissector: fix build failure without CONFIG_NET
Willem de Bruijn [Tue, 18 Sep 2018 20:20:18 +0000 (16:20 -0400)]
flow_dissector: fix build failure without CONFIG_NET

If boolean CONFIG_BPF_SYSCALL is enabled, kernel/bpf/syscall.c will
call flow_dissector functions from net/core/flow_dissector.c.

This causes this build failure if CONFIG_NET is disabled:

    kernel/bpf/syscall.o: In function `__x64_sys_bpf':
    syscall.c:(.text+0x3278): undefined reference to
    `skb_flow_dissector_bpf_prog_attach'
    syscall.c:(.text+0x3310): undefined reference to
    `skb_flow_dissector_bpf_prog_detach'
    kernel/bpf/syscall.o:(.rodata+0x3f0): undefined reference to
    `flow_dissector_prog_ops'
    kernel/bpf/verifier.o:(.rodata+0x250): undefined reference to
    `flow_dissector_verifier_ops'

Analogous to other optional BPF program types in syscall.c, add stubs
if the relevant functions are not compiled and move the BPF_PROG_TYPE
definition in the #ifdef CONFIG_NET block.

Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Acked-by: Randy Dunlap <[email protected]> # build-tested
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
6 years agoMerge tag 'hwmon-for-linus-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Wed, 19 Sep 2018 20:59:30 +0000 (22:59 +0200)]
Merge tag 'hwmon-for-linus-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Guenter writes:
   "Various bug fixes for nct6775 driver"

6 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Greg Kroah-Hartman [Wed, 19 Sep 2018 20:34:22 +0000 (22:34 +0200)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

James writes:
  "SCSI fixes on 20180919

   A couple of small but important fixes, one affecting big endian and
   the other fixing a BUG_ON in scatterlist processing.

Signed-off-by: James E.J. Bottomley <[email protected]>"
6 years agoxen: issue warning message when out of grant maptrack entries
Juergen Gross [Wed, 19 Sep 2018 13:42:33 +0000 (15:42 +0200)]
xen: issue warning message when out of grant maptrack entries

When a driver domain (e.g. dom0) is running out of maptrack entries it
can't map any more foreign domain pages. Instead of silently stalling
the affected domUs issue a rate limited warning in this case in order
to make it easier to detect that situation.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
6 years agoxen/x86/vpmu: Zero struct pt_regs before calling into sample handling code
Boris Ostrovsky [Thu, 12 Jul 2018 17:27:00 +0000 (13:27 -0400)]
xen/x86/vpmu: Zero struct pt_regs before calling into sample handling code

Otherwise we may leak kernel stack for events that sample user
registers.

Reported-by: Mark Rutland <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
Cc: [email protected]
6 years agoMAINTAINERS: Add Borislav to the x86 maintainers
Thomas Gleixner [Wed, 19 Sep 2018 11:48:41 +0000 (13:48 +0200)]
MAINTAINERS: Add Borislav to the x86 maintainers

Borislav is effectivly maintaining parts of X86 already, make it official.

Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
6 years agoMerge tag 'perf-urgent-for-mingo-4.19-20180918' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Wed, 19 Sep 2018 11:25:35 +0000 (13:25 +0200)]
Merge tag 'perf-urgent-for-mingo-4.19-20180918' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - Fix the build on !_GNU_SOURCE libc systems such as Alpine Linux/musl
   libc due to usage of strerror_r glibc variant on libbpf (Arnaldo Carvalho de Melo)

 - Fix out-of-tree asciidoctor man page generation (Ben Hutchings)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
6 years agox86/paravirt: Fix some warning messages
Dan Carpenter [Wed, 19 Sep 2018 10:35:53 +0000 (13:35 +0300)]
x86/paravirt: Fix some warning messages

The first argument to WARN_ONCE() is a condition.

Fixes: 5800dc5c19f3 ("x86/paravirt: Fix spectre-v2 mitigations for paravirt guests")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Alok Kataria <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/20180919103553.GD9238@mwanda
6 years agodrm: sun4i: drop second PLL from A64 HDMI PHY
Icenowy Zheng [Sun, 16 Sep 2018 04:34:06 +0000 (12:34 +0800)]
drm: sun4i: drop second PLL from A64 HDMI PHY

The A64 HDMI PHY seems to be not able to use the second video PLL as
clock parent in experiments.

Drop the support for the second PLL from A64 HDMI PHY driver.

Fixes: b46e2c9f5f64 ("drm/sun4i: Add support for A64 HDMI PHY")
Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
6 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Greg Kroah-Hartman [Wed, 19 Sep 2018 06:29:42 +0000 (08:29 +0200)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Crypto stuff from Herbert:
  "This push fixes a potential boot hang in ccp and an incorrect
   CPU capability check in aegis/morus on x86."

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2
  crypto: ccp - add timeout support in the SEV command

6 years agoMerge tag 'trace-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Greg Kroah-Hartman [Wed, 19 Sep 2018 05:41:46 +0000 (07:41 +0200)]
Merge tag 'trace-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Steven writes:
  "Vaibhav Nagarnaik found that modifying the ring buffer size could cause
   a huge latency in the system because it does a while loop to free pages
   without releasing the CPU (on non preempt kernels). In a case where there
   are hundreds of thousands of pages to free it could actually cause a system
   stall. A properly place cond_resched() solves this issue."

6 years agoMerge tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform...
Greg Kroah-Hartman [Wed, 19 Sep 2018 05:21:21 +0000 (07:21 +0200)]
Merge tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86

Darren writes:
  "platform-drivers-x86 for v4.19-2

   Free allocated ACPI buffers in two drivers.

   The following is an automated git shortlog grouped by driver:

   alienware-wmi:
    -  Correct a memory leak

   dell-smbios-wmi:
    -  Correct a memory leak"

* tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: alienware-wmi: Correct a memory leak
  platform/x86: dell-smbios-wmi: Correct a memory leak

6 years agoMerge branch 'ipv6-fix-issues-on-accessing-fib6_metrics'
David S. Miller [Wed, 19 Sep 2018 03:17:01 +0000 (20:17 -0700)]
Merge branch 'ipv6-fix-issues-on-accessing-fib6_metrics'

Wei Wang says:

====================
ipv6: fix issues on accessing fib6_metrics

The latest fix on the memory leak of fib6_metrics still causes
use-after-free.
This patch series first revert the previous fix and propose a new fix
that is more inline with ipv4 logic and is tested to fix the
use-after-free issue reported.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agoipv6: fix memory leak on dst->_metrics
Wei Wang [Tue, 18 Sep 2018 20:45:00 +0000 (13:45 -0700)]
ipv6: fix memory leak on dst->_metrics

When dst->_metrics and f6i->fib6_metrics share the same memory, both
take reference count on the dst_metrics structure. However, when dst is
destroyed, ip6_dst_destroy() only invokes dst_destroy_metrics_generic()
which does not take care of READONLY metrics and does not release refcnt.
This causes memory leak.
Similar to ipv4 logic, the fix is to properly release refcnt and free
the memory space pointed by dst->_metrics if refcnt becomes 0.

Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
Reported-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoRevert "ipv6: fix double refcount of fib6_metrics"
Wei Wang [Tue, 18 Sep 2018 20:44:59 +0000 (13:44 -0700)]
Revert "ipv6: fix double refcount of fib6_metrics"

This reverts commit e70a3aad44cc8b24986687ffc98c4a4f6ecf25ea.

This change causes use-after-free on dst->_metrics.
The crash trace looks like this:
[   97.763269] BUG: KASAN: use-after-free in ip6_mtu+0x116/0x140
[   97.769038] Read of size 4 at addr ffff881781d2cf84 by task svw_NetThreadEv/8801

[   97.777954] CPU: 76 PID: 8801 Comm: svw_NetThreadEv Not tainted 4.15.0-smp-DEV #11
[   97.777956] Hardware name: Default string Default string/Indus_QC_02, BIOS 5.46.4 03/29/2018
[   97.777957] Call Trace:
[   97.777971]  [<ffffffff895709db>] dump_stack+0x4d/0x72
[   97.777985]  [<ffffffff881651df>] print_address_description+0x6f/0x260
[   97.777997]  [<ffffffff88165747>] kasan_report+0x257/0x370
[   97.778001]  [<ffffffff894488e6>] ? ip6_mtu+0x116/0x140
[   97.778004]  [<ffffffff881658b9>] __asan_report_load4_noabort+0x19/0x20
[   97.778008]  [<ffffffff894488e6>] ip6_mtu+0x116/0x140
[   97.778013]  [<ffffffff892bb91e>] tcp_current_mss+0x12e/0x280
[   97.778016]  [<ffffffff892bb7f0>] ? tcp_mtu_to_mss+0x2d0/0x2d0
[   97.778022]  [<ffffffff887b45b8>] ? depot_save_stack+0x138/0x4a0
[   97.778037]  [<ffffffff87c38985>] ? __mmdrop+0x145/0x1f0
[   97.778040]  [<ffffffff881643b1>] ? save_stack+0xb1/0xd0
[   97.778046]  [<ffffffff89264c82>] tcp_send_mss+0x22/0x220
[   97.778059]  [<ffffffff89273a49>] tcp_sendmsg_locked+0x4f9/0x39f0
[   97.778062]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778066]  [<ffffffff89273550>] ? tcp_sendpage+0x60/0x60
[   97.778070]  [<ffffffff881cb359>] ? rw_copy_check_uvector+0x69/0x280
[   97.778075]  [<ffffffff8873c65f>] ? import_iovec+0x9f/0x430
[   97.778078]  [<ffffffff88164be7>] ? kasan_slab_free+0x87/0xc0
[   97.778082]  [<ffffffff8873c5c0>] ? memzero_page+0x140/0x140
[   97.778085]  [<ffffffff881642b4>] ? kasan_check_write+0x14/0x20
[   97.778088]  [<ffffffff89276f6c>] tcp_sendmsg+0x2c/0x50
[   97.778092]  [<ffffffff89276f6c>] ? tcp_sendmsg+0x2c/0x50
[   97.778098]  [<ffffffff89352d43>] inet_sendmsg+0x103/0x480
[   97.778102]  [<ffffffff89352c40>] ? inet_gso_segment+0x15b0/0x15b0
[   97.778105]  [<ffffffff890294da>] sock_sendmsg+0xba/0xf0
[   97.778108]  [<ffffffff8902ab6a>] ___sys_sendmsg+0x6ca/0x8e0
[   97.778113]  [<ffffffff87dccac1>] ? hrtimer_try_to_cancel+0x71/0x3b0
[   97.778116]  [<ffffffff8902a4a0>] ? copy_msghdr_from_user+0x3d0/0x3d0
[   97.778119]  [<ffffffff881646d1>] ? memset+0x31/0x40
[   97.778123]  [<ffffffff87a0cff5>] ? schedule_hrtimeout_range_clock+0x165/0x380
[   97.778127]  [<ffffffff87a0ce90>] ? hrtimer_nanosleep_restart+0x250/0x250
[   97.778130]  [<ffffffff87dcc700>] ? __hrtimer_init+0x180/0x180
[   97.778133]  [<ffffffff87dd1f82>] ? ktime_get_ts64+0x172/0x200
[   97.778137]  [<ffffffff8822b8ec>] ? __fget_light+0x8c/0x2f0
[   97.778141]  [<ffffffff8902d5c6>] __sys_sendmsg+0xe6/0x190
[   97.778144]  [<ffffffff8902d5c6>] ? __sys_sendmsg+0xe6/0x190
[   97.778147]  [<ffffffff8902d4e0>] ? SyS_shutdown+0x20/0x20
[   97.778152]  [<ffffffff87cd4370>] ? wake_up_q+0xe0/0xe0
[   97.778155]  [<ffffffff8902d670>] ? __sys_sendmsg+0x190/0x190
[   97.778158]  [<ffffffff8902d683>] SyS_sendmsg+0x13/0x20
[   97.778162]  [<ffffffff87a1600c>] do_syscall_64+0x2ac/0x430
[   97.778166]  [<ffffffff87c17515>] ? do_page_fault+0x35/0x3d0
[   97.778171]  [<ffffffff8960131f>] ? page_fault+0x2f/0x50
[   97.778174]  [<ffffffff89600071>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   97.778177] RIP: 0033:0x7f83fa36000d
[   97.778178] RSP: 002b:00007f83ef9229e0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   97.778180] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f83fa36000d
[   97.778182] RDX: 0000000000004000 RSI: 00007f83ef922f00 RDI: 0000000000000036
[   97.778183] RBP: 00007f83ef923040 R08: 00007f83ef9231f8 R09: 00007f83ef923168
[   97.778184] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f83f69c5b40
[   97.778185] R13: 000000000000001c R14: 0000000000000001 R15: 0000000000004000

[   97.779684] Allocated by task 5919:
[   97.783185]  save_stack+0x46/0xd0
[   97.783187]  kasan_kmalloc+0xad/0xe0
[   97.783189]  kmem_cache_alloc_trace+0xdf/0x580
[   97.783190]  ip6_convert_metrics.isra.79+0x7e/0x190
[   97.783192]  ip6_route_info_create+0x60a/0x2480
[   97.783193]  ip6_route_add+0x1d/0x80
[   97.783195]  inet6_rtm_newroute+0xdd/0xf0
[   97.783198]  rtnetlink_rcv_msg+0x641/0xb10
[   97.783200]  netlink_rcv_skb+0x27b/0x3e0
[   97.783202]  rtnetlink_rcv+0x15/0x20
[   97.783203]  netlink_unicast+0x4be/0x720
[   97.783204]  netlink_sendmsg+0x7bc/0xbf0
[   97.783205]  sock_sendmsg+0xba/0xf0
[   97.783207]  ___sys_sendmsg+0x6ca/0x8e0
[   97.783208]  __sys_sendmsg+0xe6/0x190
[   97.783209]  SyS_sendmsg+0x13/0x20
[   97.783211]  do_syscall_64+0x2ac/0x430
[   97.783213]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[   97.784709] Freed by task 0:
[   97.785056] knetbase: Error: /proc/sys/net/core/txcs_enable does not exist
[   97.794497]  save_stack+0x46/0xd0
[   97.794499]  kasan_slab_free+0x71/0xc0
[   97.794500]  kfree+0x7c/0xf0
[   97.794501]  fib6_info_destroy_rcu+0x24f/0x310
[   97.794504]  rcu_process_callbacks+0x38b/0x1730
[   97.794506]  __do_softirq+0x1c8/0x5d0

Reported-by: John Sperbeck <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agosfp: fix oops with ethtool -m
Russell King [Tue, 18 Sep 2018 15:48:53 +0000 (16:48 +0100)]
sfp: fix oops with ethtool -m

If a network interface is created prior to the SFP socket being
available, ethtool can request module information.  This unfortunately
leads to an oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = (ptrval)
[00000008] *pgd=7c400831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1480 Comm: ethtool Not tainted 4.19.0-rc3 #138
Hardware name: Broadcom Northstar Plus SoC
PC is at sfp_get_module_info+0x8/0x10
LR is at dev_ethtool+0x218c/0x2afc

Fix this by not filling in the network device's SFP bus pointer until
SFP is fully bound, thereby avoiding the core calling into the SFP bus
code.

Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Reported-by: Florian Fainelli <[email protected]>
Tested-by: Florian Fainelli <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: fix a txq_done race condition
Antoine Tenart [Tue, 18 Sep 2018 14:58:47 +0000 (16:58 +0200)]
net: mvpp2: fix a txq_done race condition

When no Tx IRQ is available, the txq_done() routine (called from
tx_done()) shouldn't be called from the polling function, as in such
case it is already called in the Tx path thanks to an hrtimer. This
mostly occurred when using PPv2.1, as the engine then do not have Tx
IRQs.

Fixes: edc660fa09e2 ("net: mvpp2: replace TX coalescing interrupts with hrtimer")
Reported-by: Stefan Chulski <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'net-smc-fixes'
David S. Miller [Wed, 19 Sep 2018 03:11:43 +0000 (20:11 -0700)]
Merge branch 'net-smc-fixes'

Ursula Braun says:

====================
net/smc: fixes 2018-09-18

here are some fixes in different areas of the smc code for the net
tree.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet/smc: fix sizeof to int comparison
YueHaibing [Tue, 18 Sep 2018 13:46:38 +0000 (15:46 +0200)]
net/smc: fix sizeof to int comparison

Comparing an int to a size, which is unsigned, causes the int to become
unsigned, giving the wrong result. kernel_sendmsg can return a negative
error code.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet/smc: no urgent data check for listen sockets
Karsten Graul [Tue, 18 Sep 2018 13:46:37 +0000 (15:46 +0200)]
net/smc: no urgent data check for listen sockets

Don't check a listen socket for pending urgent data in smc_poll().

Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet/smc: enable fallback for connection abort in state INIT
Ursula Braun [Tue, 18 Sep 2018 13:46:36 +0000 (15:46 +0200)]
net/smc: enable fallback for connection abort in state INIT

If a linkgroup is terminated abnormally already due to failing
LLC CONFIRM LINK or LLC ADD LINK, fallback to TCP is still possible.
In this case do not switch to state SMC_PEERABORTWAIT and do not set
sk_err.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet/smc: remove duplicate mutex_unlock
Ursula Braun [Tue, 18 Sep 2018 13:46:35 +0000 (15:46 +0200)]
net/smc: remove duplicate mutex_unlock

For a failing smc_listen_rdma_finish() smc_listen_decline() is
called. If fallback is possible, the new socket is already enqueued
to be accepted in smc_listen_decline(). Avoid enqueuing a second time
afterwards in this case, otherwise the smc_create_lgr_pending lock
is released twice:
[  373.463976] WARNING: bad unlock balance detected!
[  373.463978] 4.18.0-rc7+ #123 Tainted: G           O
[  373.463979] -------------------------------------
[  373.463980] kworker/1:1/30 is trying to release lock (smc_create_lgr_pending) at:
[  373.463990] [<000003ff801205fc>] smc_listen_work+0x22c/0x5d0 [smc]
[  373.463991] but there are no more locks to release!
[  373.463991]
other info that might help us debug this:
[  373.463993] 2 locks held by kworker/1:1/30:
[  373.463994]  #0: 00000000772cbaed ((wq_completion)"events"){+.+.}, at: process_one_work+0x1ec/0x6b0
[  373.464000]  #1: 000000003ad0894a ((work_completion)(&new_smc->smc_listen_work)){+.+.}, at: process_one_work+0x1ec/0x6b0
[  373.464003]
stack backtrace:
[  373.464005] CPU: 1 PID: 30 Comm: kworker/1:1 Kdump: loaded Tainted: G           O      4.18.0-rc7uschi+ #123
[  373.464007] Hardware name: IBM 2827 H43 738 (LPAR)
[  373.464010] Workqueue: events smc_listen_work [smc]
[  373.464011] Call Trace:
[  373.464015] ([<0000000000114100>] show_stack+0x60/0xd8)
[  373.464019]  [<0000000000a8c9bc>] dump_stack+0x9c/0xd8
[  373.464021]  [<00000000001dcaf8>] print_unlock_imbalance_bug+0xf8/0x108
[  373.464022]  [<00000000001e045c>] lock_release+0x114/0x4f8
[  373.464025]  [<0000000000aa87fa>] __mutex_unlock_slowpath+0x4a/0x300
[  373.464027]  [<000003ff801205fc>] smc_listen_work+0x22c/0x5d0 [smc]
[  373.464029]  [<0000000000197a68>] process_one_work+0x2a8/0x6b0
[  373.464030]  [<0000000000197ec2>] worker_thread+0x52/0x410
[  373.464033]  [<000000000019fd0e>] kthread+0x15e/0x178
[  373.464035]  [<0000000000aaf58a>] kernel_thread_starter+0x6/0xc
[  373.464052]  [<0000000000aaf584>] kernel_thread_starter+0x0/0xc
[  373.464054] INFO: lockdep is turned off.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet/smc: fix non-blocking connect problem
Ursula Braun [Tue, 18 Sep 2018 13:46:34 +0000 (15:46 +0200)]
net/smc: fix non-blocking connect problem

In state SMC_INIT smc_poll() delegates polling to the internal
CLC socket. This means, once the connect worker has finished
its kernel_connect() step, the poll wake-up may occur. This is not
intended. The wake-up should occur from the wake up call in
smc_connect_work() after __smc_connect() has finished.
Thus in state SMC_INIT this patch now calls sock_poll_wait() on the
main SMC socket.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoravb: do not write 1 to reserved bits
Kazuya Mizuguchi [Tue, 18 Sep 2018 10:22:26 +0000 (12:22 +0200)]
ravb: do not write 1 to reserved bits

EtherAVB hardware requires 0 to be written to status register bits in
order to clear them, however, care must be taken not to:

1. Clear other bits, by writing zero to them
2. Write one to reserved bits

This patch corrects the ravb driver with respect to the second point above.
This is done by defining reserved bit masks for the affected registers and,
after auditing the code, ensure all sites that may write a one to a
reserved bit use are suitably masked.

Signed-off-by: Kazuya Mizuguchi <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Sergei Shtylyov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This page took 0.123823 seconds and 4 git commands to generate.