Git Repo - linux.git/log

net: sched: null actions array pointer before releasing action

Currently, tcf_action_delete() nulls actions array pointer after putting
and deleting it. However, if tcf_idr_delete_index() returns an error,
pointer to action is not set to null. That results it being released second
time in error handling code of tca_action_gd().

Kasan error:

[  807.367755] ==================================================================
[  807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
[  807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732

[  807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G        W         4.19.0-rc1+ #799
[  807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[  807.407948] Call Trace:
[  807.410763]  dump_stack+0x92/0xeb
[  807.414456]  print_address_description+0x70/0x360
[  807.419549]  kasan_report+0x14d/0x300
[  807.423582]  ? tc_setup_cb_call+0x14e/0x250
[  807.428150]  tc_setup_cb_call+0x14e/0x250
[  807.432539]  ? nla_put+0x65/0xe0
[  807.436146]  fl_dump+0x394/0x3f0 [cls_flower]
[  807.440890]  ? fl_tmplt_dump+0x140/0x140 [cls_flower]
[  807.446327]  ? lock_downgrade+0x320/0x320
[  807.450702]  ? lock_acquire+0xe2/0x220
[  807.454819]  ? is_bpf_text_address+0x5/0x140
[  807.459475]  ? memcpy+0x34/0x50
[  807.462980]  ? nla_put+0x65/0xe0
[  807.466582]  tcf_fill_node+0x341/0x430
[  807.470717]  ? tcf_block_put+0xe0/0xe0
[  807.474859]  tcf_node_dump+0xdb/0xf0
[  807.478821]  fl_walk+0x8e/0x170 [cls_flower]
[  807.483474]  tcf_chain_dump+0x35a/0x4d0
[  807.487703]  ? tfilter_notify+0x170/0x170
[  807.492091]  ? tcf_fill_node+0x430/0x430
[  807.496411]  tc_dump_tfilter+0x362/0x3f0
[  807.500712]  ? tc_del_tfilter+0x850/0x850
[  807.505104]  ? kasan_unpoison_shadow+0x30/0x40
[  807.509940]  ? __mutex_unlock_slowpath+0xcf/0x410
[  807.515031]  netlink_dump+0x263/0x4f0
[  807.519077]  __netlink_dump_start+0x2a0/0x300
[  807.523817]  ? tc_del_tfilter+0x850/0x850
[  807.528198]  rtnetlink_rcv_msg+0x46a/0x6d0
[  807.532671]  ? rtnl_fdb_del+0x3f0/0x3f0
[  807.536878]  ? tc_del_tfilter+0x850/0x850
[  807.541280]  netlink_rcv_skb+0x18d/0x200
[  807.545570]  ? rtnl_fdb_del+0x3f0/0x3f0
[  807.549773]  ? netlink_ack+0x500/0x500
[  807.553913]  netlink_unicast+0x2d0/0x370
[  807.558212]  ? netlink_attachskb+0x340/0x340
[  807.562855]  ? _copy_from_iter_full+0xe9/0x3e0
[  807.567677]  ? import_iovec+0x11e/0x1c0
[  807.571890]  netlink_sendmsg+0x3b9/0x6a0
[  807.576192]  ? netlink_unicast+0x370/0x370
[  807.580684]  ? netlink_unicast+0x370/0x370
[  807.585154]  sock_sendmsg+0x6b/0x80
[  807.589015]  ___sys_sendmsg+0x4a1/0x520
[  807.593230]  ? copy_msghdr_from_user+0x210/0x210
[  807.598232]  ? do_wp_page+0x174/0x880
[  807.602276]  ? __handle_mm_fault+0x749/0x1c10
[  807.607021]  ? __handle_mm_fault+0x1046/0x1c10
[  807.611849]  ? __pmd_alloc+0x320/0x320
[  807.615973]  ? check_chain_key+0x140/0x1f0
[  807.620450]  ? check_chain_key+0x140/0x1f0
[  807.624929]  ? __fget_light+0xbc/0xd0
[  807.628970]  ? __sys_sendmsg+0xd7/0x150
[  807.633172]  __sys_sendmsg+0xd7/0x150
[  807.637201]  ? __ia32_sys_shutdown+0x30/0x30
[  807.641846]  ? up_read+0x53/0x90
[  807.645442]  ? __do_page_fault+0x484/0x780
[  807.649949]  ? do_syscall_64+0x1e/0x2c0
[  807.654164]  do_syscall_64+0x72/0x2c0
[  807.658198]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  807.663625] RIP: 0033:0x7f42e9870150
[  807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[  807.687328] RSP: 002b:00007ffdbf595b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  807.695564] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f42e9870150
[  807.703083] RDX: 0000000000000000 RSI: 00007ffdbf595b80 RDI: 0000000000000003
[  807.710605] RBP: 00007ffdbf599d90 R08: 0000000000679bc0 R09: 000000000000000f
[  807.718127] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdbf599d88
[  807.725651] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

[  807.735048] Allocated by task 2687:
[  807.738902]  kasan_kmalloc+0xa0/0xd0
[  807.742852]  __kmalloc+0x118/0x2d0
[  807.746615]  tcf_idr_create+0x44/0x320
[  807.750738]  tcf_nat_init+0x41e/0x530 [act_nat]
[  807.755638]  tcf_action_init_1+0x4e0/0x650
[  807.760104]  tcf_action_init+0x1ce/0x2d0
[  807.764395]  tcf_exts_validate+0x1d8/0x200
[  807.768861]  fl_change+0x55a/0x26b4 [cls_flower]
[  807.773845]  tc_new_tfilter+0x748/0xa20
[  807.778051]  rtnetlink_rcv_msg+0x56a/0x6d0
[  807.782517]  netlink_rcv_skb+0x18d/0x200
[  807.786804]  netlink_unicast+0x2d0/0x370
[  807.791095]  netlink_sendmsg+0x3b9/0x6a0
[  807.795387]  sock_sendmsg+0x6b/0x80
[  807.799240]  ___sys_sendmsg+0x4a1/0x520
[  807.803445]  __sys_sendmsg+0xd7/0x150
[  807.807473]  do_syscall_64+0x72/0x2c0
[  807.811506]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  807.818776] Freed by task 2728:
[  807.822283]  __kasan_slab_free+0x122/0x180
[  807.826752]  kfree+0xf4/0x2f0
[  807.830080]  __tcf_action_put+0x5a/0xb0
[  807.834281]  tcf_action_put_many+0x46/0x70
[  807.838747]  tca_action_gd+0x232/0xc40
[  807.842862]  tc_ctl_action+0x215/0x230
[  807.846977]  rtnetlink_rcv_msg+0x56a/0x6d0
[  807.851444]  netlink_rcv_skb+0x18d/0x200
[  807.855731]  netlink_unicast+0x2d0/0x370
[  807.860021]  netlink_sendmsg+0x3b9/0x6a0
[  807.864312]  sock_sendmsg+0x6b/0x80
[  807.868166]  ___sys_sendmsg+0x4a1/0x520
[  807.872372]  __sys_sendmsg+0xd7/0x150
[  807.876401]  do_syscall_64+0x72/0x2c0
[  807.880431]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  807.887704] The buggy address belongs to the object at ffff88033e636000
                which belongs to the cache kmalloc-256 of size 256
[  807.900909] The buggy address is located 0 bytes inside of
                256-byte region [ffff88033e636000, ffff88033e636100)
[  807.913155] The buggy address belongs to the page:
[  807.918322] page:ffffea000cf98d80 count:1 mapcount:0 mapping:ffff88036f80ee00 index:0x0 compound_mapcount: 0
[  807.928831] flags: 0x5fff8000008100(slab|head)
[  807.933647] raw: 005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00
[  807.942050] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
[  807.950456] page dumped because: kasan: bad access detected

[  807.958240] Memory state around the buggy address:
[  807.963405]  ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
[  807.971288]  ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
[  807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  807.994882]                    ^
[  807.998477]  ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  808.006352]  ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[  808.014230] ==================================================================
[  808.022108] Disabling lock debugging due to kernel taint

Fixes: edfaf94fa705 ("net_sched: improve and refactor tcf_action_put_many()")
Signed-off-by: Vlad Buslov <[email protected]>
Acked-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: correct structure parameter comments for topsrv

Remove the following obsolete parameter comments of tipc_topsrv struct:
  @rcvbuf_cache
  @tipc_conn_new
  @tipc_conn_release
  @tipc_conn_recvmsg
  @imp
  @type

Add the comments for the missing parameters below of tipc_topsrv struct:
  @awork
  @listener

Remove the unused or duplicated parameter comments of tipc_conn struct:
  @outqueue_lock
  @rx_action

Signed-off-by: Zhenbo Gao <[email protected]>
Reviewed-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition

The _IOC_READ flag fits this ioctl request more because this request
actually only writes to, but doesn't read from userspace.
See NOTEs in include/uapi/asm-generic/ioctl.h for more information.

Fixes: 429711aec282 ("vhost: switch to use new message format")
Signed-off-by: Gleb Fotengauer-Malinovskiy <[email protected]>
Acked-by: Jason Wang <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

liquidio: Added delayed work for periodically updating the link statistics.

Signed-off-by: Pradeep Nalla <[email protected]>
Signed-off-by: Felix Manlunas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

r8169: add support for NCube 8168 network card

This card identifies itself as:
Ethernet controller [0200]: NCube Device [10ff:8168] (rev 06)
Subsystem: TP-LINK Technologies Co., Ltd. Device [7470:3468]

Adding a new entry to rtl8169_pci_tbl makes the card work.

Link: http://launchpad.net/bugs/1788730
Signed-off-by: Anthony Wong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ip6_tunnel: respect ttl inherit for ip6tnl

man ip-tunnel ttl section says:
0 is a special value meaning that packets inherit the TTL value.

IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
implement it yet. To make IPv6 behave consistently with IP tunnel,
add ipv6 tunnel inherit support.

Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mac80211: shorten the IBSS debug messages

When tracing is enabled, all the debug messages are recorded and must
not exceed MAX_MSG_LEN (100) columns. Longer debug messages grant the
user with:

WARNING: CPU: 3 PID: 32642 at /tmp/wifi-core-20180806094828/src/iwlwifi-stack-dev/net/mac80211/./trace_msg.h:32 trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Workqueue: phy1 ieee80211_iface_work [mac80211]
RIP: 0010:trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Call Trace:
  __sdata_dbg+0xbd/0x120 [mac80211]
  ieee80211_ibss_rx_queued_mgmt+0x15f/0x510 [mac80211]
  ieee80211_iface_work+0x21d/0x320 [mac80211]

Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: don't Tx a deauth frame if the AP forbade Tx

If the driver fails to properly prepare for the channel
switch, mac80211 will disconnect. If the CSA IE had mode
set to 1, it means that the clients are not allowed to send
any Tx on the current channel, and that includes the
deauthentication frame.

Make sure that we don't send the deauthentication frame in
this case.

In iwlwifi, this caused a failure to flush queues since the
firmware already closed the queues after having parsed the
CSA IE. Then mac80211 would wait until the deauthentication
frame would go out (drv_flush(drop=false)) and that would
never happen.

Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: Fix station bandwidth setting after channel switch

When performing a channel switch flow for a managed interface, the
flow did not update the bandwidth of the AP station and the rate
scale algorithm. In case of a channel width downgrade, this would
result with the rate scale algorithm using a bandwidth that does not
match the interface channel configuration.

Fix this by updating the AP station bandwidth and rate scaling algorithm
before the actual channel change in case of a bandwidth downgrade, or
after the actual channel change in case of a bandwidth upgrade.

Signed-off-by: Ilan Peer <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: fix a race between restart and CSA flows

We hit a problem with iwlwifi that was caused by a bug in
mac80211. A bug in iwlwifi caused the firwmare to crash in
certain cases in channel switch. Because of that bug,
drv_pre_channel_switch would fail and trigger the restart
flow.
Now we had the hw restart worker which runs on the system's
workqueue and the csa_connection_drop_work worker that runs
on mac80211's workqueue that can run together. This is
obviously problematic since the restart work wants to
reconfigure the connection, while the csa_connection_drop_work
worker does the exact opposite: it tries to disconnect.

Fix this by cancelling the csa_connection_drop_work worker
in the restart worker.

Note that this can sound racy: we could have:

driver   iface_work   CSA_work   restart_work
+++++++++++++++++++++++++++++++++++++++++++++
              |
<--drv_cs ---|
<FW CRASH!>
-CS FAILED-->
              |                       |
              |                 cancel_work(CSA)
           schedule                   |
           CSA work                   |
                         |            |
                        Race between those 2

But this is not possible because we flush the workqueue
in the restart worker before we cancel the CSA worker.
That would be bullet proof if we could guarantee that
we schedule the CSA worker only from the iface_work
which runs on the workqueue (and not on the system's
workqueue), but unfortunately we do have an instance
in which we schedule the CSA work outside the context
of the workqueue (ieee80211_chswitch_done).

Note also that we should probably cancel other workers
like beacon_connection_loss_work and possibly others
for different types of interfaces, at the very least,
IBSS should suffer from the exact same problem, but for
now, do the minimum to fix the actual bug that was actually
experienced and reproduced.

Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: fix WMM TXOP calculation

In commit 9236c4523e5b ("mac80211: limit wmm params to comply
with ETSI requirements"), we have limited the WMM parameters to
comply with 802.11 and ETSI standard. Mistakenly the TXOP value
was caluclated wrong. Fix it by taking the minimum between
802.11 to ETSI to make sure we are not violating both.

Fixes: e552af058148 ("mac80211: limit wmm params to comply with ETSI requirements")
Signed-off-by: Haim Dreyfuss <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

cfg80211: fix a type issue in ieee80211_chandef_to_operating_class()

The "chandef->center_freq1" variable is a u32 but "freq" is a u16 so we
are truncating away the high bits.  I noticed this bug because in commit
9cf0a0b4b64a ("cfg80211: Add support for 60GHz band channels 5 and 6")
we made "freq <= 56160 + 2160 * 6" a valid requency when before it was
only "freq <= 56160 + 2160 * 4" that was valid.  It introduces a static
checker warning:

    net/wireless/util.c:1571 ieee80211_chandef_to_operating_class()
    warn: always true condition '(freq <= 56160 + 2160 * 6) => (0-u16max <= 69120)'

But really we probably shouldn't have been truncating the high bits
away to begin with.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

mac80211: fix an off-by-one issue in A-MSDU max_subframe computation

Initialize 'n' to 2 in order to take into account also the first
packet in the estimation of max_subframe limit for a given A-MSDU
since frag_tail pointer is NULL when ieee80211_amsdu_aggregate
routine analyzes the second frame.

Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

Merge tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:
"A few fixes for the fallout of being a little more pedantic about dma
  masks"

* tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping:
  of/platform: initialise AMBA default DMA masks
  sparc: set a default 32-bit dma mask for OF devices
  kernel/dma/direct: take DMA offset into account in dma_direct_supported

Merge branch 'mlx5e-IPoIB-stats'

Tariq Toukan says:

====================
mlx5e IPoIB stats

This patchset by Feras contains statistics enhancements and NDO
implementation for the mlx5e IPoIB driver.

Series generated against net-next commit:
2d5c28859839 net: bgmac: remove set but not used variable 'err'
====================

Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: IPoIB, Use priv stats in completion rx flow

Since the RQs are shared between all pkey interfaces, the stats
should be taken from where the per-ring stats are stored instead
of the parent RQ.

Fixes: 4c6c615e3f30 ("net/mlx5e: IPoIB, Add PKEY child interface nic profile")
Signed-off-by: Feras Daoud <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: IPoIB, Add ndo stats support for IPoIB child devices

Expose RX and TX counters by implementing ndo_get_stats64 operation
for child devices.

Signed-off-by: Feras Daoud <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: IPoIB, Add ndo stats support for IPoIB netdevices

Expose RX and TX counters by implementing ndo_get_stats64 operation for
both parent devices.
After this change, all the relevant statistics can be retrieved using
ifconfig.

Signed-off-by: Feras Daoud <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: IPoIB, Initialize max_opened_tc in mlx5i_init flow

Enhanced ipoib does not initialize max_opened_tc causing wrong ethtool
statistics. As mlx5e_grp_sw_update_stats relies on this variable, without
this change, the TX statistics will not be updated.

Fixes: 05909babce53 ("net/mlx5e: Avoid reset netdev stats on configuration changes")
Signed-off-by: Feras Daoud <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'Full-phylink-support-for-mv88e6352'

Andrew Lunn says:

====================
Full phylink support for mv88e6352

These two patches implement full phylink support for the mv88e6352
family, when using an SFP connected to its SERDES interface. This adds
interrupt support to the SERDES, so that we get interrupts on link
up/down, and then make calls phydev_link_change().

The first patch is a minor bug fix, which does not seem to affect any
current features, so i'm not submitting it for stable. It is however
required for configuring SERDES interrupts.
====================

Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Add SERDES phydev_link_change for 6352

The 6352 family has one SERDES interface, which can be used by either
port 4 or port 5. Add interrupt support for the SERDES interface, and
report when the link status changes.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Fix writing to a PHY page.

After changing to the needed page, actually write the value to the
register!

Fixes: 09cb7dfd3f14 ("net: dsa: mv88e6xxx: describe PHY page and SerDes")
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

uapi: Fix linux/rds.h userspace compilation errors.

Include linux/in6.h for struct in6_addr.

/usr/include/linux/rds.h:156:18: error: field ‘laddr’ has incomplete type
  struct in6_addr laddr;
                  ^~~~~
/usr/include/linux/rds.h:157:18: error: field ‘faddr’ has incomplete type
  struct in6_addr faddr;
                  ^~~~~
/usr/include/linux/rds.h:178:18: error: field ‘laddr’ has incomplete type
  struct in6_addr laddr;
                  ^~~~~
/usr/include/linux/rds.h:179:18: error: field ‘faddr’ has incomplete type
  struct in6_addr faddr;
                  ^~~~~
/usr/include/linux/rds.h:198:18: error: field ‘bound_addr’ has incomplete type
  struct in6_addr bound_addr;
                  ^~~~~~~~~~
/usr/include/linux/rds.h:199:18: error: field ‘connected_addr’ has incomplete type
  struct in6_addr connected_addr;
                  ^~~~~~~~~~~~~~
/usr/include/linux/rds.h:219:18: error: field ‘local_addr’ has incomplete type
  struct in6_addr local_addr;
                  ^~~~~~~~~~
/usr/include/linux/rds.h:221:18: error: field ‘peer_addr’ has incomplete type
  struct in6_addr peer_addr;
                  ^~~~~~~~~
/usr/include/linux/rds.h:245:18: error: field ‘src_addr’ has incomplete type
  struct in6_addr src_addr;
                  ^~~~~~~~
/usr/include/linux/rds.h:246:18: error: field ‘dst_addr’ has incomplete type
  struct in6_addr dst_addr;
                  ^~~~~~~~

Fixes: b7ff8b1036f0 ("rds: Extend RDS API for IPv6 support")
Signed-off-by: Vinson Lee <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: remove useless add operation when init sysctl_max_tw_buckets

cp_hashinfo.ehash_mask is always an odd number, which is set in function
alloc_large_system_hash(). See bellow,
if (_hash_mask)
*_hash_mask = (1 << log2qty) - 1; <<< always odd number

Hence the local variable 'cnt' is a even number, as a result of that it is
no difference to do the incrementation here.

Signed-off-by: Yafang Shao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: nixge: Fix Kconfig warning with OF_MDIO

Fix Kconfig warning with OF_MDIO where OF_MDIO was
selected unconditionally instead of only when
OF is actually enabled.

Fixes 7e8d5755be0e ("net: nixge: Add support for 64-bit platforms")
Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Moritz Fischer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: cadence: Fix a sleep-in-atomic-context bug in macb_halt_tx()

The kernel module may sleep with holding a spinlock.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] usleep_range
drivers/net/ethernet/cadence/macb_main.c, 648:
usleep_range in macb_halt_tx
drivers/net/ethernet/cadence/macb_main.c, 730:
macb_halt_tx in macb_tx_error_task
drivers/net/ethernet/cadence/macb_main.c, 721:
_raw_spin_lock_irqsave in macb_tx_error_task

To fix this bug, usleep_range() is replaced with udelay().

This bug is found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-09-02

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data()
   when linearizing multiple scatterlist elements, from Tushar.

2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is
   found on map update where it would release existing ULP. syzbot found and
   triggered this couple of times now, fix from John.

3) Add missing xskmap type to bpftool so it will properly show the type
   on map dump, from Prashant.
====================

Signed-off-by: David S. Miller <[email protected]>

Linux 4.19-rc2

Merge branch 'mvneta-some-small-improvements'

Jisheng Zhang says:

====================
net: mvneta: some small improvements

patch1 removes the NETIF_F_GRO check ourself, because the net subsystem
will handle it for us.

patch2 enables NETIF_F_RXCSUM by default, since the driver and HW
supports the feature.

patch3 is a small optimization, to reduce smp_processor_id() calling
in mvneta_tx_done_gbe.

since v1:
- based on net-next tree
- remove the fix patches, since they should be based on net branch.
- Add Gregory's Reviewed-by tag
====================

Signed-off-by: David S. Miller <[email protected]>

net: mvneta: reduce smp_processor_id() calling in mvneta_tx_done_gbe

In the loop of mvneta_tx_done_gbe(), we call the smp_processor_id()
each time, move the call out of the loop to optimize the code a bit.

Before the patch, the loop looks like(under arm64):

        ldr     x1, [x29,#120]
        ...
        ldr     w24, [x1,#36]
        ...
        bl      0 <_raw_spin_lock>
        str     w24, [x27,#132]
        ...

After the patch, the loop looks like(under arm64):

        ...
        bl      0 <_raw_spin_lock>
        str     w23, [x28,#132]
        ...
where w23 is loaded so be ready before the loop.

>From another side, mvneta_tx_done_gbe() is called from mvneta_poll()
which is in non-preemptible context, so it's safe to call the
smp_processor_id() function once.

Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Gregory CLEMENT <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvneta: enable NETIF_F_RXCSUM by default

The code and HW supports NETIF_F_RXCSUM, so let's enable it by default.

Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Gregory CLEMENT <[email protected]>
Tested-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvneta: Don't check NETIF_F_GRO ourself

napi_gro_receive() checks NETIF_F_GRO bit as well, if the bit is not
set, we will go through GRO_NORMAL in napi_skb_finish(), so fall back
to netif_receive_skb_internal(), so we don't need to check NETIF_F_GRO
ourself.

Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Gregory CLEMENT <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv6: Only update MTU metric if it set

Jan reported a regression after an update to 4.18.5. In this case ipv6
default route is setup by systemd-networkd based on data from an RA. The
RA contains an MTU of 1492 which is used when the route is first inserted
but then systemd-networkd pushes down updates to the default route
without the mtu set.

Prior to the change to fib6_info, metrics such as MTU were held in the
dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
the value from the metrics is used if it is set and finally falling
back to the idev value.

After the fib6_info change metrics are contained in the fib6_info struct
and there is no equivalent to rt6i_pmtu. To maintain consistency with
the old behavior the new code should only reset the MTU in the metrics
if the route update has it set.

Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Jan Janssen <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: fix type of htb statistics

tokens and ctokens are defined as s64 in htb_class structure,
and clamped to 32bits value during netlink dumps:

cl->xstats.tokens = clamp_t(s64, PSCHED_NS2TICKS(cl->tokens),
INT_MIN, INT_MAX);

Defining it as u32 is working since userspace (tc) is printing it as
signed int, but a correct definition from the beginning is probably
better.

In the same time, 'giants' structure member is unused since years, so
update the comment to mark it unused.

Signed-off-by: Florent Fourcot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: cpsw-phy-sel: prefer phandle for phy sel

The cpsw-phy-sel device is not a child of the cpsw interconnect target
module. It lives in the system control module.

Let's fix this issue by trying to use cpsw-phy-sel phandle first if it
exists and if not fall back to current usage of trying to find the
cpsw-phy-sel child. That way the phy sel driver can be a child of the
system control module where it belongs in the device tree.

Without this fix, we cannot have a proper interconnect target module
hierarchy in device tree for things like genpd.

Note that deferred probe is mostly not supported by cpsw and this patch
does not attempt to fix that. In case deferred probe support is needed,
this could be added to cpsw_slave_open() and phy_connect() so they start
handling and returning errors.

For documenting it, looks like the cpsw-phy-sel is used for all cpsw device
tree nodes. It's missing the related binding documentation, so let's also
update the binding documentation accordingly.

Cc: [email protected]
Cc: Andrew Lunn <[email protected]>
Cc: Grygorii Strashko <[email protected]>
Cc: Ivan Khoronzhuk <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Murali Karicheri <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle

The current cpsw usage for cpsw-phy-sel is undocumented but is used for
all the boards using cpsw. And cpsw-phy-sel is not really a child of
the cpsw device, it lives in the system control module instead.

Let's document the existing usage, and improve it a bit where we prefer
to use a phandle instead of a child device for it. That way we can
properly describe the hardware in dts files for things like genpd.

Cc: [email protected]
Cc: Andrew Lunn <[email protected]>
Cc: Grygorii Strashko <[email protected]>
Cc: Ivan Khoronzhuk <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Murali Karicheri <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'igmp-fix-two-incorrect-unsolicit-report-count-issues'

Hangbin Liu says:

====================
igmp: fix two incorrect unsolicit report count issues

Just like the subject, fix two minor igmp unsolicit report count issues.
====================

Signed-off-by: David S. Miller <[email protected]>

igmp: fix incorrect unsolicit report count after link down and up

After link down and up, i.e. when call ip_mc_up(), we doesn't init
im->unsolicit_count. So after igmp_timer_expire(), we will not start
timer again and only send one unsolicit report at last.

Fix it by initializing im->unsolicit_count in igmp_group_added(), so
we can respect igmp robustness value.

Fixes: 24803f38a5c0b ("igmp: do not remove igmp souce list info when set link down")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

igmp: fix incorrect unsolicit report count when join group

We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP

Currently we check sk_user_data is non NULL to determine if the sk
exists in a map. However, this is not sufficient to ensure the psock
or the ULP ops are not in use by another user, such as kcm or TLS. To
avoid this when adding a sock to a map also verify it is of the
correct ULP type. Additionally, when releasing a psock verify that
it is the TCP_ULP_BPF type before releasing the ULP. The error case
where we abort an update due to ULP collision can cause this error
path.

For example,

  __sock_map_ctx_update_elem()
     [...]
     err = tcp_set_ulp_id(sock, TCP_ULP_BPF) <- collides with TLS
     if (err)                                <- so err out here
        goto out_free
     [...]
  out_free:
     smap_release_sock() <- calling tcp_cleanup_ulp releases the
                            TLS ULP incorrectly.

Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

tools/bpf: bpftool, add xskmap in map types

When listed all maps, bpftool currently shows (null) for xskmap.
Added xskmap type in map_type_name[] to show correct type.

Signed-off-by: Prashant Bhole <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: Fix bpf_msg_pull_data()

Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
linearizing multiple scatterlist elements. Variable 'offset' is used
to find first starting scatterlist element
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"

Use different variable name while linearizing multiple scatterlist
elements so that value contained in variable 'offset' won't get
overwritten.

Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Tushar Dave <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

Merge tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
"A couple of new helper functions in preparation for some tree wide
  clean-ups.

  I'm sending these new helpers now for rc2 in order to simplify the
  dependencies on subsequent cleanups across the tree in 4.20"

* tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: Add device_type access helper functions
  of: add node name compare helper functions
  of: add helper to lookup compatible child node

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"First batch of fixes post-merge window:

   - A handful of devicetree changes for i.MX2{3,8} to change over to
     new panel bindings. The platforms were moved from legacy
     framebuffers to DRM and some development board panels hadn't yet
     been converted.

   - OMAP fixes related to ti-sysc driver conversion fallout, fixing
     some register offsets, no_console_suspend fixes, etc.

   - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.

   - Fixed 0755->0644 permissions on a newly added file.

   - Defconfig changes to make ARM Versatile more useful with QEMU
     (helps testing).

   - Enable defconfig options for new TI SoC platform that was merged
     this window (AM6)"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: defconfig: Enable TI's AM6 SoC platform
  ARM: defconfig: Update the ARM Versatile defconfig
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
  ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: dts: imx23-evk: Convert to the new display bindings
  ARM: dts: imx23-evk: Move regulators outside simple-bus
  ARM: dts: imx28-evk: Convert to the new display bindings
  ARM: dts: imx28-evk: Move regulators outside simple-bus
  Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
  arm: dts: am4372: setup rtc as system-power-controller
  ARM: dts: omap4-droid4: fix vibrations on Droid 4
  bus: ti-sysc: Fix no_console_suspend handling
  bus: ti-sysc: Fix module register ioremap for larger offsets
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"Speculation:

   - Make the microcode check more robust

   - Make the L1TF memory limit depend on the internal cache physical
     address space and not on the CPUID advertised physical address
     space, which might be significantly smaller. This avoids disabling
     L1TF on machines which utilize the full physical address space.

   - Fix the GDT mapping for EFI calls on 32bit PTI

   - Fix the MCE nospec implementation to prevent #GP

  Fixes and robustness:

   - Use the proper operand order for LSL in the VDSO

   - Prevent NMI uaccess race against CR3 switching

   - Add a lockdep check to verify that text_mutex is held in
     text_poke() functions

   - Repair the fallout of giving native_restore_fl() a prototype

   - Prevent kernel memory dumps based on usermode RIP

   - Wipe KASAN shadow stack before rewinding the stack to prevent false
     positives

   - Move the AMS GOTO enforcement to the actual build stage to allow
     user API header extraction without a compiler

   - Fix a section mismatch introduced by the on demand VDSO mapping
     change

  Miscellaneous:

   - Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pti: Fix section mismatch warning/error
  x86/vdso: Fix lsl operand order
  x86/mce: Fix set_mce_nospec() to avoid #GP fault
  x86/efi: Load fixmap GDT in efi_call_phys_epilog()
  x86/nmi: Fix NMI uaccess race against CR3 switching
  x86: Allow generating user-space headers without a compiler
  x86/dumpstack: Don't dump kernel memory based on usermode RIP
  x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
  x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
  x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
  x86/irqflags: Mark native_restore_fl extern inline
  x86/build: Remove jump label quirk for GCC older than 4.5.2
  x86/Kconfig: Fix trivial typo
  x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
  x86/spectre: Add missing family 6 check to microcode check

Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CPU hotplug fix from Thomas Gleixner:
"Remove the stale skip_onerr member from the hotplug states"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Thomas Gleixner:
"A small set of updates for core code:

   - Prevent tracing in functions which are called from trace patching
     via stop_machine() to prevent executing half patched function trace
     entries.

   - Remove old GCC workarounds

   - Remove pointless includes of notifier.h"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Remove workaround for unreachable warnings from old GCC
  notifier: Remove notifier header file wherever not used
  watchdog: Mark watchdog touch functions as notrace

x86/pti: Fix section mismatch warning/error

Fix the section mismatch warning in arch/x86/mm/pti.c:

WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.

Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

of/platform: initialise AMBA default DMA masks

This addresses a v4.19-rc1 regression in the PL111 DRM driver in
drivers/gpu/pl111/*

The driver uses the CMA KMS helpers and will thus at some point call
down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory
for the framebuffer.

It appears that in v4.18, it was OK that this (and other DMA mastering
AMBA devices) left dev->coherent_dma_mask blank (zero).

In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in
dma_alloc_attrs() in include/linux/dma-mapping.h is triggered.  The
allocation later fails when get_coherent_dma_mask() is called from
__dma_alloc() and __dma_alloc() returns NULL:

drm-clcd-pl111 dev:20: coherent DMA mask is unset
drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
                Failed to set fbdev configuration

It turns out that in commit 4d8bde883bfb ("OF: Don't set default
coherent DMA mask") the OF core stops setting the default DMA mask on
new devices, especially those lines of the patch:

- if (!dev->coherent_dma_mask)
-               dev->coherent_dma_mask = DMA_BIT_MASK(32);

Robin Murphy solved a similar problem in a5516219b102 ("of/platform:
Initialise default DMA masks") by simply assigning dev.coherent_dma_mask
and the dev.dma_mask to point to the same when creating devices from the
device tree, and introducing the same code into the code path creating
AMBA/PrimeCell devices solved my problem, graphics now come up.

The code simply assumes that the device can access all of the system
memory by setting the coherent DMA mask to 0xffffffff when creating a
device from the device tree, which is crude, but seems to be what kernel
v4.18 assumed.

The AMBA PrimeCells do not differ between coherent and streaming DMA so
we can just assign the same to any DMA mask.

Possibly drivers should augment their coherent DMA mask in accordance
with "dma-ranges" from the device tree if more finegranular masking is
needed.

Reported-by: Russell King <[email protected]>
Fixes: 4d8bde883bfb ("OF: Don't set default coherent DMA mask")
Cc: Russell King <[email protected]>
Cc: Robin Murphy <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

sparc: set a default 32-bit dma mask for OF devices

This keeps the historic default behavior for devices without a DMA mask,
but removes the warning about a lacking DMA mask for doing DMA without
a mask.

Reported-by: Meelis Roos <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Guenter Roeck <[email protected]>

net: bgmac: remove set but not used variable 'err'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/broadcom/bgmac.c: In function 'bgmac_dma_alloc':
drivers/net/ethernet/broadcom/bgmac.c:619:6: warning:
variable 'err' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: b53: Provide sensible defaults

The SRAB driver is the default way to communicate with the integrated
switch on iProc platforms and the MMAP driver is the way to communicate
with the integrated switch on DSL BCM63xx and CM BCM33xx.

Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

liquidio: remove set but not used variable 'irh'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/cavium/liquidio/request_manager.c: In function 'lio_process_iq_request_list':
drivers/net/ethernet/cavium/liquidio/request_manager.c:383:27: warning:
variable 'irh' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <[email protected]>
Acked-by: Felix Manlunas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed: Lower the severity of a dcbx log message.

Driver displays an error message for each unrecognized dcbx TLV that's
received from the peer or configured on the device. It is observed that
syslog will be flooded with such messages in certain scenarios e.g.,
frequent link-flaps/lldp-transactions. Changing the severity of this
message to verbose level as it's not an error scenario/message.

Signed-off-by: Sudarsana Reddy Kalluru <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rds: store socket timestamps as ktime_t

rds is the last in-kernel user of the old do_gettimeofday()
function. Convert it over to ktime_get_real() to make it
work more like the generic socket timestamps, and to let
us kill off do_gettimeofday().

A follow-up patch will have to change the user space interface
to deal better with 32-bit tasks, which may use an incompatible
layout for 'struct timespec'.

Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests/tls: Add test for recv(PEEK) spanning across multiple records

Added test case to receive multiple records with a single recvmsg()
operation with a MSG_PEEK set.

Signed-off-by: David S. Miller <[email protected]>

net/tls: Add support for async decryption of tls records

When tls records are decrypted using asynchronous acclerators such as
NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on
getting -EINPROGRESS, the tls record processing stops till the time the
crypto accelerator finishes off and returns the result. This incurs a
context switch and is not an efficient way of accessing the crypto
accelerators. Crypto accelerators work efficient when they are queued
with multiple crypto jobs without having to wait for the previous ones
to complete.

The patch submits multiple crypto requests without having to wait for
for previous ones to complete. This has been implemented for records
which are decrypted in zero-copy mode. At the end of recvmsg(), we wait
for all the asynchronous decryption requests to complete.

The references to records which have been sent for async decryption are
dropped. For cases where record decryption is not possible in zero-copy
mode, asynchronous decryption is not used and we wait for decryption
crypto api to complete.

For crypto requests executing in async fashion, the memory for
aead_request, sglists and skb etc is freed from the decryption
completion handler. The decryption completion handler wakesup the
sleeping user context when recvmsg() flags that it has done sending
all the decryption requests and there are no more decryption requests
pending to be completed.

Signed-off-by: Vakul Garg <[email protected]>
Reviewed-by: Dave Watson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omap variants against v4.19-rc1

These are mostly fixes related to using ti-sysc interconnect target module
driver for accessing right register offsets for sgx and cpsw and for
no_console_suspend regression.

There is also a droid4 emmc fix where emmc may not get detected for some
models, and vibrator dts mismerge fix.

And we have a file permission fix for am335x-osd3358-sm-red.dts that
just got added. And we must tag RTC as system-power-controller for
am437x for PMIC to shut down during poweroff.

* tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
  arm: dts: am4372: setup rtc as system-power-controller
  ARM: dts: omap4-droid4: fix vibrations on Droid 4
  bus: ti-sysc: Fix no_console_suspend handling
  bus: ti-sysc: Fix module register ioremap for larger offsets
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug

Signed-off-by: Olof Johansson <[email protected]>

bnxt_en: remove set but not used variable 'rx_stats'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c: In function 'bnxt_vf_rep_rx':
drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:212:28: warning:
variable 'rx_stats' set but not used [-Wunused-but-set-variable]
struct bnxt_vf_rep_stats *rx_stats;

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: remove duplicated include from net_failover.c

Remove duplicated include.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: don't get lwtstate twice in ip6_rt_copy_init()

Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info")
partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
with ip6_rt_copy_init(), and it should be done only once there.

rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
ip6_rt_copy_init(), so there is no need to get it again at the end.

With this patch, lwtstate also isn't copied from RTF_REJECT routes.

[1]:
unreferenced object 0xffff880b6aaa14e0 (size 64):
  comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s)
  hex dump (first 32 bytes):
    01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000018664623>] lwtunnel_build_state+0x1bc/0x420
    [<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0
    [<00000000ee2c5d1f>] ip6_route_add+0x14/0x70
    [<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0
    [<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0
    [<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0
    [<000000004c893c76>] netlink_unicast+0x417/0x5a0
    [<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30
    [<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0
    [<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950
    [<000000001e7426c8>] __sys_sendmsg+0xde/0x170
    [<00000000fe411443>] do_syscall_64+0x9f/0x4a0
    [<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<000000006d21f353>] 0xffffffffffffffff

Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path")
Signed-off-by: Alexey Kodanev <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Add CBS support in XGMAC2

XGMAC2 uses the same CBS mechanism as GMAC5, only registers offset
changes. Lets use the same TC callbacks and implement the .config_cbs
callback in XGMAC2 core.

Signed-off-by: Jose Abreu <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Joao Pinto <[email protected]>
Cc: Giuseppe Cavallaro <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'dpaa2-eth-Move-DPAA2-Ethernet-driver'

Ioana Radulescu says:

====================
dpaa2-eth: Move DPAA2 Ethernet driver

The Freescale/NXP DPAA2 Ethernet driver was first included in
drivers/staging, due to its dependencies on two components located
there at the time of its initial submission:
* the fsl-mc bus driver, which was moved to drivers/bus in kernel 4.17
* the dpio driver, which was moved to drivers/soc/fsl in kernel 4.18

More information on the DPAA2 architecture and the interactions
between the fsl-mc bus and the objects present on it can be found in:
Documentation/networking/dpaa2/overview.rst

For easier review, the patch is generated without the -M option,
although the driver files are moved without any code changes.

changes since v1[1]:
* remove RFC label, since dependencies have been merged on net-next
* add patch fixing a possible race at probe (reported by Andrew Lunn)

[1] https://lore.kernel.org/patchwork/patch/971333/
====================

Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Move DPAA2 Ethernet driver from staging to drivers/net

The DPAA2 Ethernet driver supports Freescale/NXP SoCs with DPAA2
(DataPath Acceleration Architecture v2). The driver manages
network objects discovered on the fsl-mc bus.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

staging: fsl-dpaa2/eth: Delay netdev_register() call

Only call netdev_register() at the end of the probe function,
once all other necessary bits and pieces are properly initialized.

We keep the rest of the netdevice initialization code in place,
at the earlier point of the probing sequence, including the
settings previously done in ndo_init.

Signed-off-by: Ioana Radulescu <[email protected]>
Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

x86/vdso: Fix lsl operand order

In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.

Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

Merge tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixlet from Wim Van Sebroeck:
"Document support for r8a774a1"

* tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog:
dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"Two small fixes, one for the x86 Stoney SoC to get a more accurate clk
  frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX
  driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: x86: Set default parent to 48Mhz
  clk: npcm7xx: fix memory allocation

kernel/dma/direct: take DMA offset into account in dma_direct_supported

When a device has a DMA offset the dma capable result will change due
to the difference between the physical and DMA address. Take that into
account.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Benjamin Herrenschmidt <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>

x86/mce: Fix set_mce_nospec() to avoid #GP fault

The trick with flipping bit 63 to avoid loading the address of the 1:1
mapping of the poisoned page while the 1:1 map is updated used to work when
unmapping the page. But it falls down horribly when attempting to directly
set the page as uncacheable.

The problem is that when the cache mode is changed to uncachable, the pages
needs to be flushed from the cache first. But the decoy address is
non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a
#GP fault.

Add code to change_page_attr_set_clr() to fix the address before calling
flush.

Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Peter Anvin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: linux-edac <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Jiang <[email protected]>
Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk

ibmvnic: Include missing return code checks in reset function

Check the return codes of these functions and halt reset
in case of failure. The driver will remain in a dormant state
until the next reset event, when device initialization will be
re-attempted.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: pmtu: detect correct binary to ping ipv6 addresses

Some systems don't have the ping6 binary anymore, and use ping for
everything. Detect the absence of ping6 and try to use ping instead.

Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test")
Signed-off-by: Sabrina Dubroca <[email protected]>
Acked-by: Stefano Brivio <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: pmtu: maximum MTU for vti4 is 2^16-1-20

Since commit 82612de1c98e ("ip_tunnel: restore binding to ifaces with a
large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of
the mysterious constant 0xFFF8. This makes this selftest fail.

Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu")
Signed-off-by: Sabrina Dubroca <[email protected]>
Acked-by: Stefano Brivio <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: Switch to bitmap_zalloc()

Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: do not restart timewait timer on rst reception

RFC 1337 says:
''Ignore RST segments in TIME-WAIT state.
   If the 2 minute MSL is enforced, this fix avoids all three hazards.''

So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
expire rather than removing it instantly when a reset is received.

However, Linux will also re-start the TIME-WAIT timer.

This causes connect to fail when tying to re-use ports or very long
delays (until syn retry interval exceeds MSL).

packetdrill test case:
// Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
`sysctl net.ipv4.tcp_rfc1337=1`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

// Receive first segment
0.310 < P. 1:1001(1000) ack 1 win 46

// Send one ACK
0.310 > . 1:1(0) ack 1001

// read 1000 byte
0.310 read(4, ..., 1000) = 1000

// Application writes 100 bytes
0.350 write(4, ..., 100) = 100
0.350 > P. 1:101(100) ack 1001

// ACK
0.500 < . 1001:1001(0) ack 101 win 257

// close the connection
0.600 close(4) = 0
0.600 > F. 101:101(0) ack 1001 win 244

// Our side is in FIN_WAIT_1 & waits for ack to fin
0.7 < . 1001:1001(0) ack 102 win 244

// Our side is in FIN_WAIT_2 with no outstanding data.
0.8 < F. 1001:1001(0) ack 102 win 244
0.8 > . 102:102(0) ack 1002 win 244

// Our side is now in TIME_WAIT state, send ack for fin.
0.9 < F. 1002:1002(0) ack 102 win 244
0.9 > . 102:102(0) ack 1002 win 244

// Peer reopens with in-window SYN:
1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// Therefore, reply with ACK.
1.000 > . 102:102(0) ack 1002 win 244

// Peer sends RST for this ACK.  Normally this RST results
// in tw socket removal, but rfc1337=1 setting prevents this.
1.100 < R 1002:1002(0) win 244

// second syn. Due to rfc1337=1 expect another pure ACK.
31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
31.0 > . 102:102(0) ack 1002 win 244

// .. and another RST from peer.
31.1 < R 1002:1002(0) win 244
31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`

// third syn after one minute.  Time-Wait socket should have expired by now.
63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// so we expect a syn-ack & 3whs to proceed from here on.
63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>

Without this patch, 'ss' shows restarts of tw timer and last packet is
thus just another pure ack, more than one minute later.

This restores the original code from commit 283fd6cf0be690a83
("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .

For some reason the else branch was removed/lost in 1f28b683339f7
("Merge in TCP/UDP optimizations and [..]") and timer restart became
unconditional.

Reported-by: Michal Tesar <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: RDS is not Radio Data System

Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is
easily confused with Radio Data System (which we may want to support
in kernel, too).

Signed-off-by: Pavel Machek <[email protected]>
Acked-by: Sowmini Varadhan <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Acked-by: Sowmini Varadhan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()

This patch fixes the race between netvsc_probe() and
rndis_set_subchannel(), which can cause a deadlock.

These are the related 3 paths which show the deadlock:

path #1:
    Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
    Call Trace:
     schedule
     schedule_preempt_disabled
     __mutex_lock
     __device_attach
     bus_probe_device
     device_add
     vmbus_device_register
     vmbus_onoffer
     vmbus_onmessage_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #2:
    schedule
     schedule_preempt_disabled
     __mutex_lock
     netvsc_probe
     vmbus_probe
     really_probe
     __driver_attach
     bus_for_each_dev
     driver_attach_async
     async_run_entry_fn
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #3:
    Workqueue: events netvsc_subchan_work [hv_netvsc]
    Call Trace:
     schedule
     rndis_set_subchannel
     netvsc_subchan_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

Before path #1 finishes, path #2 can start to run, because just before
the "bus_probe_device(dev);" in device_add() in path #1, there is a line
"object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can
immediately try to load hv_netvsc and hence path #2 can start to run.

Next, path #2 offloads the subchannal's initialization to a workqueue,
i.e. path #3, so we can end up in a deadlock situation like this:

Path #2 gets the device lock, and is trying to get the rtnl lock;
Path #3 gets the rtnl lock and is waiting for all the subchannel messages
to be processed;
Path #1 is trying to get the device lock, but since #2 is not releasing
the device lock, path #1 has to sleep; since the VMBus messages are
processed one by one, this means the sub-channel messages can't be
procedded, so #3 has to sleep with the rtnl lock held, and finally #2
has to sleep... Now all the 3 paths are sleeping and we hit the deadlock.

With the patch, we can make sure #2 gets both the device lock and the
rtnl lock together, gets its job done, and releases the locks, so #1
and #3 will not be blocked for ever.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Dexuan Cui <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Share main switch IRQ

On some boards the interrupt can be shared between multiple devices.
For example on Turris Mox the interrupt is shared between all switches.

Signed-off-by: Marek Behun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv6: Do not reset nl_net in ip6_route_info_create

nl_net is set on entry to ip6_route_info_create. Only devices
within that namespace are considered so no need to reset it
before returning.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv4: Add extack message that dev is required for ONLINK

Make IPv4 consistent with IPv6 and return an extack message that the
ONLINK flag requires a nexthop device.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: change IPv6 flow-label upon receiving spurious retransmission

Currently a Linux IPv6 TCP sender will change the flow label upon
timeouts to potentially steer away from a data path that has gone
bad. However this does not help if the problem is on the ACK path
and the data path is healthy. In this case the receiver is likely
to receive repeated spurious retransmission because the sender
couldn't get the ACKs in time and has recurring timeouts.

This patch adds another feature to mitigate this problem. It
leverages the DSACK states in the receiver to change the flow
label of the ACKs to speculatively re-route the ACK packets.
In order to allow triggering on the second consecutive spurious
RTO, the receiver changes the flow label upon sending a second
consecutive DSACK for a sequence number below RCV.NXT.

Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfp: wait for posted reconfigs when disabling the device

To avoid leaking a running timer we need to wait for the
posted reconfigs after netdev is unregistered. In common
case the process of deinitializing the device will perform
synchronous reconfigs which wait for posted requests, but
especially with VXLAN ports being actively added and removed
there can be a race condition leaving a timer running after
adapter structure is freed leading to a crash.

Add an explicit flush after deregistering and for a good
measure a warning to check if timer is running just before
structures are freed.

Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Dirk van der Merwe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Revert "packet: switch kvzalloc to allocate memory"

This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0.

mmap()/munmap() can not be backed by kmalloced pages :

We fault in :

    VM_BUG_ON_PAGE(PageSlab(page), page);

    unmap_single_vma+0x8a/0x110
    unmap_vmas+0x4b/0x90
    unmap_region+0xc9/0x140
    do_munmap+0x274/0x360
    vm_munmap+0x81/0xc0
    SyS_munmap+0x2b/0x40
    do_syscall_64+0x13e/0x1c0
    entry_SYSCALL_64_after_hwframe+0x42/0xb7

Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: John Sperbeck <[email protected]>
Bisected-by: John Sperbeck <[email protected]>
Cc: Zhang Yu <[email protected]>
Cc: Li RongQing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched: add missing tcf_lock for act_connmark

According to the new locking rule, we have to take tcf_lock
for both ->init() and ->dump(), as RTNL will be removed.
However, it is missing for act_connmark.

Cc: Vlad Buslov <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

veth: add software timestamping

Provide a software TX timestamp as well as the ethtool query interface
and report the software timestamp capabilities.

Tested with "ethtool -T" and two linuxptp instances each bound to a
tunnel endpoint.

Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Revert "net: sched: act: add extack for lookup callback"

This reverts commit 331a9295de23 ("net: sched: act: add extack for lookup callback").

This extack is never used after 6 months... In fact, it can be just
set in the caller, right after ->lookup().

Cc: Alexander Aring <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-09-01

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus.

2) BPF verifier improvements by giving each register its own liveness
   chain which allows to simplify and getting rid of skip_callee() logic,
   from Edward.

3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap
   and percpu lru hashmap. Also add generic percpu formatted print on
   bpftool so the same can be dumped there, from Yonghong.

4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and
   TCP_SAVED_SYN options to allow reflection of tos/tclass from received
   SYN packet, from Nikita.

5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2
   interaction and removal of incorrect shutdown() calls, from John.

6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant.
====================

Signed-off-by: David S. Miller <[email protected]>

xsk: i40e: get rid of useless struct xdp_umem_props

This commit gets rid of the structure xdp_umem_props. It was there to
be able to break a dependency at one point, but this is no longer
needed. The values in the struct are instead stored directly in the
xdp_umem structure. This simplifies the xsk code as well as af_xdp
zero-copy drivers and as a bonus gets rid of one internal header file.

The i40e driver is also adapted to the new interface in this commit.

Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

i40e: fix possible compiler warning in xsk TX path

With certain gcc versions, it was possible to get the warning
"'tx_desc' may be used uninitialized in this function" for the
i40e_xmit_zc. This was not possible, however this commit simplifies
the code path so that this warning is no longer emitted.

Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: add selftest for bpf's (set|get)_sockopt for SAVE_SYN

adding selftest for feature, introduced in commit 9452048c79404 ("bpf:
add TCP_SAVE_SYN/TCP_SAVED_SYN options for bpf_(set|get)sockopt").

Signed-off-by: Nikita V. Shirokov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

samples/bpf: xdpsock, minor fixes

- xsks_map size was fixed to 4, changed it MAX_SOCKS
- Remove redundant definition of MAX_SOCKS in xdpsock_user.c
- In dump_stats(), add NULL check for xsks[i]

Signed-off-by: Prashant Bhole <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

xsk: remove unnecessary assignment

Since xdp_umem_query() was added one assignment of bpf.command was
missed from cleanup. Removing the assignment statement.

Fixes: 84c6b86875e01a0 ("xsk: don't allow umem replace at stack level")
Signed-off-by: Prashant Bhole <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample program

Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example:
bpf program which is doing TOS/TCLASS reflection (server would reply
with a same TOS/TCLASS as client).

Signed-off-by: Nikita V. Shirokov <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN options for bpf_(set|get)sockopt

Adding support for two new bpf get/set sockopts: TCP_SAVE_SYN (set)
and TCP_SAVED_SYN (get). This would allow for bpf program to build
logic based on data from ingress SYN packet (e.g. doing tcp's tos/
tclass reflection (see sample prog)) and do it transparently from
userspace program point of view.

Signed-off-by: Nikita V. Shirokov <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

xdp: remove redundant variable 'headroom'

Variable 'headroom' is being assigned but is never used hence it is
redundant and can be removed.

Cleans up clang warning:
variable ‘headroom’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-08-30

This series contains updates to i40e, i40evf and virtchnl.

Jake implements helper functions to use an array to handle the queue
stats which reduces the boiler plate code as well as keep the complexity
localized to a few functions.

Paweł adds the ability to change a VF's MAC address from the host side
without having to reload the VF driver on the guest side.

Paul adds a check to ensure that the number of queues that the PF sends
to the VF is equal to or less than the maximum number of queues the VF
can support.

Mitch fixes an issue caught by GCC 8, where we need to not include the
terminating null in the length of the string for strncpy().

Lihong fixes a VF issue to ensure that it does not enter into
promiscuous mode when macvlan is added to the VF.  Fixed a potential
crash after a VF is removed, since the workqueue sync for the adminq
task was not being cancelled.

Harshitha fixes the type for field_flags in the virtchnl_filter struct.

Martyna removes an unnecessary check in a conditional if statement.

Björn fixes an issue reported by Jesper Dangaard Brouer, where the
driver was reporting incorrect statistics when XDP was enabled.

Jan fixes the potential reporting of incorrect speed settings.

Patryk fixed an issue where the flag
I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING was getting set when any offload is
set via ethtool.  Resolved by only setting this flag when VLAN offload
is enabled.  Also ensure we hold the rtnl lock when we are clearing the
interrupt scheme.  Added a check when deleting the MAC address from the
VF to ensure that the MAC address was not set by the PF and if it was,
do not delete it.

v2: updated patch 2 in the series based on community feedback from David
    Miller to inline a function
====================

Signed-off-by: David S. Miller <[email protected]>

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"A few arm64 fixes came in this week, specifically fixing some nasty
  truncation of return values from firmware calls and resolving a
  VM_BUG_ON due to accessing uninitialised struct pages corresponding to
  NOMAP pages.

  Summary:

   - Fix typos in SVE documentation

   - Fix type-checking and implicit truncation for SMCCC calls

   - Force CONFIG_HOLES_IN_ZONE=y so that SLAB doesn't fall over NOMAP
     regions"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: always enable CONFIG_HOLES_IN_ZONE
  arm/arm64: smccc-1.1: Handle function result as parameters
  arm/arm64: smccc-1.1: Make return values unsigned long
  Documentation/arm64/sve: Couple of improvements and typos

x86/efi: Load fixmap GDT in efi_call_phys_epilog()

When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap
for the simple reason that this address is also mapped for user-space.

The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT
to call EFI runtime services and switch back to the kernel GDT when they
return. But the switch-back uses the writable GDT, not the fixmap GDT.

When that happened and and the CPU returns to user-space it switches to the
user %cr3 and tries to restore user segment registers. This fails because
the writable GDT is not mapped in the user page-table, and without a GDT
the fault handlers also can't be launched. The result is a triple fault and
reboot of the machine.

Fix that by restoring the GDT back to the fixmap GDT which is also mapped
in the user page-table.

Fixes: 7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

Merge tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- minor cleanup avoiding a warning when building with new gcc

- a patch to add a new sysfs node for Xen frontend/backend drivers to
   make it easier to obtain the state of a pv device

- two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
   PTEs

* tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: remove redundant variable save_pud
  xen: export device state to sysfs
  x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  x86/xen: don't write ptes directly in 32-bit PV guests

Merge tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:
"Just a single fix for a bug introduced during the merge window: fix
wrong date and time on PMU-based Macs"

* tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/mac: Use correct PMU response format