Git Repo - linux.git/log

devlink: Avoid overwriting port attributes of registered port

Cited commit in fixes tag overwrites the port attributes for the
registered port.

Avoid such error by checking registered flag before setting attributes.

Fixes: 71ad8d55f8e5 ("devlink: Replace devlink_port_attrs_set parameters with a struct")
Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

vrf: Fix fast path output packet handling with async Netfilter rules

VRF devices use an optimized direct path on output if a default qdisc
is involved, calling Netfilter hooks directly. This path, however, does
not consider Netfilter rules completing asynchronously, such as with
NFQUEUE. The Netfilter okfn() is called for asynchronously accepted
packets, but the VRF never passes that packet down the stack to send
it out over the slave device. Using the slower redirect path for this
seems not feasible, as we do not know beforehand if a Netfilter hook
has asynchronously completing rules.

Fix the use of asynchronously completing Netfilter rules in OUTPUT and
POSTROUTING by using a special completion function that additionally
calls dst_output() to pass the packet down the stack. Also, slightly
adjust the use of nf_reset_ct() so that is called in the asynchronous
case, too.

Fixes: dcdd43c41e60 ("net: vrf: performance improvements for IPv4")
Fixes: a9ec54d1b0cd ("net: vrf: performance improvements for IPv6")
Signed-off-by: Martin Willi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

NFS: Remove unnecessary inode lock in nfs_fsync_dir()

nfs_inc_stats() is already thread-safe, and there are no other reasons
to hold the inode lock here.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

NFS: Remove unnecessary inode locking in nfs_llseek_dir()

Remove the contentious inode lock, and instead provide thread safety
using the file->f_lock spinlock.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

NFS: Fix listxattr receive buffer size

Certain NFSv4.2/RDMA tests fail with v5.9-rc1.

rpcrdma_convert_kvec() runs off the end of the rl_segments array
because rq_rcv_buf.tail[0].iov_len holds a very large positive
value. The resultant kernel memory corruption is enough to crash
the client system.

Callers of rpc_prepare_reply_pages() must reserve an extra XDR_UNIT
in the maximum decode size for a possible XDR pad of the contents
of the xdr_buf's pages. That guarantees the allocated receive buffer
will be large enough to accommodate the usual contents plus that XDR
pad word.

encode_op_hdr() cannot add that extra word. If it does,
xdr_inline_pages() underruns the length of the tail iovec.

Fixes: 3e1f02123fba ("NFSv4.2: add client side XDR handling for extended attributes")
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

NFSv4.2: fix failure to unregister shrinker

We forgot to unregister the nfs4_xattr_large_entry_shrinker.

That leaves the global list of shrinkers corrupted after unload of the
nfs module, after which possibly unrelated code that calls
register_shrinker() or unregister_shrinker() gets a BUG() with
"supervisor write access in kernel mode".

And similarly for the nfs4_xattr_large_entry_lru.

Reported-by: Kris Karas <[email protected]>
Tested-By: Kris Karas <[email protected]>
Fixes: 95ad37f90c33 "NFSv4.2: add client side xattr caching."
Signed-off-by: J. Bruce Fields <[email protected]>
CC: [email protected]
Signed-off-by: Anna Schumaker <[email protected]>

Merge branches 'acpi-scan', 'acpi-misc', 'acpi-button' and 'acpi-dptf'

* acpi-scan:
  ACPI: scan: Fix acpi_dma_configure_id() kerneldoc name

* acpi-misc:
  ACPI: GED: fix -Wformat
  ACPI: Fix whitespace inconsistencies

* acpi-button:
  ACPI: button: Add DMI quirk for Medion Akoya E2228T

* acpi-dptf:
  ACPI: DPTF: Support Alder Lake

gfs2: fix possible reference leak in gfs2_check_blk_type

In the fail path of gfs2_check_blk_type, forgetting to call
gfs2_glock_dq_uninit will result in rgd_gh reference leak.

Signed-off-by: Zhang Qilong <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>

fscrypt: fix inline encryption not used on new files

The new helper function fscrypt_prepare_new_inode() runs before
S_ENCRYPTED has been set on the new inode. This accidentally made
fscrypt_select_encryption_impl() never enable inline encryption on newly
created files, due to its use of fscrypt_needs_contents_encryption()
which only returns true when S_ENCRYPTED is set.

Fix this by using S_ISREG() directly instead of
fscrypt_needs_contents_encryption(), analogous to what
select_encryption_mode() does.

I didn't notice this earlier because by design, the user-visible
behavior is the same (other than performance, potentially) regardless of
whether inline encryption is used or not.

Fixes: a992b20cd4ee ("fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()")
Reviewed-by: Satya Tangirala <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>

cosa: Add missing kfree in error path of cosa_write

If memory allocation for 'kbuf' succeed, cosa_write() doesn't have a
corresponding kfree() in exception handling. Thus add kfree() for this
function implementation.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Acked-by: Jan "Yenya" Kasprzak <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: switch to the kernel.org patchwork instance

Move to the kernel.org patchwork instance, it has significantly
lower latency for accessing from Europe and the US. Other quirks
include the reply bot.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'cxgb4-ch_ktls-fixes-in-nic-tls-code'

Rohit Maheshwari says:

====================
cxgb4/ch_ktls: Fixes in nic tls code

This series helps in fixing multiple nic ktls issues. Series is broken
into 12 patches.

Patch 1 avoids deciding tls packet based on decrypted bit. If its a
retransmit packet which has tls handshake and finish (for encryption),
decrypted bit won't be set there, and so we can't rely on decrypted
bit.

Patch 2 helps supporting linear skb. SKBs were assumed non-linear.
Corrected the length extraction.

Patch 3 fixes the checksum offload update in WR.

Patch 4 fixes kernel panic happening due to creating new skb for each
record. As part of fix driver will use same skb to send out one tls
record (partial data) of the same SKB.

Patch 5 fixes the problem of skb data length smaller than remaining data
of the record.

Patch 6 fixes the handling of SKBs which has tls header alone pkt, but
not starting from beginning.

Patch 7 avoids sending extra data which is used to make a record 16 byte
aligned. We don't need to retransmit those extra few bytes.

Patch 8 handles the cases where retransmit packet has tls starting
exchanges which are prior to tls start marker.

Patch 9 fixes the problem os skb free before HW knows about tcp FIN.

Patch 10 handles the small packet case which has partial TAG bytes only.
HW can't handle those, hence using sw crypto for such pkts.

Patch 11 corrects the potential tcb update problem.

Patch 12 stops the queue if queue reaches threshold value.

v1->v2:
- Corrected fixes tag issue.
- Marked chcr_ktls_sw_fallback() static.

v2->v3:
- Replaced GFP_KERNEL with GFP_ATOMIC.
- Removed mixed fixes.

v3->v4:
- Corrected fixes tag issue.

v4->v5:
- Separated mixed fixes from patch 4.

v5-v6:
- Fixes tag should be at the end.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: stop the txq if reaches threshold

Stop the queue and ask for the credits if queue reaches to
threashold.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: tcb update fails sometimes

context id and port id should be filled while sending tcb update.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls/cxgb4: handle partial tag alone SKBs

If TCP congestion caused a very small packets which only has some
part fo the TAG, and that too is not till the end. HW can't handle
such case, so falling back to sw crypto in such cases.

v1->v2:
- Marked chcr_ktls_sw_fallback() static.

Fixes: dc05f3df8fac ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: don't free skb before sending FIN

If its a last packet and fin is set. Make sure FIN is informed
to HW before skb gets freed.

Fixes: 429765a149f1 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: packet handling prior to start marker

There could be a case where ACK for tls exchanges prior to start
marker is missed out, and by the time tls is offloaded. This pkt
should not be discarded and handled carefully. It could be
plaintext alone or plaintext + finish as well.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: Correction in middle record handling

If a record starts in middle, reset TCB UNA so that we could
avoid sending out extra packet which is needed to make it 16
byte aligned to start AES CTR.
Check also considers prev_seq, which should be what is
actually sent, not the skb data length.
Avoid updating partial TAG to HW at any point of time, that's
why we need to check if remaining part is smaller than TAG
size, then reset TX_MAX to be TAG starting sequence number.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: missing handling of header alone

If an skb has only header part which doesn't start from
beginning, is not being handled properly.

Fixes: dc05f3df8fac ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: Correction in trimmed_len calculation

trimmed length calculation goes wrong if skb has only tag part
to send. It should be zero if there is no data bytes apart from
TAG.

Fixes: dc05f3df8fac ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

cxgb4/ch_ktls: creating skbs causes panic

Creating SKB per tls record and freeing the original one causes
panic. There will be race if connection reset is requested. By
freeing original skb, refcnt will be decremented and that means,
there is no pending record to send, and so tls_dev_del will be
requested in control path while SKB of related connection is in
queue.
Better approach is to use same SKB to send one record (partial
data) at a time. We still have to create a new SKB when partial
last part of a record is requested.
This fix introduces new API cxgb4_write_partial_sgl() to send
partial part of skb. Present cxgb4_write_sgl can only provide
feasibility to start from an offset which limits to header only
and it can write sgls for the whole skb len. But this new API
will help in both. It can start from any offset and can end
writing in middle of the skb.

v4->v5:
- Removed extra changes.

Fixes: 429765a149f1 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: Update cheksum information

Checksum update was missing in the WR.

Fixes: 429765a149f1 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

ch_ktls: Correction in finding correct length

There is a possibility of linear skbs coming in. Correcting
the length extraction logic.

v2->v3:
- Separated un-related changes from this patch.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

cxgb4/ch_ktls: decrypted bit is not enough

If skb has retransmit data starting before start marker, e.g. ccs,
decrypted bit won't be set for that, and if it has some data to
encrypt, then it must be given to crypto ULD. So in place of
decrypted, check if socket is tls offloaded. Also, unless skb has
some data to encrypt, no need to give it for tls offload handling.

v2->v3:
- Removed ifdef.

Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/x25: Fix null-ptr-deref in x25_connect

This fixes a regression for blocking connects introduced by commit
4becb7ee5b3d ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect").

The x25->neighbour is already set to "NULL" by x25_disconnect() now,
while a blocking connect is waiting in
x25_wait_for_connection_establishment(). Therefore x25->neighbour must
not be accessed here again and x25->state is also already set to
X25_STATE_0 by x25_disconnect().

Fixes: 4becb7ee5b3d ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect")
Signed-off-by: Martin Schiller <[email protected]>
Reviewed-by: Xie He <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

arm64: dts: fsl-ls1028a-kontron-sl28: specify in-band mode for ENETC

Since commit 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") the
network port of the Kontron sl28 board is broken. After the migration to
phylink the device tree has to specify the in-band-mode property. Add
it.

Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX")
Suggested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Michael Walle <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tipc: fix memory leak in tipc_topsrv_start()

kmemleak report a memory leak as follows:

unreferenced object 0xffff88810a596800 (size 512):
  comm "ip", pid 21558, jiffies 4297568990 (age 112.120s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 00 83 60 b0 ff ff ff ff  ..........`.....
  backtrace:
    [<0000000022bbe21f>] tipc_topsrv_init_net+0x1f3/0xa70
    [<00000000fe15ddf7>] ops_init+0xa8/0x3c0
    [<00000000138af6f2>] setup_net+0x2de/0x7e0
    [<000000008c6807a3>] copy_net_ns+0x27d/0x530
    [<000000006b21adbd>] create_new_namespaces+0x382/0xa30
    [<00000000bb169746>] unshare_nsproxy_namespaces+0xa1/0x1d0
    [<00000000fe2e42bc>] ksys_unshare+0x39c/0x780
    [<0000000009ba3b19>] __x64_sys_unshare+0x2d/0x40
    [<00000000614ad866>] do_syscall_64+0x56/0xa0
    [<00000000a1b5ca3c>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

'srv' is malloced in tipc_topsrv_start() but not free before
leaving from the error handling cases. We need to free it.

Fixes: 5c45ab24ac77 ("tipc: make struct tipc_server private for server.c")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb

Pull swiotlb fixes from Konrad Rzeszutek Wilk:
"Two tiny fixes for issues that make drivers under Xen unhappy under
  certain conditions"

* 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
  swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"

Merge branch 'net-iucv-fixes-2020-11-09'

Julian Wiedmann says:

====================
net/iucv: fixes 2020-11-09

One fix in the shutdown path for af_iucv sockets. This is relevant for
stable as well.
Also sending along an update for the Maintainers file.

v1 -> v2: use the correct Fixes tag in patch 1 (Jakub)
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

MAINTAINERS: remove Ursula Braun as s390 network maintainer

I am retiring soon. Thus this patch removes myself from the
MAINTAINERS file (s390 network).

Signed-off-by: Ursula Braun <[email protected]>
[jwi: fix up the subject]
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/af_iucv: fix null pointer dereference on shutdown

syzbot reported the following KASAN finding:

BUG: KASAN: nullptr-dereference in iucv_send_ctrl+0x390/0x3f0 net/iucv/af_iucv.c:385
Read of size 2 at addr 000000000000021e by task syz-executor907/519

CPU: 0 PID: 519 Comm: syz-executor907 Not tainted 5.9.0-syzkaller-07043-gbcf9877ad213 #0
Hardware name: IBM 3906 M04 701 (KVM/Linux)
Call Trace:
[<00000000c576af60>] unwind_start arch/s390/include/asm/unwind.h:65 [inline]
[<00000000c576af60>] show_stack+0x180/0x228 arch/s390/kernel/dumpstack.c:135
[<00000000c9dcd1f8>] __dump_stack lib/dump_stack.c:77 [inline]
[<00000000c9dcd1f8>] dump_stack+0x268/0x2f0 lib/dump_stack.c:118
[<00000000c5fed016>] print_address_description.constprop.0+0x5e/0x218 mm/kasan/report.c:383
[<00000000c5fec82a>] __kasan_report mm/kasan/report.c:517 [inline]
[<00000000c5fec82a>] kasan_report+0x11a/0x168 mm/kasan/report.c:534
[<00000000c98b5b60>] iucv_send_ctrl+0x390/0x3f0 net/iucv/af_iucv.c:385
[<00000000c98b6262>] iucv_sock_shutdown+0x44a/0x4c0 net/iucv/af_iucv.c:1457
[<00000000c89d3a54>] __sys_shutdown+0x12c/0x1c8 net/socket.c:2204
[<00000000c89d3b70>] __do_sys_shutdown net/socket.c:2212 [inline]
[<00000000c89d3b70>] __s390x_sys_shutdown+0x38/0x48 net/socket.c:2210
[<00000000c9e36eac>] system_call+0xe0/0x28c arch/s390/kernel/entry.S:415

There is nothing to shutdown if a connection has never been established.
Besides that iucv->hs_dev is not yet initialized if a socket is in
IUCV_OPEN state and iucv->path is not yet initialized if socket is in
IUCV_BOUND state.
So, just skip the shutdown calls for a socket in these states.

Fixes: eac3731bd04c ("[S390]: Add AF_IUCV socket support")
Fixes: 82492a355fac ("af_iucv: add shutdown for HS transport")
Reviewed-by: Vasily Gorbik <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
[jwi: correct one Fixes tag]
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

lan743x: fix "BUG: invalid wait context" when setting rx mode

In the net core, the struct net_device_ops -> ndo_set_rx_mode()
callback is called with the dev->addr_list_lock spinlock held.

However, this driver's ndo_set_rx_mode callback eventually calls
lan743x_dp_write(), which acquires a mutex. Mutex acquisition
may sleep, and this is not allowed when holding a spinlock.

Fix by removing the dp_lock mutex entirely. Its purpose is to
prevent concurrent accesses to the data port. No concurrent
accesses are possible, because the dev->addr_list_lock
spinlock in the core only lets through one thread at a time.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Sven Van Asbroeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: mv88e6xxx: Fix memleak in mv88e6xxx_region_atu_snapshot

When mv88e6xxx_fid_map return error, we lost free the table.

Fix it.

Fixes: bfb255428966 ("net: dsa: mv88e6xxx: Add devlink regions")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: zhangxiaoxu <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: Update window_clamp if SOCK_RCVBUF is set

When net.ipv4.tcp_syncookies=1 and syn flood is happened,
cookie_v4_check or cookie_v6_check tries to redo what
tcp_v4_send_synack or tcp_v6_send_synack did,
rsk_window_clamp will be changed if SOCK_RCVBUF is set,
which will make rcv_wscale is different, the client
still operates with initial window scale and can overshot
granted window, the client use the initial scale but local
server use new scale to advertise window value, and session
work abnormally.

Fixes: e88c64f0a425 ("tcp: allow effective reduction of TCP's rcv-buffer via setsockopt")
Signed-off-by: Mao Wenan <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: realtek: support paged operations on RTL8201CP

The RTL8401-internal PHY identifies as RTL8201CP, and the init
sequence in r8169, copied from vendor driver r8168, uses paged
operations. Therefore set the same paged operation callbacks as
for the other Realtek PHY's.

Fixes: cdafdc29ef75 ("r8169: sync support for RTL8401 with vendor driver")
Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

lan743x: correctly handle chips with internal PHY

Commit 6f197fb63850 ("lan743x: Added fixed link and RGMII support")
assumes that chips with an internal PHY will never have a devicetree
entry. This is incorrect: even for these chips, a devicetree entry
can be useful e.g. to pass the mac address from bootloader to chip:

    &pcie {
            status = "okay";

            host@0 {
                    reg = <0 0 0 0 0>;

                    #address-cells = <3>;
                    #size-cells = <2>;

                    lan7430: ethernet@0 {
                            /* LAN7430 with internal PHY */
                            compatible = "microchip,lan743x";
                            status = "okay";
                            reg = <0 0 0 0 0>;
                            /* filled in by bootloader */
                            local-mac-address = [00 00 00 00 00 00];
                    };
            };
    };

If a devicetree entry is present, the driver will not attach the chip
to its internal phy, and the chip will be non-operational.

Fix by tweaking the phy connection algorithm:
- first try to connect to a phy specified in the devicetree
  (could be 'real' phy, or just a 'fixed-link')
- if that doesn't succeed, try to connect to an internal phy, even
  if the chip has a devnode

Tested on a LAN7430 with internal PHY. I cannot test a device using
fixed-link, as I do not have access to one.

Fixes: 6f197fb63850 ("lan743x: Added fixed link and RGMII support")
Tested-by: Sven Van Asbroeck <[email protected]> # lan7430
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Sven Van Asbroeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

netlabel: fix our progress tracking in netlbl_unlabel_staticlist()

The current NetLabel code doesn't correctly keep track of the netlink
dump state in some cases, in particular when multiple interfaces with
large configurations are loaded. The problem manifests itself by not
reporting the full configuration to userspace, even though it is
loaded and active in the kernel. This patch fixes this by ensuring
that the dump state is properly reset when necessary inside the
netlbl_unlabel_staticlist() function.

Fixes: 8cc44579d1bd ("NetLabel: Introduce static network labels for unlabeled connections")
Signed-off-by: Paul Moore <[email protected]>
Link: https://lore.kernel.org/r/160484450633.3752.16512718263560813473.stgit@sifl
Signed-off-by: Jakub Kicinski <[email protected]>

MAINTAINERS: Update repositories for Intel Ethernet Drivers

Update Intel Ethernet Drivers repositories to new locations.

Signed-off-by: Tony Nguyen <[email protected]>

igc: Fix returning wrong statistics

'igc_update_stats()' was not updating 'netdev->stats', so the returned
statistics, for example, requested by:

$ ip -s link show dev enp3s0

were not being updated and were always zero.

Fix by returning a set of statistics that are actually being
updated (adapter->stats64).

Fixes: c9a11c23ceb6 ("igc: Add netdev")
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

i40e, xsk: uninitialized variable in i40e_clean_rx_irq_zc()

The "failure" variable is used without being initialized. It should be
set to false.

Fixes: 8cbf74149903 ("i40e, xsk: move buffer allocation out of the Rx processing loop")
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Tested-by: George Kuruvinakunnel <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

i40e: Fix MAC address setting for a VF via Host/VM

Fix MAC setting flow for the PF driver.

Update the unicast VF's MAC address in VF structure if it is
a new setting in i40e_vc_add_mac_addr_msg.

When unicast MAC address gets deleted, record that and
set the new unicast MAC address that is already waiting in the filter
list. This logic is based on the order of messages arriving to
the PF driver.

Without this change the MAC address setting was interpreted
incorrectly in the following use cases:
1) Print incorrect VF MAC or zero MAC
ip link show dev $pf
2) Don't preserve MAC between driver reload
rmmod iavf; modprobe iavf
3) Update VF MAC when macvlan was set
ip link add link $vf address $mac $vf.1 type macvlan
4) Failed to update mac address when VF was trusted
ip link set dev $vf address $mac

This includes all other configurations including above commands.

Fixes: f657a6e1313b ("i40e: Fix VF driver MAC address configuration")
Signed-off-by: Slawomir Laba <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

selftest: fix flower terse dump tests

Iproute2 tc classifier terse dump has been accepted with modified syntax.
Update the tests accordingly.

Signed-off-by: Vlad Buslov <[email protected]>
Fixes: e7534fd42a99 ("selftests: implement flower classifier terse dump tests")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull core dump fix from Al Viro:
"Fix for multithreaded coredump playing fast and loose with getting
  registers of secondary threads; if a secondary gets caught in the
  middle of exit(2), the conditition it will be stopped in for dumper to
  examine might be unusual enough for things to go wrong.

  Quite a few architectures are fine with that, but some are not."

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  don't dump the threads that had been already exiting when zapped.

Merge tag 'for-5.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"A handful of minor fixes and updates:

   - handle missing device replace item on mount (syzbot report)

   - fix space reservation calculation when finishing relocation

   - fix memory leak on error path in ref-verify (debugging feature)

   - fix potential overflow during defrag on 32bit arches

   - minor code update to silence smatch warning

   - minor error message updates"

* tag 'for-5.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod
  btrfs: dev-replace: fail mount if we don't have replace item with target device
  btrfs: scrub: update message regarding read-only status
  btrfs: clean up NULL checks in qgroup_unreserve_range()
  btrfs: fix min reserved size calculation in merge_reloc_root
  btrfs: print the block rsv type when we fail our reservation
  btrfs: fix potential overflow in cluster_pages_for_defrag on 32bit arch

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt fix from Eric Biggers:
"Fix a regression where a new WARN_ON() was reachable when using
  FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 on ext4, causing xfstest
  generic/602 to sometimes fail on ext4"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: remove reachable WARN in fscrypt_setup_iv_ino_lblk_32_key()

Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:
"Update update to version 20.09.30, one kernel side fix"

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: update version number
  powercap: restrict energy meter to root access
  tools/power turbostat: harden against cpu hotplug
  tools/power turbostat: adjust for temperature offset
  tools/power turbostat: Build with _FILE_OFFSET_BITS=64
  tools/power turbostat: Support AMD Family 19h
  tools/power turbostat: Remove empty columns for Jacobsville
  tools/power turbostat: Add a new GFXAMHz column that exposes gt_act_freq_mhz.
  tools/power x86_energy_perf_policy: Input/output error in a VM
  tools/power turbostat: Skip pc8, pc9, pc10 columns, if they are disabled
  tools/power turbostat: Support additional CPU model numbers
  tools/power turbostat: Fix output formatting for ACPI CST enumeration
  tools/power turbostat: Replace HTTP links with HTTPS ones: TURBOSTAT UTILITY
  tools/power turbostat: Use sched_getcpu() instead of hardcoded cpu 0
  tools/power turbostat: Enable accumulate RAPL display
  tools/power turbostat: Introduce functions to accumulate RAPL consumption
  tools/power turbostat: Make the energy variable to be 64 bit
  tools/power turbostat: Always print idle in the system configuration header
  tools/power turbostat: Print /dev/cpu_dma_latency

ACPI: DPTF: Support Alder Lake

Add Alder Lake ACPI IDs for DPTF devices.

Signed-off-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Documentation: ACPI: fix spelling mistakes

Signed-off-by: Flavio Suligoi <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: intel_pstate: Take CPUFREQ_GOV_STRICT_TARGET into account

Make intel_pstate take the new CPUFREQ_GOV_STRICT_TARGET governor
flag into account when it operates in the passive mode with HWP
enabled, so as to fix the "powersave" governor behavior in that
case (currently, HWP is allowed to scale the performance all the
way up to the policy max limit when the "powersave" governor is
used, but it should be constrained to the policy min limit then).

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: 5.9+ <[email protected]> # 5.9+: 9a2a9ebc0a75 cpufreq: Introduce governor flags
Cc: 5.9+ <[email protected]> # 5.9+: 218f66870181 cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET
Cc: 5.9+ <[email protected]> # 5.9+: ea9364bbadf1 cpufreq: Add strict_target to struct cpufreq_policy

cpufreq: Add strict_target to struct cpufreq_policy

Add a new field to be set when the CPUFREQ_GOV_STRICT_TARGET flag is
set for the current governor to struct cpufreq_policy, so that the
drivers needing to check CPUFREQ_GOV_STRICT_TARGET do not have to
access the governor object during every frequency transition.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET

Introduce a new governor flag, CPUFREQ_GOV_STRICT_TARGET, for the
governors that want the target frequency to be set exactly to the
given value without leaving any room for adjustments on the hardware
side and set this flag for the powersave and performance governors.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

cpufreq: Introduce governor flags

A new cpufreq governor flag will be added subsequently, so replace
the bool dynamic_switching fleid in struct cpufreq_governor with a
flags field and introduce CPUFREQ_GOV_DYNAMIC_SWITCHING to set for
the "dynamic switching" governors instead of it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

tools/power turbostat: update version number

goodbye summer...

Signed-off-by: Len Brown <[email protected]>

powercap: restrict energy meter to root access

Remove non-privileged user access to power data contained in
/sys/class/powercap/intel-rapl*/*/energy_uj

Non-privileged users currently have read access to power data and can
use this data to form a security attack. Some privileged
drivers/applications need read access to this data, but don't expose it
to non-privileged users.

For example, thermald uses this data to ensure that power management
works correctly. Thus removing non-privileged access is preferred over
completely disabling this power reporting capability with
CONFIG_INTEL_RAPL=n.

Fixes: 95677a9a3847 ("PowerCap: Fix mode for energy counter")
Signed-off-by: Len Brown <[email protected]>
Cc: [email protected]

mptcp: provide rmem[0] limit

The mptcp proto struct currently does not provide the
required limit for forward memory scheduling. Under
pressure sk_rmem_schedule() will unconditionally try
to use such field and will oops.

Address the issue inheriting the tcp limit, as we already
do for the wmem one.

Fixes: 9c3f94e1681b ("mptcp: add missing memory scheduling in the rx path")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/37af798bd46f402fb7c79f57ebbdd00614f5d7fa.1604861097.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>

docs: networking: phy: s/2.5 times faster/2.5 times as fast/

2.5 times faster would be 3.5 Gbps (4.375 Gbaud after 8b/10b encoding).

Signed-off-by: Jonathan Neuschäfer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ethtool: netlink: add missing netdev_features_change() call

After updating userspace Ethtool from 5.7 to 5.9, I noticed that
NETDEV_FEAT_CHANGE is no more raised when changing netdev features
through Ethtool.
That's because the old Ethtool ioctl interface always calls
netdev_features_change() at the end of user request processing to
inform the kernel that our netdevice has some features changed, but
the new Netlink interface does not. Instead, it just notifies itself
with ETHTOOL_MSG_FEATURES_NTF.
Replace this ethtool_notify() call with netdev_features_change(), so
the kernel will be aware of any features changes, just like in case
with the ioctl interface. This does not omit Ethtool notifications,
as Ethtool itself listens to NETDEV_FEAT_CHANGE and drops
ETHTOOL_MSG_FEATURES_NTF on it
(net/ethtool/netlink.c:ethnl_netdev_event()).

From v1 [1]:
- dropped extra new line as advised by Jakub;
- no functional changes.

[1] https://lore.kernel.org/netdev/[email protected]

Fixes: 0980bfcd6954 ("ethtool: set netdev features with FEATURES_SET request")
Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Michal Kubecek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies

Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6
packets over a link with a PMTU estimation of exactly 1350 bytes,
won't trigger ICMPv6 Packet Too Big replies when the encapsulated
datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of
overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU
would be legitimate and expected.

This comes from an off-by-one error I introduced in checks added
as part of commit 4cb47a8644cc ("tunnels: PMTU discovery support
for directly bridged IP packets"), whose purpose was to prevent
sending ICMPv6 Packet Too Big messages with an MTU lower than the
smallest permissible IPv6 link MTU, i.e. 1280 bytes.

In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if
the advertised MTU would be less than, and not equal to, 1280 bytes.

Also fix the analogous comparison for IPv4, that is, skip the ICMP
reply only if the resulting MTU is strictly less than 576 bytes.

This becomes apparent while running the net/pmtu.sh bridged VXLAN
or GENEVE selftests with adjusted lower-link MTU values. Using
e.g. GENEVE, setting ll_mtu to the values reported below, in the
test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test
function, we can see failures on the following tests:

             test                | ll_mtu
  -------------------------------|--------
  pmtu_ipv4_br_geneve4_exception |   626
  pmtu_ipv6_br_geneve4_exception |  1330
  pmtu_ipv6_br_geneve6_exception |  1350

owing to the different tunneling overheads implied by the
corresponding configurations.

Reported-by: Jianlin Shi <[email protected]>
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Stefano Brivio <[email protected]>
Link: https://lore.kernel.org/r/4f5fc2f33bfdf8409549fafd4f952b008bf04d63.1604681709.git.sbrivio@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>

IPv6: Set SIT tunnel hard_header_len to zero

Due to the legacy usage of hard_header_len for SIT tunnels while
already using infrastructure from net/ipv4/ip_tunnel.c the
calculation of the path MTU in tnl_update_pmtu is incorrect.
This leads to unnecessary creation of MTU exceptions for any
flow going over a SIT tunnel.

As SIT tunnels do not have a header themsevles other than their
transport (L3, L2) headers we're leaving hard_header_len set to zero
as tnl_update_pmtu is already taking care of the transport headers
sizes.

This will also help avoiding unnecessary IPv6 GC runs and spinlock
contention seen when using SIT tunnels and for more than
net.ipv6.route.gc_thresh flows.

Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Oliver Herms <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/r/20201103104133.GA1573211@tws
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:
   - fix compilation error when PMD and PUD are folded
   - fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1
   - add aarch64 get-reg-list test

  x86:
   - fix semantic conflict between two series merged for 5.10
   - fix (and test) enforcement of paravirtual cpuid features

  selftests:
   - various cleanups to memory management selftests
   - new selftests testcase for performance of dirty logging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
  KVM: selftests: allow two iterations of dirty_log_perf_test
  KVM: selftests: Introduce the dirty log perf test
  KVM: selftests: Make the number of vcpus global
  KVM: selftests: Make the per vcpu memory size global
  KVM: selftests: Drop pointless vm_create wrapper
  KVM: selftests: Add wrfract to common guest code
  KVM: selftests: Simplify demand_paging_test with timespec_diff_now
  KVM: selftests: Remove address rounding in guest code
  KVM: selftests: Factor code out of demand_paging_test
  KVM: selftests: Use a single binary for dirty/clear log test
  KVM: selftests: Always clear dirty bitmap after iteration
  KVM: selftests: Add blessed SVE registers to get-reg-list
  KVM: selftests: Add aarch64 get-reg-list test
  selftests: kvm: test enforcement of paravirtual cpuid features
  selftests: kvm: Add exception handling to selftests
  selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  selftests: kvm: Fix the segment descriptor layout to match the actual layout
  KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrs
  kvm: x86: request masterclock update any time guest uses different msr
  kvm: x86: ensure pv_cpuid.features is initialized when enabling cap
  ...

Merge tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
"This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc
  refactoring"

* tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux:
  net/sunrpc: fix useless comparison in proc_do_xprt()
  net/sunrpc: return 0 on attempt to write to "transports"
  NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy
  NFSD: Fix use-after-free warning when doing inter-server copy
  NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL
  SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow()
  NFSD: NFSv3 PATHCONF Reply is improperly formed

Merge tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes and cleanups from Ted Ts'o:
"More fixes and cleanups for the new fast_commit features, but also a
  few other miscellaneous bug fixes and a cleanup for the MAINTAINERS
  file"

* tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits)
  jbd2: fix up sparse warnings in checkpoint code
  ext4: fix sparse warnings in fast_commit code
  ext4: cleanup fast commit mount options
  jbd2: don't start fast commit on aborted journal
  ext4: make s_mount_flags modifications atomic
  ext4: issue fsdev cache flush before starting fast commit
  ext4: disable fast commit with data journalling
  ext4: fix inode dirty check in case of fast commits
  ext4: remove unnecessary fast commit calls from ext4_file_mmap
  ext4: mark buf dirty before submitting fast commit buffer
  ext4: fix code documentatioon
  ext4: dedpulicate the code to wait on inode that's being committed
  jbd2: don't read journal->j_commit_sequence without taking a lock
  jbd2: don't touch buffer state until it is filled
  jbd2: add todo for a fast commit performance optimization
  jbd2: don't pass tid to jbd2_fc_end_commit_fallback()
  jbd2: don't use state lock during commit path
  jbd2: drop jbd2_fc_init documentation
  ext4: clean up the JBD2 API that initializes fast commits
  jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs
  ...

Merge tag 'erofs-for-5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
"A week ago, Vladimir reported an issue that the kernel log would
  become polluted if the page allocation debug option is enabled. I also
  found this when I cleaned up magical page->mapping and originally
  planned to submit these all for 5.11 but it seems the impact can be
  noticed so submit the fix in advance.

  In addition, nl6720 also reported that atime is empty although EROFS
  has the only one on-disk timestamp as a practical consideration for
  now but it's better to derive it as what we did for the other
  timestamps.

  Summary:

   - fix setting up pcluster improperly for temporary pages

   - derive atime instead of leaving it empty"

* tag 'erofs-for-5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix setting up pcluster for temporary pages
  erofs: derive atime instead of leaving it empty

ACPI: button: Add DMI quirk for Medion Akoya E2228T

The Medion Akoya E2228T's ACPI _LID implementation is quite broken,
it has the same issues as the one from the Medion Akoya E2215T:

1. For notifications it uses an ActiveLow Edge GpioInt, rather then
an ActiveBoth one, meaning that the device is only notified when the
lid is closed, not when it is opened.

2. Matching with this its _LID method simply always returns 0 (closed)

In order for the Linux LID code to work properly with this implementation,
the lid_init_state selection needs to be set to ACPI_BUTTON_LID_INIT_OPEN,
add a DMI quirk for this.

While working on this I also found out that the MD60### part of the model
number differs per country/batch while all of the E2215T and E2228T models
have this issue, so also remove the " MD60198" part from the E2215T quirk.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

ACPI: GED: fix -Wformat

Clang is more aggressive about -Wformat warnings when the format flag
specifies a type smaller than the parameter. It turns out that gsi is an
int. Fixes:

drivers/acpi/evged.c:105:48: warning: format specifies type 'unsigned
char' but the argument has type 'unsigned int' [-Wformat]
trigger == ACPI_EDGE_SENSITIVE ? 'E' : 'L', gsi);
^~~

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Fixes: ea6f3af4c5e6 ("ACPI: GED: add support for _Exx / _Lxx handler methods")
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

ACPI: Fix whitespace inconsistencies

Replaces spaces with tabs where spaces have been (inconsistently) used
for indentation and removes trailing whitespaces.

Signed-off-by: Maximilian Luz <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

ACPI: scan: Fix acpi_dma_configure_id() kerneldoc name

For some reason building with W=1 doesn't pick up on this, but the
kerneldoc name for acpi_dma_configure_id() is not right, so fix it up.

Signed-off-by: John Garry <[email protected]>
Acked-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Documentation: firmware-guide: gpio-properties: Clarify initial output state

GpioIo() doesn't provide an explicit state for an output pin.
Linux tries to be smart and uses a common sense based on other
parameters. Document how it looks like in the code.

Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Documentation: firmware-guide: gpio-properties: active_low only for GpioIo()

It appears that people may misinterpret active_low field in _DSD
for GpioInt() resource. Add a paragraph to clarify this.

Reported-by: Ricardo Ribalda <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Reviewed-by: Ricardo Ribalda <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Documentation: firmware-guide: gpio-properties: Fix factual mistakes

Fix factual mistakes and style issues in GPIO properties document.
This converts IoRestriction from InputOnly to OutputOnly as pins
in the example are used as outputs.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

KVM: selftests: allow two iterations of dirty_log_perf_test

Even though one iteration is not enough for the dirty log performance
test (due to the cost of building page tables, zeroing memory etc.)
two is okay and it is the default. Without this patch,
"./dirty_log_perf_test" without any further arguments fails.

Cc: Ben Gardon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Linux 5.10-rc3

net/sunrpc: fix useless comparison in proc_do_xprt()

In the original code, the "if (*lenp < 0)" check didn't work because
"*lenp" is unsigned. Fortunately, the memory_read_from_buffer() call
will never fail in this context so it doesn't affect runtime.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

Merge tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core documentation fixes from Greg KH:
"Some small Documentation fixes that were fallout from the larger
  documentation update we did in 5.10-rc2.

  Nothing major here at all, but all of these have been in linux-next
  and resolve build warnings when building the documentation files"

* tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation: remove mic/index from misc-devices/index.rst
  scripts: get_api.pl: Add sub-titles to ABI output
  scripts: get_abi.pl: Don't let ABI files to create subtitles
  docs: leds: index.rst: add a missing file
  docs: ABI: sysfs-class-net: fix a typo
  docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys

Merge tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are a small number of small tty and serial fixes for some
  reported problems for the tty core, vt code, and some serial drivers.

  They include fixes for:

   - a buggy and obsolete vt font ioctl removal

   - 8250_mtk serial baudrate runtime warnings

   - imx serial earlycon build configuration fix

   - txx9 serial driver error path cleanup issues

   - tty core fix in release_tty that can be triggered by trying to bind
     an invalid serial port name to a speakup console device

  Almost all of these have been in linux-next without any problems, the
  only one that hasn't, just deletes code :)"

* tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt: Disable KD_FONT_OP_COPY
  tty: fix crash in release_tty if tty->port is not set
  serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init
  tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled
  serial: 8250_mtk: Fix uart_get_baud_rate warning

Merge tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB fixes and new device ids:

   - USB gadget fixes for some reported issues

   - Fixes for the ever-troublesome apple fastcharge driver, hopefully
     we finally have it right.

   - More USB core quirks for odd devices

   - USB serial driver fixes for some long-standing issues that were
     recently found

   - some new USB serial driver device ids

  All have been in linux-next with no reported issues"

* tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property
  usb: mtu3: fix panic in mtu3_gadget_stop()
  USB: serial: option: add Telit FN980 composition 0x1055
  USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
  USB: serial: cyberjack: fix write-URB completion race
  USB: Add NO_LPM quirk for Kingston flash drive
  USB: serial: option: add Quectel EC200T module support
  usb: raw-gadget: fix memory leak in gadget_setup
  usb: dwc2: Avoid leaving the error_debugfs label unused
  usb: dwc3: ep0: Fix delay status handling
  usb: gadget: fsl: fix null pointer checking
  usb: gadget: goku_udc: fix potential crashes in probe
  usb: dwc3: pci: add support for the Intel Alder Lake-S

fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent

current->group_leader->exit_signal may change during copy_process() if
current->real_parent exits.

Move the assignment inside tasklist_lock to avoid the race.

Signed-off-by: Eddy Wu <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

vt: Disable KD_FONT_OP_COPY

It's buggy:

On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ).  The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use  KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.

Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
  and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
  not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016

Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken

https://github.com/systemd/systemd/pull/3651

So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.

Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.

Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:

/* copy font from active VT, where the font was uploaded to */
cfo.op = KD_FONT_OP_COPY;
cfo.height = vcs.v_active-1; /* tty1 == index 0 */
(void) ioctl(vcfd, KDFONTOP, &cfo);

Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.

v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.

Acked-by: Peilin Ye <[email protected]>
Reported-by: Minh Yuan <[email protected]>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <[email protected]>
Cc: Peilin Ye <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

- Fix an uninitialized struct problem

- Fix an iomap problem zeroing unwritten EOF blocks

- Fix some clumsy error handling when writeback fails on filesystems
   with blocksize < pagesize

- Fix a retry loop not resetting loop variables properly

- Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel
   actually does permit that combination

- Fix excessive page cache flushing when unsharing part of a file

* tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: only flush the unshared range in xfs_reflink_unshare
  xfs: fix scrub flagging rtinherit even if there is no rt device
  xfs: fix missing CoW blocks writeback conversion retry
  iomap: clean up writeback state logic on writepage error
  iomap: support partial page discard on writeback block mapping failure
  xfs: flush new eof page on truncate to avoid post-eof corruption
  xfs: set xefi_discard when creating a deferred agfl free log intent item

Merge branch 'hch' (patches from Christoph)

Merge procfs splice read fixes from Christoph Hellwig:
"Greg reported a problem due to the fact that Android tests use procfs
  files to test splice, which stopped working with the changes for
  set_fs() removal.

  This series adds read_iter support for seq_file, and uses those for
  various proc files using seq_file to restore splice read support"

[ Side note: Christoph initially had a scripted "move everything over"
  patch, which looks fine, but I personally would prefer us to actively
  discourage splice() on random files.  So this does just the minimal
  basic core set of proc file op conversions.

  For completeness, and in case people care, that script was

     sed -i -e 's/\.proc_read$\s*=\s*$seq_read/\.proc_read_iter\1seq_read_iter/g'

  but I'll wait and see if somebody has a strong argument for using
  splice on random small /proc files before I'd run it on the whole
  kernel.   - Linus ]

* emailed patches from Christoph Hellwig <[email protected]>:
  proc "seq files": switch to ->read_iter
  proc "single files": switch to ->read_iter
  proc/stat: switch to ->read_iter
  proc/cpuinfo: switch to ->read_iter
  proc: wire up generic_file_splice_read for iter ops
  seq_file: add seq_read_iter

Merge tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes:

   - Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a
     combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs
     integrated assembler upset

   - Correct the mitigation selection logic which prevented the related
     prctl to work correctly

   - Make the UV5 hubless system work correctly by fixing up the
     malformed table entries and adding the missing ones"

* tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/uv: Recognize UV5 hubless system identifier
  x86/platform/uv: Remove spaces from OEM IDs
  x86/platform/uv: Fix missing OEM_TABLE_ID
  x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
  x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S

Merge tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fix from Thomas Gleixner:
"A single fix for the perf core plugging a memory leak in the address
filter parser"

* tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix a memory leak in perf_event_parse_addr_filter()

Merge tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull futex fix from Thomas Gleixner:
"A single fix for the futex code where an intermediate state in the
  underlying RT mutex was not handled correctly and triggering a BUG()
  instead of treating it as another variant of retry condition"

* tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Handle transient "ownerless" rtmutex state correctly

Merge tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"A set of fixes for interrupt chip drivers:

   - Fix the fallout of the IPI as interrupt conversion in Kconfig and
     the BCM2836 interrupt chip driver

   - Fixes for interrupt affinity setting and the handling of
     hierarchical irq domains in the SiFive PLIC driver

   - Make the unmapped event handling in the TI SCI driver work
     correctly

   - A few minor fixes and cleanups in various chip drivers and Kconfig"

* tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dt-bindings: irqchip: ti, sci-inta: Fix diagram indentation for unmapped events
  irqchip/ti-sci-inta: Add support for unmapped event handling
  dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
  irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm
  irqchip/sifive-plic: Fix chip_data access within a hierarchy
  irqchip/sifive-plic: Fix broken irq_set_affinity() callback
  irqchip/stm32-exti: Add all LP timer exti direct events support
  irqchip/bcm2836: Fix missing __init annotation
  irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY
  irqchip/mst: Make mst_intc_of_init static
  irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
  genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY

Merge tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull entry code fix from Thomas Gleixner:
"A single fix for the generic entry code to correct the wrong
  assumption that the lockdep interrupt state needs not to be
  established before calling the RCU check"

* tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry: Fix the incorrect ordering of lockdep and RCU check

Merge tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

- fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user()

- fix for an RCU splat at boot caused by a recent lockdep change

- fix for a possible deadlock in our EEH debugfs code

- several fixes for handling of _PAGE_ACCESSED on 32-bit platforms

- build fix when CONFIG_NUMA=n

Thanks to Andreas Schwab, Christophe Leroy, Oliver O'Halloran, Qian Cai,
and Scott Cheloha.

* tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/numa: Fix build when CONFIG_NUMA=n
  powerpc/8xx: Manage _PAGE_ACCESSED through APG bits in L1 entry
  powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
  powerpc/40x: Always fault when _PAGE_ACCESSED is not set
  powerpc/603: Always fault when _PAGE_ACCESSED is not set
  powerpc: Use asm_goto_volatile for put_user()
  powerpc/smp: Call rcu_cpu_starting() earlier
  powerpc/eeh_cache: Fix a possible debugfs deadlock

KVM: selftests: Introduce the dirty log perf test

The dirty log perf test will time verious dirty logging operations
(enabling dirty logging, dirtying memory, getting the dirty log,
clearing the dirty log, and disabling dirty logging) in order to
quantify dirty logging performance. This test can be used to inform
future performance improvements to KVM's dirty logging infrastructure.

This series was tested by running the following invocations on an Intel
Skylake machine:
dirty_log_perf_test -b 20m -i 100 -v 64
dirty_log_perf_test -b 20g -i 5 -v 4
dirty_log_perf_test -b 4g -i 5 -v 32
demand_paging_test -b 20m -v 64
demand_paging_test -b 20g -v 4
demand_paging_test -b 4g -v 32
All behaved as expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20201027233733.1484855 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Make the number of vcpus global

We also check the input number of vcpus against the maximum supported.

Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201104212357 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Make the per vcpu memory size global

Rename vcpu_memory_bytes to something with "percpu" in it
in order to be less ambiguous. Also make it global to
simplify things.

Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201104212357 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Drop pointless vm_create wrapper

Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201104212357 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Add wrfract to common guest code

Wrfract will be used by the dirty logging perf test introduced later in
this series to dirty memory sparsely.

This series was tested by running the following invocations on an Intel
Skylake machine:
dirty_log_perf_test -b 20m -i 100 -v 64
dirty_log_perf_test -b 20g -i 5 -v 4
dirty_log_perf_test -b 4g -i 5 -v 32
demand_paging_test -b 20m -v 64
demand_paging_test -b 20g -v 4
demand_paging_test -b 4g -v 32
All behaved as expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20201027233733.1484855 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Simplify demand_paging_test with timespec_diff_now

Add a helper function to get the current time and return the time since
a given start time. Use that function to simplify the timekeeping in the
demand paging test.

This series was tested by running the following invocations on an Intel
Skylake machine:
dirty_log_perf_test -b 20m -i 100 -v 64
dirty_log_perf_test -b 20g -i 5 -v 4
dirty_log_perf_test -b 4g -i 5 -v 32
demand_paging_test -b 20m -v 64
demand_paging_test -b 20g -v 4
demand_paging_test -b 4g -v 32
All behaved as expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20201027233733.1484855 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Remove address rounding in guest code

Rounding the address the guest writes to a host page boundary
will only have an effect if the host page size is larger than the guest
page size, but in that case the guest write would still go to the same
host page. There's no reason to round the address down, so remove the
rounding to simplify the demand paging test.

This series was tested by running the following invocations on an Intel
Skylake machine:
dirty_log_perf_test -b 20m -i 100 -v 64
dirty_log_perf_test -b 20g -i 5 -v 4
dirty_log_perf_test -b 4g -i 5 -v 32
demand_paging_test -b 20m -v 64
demand_paging_test -b 20g -v 4
demand_paging_test -b 4g -v 32
All behaved as expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20201027233733.1484855 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Factor code out of demand_paging_test

Much of the code in demand_paging_test can be reused by other, similar
multi-vCPU-memory-touching-perfromance-tests. Factor that common code
out for reuse.

No functional change expected.

This series was tested by running the following invocations on an Intel
Skylake machine:
dirty_log_perf_test -b 20m -i 100 -v 64
dirty_log_perf_test -b 20g -i 5 -v 4
dirty_log_perf_test -b 4g -i 5 -v 32
demand_paging_test -b 20m -v 64
demand_paging_test -b 20g -v 4
demand_paging_test -b 4g -v 32
All behaved as expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <20201027233733.1484855 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Use a single binary for dirty/clear log test

Remove the clear_dirty_log test, instead merge it into the existing
dirty_log_test. It should be cleaner to use this single binary to do
both tests, also it's a preparation for the upcoming dirty ring test.

The default behavior will run all the modes in sequence.

Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20201001012233 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Always clear dirty bitmap after iteration

We used not to clear the dirty bitmap before because KVM_GET_DIRTY_LOG
would overwrite it the next time it copies the dirty log onto it.
In the upcoming dirty ring tests we'll start to fetch dirty pages from
a ring buffer, so no one is going to clear the dirty bitmap for us.

Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20201001012228 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Add blessed SVE registers to get-reg-list

Add support for the SVE registers to get-reg-list and create a
new test, get-reg-list-sve, which tests them when running on a
machine with SVE support.

Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201029201703 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: selftests: Add aarch64 get-reg-list test

Check for KVM_GET_REG_LIST regressions. The blessed list was
created by running on v4.15 with the --core-reg-fixup option.
The following script was also used in order to annotate system
registers with their names when possible. When new system
registers are added the names can just be added manually using
the same grep.

while read reg; do
if [[ ! $reg =~ ARM64_SYS_REG ]]; then
printf "\t$reg\n"
continue
fi
encoding=$(echo "$reg" | sed "s/ARM64_SYS_REG(//;s/),//")
if ! name=$(grep "$encoding" ../../../../arch/arm64/include/asm/sysreg.h); then
printf "\t$reg\n"
continue
fi
name=$(echo "$name" | sed "s/.*SYS_//;s/[\t ]*sys_reg($encoding)$//")
printf "\t$reg\t/* $name */\n"
done < <(aarch64/get-reg-list --core-reg-fixup --list)

Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20201029201703 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

selftests: kvm: test enforcement of paravirtual cpuid features

Add a set of tests that ensure the guest cannot access paravirtual msrs
and hypercalls that have been disabled in the KVM_CPUID_FEATURES leaf.
Expect a #GP in the case of msr accesses and -KVM_ENOSYS from
hypercalls.

Cc: Jim Mattson <[email protected]>
Signed-off-by: Oliver Upton <[email protected]>
Reviewed-by: Peter Shier <[email protected]>
Reviewed-by: Aaron Lewis <[email protected]>
Message-Id: <20201027231044 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

selftests: kvm: Add exception handling to selftests

Add the infrastructure needed to enable exception handling in selftests.
This allows any of the exception and interrupt vectors to be overridden
in the guest.

Signed-off-by: Aaron Lewis <[email protected]>
Reviewed-by: Alexander Graf <[email protected]>
Message-Id: <20201012194716.3950330 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>