powerpc: Don't hide eh field of lwarx behind a macro
The eh field must remain 0 for PPC32 and is only used
by PPC64.
Don't hide that behind a macro, just leave the responsibility
to the user.
At the time being, the only users of PPC_RAW_L{WDQ}ARX are
setting the eh field to 0, so the special handling of __PPC_EH
is useless. Just take the value given by the caller.
Same for DEFINE_TESTOP(), don't do special handling in that
macro, ensure the caller hands over the proper eh value.
Commit 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of
PPC_LWARX/LDARX macros") properly handled the eh field of lwarx
in asm/bitops.h but failed to clear it for PPC32 in
asm/simple_spinlock.h
So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64
but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which
returns 1 when CONFIG_PPC64 is set and 0 otherwise.
====================
Do not use RT_TOS for IPv6 flowlabel
According to Guillaume Nault RT_TOS should never be used for IPv6.
Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.
But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
====================
Matthias May [Fri, 5 Aug 2022 19:19:06 +0000 (21:19 +0200)]
ipv6: do not use RT_TOS for IPv6 flowlabel
According to Guillaume Nault RT_TOS should never be used for IPv6.
Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.
But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Acked-by: Guillaume Nault <[email protected]> Signed-off-by: Matthias May <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Matthias May [Fri, 5 Aug 2022 19:19:05 +0000 (21:19 +0200)]
mlx5: do not use RT_TOS for IPv6 flowlabel
According to Guillaume Nault RT_TOS should never be used for IPv6.
Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.
But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
Fixes: ce99f6b97fcd ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels") Acked-by: Guillaume Nault <[email protected]> Signed-off-by: Matthias May <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Matthias May [Fri, 5 Aug 2022 19:19:04 +0000 (21:19 +0200)]
vxlan: do not use RT_TOS for IPv6 flowlabel
According to Guillaume Nault RT_TOS should never be used for IPv6.
Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.
But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
Matthias May [Fri, 5 Aug 2022 19:19:03 +0000 (21:19 +0200)]
geneve: do not use RT_TOS for IPv6 flowlabel
According to Guillaume Nault RT_TOS should never be used for IPv6.
Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.
But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
Fixes: 3a56f86f1be6 ("geneve: handle ipv6 priority like ipv4 tos") Acked-by: Guillaume Nault <[email protected]> Signed-off-by: Matthias May <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Matthias May [Fri, 5 Aug 2022 19:00:06 +0000 (21:00 +0200)]
geneve: fix TOS inheriting for ipv4
The current code retrieves the TOS field after the lookup
on the ipv4 routing table. The routing process currently
only allows routing based on the original 3 TOS bits, and
not on the full 6 DSCP bits.
As a result the retrieved TOS is cut to the 3 bits.
However for inheriting purposes the full 6 bits should be used.
Extract the full 6 bits before the route lookup and use
that instead of the cut off 3 TOS bits.
net: atlantic: fix aq_vec index out of range error
The final update statement of the for loop exceeds the array range, the
dereference of self->aq_vec[i] is not checked and then leads to the
index out of range error.
Also fixed this kind of coding style in other for loop.
The following patchset contains Netfilter fixes for net:
1) Harden set element field checks to avoid out-of-bound memory access,
this patch also fixes the type of issue described in 7e6bc1f6cabc
("netfilter: nf_tables: stricter validation of element data") in a
broader way.
2) Patches to restrict the chain, set, and rule id lookup in the
transaction to the corresponding top-level table, patches from
Thadeu Lima de Souza Cascardo.
3) Fix incorrect comment in ip6t_LOG.h
4) nft_data_init() performs upfront validation of the expected data.
struct nft_data_desc is used to describe the expected data to be
received from userspace. The .size field represents the maximum size
that can be stored, for bound checks. Then, .len is an input/output field
which stores the expected length as input (this is optional, to restrict
the checks), as output it stores the real length received from userspace
(if it was not specified as input). This patch comes in response to 7e6bc1f6cabc ("netfilter: nf_tables: stricter validation of element data")
to address this type of issue in a more generic way by avoid opencoded
data validation. Next patch requires this as a dependency.
5) Disallow jump to implicit chain from set element, this configuration
is invalid. Only allow jump to chain via immediate expression is
supported at this stage.
6) Fix possible null-pointer derefence in the error path of table updates,
if memory allocation of the transaction fails. From Florian Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fix null deref due to zeroed list head
netfilter: nf_tables: disallow jump to implicit chain from set element
netfilter: nf_tables: upfront validation of data via nft_data_init()
netfilter: ip6t_LOG: Fix a typo in a comment
netfilter: nf_tables: do not allow RULE_ID to refer to another chain
netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
netfilter: nf_tables: do not allow SET_ID to refer to another table
netfilter: nf_tables: validate variable length element extension
====================
* Expand commit log to include summary of the discussion with Yonghong
* Make lru_bug selftest serial to not mess up refcount for map_kptr test
====================
Add a regression test to check against invalid check_and_init_map_value
call inside prealloc_lru_pop.
The kptr should not be reset to NULL once we set it after deleting the
map element. Hence, we trigger a program that updates the element
causing its reuse, and checks whether the unref kptr is reset or not.
If it is, prealloc_lru_pop does an incorrect check_and_init_map_value
call and the test fails.
The LRU map that is preallocated may have its elements reused while
another program holds a pointer to it from bpf_map_lookup_elem. Hence,
only check_and_free_fields is appropriate when the element is being
deleted, as it ensures proper synchronization against concurrent access
of the map value. After that, we cannot call check_and_init_map_value
again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while
they can be concurrently accessed from a BPF program.
This is safe to do as when the map entry is deleted, concurrent access
is protected against by check_and_free_fields, i.e. an existing timer
would be freed, and any existing kptr will be released by it. The
program can create further timers and kptrs after check_and_free_fields,
but they will eventually be released once the preallocated items are
freed on map destruction, even if the item is never reused again. Hence,
the deleted item sitting in the free list can still have resources
attached to it, and they would never leak.
With spin_lock, we never touch the field at all on delete or update, as
we may end up modifying the state of the lock. Since the verifier
ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock
call, the program will eventually release the lock so that on reuse the
new user of the value can take the lock.
Essentially, for the preallocated case, we must assume that the map
value may always be in use by the program, even when it is sitting in
the freelist, and handle things accordingly, i.e. use proper
synchronization inside check_and_free_fields, and never reinitialize the
special fields when it is reused on update.
Mikulas Patocka [Mon, 8 Aug 2022 14:50:10 +0000 (10:50 -0400)]
dm writecache: fix smatch warning about invalid return from writecache_map
There's a smatch warning "inconsistent returns '&wc->lock'" in
dm-writecache. The reason for the warning is that writecache_map()
doesn't drop the lock on the impossible path.
Fix this warning by adding wc_unlock() after the BUG statement (so
that it will be compiled-away anyway).
Fixes: df699cc16ea5e ("dm writecache: report invalid return from writecache_map helpers") Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
Mike Snitzer [Tue, 9 Aug 2022 22:07:28 +0000 (18:07 -0400)]
dm verity: fix verity_parse_opt_args parsing
Commit df326e7a0699 ("dm verity: allow optional args to alter primary
args handling") introduced a bug where verity_parse_opt_args() wouldn't
properly shift past an optional argument's additional params (by
ignoring them).
Fix this by avoiding returning with error if an unknown argument is
encountered when @only_modifier_opts=true is passed to
verity_parse_opt_args().
In practice this regressed the cryptsetup testsuite's FEC testing
because unknown optional arguments were encountered, wherey
short-circuiting ever testing FEC mode. With this fix all of the
cryptsetup testsuite's verity FEC tests pass.
Fixes: df326e7a0699 ("dm verity: allow optional args to alter primary args handling") Reported-by: Milan Broz <[email protected]>> Signed-off-by: Mike Snitzer <[email protected]>
Mike Snitzer [Tue, 9 Aug 2022 21:33:12 +0000 (17:33 -0400)]
dm verity: fix DM_VERITY_OPTS_MAX value yet again
Must account for the possibility that "try_verify_in_tasklet" is used.
This is the same issue that was fixed with commit 160f99db94322 -- it
is far too easy to miss that additional a new argument(s) require
bumping DM_VERITY_OPTS_MAX accordingly.
Historically none of the bufio code runs in interrupt context but with
the use of DM_BUFIO_CLIENT_NO_SLEEP a bufio client can, see: commit 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
That said, the new tasklet usecase still doesn't require interrupts be
disabled by bufio (let alone conditionally restore them).
Yet with PREEMPT_RT, and falling back from tasklet to workqueue, care
must be taken to properly synchronize between softirq and process
context, otherwise ABBA deadlock may occur. While it is unnecessary to
disable bottom-half preemption within a tasklet, we must consistently do
so in process context to ensure locking is in the proper order.
Fix these issues by switching from spin_lock_irq{save,restore} to using
spin_{lock,unlock}_bh instead. Also remove the 'spinlock_flags' member
in dm_bufio_client struct (that can be used unsafely if bufio must
recurse on behalf of some caller, e.g. block layer's submit_bio).
Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.
Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).
Linus Torvalds [Tue, 9 Aug 2022 21:56:49 +0000 (14:56 -0700)]
Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"Work on 'courteous server', which was introduced in 5.19, continues
apace. This release introduces a more flexible limit on the number of
NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in
courtesy state long after the lease expiration timeout. The client
limit is adjusted based on the physical memory size of the server.
The NFSD filecache is a cache of files held open by NFSv4 clients or
recently touched by NFSv2 or NFSv3 clients. This cache had some
significant scalability constraints that have been relieved in this
release. Thanks to all who contributed to this work.
A data corruption bug found during the most recent NFS bake-a-thon
that involves NFSv3 and NFSv4 clients writing the same file has been
addressed in this release.
This release includes several improvements in CPU scalability for
NFSv4 operations. In addition, Neil Brown provided patches that
simplify locking during file lookup, creation, rename, and removal
that enables subsequent work on making these operations more scalable.
We expect to see that work materialize in the next release.
There are also numerous single-patch fixes, clean-ups, and the usual
improvements in observability"
* tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits)
lockd: detect and reject lock arguments that overflow
NFSD: discard fh_locked flag and fh_lock/fh_unlock
NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
NFSD: use explicit lock/unlock for directory ops
NFSD: reduce locking in nfsd_lookup()
NFSD: only call fh_unlock() once in nfsd_link()
NFSD: always drop directory lock in nfsd_unlink()
NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
NFSD: add posix ACLs to struct nfsd_attrs
NFSD: add security label to struct nfsd_attrs
NFSD: set attributes when creating symlinks
NFSD: introduce struct nfsd_attrs
NFSD: verify the opened dentry after setting a delegation
NFSD: drop fh argument from alloc_init_deleg
NFSD: Move copy offload callback arguments into a separate structure
NFSD: Add nfsd4_send_cb_offload()
NFSD: Remove kmalloc from nfsd4_do_async_copy()
NFSD: Refactor nfsd4_do_copy()
NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
...
Dan Carpenter [Mon, 1 Aug 2022 10:17:32 +0000 (13:17 +0300)]
NTB: EPF: Tidy up some bounds checks
This sscanf() is reading from the filename which was set by the kernel
so it should be trust worthy. Although the data is likely trust worthy
there is some bounds checking but unfortunately, it is not complete or
consistent. Additionally, the Smatch static checker marks everything
that comes from sscanf() as tainted and so Smatch complains that this
code can lead to an out of bounds issue. Let's clean things up and make
Smatch happy.
The first problem is that there is no bounds checking in the _show()
functions. The _store() and _show() functions are very similar so make
the bounds checking the same in both.
The second issue is that if "win_no" is zero it leads to an array
underflow so add an if (win_no <= 0) check for that.
Dan Carpenter [Mon, 1 Aug 2022 10:15:25 +0000 (13:15 +0300)]
NTB: EPF: Fix error code in epf_ntb_bind()
Return an error code if pci_register_driver() fails. Don't return
success.
Fixes: da51fd247424 ("NTB: EPF: support NTB transfer between PCI RC and EP connection") Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Souptick Joarder (HPE) <[email protected]> Signed-off-by: Jon Mason <[email protected]>
Tom Rix [Mon, 4 Jul 2022 13:25:59 +0000 (09:25 -0400)]
PCI: endpoint: pci-epf-vntb: reduce several globals to statics
sparse reports
drivers/pci/endpoint/functions/pci-epf-vntb.c:975:5: warning: symbol 'pci_read' was not declared. Should it be static?
drivers/pci/endpoint/functions/pci-epf-vntb.c:984:5: warning: symbol 'pci_write' was not declared. Should it be static?
drivers/pci/endpoint/functions/pci-epf-vntb.c:989:16: warning: symbol 'vpci_ops' was not declared. Should it be static?
These functions and variables are only used in pci-epf-vntb.c, so their storage
class specifiers should be static.
Fixes: ff32fac00d97 ("NTB: EPF: support NTB transfer between PCI RC and EP connection") Signed-off-by: Tom Rix <[email protected]> Acked-by: Frank Li <[email protected]> Signed-off-by: Jon Mason <[email protected]>
Yang Yingliang [Sat, 25 Jun 2022 02:15:16 +0000 (10:15 +0800)]
PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()
In error case of epf_ntb_mw_bar_init(), memory window BARs should be
cleared, so add 'num_mws' parameter in epf_ntb_mw_bar_clear() and
calling it in error path to clear the BARs. Also add missing error
code when pci_epc_mem_alloc_addr() fails.
Fixes: ff32fac00d97 ("NTB: EPF: support NTB transfer between PCI RC and EP connection") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: Jon Mason <[email protected]>
Ren Zhijie [Fri, 24 Jun 2022 01:19:11 +0000 (09:19 +0800)]
PCI: endpoint: Fix Kconfig dependency
If CONFIG_NTB is not set and CONFIG_PCI_EPF_VNTB is y.
make ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu-, will be failed, like this:
drivers/pci/endpoint/functions/pci-epf-vntb.o: In function `epf_ntb_cmd_handler':
pci-epf-vntb.c:(.text+0x95e): undefined reference to `ntb_db_event'
pci-epf-vntb.c:(.text+0xa1f): undefined reference to `ntb_link_event'
pci-epf-vntb.c:(.text+0xa42): undefined reference to `ntb_link_event'
drivers/pci/endpoint/functions/pci-epf-vntb.o: In function `pci_vntb_probe':
pci-epf-vntb.c:(.text+0x1250): undefined reference to `ntb_register_device'
The functions ntb_*() are defined in drivers/ntb/core.c, which need CONFIG_NTB setting y to be build-in.
To fix this build error, add depends on NTB.
Fix the warning by using code-block directive to format the diagram
inside code block, as in other diagrams in Documentation/. While at it,
unindent the preceeding text.
Frank Li [Tue, 22 Feb 2022 16:23:55 +0000 (10:23 -0600)]
Documentation: PCI: Add specification for the PCI vNTB function device
Add specification for the PCI vNTB function device. The endpoint function
driver and the host PCI driver should be created based on this
specification.
Frank Li [Tue, 22 Feb 2022 16:23:53 +0000 (10:23 -0600)]
NTB: epf: Allow more flexibility in the memory BAR map method
Support the below BAR configuration methods for epf NTB.
BAR 0: config and scratchpad
BAR 2: doorbell
BAR 4: memory map windows
Set difference BAR number information into struct ntb_epf_data. So difference
VID/PID can choose different BAR configurations. There are difference
BAR map method between epf NTB and epf vNTB Endpoint function.
ntb_mw_set_trans() will set memory map window after endpoint function
driver bind. The inbound map address need be updated dynamically when
using NTB by PCIe Root Port and PCIe Endpoint connection.
Checking if iatu already assigned to the BAR, if yes, using assigned iatu
number to update inbound address map and skip set BAR's register.
Replace existing limited example with proper code for Qualcomm Resource
Power Manager (RPM) over SMD based on MSM8916. This also fixes the
example's indentation.
The child node of smd is an SMD edge representing remote subsystem.
Bring back missing reference from previously sent patch (disappeared
when applying).
dt-bindings: mmc: sdhci-msm: Fix 'operating-points-v2 was unexpected' issue
As Rob reported in [1], there is one more issue present
in the 'sdhci-msm' dt-binding which shows up when a fix for
'unevaluatedProperties' handling is applied:
Documentation/devicetree/bindings/mmc/sdhci-msm.example.dtb:
mmc@8804000: Unevaluated properties are not allowed
('operating-points-v2' was unexpected)
dt-bindings: display: simple-framebuffer: Drop Bartlomiej Zolnierkiewicz
Bartlomiej's Samsung email address is not working since around last
year and there was no follow up patch take over of the drivers, so drop
the email from maintainers.
Sebastian Würl [Thu, 4 Aug 2022 08:14:11 +0000 (10:14 +0200)]
can: mcp251x: Fix race condition on receive interrupt
The mcp251x driver uses both receiving mailboxes of the CAN controller
chips. For retrieving the CAN frames from the controller via SPI, it checks
once per interrupt which mailboxes have been filled and will retrieve the
messages accordingly.
This introduces a race condition, as another CAN frame can enter mailbox 1
while mailbox 0 is emptied. If now another CAN frame enters mailbox 0 until
the interrupt handler is called next, mailbox 0 is emptied before
mailbox 1, leading to out-of-order CAN frames in the network device.
This is fixed by checking the interrupt flags once again after freeing
mailbox 0, to correctly also empty mailbox 1 before leaving the handler.
For reproducing the bug I created the following setup:
- Two CAN devices, one Raspberry Pi with MCP2515, the other can be any.
- Setup CAN to 1 MHz
- Spam bursts of 5 CAN-messages with increasing CAN-ids
- Continue sending the bursts while sleeping a second between the bursts
- Check on the RPi whether the received messages have increasing CAN-ids
- Without this patch, every burst of messages will contain a flipped pair
The issue seems similar to commit 90b3b339364c ("net: hisilicon: Fix a BUG
trigered by wrong bytes_compl") and potentially introduced by commit b38c83dd0866 ("bgmac: simplify tx ring index handling").
If there is an RX interrupt between setting ring->end
and netdev_sent_queue() we can hit the BUG_ON as bgmac_dma_tx_free()
can miscalculate the queue size while called from bgmac_poll().
The machine which triggered the BUG runs a v4.14 RT kernel - but the issue
seems present in mainline too.
Vladimir Oltean [Mon, 8 Aug 2022 12:51:27 +0000 (15:51 +0300)]
net: dsa: felix: suppress non-changes to the tagging protocol
The way in which dsa_tree_change_tag_proto() works is that when
dsa_tree_notify() fails, it doesn't know whether the operation failed
mid way in a multi-switch tree, or it failed for a single-switch tree.
So even though drivers need to fail cleanly in
ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
again, to restore the old tag protocol for potential switches in the
tree where the change did succeeed (before failing for others).
This means for the felix driver that if we report an error in
felix_change_tag_protocol(), we'll get another call where proto_ops ==
old_proto_ops. If we proceed to act upon that, we may do unexpected
things. For example, we will call dsa_tag_8021q_register() twice in a
row, without any dsa_tag_8021q_unregister() in between. Then we will
actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
which (if it manages to run at all, after walking through corrupted data
structures) will leave the ports inoperational anyway.
The bug can be readily reproduced if we force an error while in
tag_8021q mode; this crashes the kernel.
netfilter: nf_tables: fix null deref due to zeroed list head
In nf_tables_updtable, if nf_tables_table_enable returns an error,
nft_trans_destroy is called to free the transaction object.
nft_trans_destroy() calls list_del(), but the transaction was never
placed on a list -- the list head is all zeroes, this results in
a null dereference:
BUG: KASAN: null-ptr-deref in nft_trans_destroy+0x26/0x59
Call Trace:
nft_trans_destroy+0x26/0x59
nf_tables_newtable+0x4bc/0x9bc
[..]
Its sane to assume that nft_trans_destroy() can be called
on the transaction object returned by nft_trans_alloc(), so
make sure the list head is initialised.
Fixes: 55dd6f93076b ("netfilter: nf_tables: use new transaction infrastructure to handle table") Reported-by: mingi cho <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: upfront validation of data via nft_data_init()
Instead of parsing the data and then validate that type and length are
correct, pass a description of the expected data so it can be validated
upfront before parsing it to bail out earlier.
This patch adds a new .size field to specify the maximum size of the
data area. The .len field is optional and it is used as an input/output
field, it provides the specific length of the expected data in the input
path. If then .len field is not specified, then obtained length from the
netlink attribute is stored. This is required by cmp, bitwise, range and
immediate, which provide no netlink attribute that describes the data
length. The immediate expression uses the destination register type to
infer the expected data type.
Relying on opencoded validation of the expected data might lead to
subtle bugs as described in 7e6bc1f6cabc ("netfilter: nf_tables:
stricter validation of element data").
Trond Myklebust [Tue, 9 Aug 2022 16:50:28 +0000 (12:50 -0400)]
NFS: Improve write error tracing
Don't leak request pointers, but use the "device:inode" labelling that
is used by all the other trace points. Furthermore, replace use of page
indexes with an offset, again in order to align behaviour with other
NFS trace points.
posix-cpu-timers: Cleanup CPU timers before freeing them during exec
Commit 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a
task") started looking up tasks by PID when deleting a CPU timer.
When a non-leader thread calls execve, it will switch PIDs with the leader
process. Then, as it calls exit_itimers, posix_cpu_timer_del cannot find
the task because the timer still points out to the old PID.
That means that armed timers won't be disarmed, that is, they won't be
removed from the timerqueue_list. exit_itimers will still release their
memory, and when that list is later processed, it leads to a
use-after-free.
Clean up the timers from the de-threaded task before freeing them. This
prevents a reported use-after-free.
Youngmin Nam [Tue, 12 Jul 2022 09:47:15 +0000 (18:47 +0900)]
time: Correct the prototype of ns_to_kernel_old_timeval and ns_to_timespec64
In ns_to_kernel_old_timeval() definition, the function argument is defined
with const identifier in kernel/time/time.c, but the prototype in
include/linux/time32.h looks different.
- The function is defined in kernel/time/time.c as below:
struct __kernel_old_timeval ns_to_kernel_old_timeval(const s64 nsec)
- The function is decalared in include/linux/time32.h as below:
extern struct __kernel_old_timeval ns_to_kernel_old_timeval(s64 nsec);
Because the variable of arithmethic types isn't modified in the calling scope,
there's no need to mark arguments as const, which was already mentioned during
review (Link[1) of the original patch.
Likewise remove the "const" keyword in both definition and declaration of
ns_to_timespec64() as requested by Arnd (Link[2]).
netfilter: nf_tables: do not allow RULE_ID to refer to another chain
When doing lookups for rules on the same batch by using its ID, a rule from
a different chain can be used. If a rule is added to a chain but tries to
be positioned next to a rule from a different chain, it will be linked to
chain2, but the use counter on chain1 would be the one to be incremented.
When looking for rules by ID, use the chain that was used for the lookup by
name. The chain used in the context copied to the transaction needs to
match that same chain. That way, struct nft_rule does not need to get
enlarged with another member.
Fixes: 1a94e38d254b ("netfilter: nf_tables: add NFTA_RULE_ID attribute") Fixes: 75dd48e2e420 ("netfilter: nf_tables: Support RULE_ID reference in new rule") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Cc: <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
When doing lookups for chains on the same batch by using its ID, a chain
from a different table can be used. If a rule is added to a table but
refers to a chain in a different table, it will be linked to the chain in
table2, but would have expressions referring to objects in table1.
Then, when table1 is removed, the rule will not be removed as its linked to
a chain in table2. When expressions in the rule are processed or removed,
that will lead to a use-after-free.
When looking for chains by ID, use the table that was used for the lookup
by name, and only return chains belonging to that same table.
Fixes: 837830a4b439 ("netfilter: nf_tables: add NFTA_RULE_CHAIN_ID attribute") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Cc: <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: do not allow SET_ID to refer to another table
When doing lookups for sets on the same batch by using its ID, a set from a
different table can be used.
Then, when the table is removed, a reference to the set may be kept after
the set is freed, leading to a potential use-after-free.
When looking for sets by ID, use the table that was used for the lookup by
name, and only return sets belonging to that same table.
This fixes CVE-2022-2586, also reported as ZDI-CAN-17470.
Reported-by: Team Orca of Sea Security (@seasecresponse) Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Cc: <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: validate variable length element extension
Update template to validate variable length extensions. This patch adds
a new .ext_len[id] field to the template to store the expected extension
length. This is used to sanity check the initialization of the variable
length extension.
Use PTR_ERR() in nft_set_elem_init() to report errors since, after this
update, there are two reason why this might fail, either because of
ENOMEM or insufficient room in the extension field (EINVAL).
Kernels up until 7e6bc1f6cabc ("netfilter: nf_tables: stricter
validation of element data") allowed to copy more data to the extension
than was allocated. This ext_len field allows to validate if the
destination has the correct size as additional check.
Sakari Ailus [Mon, 8 Aug 2022 21:12:13 +0000 (00:12 +0300)]
ACPI: property: Fix error handling in acpi_init_properties()
buf.pointer, memory for storing _DSD data and nodes, was released if either
parsing properties or, as recently added, attaching data node tags failed.
Alas, properties were still left pointing to this memory if parsing
properties were successful but attaching data node tags failed.
Fix this by separating error handling for the two, and leaving properties
intact if data nodes cannot be tagged for a reason or another.
Reported-by: kernel test robot <[email protected]> Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles") Signed-off-by: Sakari Ailus <[email protected]>
[ rjw: Drop unrelated white space change ] Signed-off-by: Rafael J. Wysocki <[email protected]>
Linus Torvalds [Tue, 9 Aug 2022 17:11:56 +0000 (10:11 -0700)]
Merge tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache updates from David Howells:
- Fix a cookie access ref leak if a cookie is invalidated a second time
before the first invalidation is actually processed.
- Add a tracepoint to log cookie lookup failure
* tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
fscache: add tracepoint when failing cookie
fscache: don't leak cookie access refs if invalidation is in progress or failed
Linus Torvalds [Tue, 9 Aug 2022 17:08:08 +0000 (10:08 -0700)]
Merge tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
"Fix AFS refcount handling.
The first patch converts afs to use refcount_t for its refcounts and
the second patch fixes afs_put_call() and afs_put_server() to save the
values they're going to log in the tracepoint before decrementing the
refcount"
* tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix access after dec in put functions
afs: Use refcount_t rather than atomic_t
Linus Torvalds [Tue, 9 Aug 2022 16:52:28 +0000 (09:52 -0700)]
Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull setgid updates from Christian Brauner:
"This contains the work to move setgid stripping out of individual
filesystems and into the VFS itself.
Creating files that have both the S_IXGRP and S_ISGID bit raised in
directories that themselves have the S_ISGID bit set requires
additional privileges to avoid security issues.
When a filesystem creates a new inode it needs to take care that the
caller is either in the group of the newly created inode or they have
CAP_FSETID in their current user namespace and are privileged over the
parent directory of the new inode. If any of these two conditions is
true then the S_ISGID bit can be raised for an S_IXGRP file and if not
it needs to be stripped.
However, there are several key issues with the current implementation:
- S_ISGID stripping logic is entangled with umask stripping.
For example, if the umask removes the S_IXGRP bit from the file
about to be created then the S_ISGID bit will be kept.
The inode_init_owner() helper is responsible for S_ISGID stripping
and is called before posix_acl_create(). So we can end up with two
different orderings:
1. FS without POSIX ACL support
First strip umask then strip S_ISGID in inode_init_owner().
In other words, if a filesystem doesn't support or enable POSIX
ACLs then umask stripping is done directly in the vfs before
calling into the filesystem:
2. FS with POSIX ACL support
First strip S_ISGID in inode_init_owner() then strip umask in
posix_acl_create().
In other words, if the filesystem does support POSIX ACLs then
unmask stripping may be done in the filesystem itself when
calling posix_acl_create().
Note that technically filesystems are free to impose their own
ordering between posix_acl_create() and inode_init_owner() meaning
that there's additional ordering issues that influence S_ISGID
inheritance.
(Note that the commit message of commit 1639a49ccdce ("fs: move
S_ISGID stripping into the vfs_*() helpers") gets the ordering
between inode_init_owner() and posix_acl_create() the wrong way
around. I realized this too late.)
- Filesystems that don't rely on inode_init_owner() don't get S_ISGID
stripping logic.
While that may be intentional (e.g. network filesystems might just
defer setgid stripping to a server) it is often just a security
issue.
Note that mandating the use of inode_init_owner() was proposed as
an alternative solution but that wouldn't fix the ordering issues
and there are examples such as afs where the use of
inode_init_owner() isn't possible.
In any case, we should also try the cleaner and generalized
solution first before resorting to this approach.
- We still have S_ISGID inheritance bugs years after the initial
round of S_ISGID inheritance fixes:
e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories")
All of this led us to conclude that the current state is too messy.
While we won't be able to make it completely clean as
posix_acl_create() is still a filesystem specific call we can improve
the S_SIGD stripping situation quite a bit by hoisting it out of
inode_init_owner() and into the respective vfs creation operations.
The obvious advantage is that we don't need to rely on individual
filesystems getting S_ISGID stripping right and instead can
standardize the ordering between S_ISGID and umask stripping directly
in the VFS.
A few short implementation notes:
- The stripping logic needs to happen in vfs_*() helpers for the sake
of stacking filesystems such as overlayfs that rely on these
helpers taking care of S_ISGID stripping.
- Security hooks have never seen the mode as it is ultimately seen by
the filesystem because of the ordering issue we mentioned. Nothing
is changed for them. We simply continue to strip the umask before
passing the mode down to the security hooks.
- The following filesystems use inode_init_owner() and thus relied on
S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs,
hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs,
overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs,
bpf, tmpfs.
We've audited all callchains as best as we could. More details can
be found in the commit message to 1639a49ccdce ("fs: move S_ISGID
stripping into the vfs_*() helpers")"
* tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
ceph: rely on vfs for setgid stripping
fs: move S_ISGID stripping into the vfs_*() helpers
fs: Add missing umask strip in vfs_tmpfile
fs: add mode_strip_sgid() helper
Linus Torvalds [Tue, 9 Aug 2022 16:48:30 +0000 (09:48 -0700)]
Merge tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
- An optimization in memblock_add_range() to reduce array traversals
- Improvements to the memblock test suite
* tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock test: Modify the obsolete description in README
memblock tests: fix compilation errors
memblock tests: change build options to run-time options
memblock tests: remove completed TODO items
memblock tests: set memblock_debug to enable memblock_dbg() messages
memblock tests: add verbose output to memblock tests
memblock tests: Makefile: add arguments to control verbosity
memblock: avoid some repeat when add new range
Dmitry Osipenko [Thu, 30 Jun 2022 20:04:04 +0000 (23:04 +0300)]
drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
Use ww_acquire_fini() in the error code paths. Otherwise lockdep
thinks that lock is held when lock's memory is freed after the
drm_gem_lock_reservations() error. The ww_acquire_context needs to be
annotated as "released", which fixes the noisy "WARNING: held lock freed!"
splat of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep.
Linus Torvalds [Tue, 9 Aug 2022 16:39:25 +0000 (09:39 -0700)]
Merge tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu fixes from Greg Ungerer:
- spelling in comment
- compilation when flexcan driver enabled
- sparse warning
* tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: Fix syntax errors in comments
m68k: coldfire: make symbol m523x_clk_lookup static
m68k: coldfire/device.c: protect FLEXCAN blocks
Linus Torvalds [Tue, 9 Aug 2022 16:29:07 +0000 (09:29 -0700)]
Merge tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 eIBRS fixes from Borislav Petkov:
"More from the CPU vulnerability nightmares front:
Intel eIBRS machines do not sufficiently mitigate against RET
mispredictions when doing a VM Exit therefore an additional RSB,
one-entry stuffing is needed"
* tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Add LFENCE to RSB fill sequence
x86/speculation: Add RSB VM Exit protections
Dmitry Osipenko [Thu, 30 Jun 2022 20:00:57 +0000 (23:00 +0300)]
drm/shmem-helper: Add missing vunmap on error
The vmapping of dma-buf may succeed, but DRM SHMEM rejects the IOMEM
mapping, and thus, drm_gem_shmem_vmap_locked() should unvmap the IOMEM
before erroring out.
Dan Carpenter [Wed, 20 Jul 2022 18:28:18 +0000 (21:28 +0300)]
NTB: ntb_tool: uninitialized heap data in tool_fn_write()
The call to:
ret = simple_write_to_buffer(buf, size, offp, ubuf, size);
will return success if it is able to write even one byte to "buf".
The value of "*offp" controls which byte. This could result in
reading uninitialized data when we do the sscanf() on the next line.
This code is not really desigined to handle partial writes where
*offp is non-zero and the "buf" is preserved and re-used between writes.
Just ban partial writes and replace the simple_write_to_buffer() with
copy_from_user().
Fixes: 578b881ba9c4 ("NTB: Add tool test client") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Jon Mason <[email protected]>
When building with Clang we encounter these warnings:
| drivers/ntb/hw/idt/ntb_hw_idt.c:2409:28: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat]
| "\t%hhu-%hhu.\t", idx + cnt - 1);
-
| drivers/ntb/hw/idt/ntb_hw_idt.c:2438:29: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat]
| "\t%hhu-%hhu.\t", idx + cnt - 1);
-
| drivers/ntb/hw/idt/ntb_hw_idt.c:2484:15: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat], src);
For the first two warnings the format specifier used is `%hhu` which
describes a u8. Both `idx` and `cnt` are u8 as well. However, the
expression as a whole is promoted to an int as you cannot get
smaller-than-int from addition. Therefore, to fix the warning, use the
promoted-to-type's format specifier -- in this case `%d`.
example:
``
uint8_t a = 4, b = 7;
int size = sizeof(a + b - 1);
printf("%d\n", size);
// output: 4
```
For the last warning, src is of type `int` while the format specifier
describes a u8. The fix here is just to use the proper specifier `%d`.
See more:
(https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules)
"Integer types smaller than int are promoted when an operation is
performed on them. If all values of the original type can be represented
as an int, the value of the smaller type is converted to an int;
otherwise, it is converted to an unsigned int."
Jeff Layton [Fri, 5 Aug 2022 10:42:45 +0000 (06:42 -0400)]
fscache: don't leak cookie access refs if invalidation is in progress or failed
It's possible for a request to invalidate a fscache_cookie will come in
while we're already processing an invalidation. If that happens we
currently take an extra access reference that will leak. Only call
__fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit
was previously clear.
Also, ensure that we attempt to clear the bit when the cookie is
"FAILED" and put the reference to avoid an access leak.
Fixes: 85e4ea1049c7 ("fscache: Fix invalidation/lookup race") Suggested-by: David Howells <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: David Howells <[email protected]>
Takashi Iwai [Tue, 9 Aug 2022 07:32:59 +0000 (09:32 +0200)]
ALSA: usb-audio: More comprehensive mixer map for ASUS ROG Zenith II
ASUS ROG Zenith II has two USB interfaces, one for the front headphone
and another for the rest I/O. Currently we provided the mixer mapping
for the latter but with an incomplete form.
This patch corrects and provides more comprehensive mixer mapping, as
well as providing the proper device names for both the front headphone
and main audio.
ALSA: scarlett2: Add Focusrite Clarett+ 8Pre support
The Focusrite Clarett+ 8Pre uses the same protocol as the Scarlett Gen
2 and Gen 3 product range. This patch adds support for the Clarett+
8Pre by adding appropriate entries to the scarlett2 driver.
The Clarett+ 2Pre and 4Pre, and the Clarett USB product line
presumably use the same protocol as well, so support for them can
easily be added if someone can test.
clang emits a -Wunaligned-access warning on struct __packed
ems_cpc_msg.
The reason is that the anonymous union msg (not declared as packed) is
being packed right after some non naturally aligned variables (3*8
bits + 2*32) inside a packed struct:
| struct __packed ems_cpc_msg {
| u8 type; /* type of message */
| u8 length; /* length of data within union 'msg' */
| u8 msgid; /* confirmation handle */
| __le32 ts_sec; /* timestamp in seconds */
| __le32 ts_nsec; /* timestamp in nano seconds */
| /* ^ not naturally aligned */
|
| union {
| /* ^ not declared as packed */
| u8 generic[64];
| struct cpc_can_msg can_msg;
| struct cpc_can_params can_params;
| struct cpc_confirm confirmation;
| struct cpc_overrun overrun;
| struct cpc_can_error error;
| struct cpc_can_err_counter err_counter;
| u8 can_state;
| } msg;
| };
Starting from LLVM 14, having an unpacked struct nested in a packed
struct triggers a warning. c.f. [1].
Fix the warning by marking the anonymous union as packed.
Fedor Pchelkin [Fri, 5 Aug 2022 15:02:16 +0000 (18:02 +0300)]
can: j1939: j1939_session_destroy(): fix memory leak of skbs
We need to drop skb references taken in j1939_session_skb_queue() when
destroying a session in j1939_session_destroy(). Otherwise those skbs
would be lost.
Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
can: j1939: j1939_sk_queue_activate_next_locked(): replace WARN_ON_ONCE with netdev_warn_once()
We should warn user-space that it is doing something wrong when trying
to activate sessions with identical parameters but WARN_ON_ONCE macro
can not be used here as it serves a different purpose.
So it would be good to replace it with netdev_warn_once() message.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Jakub Kicinski [Tue, 9 Aug 2022 03:59:07 +0000 (20:59 -0700)]
Merge tag 'for-net-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fixes various issues related to ISO channel/socket support
- Fixes issues when building with C=1
- Fix cancel uninitilized work which blocks syzbot to run
* tag 'for-net-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: ISO: Fix not using the correct QoS
Bluetooth: don't try to cancel uninitialized works at mgmt_index_removed()
Bluetooth: ISO: Fix iso_sock_getsockopt for BT_DEFER_SETUP
Bluetooth: MGMT: Fixes build warnings with C=1
Bluetooth: hci_event: Fix build warning with C=1
Bluetooth: ISO: Fix memory corruption
Bluetooth: Fix null pointer deref on unexpected status event
Bluetooth: ISO: Fix info leak in iso_sock_getsockopt()
Bluetooth: hci_conn: Fix updating ISO QoS PHY
Bluetooth: ISO: unlock on error path in iso_sock_setsockopt()
Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression
====================
Since
commit e6e771b3d897 ("s390/qeth: detach netdevice while card is offline")
there was a timing window during recovery, that qeth_query_card_info could
be sent to the card, even before it was ready for it, leading to a failing
card recovery. There is evidence that this window was hit, as not all
callers of get_link_ksettings() check for netif_device_present.
Use cached values in qeth_get_link_ksettings(), instead of calling
qeth_query_card_info() and falling back to default values in case it
fails. Link info is already updated when the card goes online, e.g. after
STARTLAN (physical link up). Set the link info to default values, when the
card goes offline or at STOPLAN (physical link down). A follow-on patch
will improve values reported for link down.
Nikita Shubin [Fri, 5 Aug 2022 08:48:43 +0000 (11:48 +0300)]
net: phy: dp83867: fix get nvmem cell fail
If CONFIG_NVMEM is not set of_nvmem_cell_get, of_nvmem_device_get
functions will return ERR_PTR(-EOPNOTSUPP) and "failed to get nvmem
cell io_impedance_ctrl" error would be reported despite "io_impedance_ctrl"
is completely missing in Device Tree and we should use default values.
Check -EOPNOTSUPP togather with -ENOENT to avoid this situation.
Duoming Zhou [Fri, 5 Aug 2022 07:00:08 +0000 (15:00 +0800)]
atm: idt77252: fix use-after-free bugs caused by tst_timer
There are use-after-free bugs caused by tst_timer. The root cause
is that there are no functions to stop tst_timer in idt77252_exit().
One of the possible race conditions is shown below:
min_gate_len[0] and min_gate_len[1] should be 200000, while
min_gate_len[2-7] should be 0.
However what happens is that min_gate_len[0] is 200000, but
min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the
point where the logic detects the gate close event for TC 1).
The problem is that the code considers a "gate close" event whenever it
sees that there is a 0 for that TC (essentially it's level rather than
edge triggered). By doing that, any time a gate is seen as closed
without having been open prior, gate_len, which is 0, will be written
into min_gate_len. Once min_gate_len becomes 0, it's impossible for it
to track anything higher than that (the length of actually open
intervals).
To fix this, we make the writing to min_gate_len[tc] be edge-triggered,
which avoids writes for gates that are closed in consecutive intervals.
However what this does is it makes us need to special-case the
permanently closed gates at the end.
Martin Schiller [Fri, 5 Aug 2022 06:18:10 +0000 (08:18 +0200)]
net/x25: fix call timeouts in blocking connects
When a userspace application starts a blocking connect(), a CALL REQUEST
is sent, the t21 timer is started and the connect is waiting in
x25_wait_for_connection_establishment(). If then for some reason the t21
timer expires before any reaction on the assigned logical channel (e.g.
CALL ACCEPT, CLEAR REQUEST), there is sent a CLEAR REQUEST and timer
t23 is started waiting for a CLEAR confirmation. If we now receive a
CLEAR CONFIRMATION from the peer, x25_disconnect() is called in
x25_state2_machine() with reason "0", which means "normal" call
clearing. This is ok, but the parameter "reason" is used as sk->sk_err
in x25_disconnect() and sock_error(sk) is evaluated in
x25_wait_for_connection_establishment() to check if the call is still
pending. As "0" is not rated as an error, the connect will stuck here
forever.
To fix this situation, also check if the sk->sk_state changed form
TCP_SYN_SENT to TCP_CLOSE in the meantime, which is also done by
x25_disconnect().
If tsnep_tx_map() fails, then tsnep_tx_unmap() shall start at the write
index like tsnep_tx_map(). This is different to the normal operation.
Thus, add an additional parameter to tsnep_tx_unmap() to enable start at
different positions for successful TX and failed TX.
Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver") Signed-off-by: Gerhard Engleder <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
drivers/net/ethernet/engleder/tsnep_main.c:1254:34: warning:
'tsnep_of_match' defined but not used [-Wunused-const-variable=]
of_match_ptr() compiles into NULL if CONFIG_OF is disabled.
tsnep_of_match exists always so use of of_match_ptr() is useless.
Fix warning by dropping of_match_ptr().
Linus Torvalds [Tue, 9 Aug 2022 03:15:13 +0000 (20:15 -0700)]
Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull ksmbd updates from Steve French:
- fixes for memory access bugs (out of bounds access, oops, leak)
- multichannel fixes
- session disconnect performance improvement, and session register
improvement
- cleanup
* tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix heap-based overflow in set_ntacl_dacl()
ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
ksmbd: prevent out of bound read for SMB2_WRITE
ksmbd: fix use-after-free bug in smb2_tree_disconect
ksmbd: fix memory leak in smb2_handle_negotiate
ksmbd: fix racy issue while destroying session on multichannel
ksmbd: use wait_event instead of schedule_timeout()
ksmbd: fix kernel oops from idr_remove()
ksmbd: add channel rwlock
ksmbd: replace sessions list in connection with xarray
MAINTAINERS: ksmbd: add entry for documentation
ksmbd: remove unused ksmbd_share_configs_cleanup function
Linus Torvalds [Tue, 9 Aug 2022 03:04:35 +0000 (20:04 -0700)]
Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more iov_iter updates from Al Viro:
- more new_sync_{read,write}() speedups - ITER_UBUF introduction
- ITER_PIPE cleanups
- unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
switching them to advancing semantics
- making ITER_PIPE take high-order pages without splitting them
- handling copy_page_from_iter() for high-order pages properly
* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
fix copy_page_from_iter() for compound destinations
hugetlbfs: copy_page_to_iter() can deal with compound pages
copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
expand those iov_iter_advance()...
pipe_get_pages(): switch to append_pipe()
get rid of non-advancing variants
ceph: switch the last caller of iov_iter_get_pages_alloc()
9p: convert to advancing variant of iov_iter_get_pages_alloc()
af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
iov_iter: saner helper for page array allocation
fold __pipe_get_pages() into pipe_get_pages()
ITER_XARRAY: don't open-code DIV_ROUND_UP()
unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
unify xarray_get_pages() and xarray_get_pages_alloc()
unify pipe_get_pages() and pipe_get_pages_alloc()
iov_iter_get_pages(): sanity-check arguments
iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
...