Jon Maloy [Wed, 10 Oct 2018 15:34:01 +0000 (17:34 +0200)]
tipc: set link tolerance correctly in broadcast link
In the patch referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
affected links.
However, the 'tolerance' field of the broadcast link is never set, and
remains at zero. This renders the whole commit without the intended
improving effect, but luckily also with no negative effect.
In this commit we add the missing initialization.
Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet retransmission") Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
====================
net: ipv4: fixes for PMTU when link MTU changes
The first patch adapts the changes that commit e9fa1495d738 ("ipv6:
Reflect MTU changes on PMTU of exceptions for MTU-less routes") did in
IPv6 to IPv4: lower PMTU when the first hop's MTU drops below it, and
raise PMTU when the first hop was limiting PMTU discovery and its MTU
is increased.
The second patch fixes bugs introduced in commit d52e5a7e7ca4 ("ipv4:
lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu") that
only appear once the first patch is applied.
Selftests for these cases were introduced in net-next commit e44e428f59e4 ("selftests: pmtu: add basic IPv4 and IPv6 PMTU tests")
v2: add cover letter, and fix a few small things in patch 1
====================
Sabrina Dubroca [Tue, 9 Oct 2018 15:48:15 +0000 (17:48 +0200)]
net: ipv4: don't let PMTU updates increase route MTU
When an MTU update with PMTU smaller than net.ipv4.route.min_pmtu is
received, we must clamp its value. However, we can receive a PMTU
exception with PMTU < old_mtu < ip_rt_min_pmtu, which would lead to an
increase in PMTU.
To fix this, take the smallest of the old MTU and ip_rt_min_pmtu.
Before this patch, in case of an update, the exception's MTU would
always change. Now, an exception can have only its lock flag updated,
but not the MTU, so we need to add a check on locking to the following
"is this exception getting updated, or close to expiring?" test.
Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu") Signed-off-by: Sabrina Dubroca <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Sabrina Dubroca [Tue, 9 Oct 2018 15:48:14 +0000 (17:48 +0200)]
net: ipv4: update fnhe_pmtu when first hop's MTU changes
Since commit 5aad1de5ea2c ("ipv4: use separate genid for next hop
exceptions"), exceptions get deprecated separately from cached
routes. In particular, administrative changes don't clear PMTU anymore.
As Stefano described in commit e9fa1495d738 ("ipv6: Reflect MTU changes
on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
the local MTU change can become stale:
- if the local MTU is now lower than the PMTU, that PMTU is now
incorrect
- if the local MTU was the lowest value in the path, and is increased,
we might discover a higher PMTU
Similarly to what commit e9fa1495d738 did for IPv6, update PMTU in those
cases.
If the exception was locked, the discovered PMTU was smaller than the
minimal accepted PMTU. In that case, if the new local MTU is smaller
than the current PMTU, let PMTU discovery figure out if locking of the
exception is still needed.
To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
notifier. By the time the notifier is called, dev->mtu has been
changed. This patch adds the old MTU as additional information in the
notifier structure, and a new call_netdevice_notifiers_u32() function.
Mike Rapoport [Tue, 9 Oct 2018 04:02:01 +0000 (07:02 +0300)]
net/ipv6: stop leaking percpu memory in fib6 info
The fib6_info_alloc() function allocates percpu memory to hold per CPU
pointers to rt6_info, but this memory is never freed. Fix it.
Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers") Signed-off-by: Mike Rapoport <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Here are a set of patches that prepares for and fix problems in rxrpc's
package reception code. There serious problems are:
(A) There's a window between binding the socket and setting the data_ready
hook in which packets can find their way into the UDP socket's receive
queues.
(B) The skb_recv_udp() will return an error (and clear the error state) if
there was an error on the Tx side. rxrpc doesn't handle this.
(C) The rxrpc data_ready handler doesn't fully drain the UDP receive
queue.
(D) The rxrpc data_ready handler assumes it is called in a non-reentrant
state.
The second patch fixes (A) - (C); the third patch renders (B) and (C)
non-issues by using the recap_rcv hook instead of data_ready - and the
final patch fixes (D). That last is the most complex.
The preparatory patches are:
(1) Fix some places that are doing things in the wrong net namespace.
(2) Stop taking the rcu read lock as it's held by the IP input routine in
the call chain.
(3) Only end the Tx phase if *we* rotated the final packet out of the Tx
buffer.
(4) Don't assume that the call state won't change after dropping the
call_state lock.
(5) Only take receive window and MTU suze parameters from an ACK packet if
it's the latest ACK packet.
(6) Record connection-level abort information correctly.
(7) Fix a trace line.
And then there are three main patches - note that these are mixed in with
the preparatory patches somewhat:
(1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
drainage (C).
(2) Switch to using the encap_rcv instead of data_ready to cut out the
effects of the UDP read queues and get the packets delivered directly.
(3) Add more locking into the various packet input paths to defend against
re-entrance (D).
====================
Ka-Cheong Poon [Mon, 8 Oct 2018 16:17:11 +0000 (09:17 -0700)]
rds: RDS (tcp) hangs on sendto() to unresponding address
In rds_send_mprds_hash(), if the calculated hash value is non-zero and
the MPRDS connections are not yet up, it will wait. But it should not
wait if the send is non-blocking. In this case, it should just use the
base c_path for sending the message.
Merge tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Dave writes:
"xfs: fixes for 4.19-rc7
Update for 4.19-rc7 to fix numerous file clone and deduplication issues."
* tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix data corruption w/ unaligned reflink ranges
xfs: fix data corruption w/ unaligned dedupe ranges
xfs: update ctime and remove suid before cloning files
xfs: zero posteof blocks when cloning above eof
xfs: refactor clonerange preparation into a separate helper
The dm-linear target is independent of the dm-zoned target. For code
requiring support for zoned block devices, use CONFIG_BLK_DEV_ZONED
instead of CONFIG_DM_ZONED.
While at it, similarly to dm linear, also enable the DM_TARGET_ZONED_HM
feature in dm-flakey only if CONFIG_BLK_DEV_ZONED is defined.
Fixes: beb9caac211c1 ("dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled") Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: [email protected] Signed-off-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
Tariq Toukan [Tue, 21 Aug 2018 11:41:41 +0000 (14:41 +0300)]
net/mlx5: WQ, fixes for fragmented WQ buffers API
mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new
calls/API.
Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.
This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()
The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.
New calculation is also more efficient, and improves performance
as follows:
Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.
Before: 16,477,619 pps
After: 17,085,793 pps
3.7% improvement
Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Merge tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Mike writes:
"device mapper fixes for 4.19 final
- Fix a DM cache module init error path bug that doesn't properly
cleanup a KMEM_CACHE if target registration fails.
- Two stable@ fixes for DM zoned target; 4.20 will have changes that
eliminate this code entirely but <= 4.19 needs these changes."
* tag 'for-4.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled
dm: fix report zone remapping to account for partition offset
dm cache: destroy migration_cache if cache target registration failed
Merge tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Steven writes:
"vsprint fix:
It was reported that trace_printk() was not reporting properly
values that came after a dereference pointer.
trace_printk() utilizes vbin_printf() and bstr_printf() to keep the
overhead of tracing down. vbin_printf() does not do any conversions
and just stors the string format and the raw arguments into the
buffer. bstr_printf() is used to read the buffer and does the
conversions to complete the printf() output.
This can be troublesome with dereferenced pointers because the
reference may be different from the time vbin_printf() is called to
the time bstr_printf() is called. To fix this, a prior commit changed
vbin_printf() to convert dereferenced pointers into strings and load
the converted string into the buffer. But the change to bstr_printf()
had an off-by-one error and didn't account for the nul character at
the end of the string and this corrupted the rest of the values in
the format that came after a dereferenced pointer."
* tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
Merge tag 'devicetree-fixes-for-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Rob writes:
"Devicetree fixes for 4.19, part 3:
- Fix DT unittest on Oldworld MAC systems"
* tag 'devicetree-fixes-for-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: unittest: Disable interrupt node tests for old world MAC systems
Valentine Fatiev [Wed, 10 Oct 2018 06:56:25 +0000 (09:56 +0300)]
IB/mlx5: Unmap DMA addr from HCA before IOMMU
The function that puts back the MR in cache also removes the DMA address
from the HCA. Therefore we need to call this function before we remove
the DMA mapping from MMU. Otherwise the HCA may access a memory that
is no longer DMA mapped.
Root cause is the following check in skb_partial_csum_set()
if (unlikely(start > skb_headlen(skb)) ||
unlikely((int)start + off > skb_headlen(skb) - 2))
return false;
If skb_headlen(skb) is 1, then (skb_headlen(skb) - 2) becomes 0xffffffff
and the check fails to detect that ((int)start + off) is off the limit,
since the compare is unsigned.
When we fix that, then the first condition (start > skb_headlen(skb))
becomes obsolete.
Then we should also check that (skb_headroom(skb) + start) wont
overflow 16bit field.
David S. Miller [Wed, 10 Oct 2018 17:19:10 +0000 (10:19 -0700)]
Merge branch 'devlink-param-type-string-fixes'
Moshe Shemesh says:
====================
devlink param type string fixes
This patchset fixes devlink param infrastructure for string param type.
The devlink param infrastructure doesn't handle copying the string data
correctly. The first two patches fix it and the third patch adds helper
function to safely copy string value without exceeding
DEVLINK_PARAM_MAX_STRING_VALUE.
====================
Moshe Shemesh [Wed, 10 Oct 2018 13:09:27 +0000 (16:09 +0300)]
devlink: Add helper function for safely copy string param
Devlink string param buffer is allocated at the size of
DEVLINK_PARAM_MAX_STRING_VALUE. Add helper function which makes sure
this size is not exceeded.
Renamed DEVLINK_PARAM_MAX_STRING_VALUE to
__DEVLINK_PARAM_MAX_STRING_VALUE to emphasize that it should be used by
devlink only. The driver should use the helper function instead to
verify it doesn't exceed the allowed length.
Moshe Shemesh [Wed, 10 Oct 2018 13:09:26 +0000 (16:09 +0300)]
devlink: Fix param cmode driverinit for string type
Driverinit configuration mode value is held by devlink to enable the
driver fetch the value after reload command. In case the param type is
string devlink should copy the value from driver string buffer to
devlink string buffer on devlink_param_driverinit_value_set() and
vice-versa on devlink_param_driverinit_value_get().
Fixes: ec01aeb1803e ("devlink: Add support for get/set driverinit value") Signed-off-by: Moshe Shemesh <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Moshe Shemesh [Wed, 10 Oct 2018 13:09:25 +0000 (16:09 +0300)]
devlink: Fix param set handling for string type
In case devlink param type is string, it needs to copy the string value
it got from the input to devlink_param_value.
Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Some samples require headers installation, so commit 3fca1700c4c3
("kbuild: make samples really depend on headers_install") added
such dependency in the top Makefile. However, UML fails to build
with CONFIG_SAMPLES=y because UML does not support headers_install.
Fixes: 3fca1700c4c3 ("kbuild: make samples really depend on headers_install") Reported-by: Kees Cook <[email protected]> Cc: David Howells <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
Mike Snitzer [Wed, 10 Oct 2018 16:01:55 +0000 (12:01 -0400)]
dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled
It is best to avoid any extra overhead associated with bio completion.
DM core will indirectly call a DM target's .end_io if it is defined.
In the case of DM linear, there is no need to do so (for every bio that
completes) if CONFIG_DM_ZONED is not enabled.
Avoiding an extra indirect call for every bio completion is very
important for ensuring DM linear doesn't incur more overhead that
further widens the performance gap between dm-linear and raw block
devices.
Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: [email protected] Signed-off-by: Mike Snitzer <[email protected]>
Marco Felsch [Tue, 2 Oct 2018 08:06:46 +0000 (10:06 +0200)]
pinctrl: mcp23s08: fix irq and irqchip setup order
Since 'commit 02e389e63e35 ("pinctrl: mcp23s08: fix irq setup order")' the
irq request isn't the last devm_* allocation. Without a deeper look at
the irq and testing this isn't a good solution. Since this driver relies
on the devm mechanism, requesting a interrupt should be the last thing
to avoid memory corruptions during unbinding.
'Commit 02e389e63e35 ("pinctrl: mcp23s08: fix irq setup order")' fixed the
order for the interrupt-controller use case only. The
mcp23s08_irq_setup() must be split into two to fix it for the
interrupt-controller use case and to register the irq at last. So the
irq will be freed first during unbind.
Stephen Boyd [Mon, 8 Oct 2018 16:32:13 +0000 (09:32 -0700)]
gpio: Assign gpio_irq_chip::parents to non-stack pointer
gpiochip_set_cascaded_irqchip() is passed 'parent_irq' as an argument
and then the address of that argument is assigned to the gpio chips
gpio_irq_chip 'parents' pointer shortly thereafter. This can't ever
work, because we've just assigned some stack address to a pointer that
we plan to dereference later in gpiochip_irq_map(). I ran into this
issue with the KASAN report below when gpiochip_irq_map() tried to setup
the parent irq with a total junk pointer for the 'parents' array.
BUG: KASAN: stack-out-of-bounds in gpiochip_irq_map+0x228/0x248
Read of size 4 at addr ffffffc0dde472e0 by task swapper/0/1
Let's leave around one unsigned int in the gpio_irq_chip struct for the
single parent irq case and repoint the 'parents' array at it. This way
code is left mostly intact to setup parents and we waste an extra few
bytes per structure of which there should be only a handful in a system.
Daniel Mack [Mon, 8 Oct 2018 20:03:57 +0000 (22:03 +0200)]
libertas: call into generic suspend code before turning off power
When powering down a SDIO connected card during suspend, make sure to call
into the generic lbs_suspend() function before pulling the plug. This will
make sure the card is successfully deregistered from the system to avoid
communication to the card starving out.
of: unittest: Disable interrupt node tests for old world MAC systems
On systems with OF_IMAP_OLDWORLD_MAC set in of_irq_workarounds, the
devicetree interrupt parsing code is different, causing unit tests of
devicetree interrupt nodes to fail. Due to a bug in unittest code, which
tries to dereference an uninitialized pointer, this results in a crash.
Merge tag 'tag-chrome-platform-fixes-for-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
Benson writes:
"chrome-platform fix for v4.19-rc8
This contains a fix to 57e94c8b974d ("mfd: cros-ec: Increase maximum
mkbp event size"), which caused cros_ec based chromebooks to truncate
an entire column of their built-in keyboard."
* tag 'tag-chrome-platform-fixes-for-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
mfd: cros-ec: copy the whole event in get_next_event_xfer
Merge branch 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Dennis writes:
"percpu fixes for-4.19-rc8
The new percpu allocator introduced in 4.14 had a missing free for
the percpu metadata. This caused a memory leak when percpu memory is
being churned resulting in the allocation and deallocation of percpu
memory chunks"
* 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: stop leaking bitmap metadata blocks
Merge tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Martin writes:
"s390 fixes for 4.19-rc8
Four more patches for 4.19:
- Fix resume after suspend-to-disk if resume-CPU != suspend-CPU
- Fix vfio-ccw check for pinned pages
- Two patches to avoid a usercopy-whitelist warning in vfio-ccw"
* tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: Fix how vfio-ccw checks pinned pages
s390/cio: Refactor alloc of ccw_io_region
s390/cio: Convert ccw_io_region to pointer
s390/hibernate: fix error handling when suspend cpu != resume cpu
Merge tag 'mips_fixes_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Paul writes:
"A few MIPS fixes for 4.19:
- Avoid suboptimal placement of our VDSO when using the legacy mmap
layout, which can prevent statically linked programs that were able
to allocate large amounts of memory using the brk syscall prior to
the introduction of our VDSO from functioning correctly.
- Fix up CONFIG_CMDLINE handling for platforms which ought to ignore
DT arguments but have incorrectly used them & lost other arguments
since v3.16.
- Fix a path in MAINTAINERS to use valid wildcards.
- Fixup a regression from v4.17 in memset() for systems using
CPU_DADDI_WORKAROUNDS."
* tag 'mips_fixes_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: memset: Fix CPU_DADDI_WORKAROUNDS `small_fixup' regression
MAINTAINERS: MIPS/LOONGSON2 ARCHITECTURE - Use the normal wildcard style
MIPS: Fix CONFIG_CMDLINE handling
MIPS: VDSO: Always map near top of user memory
Emil Karlson [Wed, 3 Oct 2018 18:43:18 +0000 (21:43 +0300)]
mfd: cros-ec: copy the whole event in get_next_event_xfer
Commit 57e94c8b974db2d83c60e1139c89a70806abbea0 caused cros-ec keyboard events
be truncated on many chromebooks so that Left and Right keys on Column 12 were
always 0. Use ret as memcpy len to fix this.
The old code was using ec_dev->event_size, which is the event payload/data size
excluding event_type header, for the length of the memcpy operation. Use ret
as memcpy length to avoid the off by one and copy the whole msg->data.
when mprotect(2) gets used on DAX mappings. Also there is a wide variety
of other failures that can result from the missing _PAGE_DEVMAP flag
when the area gets used by get_user_pages() later.
Fix the problem by including _PAGE_DEVMAP in a set of flags that get
preserved by mprotect(2).
Damien Le Moal [Tue, 9 Oct 2018 05:24:31 +0000 (14:24 +0900)]
dm: fix report zone remapping to account for partition offset
If dm-linear or dm-flakey are layered on top of a partition of a zoned
block device, remapping of the start sector and write pointer position
of the zones reported by a report zones BIO must be modified to account
for the target table entry mapping (start offset within the device and
entry mapping with the dm device). If the target's backing device is a
partition of a whole disk, the start sector on the physical device of
the partition must also be accounted for when modifying the zone
information. However, dm_remap_zone_report() was not considering this
last case, resulting in incorrect zone information remapping with
targets using disk partitions.
Fix this by calculating the target backing device start sector using
the position of the completed report zones BIO and the unchanged
position and size of the original report zone BIO. With this value
calculated, the start sector and write pointer position of the target
zones can be correctly remapped.
Shenghui Wang [Sun, 7 Oct 2018 06:45:41 +0000 (14:45 +0800)]
dm cache: destroy migration_cache if cache target registration failed
Commit 7e6358d244e47 ("dm: fix various targets to dm_register_target
after module __init resources created") inadvertently introduced this
bug when it moved dm_register_target() after the call to KMEM_CACHE().
Fixes: 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") Cc: [email protected] Signed-off-by: Shenghui Wang <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
David S. Miller [Tue, 9 Oct 2018 17:49:50 +0000 (10:49 -0700)]
Merge branch 'ena-fixes'
Arthur Kiyanovski says:
====================
minor bug fixes for ENA Ethernet driver
Arthur Kiyanovski (4):
net: ena: fix warning in rmmod caused by double iounmap
net: ena: fix rare bug when failed restart/resume is followed by
driver removal
net: ena: fix NULL dereference due to untimely napi initialization
net: ena: fix auto casting to boolean
====================
Eliminate potential auto casting compilation error.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: ena: fix NULL dereference due to untimely napi initialization
napi poll functions should be initialized before running request_irq(),
to handle a rare condition where there is a pending interrupt, causing
the ISR to fire immediately while the poll function wasn't set yet,
causing a NULL dereference.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: ena: fix rare bug when failed restart/resume is followed by driver removal
In a rare scenario when ena_device_restore() fails, followed by device
remove, an FLR will not be issued. In this case, the device will keep
sending asynchronous AENQ keep-alive events, even after driver removal,
leading to memory corruption.
Fixes: 8c5c7abdeb2d ("net: ena: add power management ops to the ENA driver") Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: ena: fix warning in rmmod caused by double iounmap
Memory mapped with devm_ioremap is automatically freed when the driver
is disconnected from the device. Therefore there is no need to
explicitly call devm_iounmap.
Fixes: 0857d92f71b6 ("net: ena: add missing unmap bars on device removal") Fixes: 411838e7b41c ("net: ena: fix rare kernel crash when bar memory remap fails") Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Paolo Bonzini [Tue, 9 Oct 2018 16:35:29 +0000 (18:35 +0200)]
KVM: x86: support CONFIG_KVM_AMD=y with CONFIG_CRYPTO_DEV_CCP_DD=m
SEV requires access to the AMD cryptographic device APIs, and this
does not work when KVM is builtin and the crypto driver is a module.
Actually the Kconfig conditions for CONFIG_KVM_AMD_SEV try to disable
SEV in that case, but it does not work because the actual crypto
calls are not culled, only sev_hardware_setup() is.
This patch adds two CONFIG_KVM_AMD_SEV checks that gate all the remaining
SEV code; it fixes this particular configuration, and drops 5 KiB of
code when CONFIG_KVM_AMD_SEV=n.
gfs2: Fix iomap buffered write support for journaled files
Commit 64bc06bb32ee broke buffered writes to journaled files (chattr
+j): we'll try to journal the buffer heads of the page being written to
in gfs2_iomap_journaled_page_done. However, the iomap code no longer
creates buffer heads, so we'll BUG() in gfs2_page_add_databufs. Fix
that by creating buffer heads ourself when needed.
cdc-acm: correct counting of UART states in serial state notification
The usb standard ("Universal Serial Bus Class Definitions for Communication
Devices") distiguishes between "consistent signals" (DSR, DCD), and
"irregular signals" (break, ring, parity error, framing error, overrun).
The bits of "irregular signals" are set, if this error/event occurred on
the device side and are immeadeatly unset, if the serial state notification
was sent.
Like other drivers of real serial ports do, just the occurence of those
events should be counted in serial_icounter_struct (but no 1->0
transitions).
cdc-acm: do not reset notification buffer index upon urb unlinking
Resetting the write index of the notification buffer on urb unlink (e.g.
closing a cdc-acm device from userspace) may lead to wrong interpretation
of further received notifications, in case the index is not 0 when urb
unlink happens (i.e. when parts of a notification already have been
transferred). On the device side there is no "reset" of the notification
transimission and thus we would get out of sync with the device.
usb: usbip: Fix BUG: KASAN: slab-out-of-bounds in vhci_hub_control()
vhci_hub_control() accesses port_status array with out of bounds port
value. Fix it to reference port_status[] only with a valid rhport value
when invalid_rhport flag is true.
The invalid_rhport flag is set early on after detecting in port value
is within the bounds or not.
The following is used reproduce the problem and verify the fix:
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14ed8ab6400000
Michael reported an issue with oversized terms values assignment
and I noticed there was actually a misunderstanding of the max
value check in the past.
The above commit's changelog says:
If bit 21 is set, there is parsing issues as below.
$ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/
event syntax error: '..pi_0/event=0x200002,umask=0x8/'
\___ value too big for format, maximum is 511
But there's no issue there, because the event value is distributed
along the value defined by the format. Even if the format defines
separated bit, the value is treated as a continual number, which
should follow the format definition.
In above case it's 9-bit value with last bit separated:
$ cat uncore_qpi_0/format/event
config:0-7,21
Hence the value 0x200002 is correctly reported as format violation,
because it exceeds 9 bits. It should have been 0x102 instead, which
sets the 9th bit - the bit 21 of the format.
$ perf stat -vv -a -e uncore_qpi_0/event=0x102,umask=0x8/
Using CPUID GenuineIntel-6-2D
...
------------------------------------------------------------
perf_event_attr:
type 10
size 112
config 0x200802
sample_type IDENTIFIER
...
Merge tag 'arc-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Vineet writes:
"ARC updates for 4.19-rc8
- Fix clone syscall to update Thread pointer register
- Make/build updates (needed for AGL/OE builds) [Alexey]
- Typo fix [Colin Ian King]"
* tag 'arc-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: clone syscall to setp r25 as thread pointer
ARC: build: Don't set CROSS_COMPILE in arch's Makefile
ARC: fix spelling mistake "entires" -> "entries"
ARC: build: Get rid of toolchain check
ARCv2: build: use mcpu=hs38 iso generic mcpu=archs
Reinette Chatre [Thu, 4 Oct 2018 21:05:23 +0000 (14:05 -0700)]
x86/intel_rdt: Fix out-of-bounds memory access in CBM tests
While the DOC at the beginning of lib/bitmap.c explicitly states that
"The number of valid bits in a given bitmap does _not_ need to be an
exact multiple of BITS_PER_LONG.", some of the bitmap operations do
indeed access BITS_PER_LONG portions of the provided bitmap no matter
the size of the provided bitmap. For example, if bitmap_intersects()
is provided with an 8 bit bitmap the operation will access
BITS_PER_LONG bits from the provided bitmap. While the operation
ensures that these extra bits do not affect the result, the memory
is still accessed.
The capacity bitmasks (CBMs) are typically stored in u32 since they
can never exceed 32 bits. A few instances exist where a bitmap_*
operation is performed on a CBM by simply pointing the bitmap operation
to the stored u32 value.
The consequence of this pattern is that some bitmap_* operations will
access out-of-bounds memory when interacting with the provided CBM. This
is confirmed with a KASAN test that reports:
BUG: KASAN: stack-out-of-bounds in __bitmap_intersects+0xa2/0x100
and
BUG: KASAN: stack-out-of-bounds in __bitmap_weight+0x58/0x90
Fix this by moving any CBM provided to a bitmap operation needing
BITS_PER_LONG to an 'unsigned long' variable.
[ tglx: Changed related function arguments to unsigned long and got rid
of the _cbm extra step ]
David Howells [Mon, 8 Oct 2018 14:46:25 +0000 (15:46 +0100)]
rxrpc: Fix the packet reception routine
The rxrpc_input_packet() function and its call tree was built around the
assumption that data_ready() handler called from UDP to inform a kernel
service that there is data to be had was non-reentrant. This means that
certain locking could be dispensed with.
This, however, turns out not to be the case with a multi-queue network card
that can deliver packets to multiple cpus simultaneously. Each of those
cpus can be in the rxrpc_input_packet() function at the same time.
Fix by adding or changing some structure members:
(1) Add peer->rtt_input_lock to serialise access to the RTT buffer.
(2) Make conn->service_id into a 32-bit variable so that it can be
cmpxchg'd on all arches.
(3) Add call->input_lock to serialise access to the Rx/Tx state. Note
that although the Rx and Tx states are (almost) entirely separate,
there's no point completing the separation and having separate locks
since it's a bi-phasal RPC protocol rather than a bi-direction
streaming protocol. Data transmission and data reception do not take
place simultaneously on any particular call.
and making the following functional changes:
(1) In rxrpc_input_data(), hold call->input_lock around the core to
prevent simultaneous producing of packets into the Rx ring and
updating of tracking state for a particular call.
(2) In rxrpc_input_ping_response(), only read call->ping_serial once, and
check it before checking RXRPC_CALL_PINGING as that's a cheaper test.
The bit test and bit clear can then be combined. No further locking
is needed here.
(3) In rxrpc_input_ack(), take call->input_lock after we've parsed much of
the ACK packet. The superseded ACK check is then done both before and
after the lock is taken.
The handing of ackinfo data is split, parsing before the lock is taken
and processing with it held. This is keyed on rxMTU being non-zero.
Congestion management is also done within the locked section.
(4) In rxrpc_input_ackall(), take call->input_lock around the Tx window
rotation. The ACKALL packet carries no information and is only really
useful after all packets have been transmitted since it's imprecise.
(5) In rxrpc_input_implicit_end_call(), we use rx->incoming_lock to
prevent calls being simultaneously implicitly ended on two cpus and
also to prevent any races with incoming call setup.
(6) In rxrpc_input_packet(), use cmpxchg() to effect the service upgrade
on a connection. It is only permitted to happen once for a
connection.
(7) In rxrpc_new_incoming_call(), we have to recheck the routing inside
rx->incoming_lock to see if someone else set up the call, connection
or peer whilst we were getting there. We can't trust the values from
the earlier routing check unless we pin refs on them - which we want
to avoid.
Further, we need to allow for an incoming call to have its state
changed on another CPU between us making it live and us adjusting it
because the conn is now in the RXRPC_CONN_SERVICE state.
(8) In rxrpc_peer_add_rtt(), take peer->rtt_input_lock around the access
to the RTT buffer. Don't need to lock around setting peer->rtt.
For reference, the inventory of state-accessing or state-altering functions
used by the packet input procedure is:
* CONNECTION-LEVEL PROCESSING
- Service upgrade
- Can only happen once per conn
! Changed to use cmpxchg
> rxrpc_post_packet_to_conn()
- Setting conn->hi_serial
- Probably safe not using locks
- Maybe use cmpxchg
* CALL-LEVEL PROCESSING
> Old-call checking
> rxrpc_input_implicit_end_call()
> rxrpc_call_completed()
> rxrpc_queue_call()
! Need to take rx->incoming_lock
> __rxrpc_disconnect_call()
> rxrpc_notify_socket()
> rxrpc_new_incoming_call()
- Uses rx->incoming_lock for the entire process
- Might be able to drop this earlier in favour of the call lock
> rxrpc_incoming_call()
! Conflicts with rxrpc_input_implicit_end_call()
> rxrpc_send_ping()
- Don't need locks to check rtt state
> rxrpc_propose_ACK
* PACKET DISTRIBUTION
> rxrpc_input_call_packet()
> rxrpc_input_data()
* QUEUE DATA PACKET ON CALL
> rxrpc_reduce_call_timer()
- Uses timer_reduce()
! Needs call->input_lock()
> rxrpc_receiving_reply()
! Needs locking around ack state
> rxrpc_rotate_tx_window()
> rxrpc_end_tx_phase()
> rxrpc_proto_abort()
> rxrpc_input_dup_data()
- Fills the Rx buffer
- rxrpc_propose_ACK()
- rxrpc_notify_socket()
> rxrpc_input_ack()
* APPLY ACK PACKET TO CALL AND DISCARD PACKET
> rxrpc_input_ping_response()
- Probably doesn't need any extra locking
! Need READ_ONCE() on call->ping_serial
> rxrpc_input_check_for_lost_ack()
- Takes call->lock to consult Tx buffer
> rxrpc_peer_add_rtt()
! Needs to take a lock (peer->rtt_input_lock)
! Could perhaps manage with cmpxchg() and xadd() instead
> rxrpc_input_requested_ack
- Consults Tx buffer
! Probably needs a lock
> rxrpc_peer_add_rtt()
> rxrpc_propose_ack()
> rxrpc_input_ackinfo()
- Changes call->tx_winsize
! Use cmpxchg to handle change
! Should perhaps track serial number
- Uses peer->lock to record MTU specification changes
> rxrpc_proto_abort()
! Need to take call->input_lock
> rxrpc_rotate_tx_window()
> rxrpc_end_tx_phase()
> rxrpc_input_soft_acks()
- Consults the Tx buffer
> rxrpc_congestion_management()
- Modifies the Tx annotations
! Needs call->input_lock()
> rxrpc_queue_call()
> rxrpc_input_abort()
* APPLY ABORT PACKET TO CALL AND DISCARD PACKET
> rxrpc_set_call_completion()
> rxrpc_notify_socket()
> rxrpc_input_ackall()
* APPLY ACKALL PACKET TO CALL AND DISCARD PACKET
! Need to take call->input_lock
> rxrpc_rotate_tx_window()
> rxrpc_end_tx_phase()
> rxrpc_reject_packet()
There are some functions used by the above that queue the packet, after
which the procedure is terminated:
- rxrpc_post_packet_to_local()
- local->event_queue is an sk_buff_head
- local->processor is a work_struct
- rxrpc_post_packet_to_conn()
- conn->rx_queue is an sk_buff_head
- conn->processor is a work_struct
- rxrpc_reject_packet()
- local->reject_queue is an sk_buff_head
- local->processor is a work_struct
And some that offload processing to process context:
- rxrpc_notify_socket()
- Uses RCU lock
- Uses call->notify_lock to call call->notify_rx
- Uses call->recvmsg_lock to queue recvmsg side
- rxrpc_queue_call()
- call->processor is a work_struct
- rxrpc_propose_ACK()
- Uses call->lock to wrap __rxrpc_propose_ACK()
And a bunch that complete a call, all of which use call->state_lock to
protect the call state:
David Howells [Mon, 8 Oct 2018 14:46:17 +0000 (15:46 +0100)]
rxrpc: Fix connection-level abort handling
Fix connection-level abort handling to cache the abort and error codes
properly so that a new incoming call can be properly aborted if it races
with the parent connection being aborted by another CPU.
The abort_code and error parameters can then be dropped from
rxrpc_abort_calls().
Fixes: f5c17aaeb2ae ("rxrpc: Calls should only have one terminal state") Signed-off-by: David Howells <[email protected]>
David Howells [Mon, 8 Oct 2018 14:46:11 +0000 (15:46 +0100)]
rxrpc: Only take the rwind and mtu values from latest ACK
Move the out-of-order and duplicate ACK packet check to before the call to
rxrpc_input_ackinfo() so that the receive window size and MTU size are only
checked in the latest ACK packet and don't regress.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <[email protected]>
In the presence of multi-order entries the typical
pagevec_lookup_entries() pattern may loop forever:
while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE),
indices)) {
...
for (i = 0; i < pagevec_count(&pvec); i++) {
index = indices[i];
...
}
index++; /* BUG */
}
The loop updates 'index' for each index found and then increments to the
next possible page to continue the lookup. However, if the last entry in
the pagevec is multi-order then the next possible page index is more
than 1 page away. Fix this locally for the filesystem-dax case by
checking for dax-multi-order entries. Going forward new users of
multi-order entries need to be similarly careful, or we need a generic
way to report the page increment in the radix iterator.
Kees Cook [Mon, 8 Oct 2018 15:46:51 +0000 (08:46 -0700)]
sunvdc: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], this moves
the math for cookies calculation into macros and allocates a fixed size
array for the maximum number of cookies and adds a runtime sanity check.
(Note that the size was always fixed, but just hidden from the compiler.)
6fbbde9a1969 ("KVM: x86: Control guest reads of MSR_PLATFORM_INFO")
That is not yet used in tools such as 'perf trace'.
The type of the change in this file, a simple integer parameter to the
KVM_CHECK_EXTENSION ioctl should be easier to implement tho, adding to
the libbeauty TODO list.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
d1766202779e ("x86/kvm/lapic: always disable MMIO interface in x2APIC mode")
That at this time will not generate changes in tools such as 'perf trace',
that still needs more work in tools/perf/examples/bpf/augmented_syscalls.c
to need such id -> string tables.
This silences the following perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
David Howells [Mon, 8 Oct 2018 14:46:05 +0000 (15:46 +0100)]
rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window()
Carry the call state out of the locked section in rxrpc_rotate_tx_window()
rather than sampling it afterwards. This is only used to select tracepoint
data, but could have changed by the time we do the tracepoint.
David Howells [Mon, 8 Oct 2018 14:46:01 +0000 (15:46 +0100)]
rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
We should only call the function to end a call's Tx phase if we rotated the
marked-last packet out of the transmission buffer.
Make rxrpc_rotate_tx_window() return an indication of whether it just
rotated the packet marked as the last out of the transmit buffer, carrying
the information out of the locked section in that function.
We can then check the return value instead of examining RXRPC_CALL_TX_LAST.
Fixes: 70790dbe3f66 ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: David Howells <[email protected]>
David Howells [Mon, 8 Oct 2018 14:45:56 +0000 (15:45 +0100)]
rxrpc: Don't need to take the RCU read lock in the packet receiver
We don't need to take the RCU read lock in the rxrpc packet receive
function because it's held further up the stack in the IP input routine
around the UDP receive routines.
Fix this by dropping the RCU read lock calls from rxrpc_input_packet().
This simplifies the code.
Fixes: 70790dbe3f66 ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: David Howells <[email protected]>
David Howells [Thu, 4 Oct 2018 10:10:51 +0000 (11:10 +0100)]
rxrpc: Use the UDP encap_rcv hook
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception
in which a packet is placed onto the UDP receive queue and then immediately
removed again by rxrpc. Going via the queue in this manner seems like it
should be unnecessary.
This does, however, require the invention of a value to place in encap_type
as that's one of the conditions to switch packets out to the encap_rcv
hook. Possibly the value doesn't actually matter for anything other than
sockopts on the UDP socket, which aren't accessible outside of rxrpc
anyway.
This seems to cut a bit of time out of the time elapsed between each
sk_buff being timestamped and turning up in rxrpc (the final number in the
following trace excerpts). I measured this by making the rxrpc_rx_packet
trace point print the time elapsed between the skb being timestamped and
the current time (in ns), e.g.:
... 424.278721: rxrpc_rx_packet: ... ACK 25026
So doing a 512MiB DIO read from my test server, with an unmodified kernel:
N min max sum mean stddev
27605 2626 7581 7.83992e+07 2840.04 181.029
and with the patch applied:
N min max sum mean stddev
27547 1895 12165 6.77461e+07 2459.29 255.02
Keith Busch [Fri, 5 Oct 2018 14:57:06 +0000 (08:57 -0600)]
nvme: remove ns sibling before clearing path
The code had been clearing a namespace being deleted as the current
path while that namespace was still in the path siblings list. It is
possible a new IO could set that namespace back to the current path
since it appeared to be an eligable path to select, which may result in
a use-after-free error.
This patch ensures a namespace being removed is not eligable to be reset
as a current path prior to clearing it as the current path.
In the quest to remove all stack VLA usage from the kernel[1], this
allocates a fixed size array for the maximum number of cookies and
adds a runtime sanity check.
Rob Herring [Wed, 29 Aug 2018 20:03:37 +0000 (15:03 -0500)]
sbus: Use of_get_child_by_name helper
Use the of_get_child_by_name() helper instead of open coding searching
for the '/options' node. This removes directly accessing the name
pointer as well.
Mikulas Patocka [Fri, 17 Aug 2018 19:19:37 +0000 (15:19 -0400)]
mach64: detect the dot clock divider correctly on sparc
On Sun Ultra 5, it happens that the dot clock is not set up properly for
some videomodes. For example, if we set the videomode "r1024x768x60" in
the firmware, Linux would incorrectly set a videomode with refresh rate
180Hz when booting (suprisingly, my LCD monitor can display it, although
display quality is very low).
The reason is this: Older mach64 cards set the divider in the register
VCLK_POST_DIV. The register has four 2-bit fields (the field that is
actually used is specified in the lowest two bits of the register
CLOCK_CNTL). The 2 bits select divider "1, 2, 4, 8". On newer mach64 cards,
there's another bit added - the top four bits of PLL_EXT_CNTL extend the
divider selection, so we have possible dividers "1, 2, 4, 8, 3, 5, 6, 12".
The Linux driver clears the top four bits of PLL_EXT_CNTL and never sets
them, so it can work regardless if the card supports them. However, the
sparc64 firmware may set these extended dividers during boot - and the
mach64 driver detects incorrect dot clock in this case.
This patch makes the driver read the additional divider bit from
PLL_EXT_CNTL and calculate the initial refresh rate properly.
These two patches correct some userspace-affecting issues introduced
during 4.19 development cycle, specifically:
* New structure "struct smcd_diag_dmbinfo" has been defined in a way
that would lead to different layout of the structure on most 32-bit
ABIs in comparison with layout on 64-bit ABIs;
* One of the commits renamed an UAPI-exposed field name.
Changes since v1:
* Managed not to forget to add --cover-letter.
* Commit ID format in commit message has been changed in accordance
with Sergei Shtylyov's recommendations.
====================
Commit c601171d7a60 ("net/smc: provide smc mode in smc_diag.c") changed
the name of diag_fallback field of struct smc_diag_msg structure
to diag_mode. However, this structure is a part of UAPI, and this change
breaks user space applications that use it ([1], for example). Since
the new name is more suitable, convert the field to a union that provides
access to the data via both the new and the old name.
Fixes: c601171d7a60 ("net/smc: provide smc mode in smc_diag.c") Signed-off-by: Eugene Syromiatnikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net/smc: use __aligned_u64 for 64-bit smc_diag fields
Commit 4b1b7d3b30a6 ("net/smc: add SMC-D diag support") introduced
new UAPI-exposed structure, struct smcd_diag_dmbinfo. However,
it's not usable by compat binaries, as it has different layout there.
Probably, the most straightforward fix that will avoid similar issues
in the future is to use __aligned_u64 for 64-bit fields.
Fixes: 4b1b7d3b30a6 ("net/smc: add SMC-D diag support") Signed-off-by: Eugene Syromiatnikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Al Viro [Sun, 7 Oct 2018 11:40:17 +0000 (07:40 -0400)]
net: sched: cls_u32: fix hnode refcounting
cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references
via ->hlist and via ->tp_root together. u32_destroy() drops the former
and, in case when there had been links, leaves the sucker on the list.
As the result, there's nothing to protect it from getting freed once links
are dropped.
That also makes the "is it busy" check incapable of catching the root
hnode - it *is* busy (there's a reference from tp), but we don't see it as
something separate. "Is it our root?" check partially covers that, but
the problem exists for others' roots as well.
AFAICS, the minimal fix preserving the existing behaviour (where it doesn't
include oopsen, that is) would be this:
* count tp->root and tp_c->hlist as separate references. I.e.
have u32_init() set refcount to 2, not 1.
* in u32_destroy() we always drop the former;
in u32_destroy_hnode() - the latter.
That way we have *all* references contributing to refcount. List
removal happens in u32_destroy_hnode() (called only when ->refcnt is 1)
an in u32_destroy() in case of tc_u_common going away, along with
everything reachable from it. IOW, that way we know that
u32_destroy_key() won't free something still on the list (or pointed to by
someone's ->root).
Reproducer:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: \
u32 divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: \
u32 divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 100 \
handle 1:0:11 u32 ht 1: link 801: offset at 0 mask 0f00 shift 6 \
plus 0 eat match ip protocol 6 ff
tc filter delete dev eth0 parent ffff: protocol ip prio 200
tc filter change dev eth0 parent ffff: protocol ip prio 100 \
handle 1:0:11 u32 ht 1: link 0: offset at 0 mask 0f00 shift 6 plus 0 \
eat match ip protocol 6 ff
tc filter delete dev eth0 parent ffff: protocol ip prio 100
Jiri Kosina [Thu, 4 Oct 2018 11:37:32 +0000 (13:37 +0200)]
udp: Unbreak modules that rely on external __skb_recv_udp() availability
Commit 2276f58ac589 ("udp: use a separate rx queue for packet reception")
turned static inline __skb_recv_udp() from being a trivial helper around
__skb_recv_datagram() into a UDP specific implementaion, making it
EXPORT_SYMBOL_GPL() at the same time.
There are external modules that got broken by __skb_recv_udp() not being
visible to them. Let's unbreak them by making __skb_recv_udp EXPORT_SYMBOL().
Rationale (one of those) why this is actually "technically correct" thing
to do: __skb_recv_udp() used to be an inline wrapper around
__skb_recv_datagram(), which itself (still, and correctly so, I believe)
is EXPORT_SYMBOL().
Mike Rapoport [Sun, 7 Oct 2018 08:31:51 +0000 (11:31 +0300)]
percpu: stop leaking bitmap metadata blocks
The commit ca460b3c9627 ("percpu: introduce bitmap metadata blocks")
introduced bitmap metadata blocks. These metadata blocks are allocated
whenever a new chunk is created, but they are never freed. Fix it.
All of these have been in linux-next with no reported issues."
* tag 'char-misc-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
thunderbolt: Initialize after IOMMUs
thunderbolt: Do not handle ICM events after domain is stopped
firmware: Always initialize the fw_priv list object
docs: fpga: document fpga manager flags
fpga: bridge: fix obvious function documentation error
tools: hv: fcopy: set 'error' in case an unknown operation was requested
fpga: do not access region struct after fpga_region_unregister
Drivers: hv: vmbus: Use get/put_cpu() in vmbus_connect()
Merge tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
I wrote:
"USB fixes for 4.19-rc7
Here are some small USB fixes for 4.19-rc7
These include:
- the usual xhci bugfixes for reported issues
- some new serial driver device ids
- bugfix for the option serial driver for some devices
- bugfix for the cdc_acm driver that has been there for a long time.
All of these have been in linux-next for a while with no reported
issues."
* tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: xhci-mtk: resume USB3 roothub first
xhci: Add missing CAS workaround for Intel Sunrise Point xHCI
usb: cdc_acm: Do not leak URB buffers
USB: serial: simple: add Motorola Tetra MTP6550 id
USB: serial: option: add two-endpoints device-id flag
USB: serial: option: improve Quectel EP06 detection
Merge tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Michael writes:
"powerpc fixes for 4.19 #4
Four regression fixes.
A fix for a change to lib/xz which broke our zImage loader when
building with XZ compression. OK'ed by Herbert who merged the
original patch.
The recent fix we did to avoid patching __init text broke some 32-bit
machines, fix that.
Our show_user_instructions() could be tricked into printing kernel
memory, add a check to avoid that.
And a fix for a change to our NUMA initialisation logic, which causes
crashes in some kdump configurations.
Thanks to:
Christophe Leroy, Hari Bathini, Jann Horn, Joel Stanley, Meelis
Roos, Murilo Opsfelder Araujo, Srikar Dronamraju."
* tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/numa: Skip onlining a offline node in kdump path
powerpc: Don't print kernel instructions in show_user_instructions()
powerpc/lib: fix book3s/32 boot failure due to code patching
lib/xz: Put CRC32_POLY_LE in xz_private.h
1) Fix truncation of 32-bit right shift in bpf, from Jann Horn.
2) Fix memory leak in wireless wext compat, from Stefan Seyfried.
3) Use after free in cfg80211's reg_process_hint(), from Yu Zhao.
4) Need to cancel pending work when unbinding in smsc75xx otherwise
we oops, also from Yu Zhao.
5) Don't allow enslaving a team device to itself, from Ido Schimmel.
6) Fix backwards compat with older userspace for rtnetlink FDB dumps.
From Mauricio Faria.
7) Add validation of tc policy netlink attributes, from David Ahern.
8) Fix RCU locking in rawv6_send_hdrinc(), from Wei Wang."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
ipv6: take rcu lock in rawv6_send_hdrinc()
net: sched: Add policy validation for tc attributes
rtnetlink: fix rtnl_fdb_dump() for ndmsg header
yam: fix a missing-check bug
net: bpfilter: Fix type cast and pointer warnings
net: cxgb3_main: fix a missing-check bug
bpf: 32-bit RSH verification must truncate input before the ALU op
net: phy: phylink: fix SFP interface autodetection
be2net: don't flip hw_features when VXLANs are added/deleted
net/packet: fix packet drop as of virtio gso
net: dsa: b53: Keep CPU port as tagged in all VLANs
openvswitch: load NAT helper
bnxt_en: get the reduced max_irqs by the ones used by RDMA
bnxt_en: free hwrm resources, if driver probe fails.
bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
bnxt_en: Fix VNIC reservations on the PF.
team: Forbid enslaving team device to itself
net/usb: cancel pending work when unbinding smsc75xx
mlxsw: spectrum: Delete RIF when VLAN device is removed
...