Linus Torvalds [Sat, 31 Oct 2020 21:31:28 +0000 (14:31 -0700)]
Merge tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull more flexible-array member conversions from Gustavo A. R. Silva:
"Replace zero-length arrays with flexible-array members"
* tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
printk: ringbuffer: Replace zero-length array with flexible-array member
net/smc: Replace zero-length array with flexible-array member
net/mlx5: Replace zero-length array with flexible-array member
mei: hw: Replace zero-length array with flexible-array member
gve: Replace zero-length array with flexible-array member
Bluetooth: btintel: Replace zero-length array with flexible-array member
scsi: target: tcmu: Replace zero-length array with flexible-array member
ima: Replace zero-length array with flexible-array member
enetc: Replace zero-length array with flexible-array member
fs: Replace zero-length array with flexible-array member
Bluetooth: Replace zero-length array with flexible-array member
params: Replace zero-length array with flexible-array member
tracepoint: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
====================
IPv6: reply ICMP error if fragment doesn't contain all headers
When our Engineer run latest IPv6 Core Conformance test, test v6LC.1.3.6:
First Fragment Doesn’t Contain All Headers[1] failed. The test purpose is to
verify that the node (Linux for example) should properly process IPv6 packets
that don’t include all the headers through the Upper-Layer header.
Based on RFC 8200, Section 4.5 Fragment Header
- If the first fragment does not include all headers through an
Upper-Layer header, then that fragment should be discarded and
an ICMP Parameter Problem, Code 3, message should be sent to
the source of the fragment, with the Pointer field set to zero.
The first patch add a definition for ICMPv6 Parameter Problem, code 3.
The second patch add a check for the 1st fragment packet to make sure
Upper-Layer header exist.
[1] Page 68, v6LC.1.3.6: First Fragment Doesn’t Contain All Headers part A, B,
C and D at https://ipv6ready.org/docs/Core_Conformance_5_0_0.pdf
[2] My reproducer:
Hangbin Liu [Tue, 27 Oct 2020 12:33:13 +0000 (20:33 +0800)]
IPv6: reply ICMP error if the first fragment don't include all headers
Based on RFC 8200, Section 4.5 Fragment Header:
- If the first fragment does not include all headers through an
Upper-Layer header, then that fragment should be discarded and
an ICMP Parameter Problem, Code 3, message should be sent to
the source of the fragment, with the Pointer field set to zero.
Checking each packet header in IPv6 fast path will have performance impact,
so I put the checking in ipv6_frag_rcv().
As the packet may be any kind of L4 protocol, I only checked some common
protocols' header length and handle others by (offset + 1) > skb->len.
Also use !(frag_off & htons(IP6_OFFSET)) to catch atomic fragments
(fragmented packet with only one fragment).
When send ICMP error message, if the 1st truncated fragment is ICMP message,
icmp6_send() will break as is_ineligible() return true. So I added a check
in is_ineligible() to let fragment packet with nexthdr ICMP but no ICMP header
return false.
Colin Ian King [Tue, 27 Oct 2020 11:49:25 +0000 (11:49 +0000)]
net: atm: fix update of position index in lec_seq_next
The position index in leq_seq_next is not updated when the next
entry is fetched an no more entries are available. This causes
seq_file to report the following error:
"seq_file: buggy .next function lec_seq_next [lec] did not update
position index"
Fix this by always updating the position index.
[ Note: this is an ancient 2002 bug, the sha is from the
tglx/history repo ]
Linus Torvalds [Sat, 31 Oct 2020 19:21:04 +0000 (12:21 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four driver fixes and one core fix.
The core fix closes a race window where we could kick off a second
asynchronous scan because the test and set of the variable preventing
it isn't atomic"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Stop using queue #0 always for v2 hw
scsi: ibmvscsi: Fix potential race after loss of transport
scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove()
scsi: qla2xxx: Return EBUSY on fcport deletion
scsi: core: Don't start concurrent async scan on same host
Andrew Jones [Thu, 29 Oct 2020 20:17:00 +0000 (21:17 +0100)]
KVM: selftests: Don't require THP to run tests
Unless we want to test with THP, then we shouldn't require it to be
configured by the host kernel. Unfortunately, even advising with
MADV_NOHUGEPAGE does require it, so check for THP first in order
to avoid madvise failing with EINVAL.
Vitaly Kuznetsov [Wed, 14 Oct 2020 14:33:46 +0000 (16:33 +0200)]
KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
It was noticed that evmcs_sanitize_exec_ctrls() is not being executed
nowadays despite the code checking 'enable_evmcs' static key looking
correct. Turns out, static key magic doesn't work in '__init' section
(and it is unclear when things changed) but setup_vmcs_config() is called
only once per CPU so we don't really need it to. Switch to checking
'enlightened_vmcs' instead, it is supposed to be in sync with
'enable_evmcs'.
Opportunistically make evmcs_sanitize_exec_ctrls '__init' and drop unneeded
extra newline from it.
Jim Mattson [Mon, 26 Oct 2020 18:09:22 +0000 (11:09 -0700)]
KVM: selftests: test behavior of unmapped L2 APIC-access address
Add a regression test for commit 671ddc700fd0 ("KVM: nVMX: Don't leak
L1 MMIO regions to L2").
First, check to see that an L2 guest can be launched with a valid
APIC-access address that is backed by a page of L1 physical memory.
Next, set the APIC-access address to a (valid) L1 physical address
that is not backed by memory. KVM can't handle this situation, so
resuming L2 should result in a KVM exit for internal error
(emulation).
Stefano Brivio [Thu, 29 Oct 2020 15:39:46 +0000 (16:39 +0100)]
netfilter: ipset: Update byte and packet counters regardless of whether they match
In ip_set_match_extensions(), for sets with counters, we take care of
updating counters themselves by calling ip_set_update_counter(), and of
checking if the given comparison and values match, by calling
ip_set_match_counter() if needed.
However, if a given comparison on counters doesn't match the configured
values, that doesn't mean the set entry itself isn't matching.
This fix restores the behaviour we had before commit 4750005a85f7
("netfilter: ipset: Fix "don't update counters" mode when counters used
at the matching"), without reintroducing the issue fixed there: back
then, mtype_data_match() first updated counters in any case, and then
took care of matching on counters.
Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set,
ip_set_update_counter() will anyway skip counter updates if desired.
The issue observed is illustrated by this reproducer:
ipset create c hash:ip counters
ipset add c 192.0.2.1
iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
if we now send packets from 192.0.2.1, bytes and packets counters
for the entry as shown by 'ipset list' are always zero, and, no
matter how many bytes we send, the rule will never match, because
counters themselves are not updated.
Reported-by: Mithil Mhatre <[email protected]> Fixes: 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching") Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
Linus Torvalds [Fri, 30 Oct 2020 22:02:49 +0000 (15:02 -0700)]
Merge tag 'block-5.10-2020-10-30' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- null_blk zone fixes (Damien, Kanchan)
- NVMe pull request from Christoph:
- improve zone revalidation (Keith Busch)
- gracefully handle zero length messages in nvme-rdma (zhenwei pi)
- nvme-fc error handling fixes (James Smart)
- nvmet tracing NULL pointer dereference fix (Chaitanya Kulkarni)"
- xsysace platform fixes (Andy)
- scatterlist type cleanup (David)
- blk-cgroup memory fixes (Gabriel)
- nbd block size update fix (Ming)
- Flush completion state fix (Ming)
- bio_add_hw_page() iteration fix (Naohiro)
* tag 'block-5.10-2020-10-30' of git://git.kernel.dk/linux-block:
blk-mq: mark flush request as IDLE in flush_end_io()
lib/scatterlist: use consistent sg_copy_buffer() return type
xsysace: use platform_get_resource() and platform_get_irq_optional()
null_blk: Fix locking in zoned mode
null_blk: Fix zone reset all tracing
nbd: don't update block size after device is started
block: advance iov_iter on bio_add_hw_page failure
null_blk: synchronization fix for zoned device
nvmet: fix a NULL pointer dereference when tracing the flush command
nvme-fc: remove nvme_fc_terminate_io()
nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery
nvme-fc: remove err_work work item
nvme-fc: track error_recovery while connecting
nvme-rdma: handle unexpected nvme completion data length
nvme: ignore zone validate errors on subsequent scans
blk-cgroup: Pre-allocate tree node on blkg_conf_prep
blk-cgroup: Fix memleak on error path
printk: ringbuffer: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
net/smc: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
net/mlx5: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
mei: hw: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
gve: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The
older style of one-element or zero-length arrays should no longer be
used[2].
Refactor the code according to the use of a flexible-array member in
struct gve_stats_report, instead of a zero-length array, and use the
struct_size() helper to calculate the size for the resource allocation.
Bluetooth: btintel: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
Linus Torvalds [Fri, 30 Oct 2020 21:55:36 +0000 (14:55 -0700)]
Merge tag 'io_uring-5.10-2020-10-30' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Fixes for linked timeouts (Pavel)
- Set IO_WQ_WORK_CONCURRENT early for async offload (Pavel)
- Two minor simplifications that make the code easier to read and
follow (Pavel)
* tag 'io_uring-5.10-2020-10-30' of git://git.kernel.dk/linux-block:
io_uring: use type appropriate io_kiocb handler for double poll
io_uring: simplify __io_queue_sqe()
io_uring: simplify nxt propagation in io_queue_sqe
io_uring: don't miss setting IO_WQ_WORK_CONCURRENT
io_uring: don't defer put of cancelled ltimeout
io_uring: always clear LINK_TIMEOUT after cancel
io_uring: don't adjust LINK_HEAD in cancel ltimeout
io_uring: remove opcode check on ltimeout kill
Rajat Jain [Wed, 28 Oct 2020 23:15:45 +0000 (16:15 -0700)]
PCI: Always enable ACS even if no ACS Capability
Some devices support ACS functionality even though they don't have a
spec-compliant ACS Capability; pci_enable_acs() has a quirk mechanism to
handle them.
We want to enable ACS whenever possible, but 52fbf5bdeeef ("PCI: Cache ACS
capability offset in device") inadvertently broke this by calling
pci_enable_acs() only if we find an ACS Capability.
This resulted in ACS not being enabled for these non-compliant devices,
which means devices can't be separated into different IOMMU groups, which
in turn means we may not be able to pass those devices through to VMs, as
reported by Boris V:
Linus Torvalds [Fri, 30 Oct 2020 20:29:49 +0000 (13:29 -0700)]
Merge tag 'for-5.10-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- lockdep fixes:
- drop path locks before manipulating sysfs objects or qgroups
- preliminary fixes before tree locks get switched to rwsem
- use annotated seqlock
- build warning fixes (printk format)
- fix relocation vs fallocate race
- tree checker properly validates number of stripes and parity
- readahead vs device replace fixes
- iomap dio fix for unnecessary buffered io fallback
* tag 'for-5.10-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: convert data_seqcount to seqcount_mutex_t
btrfs: don't fallback to buffered read if we don't need to
btrfs: add a helper to read the tree_root commit root for backref lookup
btrfs: drop the path before adding qgroup items when enabling qgroups
btrfs: fix readahead hang and use-after-free after removing a device
btrfs: fix use-after-free on readahead extent after failure to create it
btrfs: tree-checker: validate number of chunk stripes and parity
btrfs: tree-checker: fix incorrect printk format
btrfs: drop the path before adding block group sysfs files
btrfs: fix relocation failure due to race with fallocate
Linus Torvalds [Fri, 30 Oct 2020 20:16:03 +0000 (13:16 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The diffstat is a bit spread out thanks to an invasive CPU erratum
workaround which missed the merge window and also a bunch of fixes to
the recently added MTE selftests.
- Fixes to MTE kselftests
- Fix return code from KVM Spectre-v2 hypercall
- Build fixes for ld.lld and Clang's infamous integrated assembler
- Ensure RCU is up and running before we use printk()
- Fix linker warnings from unexpected ELF sections
- Ensure PE/COFF sections are 64k aligned"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Change .weak to SYM_FUNC_START_WEAK_PI for arch/arm64/lib/mem*.S
arm64/smp: Move rcu_cpu_starting() earlier
arm64: Add workaround for Arm Cortex-A77 erratum 1508412
arm64: Add part number for Arm Cortex-A77
arm64: mte: Document that user PSTATE.TCO is ignored by kernel uaccess
module: use hidden visibility for weak symbol references
arm64: efi: increase EFI PE/COFF header padding to 64 KB
arm64: vmlinux.lds: account for spurious empty .igot.plt sections
kselftest/arm64: Fix check_user_mem test
kselftest/arm64: Fix check_ksm_options test
kselftest/arm64: Fix check_mmap_options test
kselftest/arm64: Fix check_child_memory test
kselftest/arm64: Fix check_tags_inclusion test
kselftest/arm64: Fix check_buffer_fill test
arm64: avoid -Woverride-init warning
KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED
arm64: vdso32: Allow ld.lld to properly link the VDSO
Linus Torvalds [Fri, 30 Oct 2020 20:11:46 +0000 (13:11 -0700)]
Merge tag 'asm-generic-fixes-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fix from Arnd Bergmann:
"One small bugfix, fixing a build regression for RISC-V"
* tag 'asm-generic-fixes-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: mark __{get,put}_user_fn as __always_inline
Linus Torvalds [Fri, 30 Oct 2020 20:06:07 +0000 (13:06 -0700)]
Merge tag 'arm-soc-fixes-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is a fairly large set of bug fixes on top of -rc1, as most of
them were ready but didn't quite make it into the last-minute pull
requests for the merge window.
Allwinner:
- fix for incorrect CPU overtemperature limit
Amlogic:
- multiple smaller DT bugfixes, and missing device nodes
Marvell EBU:
- add missing aliases for ethernet switch ports on espressobin board
Marvell MMP:
- DTC warning fix
- bugfix for camera interface power-down
NXP i.MX:
- re-enable the GPIO driver on all defconfigs
ST STM32MP1:
- fix random crashes from incorrect voltage settings
Synaptics Berlin:
- enable the correct hardware timer driver
Texas Instruments K2G:
- fix a boot regression in the power domain code
TEE drivers:
- fix regression in TEE "login" method
SCMI drivers:
- multiple code fixes for corner cases in newly added code
MAINTAINERS file:
- move Kukjin Kim and Sangbeom Kim to credits (used to work on
Samsung Exynos)
- Masahiro Yamada is stepping down as Uniphier maintainer
I did not include a series of patches that work around a regression
caused by a bugfix in an ethernet phy driver that resulted in an
inadvertent DT binding change. This is still under discussion"
* tag 'arm-soc-fixes-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits)
soc: ti: ti_sci_pm_domains: check for proper args count in xlate
ARM: dts: stm32: Describe Vin power supply on stm32mp157c-edx board
ARM: dts: stm32: Describe Vin power supply on stm32mp15xx-dkx board
ARM: multi_v5_defconfig: Select CONFIG_GPIO_MXC
ARM: imx_v4_v5_defconfig: Select CONFIG_GPIO_MXC
ARM: dts: mmp2-olpc-xo-1-75: Use plural form of "-gpios"
ARM: dts: mmp3: Add power domain for the camera
arm64: berlin: Select DW_APB_TIMER_OF
dt-bindings: sram: sunxi-sram: add V3s compatible string
MAINTAINERS: Move Sangbeom Kim to credits
MAINTAINERS: Move Kukjin Kim to credits
MAINTAINERS: step down as maintainer of UniPhier SoCs and Denali driver
ARM: multi_v7_defconfig: Build in CONFIG_GPIO_MXC by default
ARM: imx_v6_v7_defconfig: Build in CONFIG_GPIO_MXC by default
arm64: defconfig: Build in CONFIG_GPIO_MXC by default
arm64: dts: meson: odroid-n2 plus: fix vddcpu_a pwm
ARM: dts: meson8: remove two invalid interrupt lines from the GPU node
arm64: dts: amlogic: add missing ethernet reset ID
firmware: arm_scmi: Fix duplicate workqueue name
firmware: arm_scmi: Fix locking in notifications
...
Linus Torvalds [Fri, 30 Oct 2020 20:00:10 +0000 (13:00 -0700)]
Merge tag 'pnp-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull PNP fix from Rafael Wysocki:
"Make function names in kerneldoc comments match the actual names of
the functions that they correspond to (Mauro Carvalho Chehab)"
* tag 'pnp-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PNP: fix kernel-doc markups
Linus Torvalds [Fri, 30 Oct 2020 19:53:49 +0000 (12:53 -0700)]
Merge tag 'devprop-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework fixes from Rafael Wysocki:
"Fix the secondary firmware node handling while manipulating the
primary firmware node for a given device (Andy Shevchenko)"
* tag 'devprop-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Don't clear secondary pointer for shared primary firmware node
device property: Keep secondary firmware node secondary by type
Linus Torvalds [Fri, 30 Oct 2020 19:45:04 +0000 (12:45 -0700)]
Merge tag 'pm-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a few issues related to running intel_pstate in the passive
mode with HWP enabled, correct the handling of the max_cstate module
parameter in intel_idle and make a few janitorial changes.
Specifics:
- Modify Kconfig to prevent configuring either the "conservative" or
the "ondemand" governor as the default cpufreq governor if
intel_pstate is selected, in which case "schedutil" is the default
choice for the default governor setting (Rafael Wysocki).
- Modify the cpufreq core, intel_pstate and the schedutil governor to
avoid missing updates of the HWP max limit when intel_pstate
operates in the passive mode with HWP enabled (Rafael Wysocki).
- Fix max_cstate module parameter handling in intel_idle for
processor models with C-state tables coming from ACPI (Chen Yu).
- Clean up assorted pieces of power management code (Jackie Zamow,
Tom Rix, Zhang Qilong)"
* tag 'pm-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: schedutil: Always call driver if CPUFREQ_NEED_UPDATE_LIMITS is set
cpufreq: Introduce cpufreq_driver_test_flags()
cpufreq: speedstep: remove unneeded semicolon
PM: sleep: fix typo in kernel/power/process.c
intel_idle: Fix max_cstate for processor models without C-state tables
cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode
cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag
cpufreq: Avoid configuring old governors as default with intel_pstate
cpufreq: e_powersaver: remove unreachable break
Linus Torvalds [Fri, 30 Oct 2020 18:04:11 +0000 (11:04 -0700)]
Merge tag 'mmc-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- sdhci: Fix performance regression with auto CMD auto select
- sdhci-of-esdhc: Fix initialization for eMMC HS400 mode
- sdhci-of-esdhc: Fix timeout bug for tuning commands
* tag 'mmc-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-esdhc: make sure delay chain locked for HS400
mmc: sdhci-of-esdhc: set timeout to max before tuning
mmc: sdhci: Use Auto CMD Auto Select only when v4_mode is true
Linus Torvalds [Fri, 30 Oct 2020 17:54:44 +0000 (10:54 -0700)]
Merge tag 'drm-fixes-2020-10-30-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"A busier rc2 than normal, have larger sets of fixes for amdgpu +
nouveau, along with some i915, docs, core, panel, sun4i, v3d, vc4
fixes.
Nothing spooky though or pumpkin related.
docs:
- kernel doc fixes
core:
- fix shmem helpers dma-buf mmap bug
amdgpu:
- Add new navi1x PCI ID
- GPUVM reserved area fixes
- Misc display fixes
- Fix bad interactions between display code and CONFIG_KGDB
- Fixes for SMU manual fan control and i2c
* tag 'drm-fixes-2020-10-30-1' of git://anongit.freedesktop.org/drm/drm: (42 commits)
drm/nouveau/kms/nv50-: Fix clock checking algorithm in nv50_dp_mode_valid()
drm/nouveau/kms/nv50-: Get rid of bogus nouveau_conn_mode_valid()
drm/nouveau/device: fix changing endianess code to work on older GPUs
drm/nouveau/gem: fix "refcount_t: underflow; use-after-free"
drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps
drm/nouveau/nouveau: fix the start/end range for migration
drm/i915: Reject 90/270 degree rotated initial fbs
drm/i915: Restore ILK-M RPS support
drm/i915/region: fix max size calculation
drm/vc4: Rework the structure conversion functions
drm/vc4: hdmi: Add a name to the codec DAI component
drm/shme-helpers: Fix dma_buf_mmap forwarding bug
drm/vc4: hdmi: Avoid sleeping in atomic context
drm/amdgpu/pm: fix the fan speed in fan1_input in manual mode for navi1x
drm/amd/pm: fix the wrong fan speed in fan1_input
drm/amdgpu/swsmu: drop smu i2c bus on navi1x
drm/vc4: drv: Add error handding for bind
drm: drm_print.h: fix kernel-doc markups
drm: kernel-doc: drm_dp_helper.h: fix a typo
drm: kernel-doc: add description for a new function parameter
...
Takashi Iwai [Fri, 30 Oct 2020 15:14:14 +0000 (16:14 +0100)]
KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
The newly introduced kvm_msr_ignored_check() tries to print error or
debug messages via vcpu_*() macros, but those may cause Oops when NULL
vcpu is passed for KVM_GET_MSRS ioctl.
Fix it by replacing the print calls with kvm_*() macros.
(Note that this will leave vcpu argument completely unused in the
function, but I didn't touch it to make the fix as small as
possible. A clean up may be applied later.)
Paolo Bonzini [Fri, 30 Oct 2020 17:39:55 +0000 (13:39 -0400)]
KVM: x86: replace static const variables with macros
Even though the compiler is able to replace static const variables with
their value, it will warn about them being unused when Linux is built with W=1.
Use good old macros instead, this is not C++.
Paolo Bonzini [Fri, 30 Oct 2020 17:25:09 +0000 (13:25 -0400)]
Merge tag 'kvmarm-fixes-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.10, take #1
- Force PTE mapping on device pages provided via VFIO
- Fix detection of cacheable mapping at S2
- Fallback to PMD/PTE mappings for composite huge pages
- Fix accounting of Stage-2 PGD allocation
- Fix AArch32 handling of some of the debug registers
- Simplify host HYP entry
- Fix stray pointer conversion on nVHE TLB invalidation
- Fix initialization of the nVHE code
- Simplify handling of capabilities exposed to HYP
- Nuke VCPUs caught using a forbidden AArch32 EL0
Since commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user
input parsing bits"), ECC are broken in FMC2 driver in case of
nand-ecc-step-size and nand-ecc-strength are not set in the device tree.
To avoid this issue, the default settings are now set in
stm32_fmc2_nfc_attach_chip function.
Marek Szyprowski [Thu, 29 Oct 2020 18:50:11 +0000 (19:50 +0100)]
net: stmmac: Fix channel lock initialization
Commit 0366f7e06a6b ("net: stmmac: add ethtool support for get/set
channels") refactored channel initialization, but during that operation,
the spinlock initialization got lost. Fix this. This fixes the following
lockdep warning:
meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 1 PID: 331 Comm: kworker/1:2H Not tainted 5.9.0-rc3+ #1858
Hardware name: Hardkernel ODROID-N2 (DT)
Workqueue: kblockd blk_mq_run_work_fn
Call trace:
dump_backtrace+0x0/0x1d0
show_stack+0x14/0x20
dump_stack+0xe8/0x154
register_lock_class+0x58c/0x590
__lock_acquire+0x7c/0x1790
lock_acquire+0xf4/0x440
_raw_spin_lock_irqsave+0x80/0xb0
stmmac_tx_timer+0x4c/0xb0 [stmmac]
call_timer_fn+0xc4/0x3e8
run_timer_softirq+0x2b8/0x6c0
efi_header_end+0x114/0x5f8
irq_exit+0x104/0x110
__handle_domain_irq+0x60/0xb8
gic_handle_irq+0x58/0xb0
el1_irq+0xbc/0x180
_raw_spin_unlock_irqrestore+0x48/0x90
mmc_blk_rw_wait+0x70/0x160
mmc_blk_mq_issue_rq+0x510/0x830
mmc_mq_queue_rq+0x13c/0x278
blk_mq_dispatch_rq_list+0x2a0/0x698
__blk_mq_do_dispatch_sched+0x254/0x288
__blk_mq_sched_dispatch_requests+0x190/0x1d8
blk_mq_sched_dispatch_requests+0x34/0x70
__blk_mq_run_hw_queue+0xcc/0x148
blk_mq_run_work_fn+0x20/0x28
process_one_work+0x2a8/0x718
worker_thread+0x48/0x460
kthread+0x134/0x160
ret_from_fork+0x10/0x1c
Wong Vee Khee [Thu, 29 Oct 2020 09:32:28 +0000 (17:32 +0800)]
stmmac: intel: Fix kernel panic on pci probe
The commit "stmmac: intel: Adding ref clock 1us tic for LPI cntr"
introduced a regression which leads to the kernel panic duing loading
of the dwmac_intel module.
Move the code block after pci resources is obtained.
Claudiu Manoil [Tue, 20 Oct 2020 17:36:05 +0000 (20:36 +0300)]
gianfar: Account for Tx PTP timestamp in the skb headroom
When PTP timestamping is enabled on Tx, the controller
inserts the Tx timestamp at the beginning of the frame
buffer, between SFD and the L2 frame header. This means
that the skb provided by the stack is required to have
enough headroom otherwise a new skb needs to be created
by the driver to accommodate the timestamp inserted by h/w.
Up until now the driver was relying on the second option,
using skb_realloc_headroom() to create a new skb to accommodate
PTP frames. Turns out that this method is not reliable, as
reallocation of skbs for PTP frames along with the required
overhead (skb_set_owner_w, consume_skb) is causing random
crashes in subsequent skb_*() calls, when multiple concurrent
TCP streams are run at the same time on the same device
(as seen in James' report).
Note that these crashes don't occur with a single TCP stream,
nor with multiple concurrent UDP streams, but only when multiple
TCP streams are run concurrently with the PTP packet flow
(doing skb reallocation).
This patch enforces the first method, by requesting enough
headroom from the stack to accommodate PTP frames, and so avoiding
skb_realloc_headroom() & co, and the crashes no longer occur.
There's no reason not to set needed_headroom to a large enough
value to accommodate PTP frames, so in this regard this patch
is a fix.
Claudiu Manoil [Thu, 29 Oct 2020 08:10:56 +0000 (10:10 +0200)]
gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
When PTP timestamping is enabled on Tx, the controller
inserts the Tx timestamp at the beginning of the frame
buffer, between SFD and the L2 frame header. This means
that the skb provided by the stack is required to have
enough headroom otherwise a new skb needs to be created
by the driver to accommodate the timestamp inserted by h/w.
Up until now the driver was relying on skb_realloc_headroom()
to create new skbs to accommodate PTP frames. Turns out that
this method is not reliable in this context at least, as
skb_realloc_headroom() for PTP frames can cause random crashes,
mostly in subsequent skb_*() calls, when multiple concurrent
TCP streams are run at the same time with the PTP flow
on the same device (as seen in James' report). I also noticed
that when the system is loaded by sending multiple TCP streams,
the driver receives cloned skbs in large numbers.
skb_cow_head() instead proves to be stable in this scenario,
and not only handles cloned skbs too but it's also more efficient
and widely used in other drivers.
The commit introducing skb_realloc_headroom in the driver
goes back to 2009, commit 93c1285c5d92
("gianfar: reallocate skb when headroom is not enough for fcb").
For practical purposes I'm referencing a newer commit (from 2012)
that brings the code to its current structure (and fixes the PTP
case).
Peter Zijlstra [Tue, 27 Oct 2020 12:48:34 +0000 (13:48 +0100)]
lockdep: Fix nr_unused_locks accounting
Chris reported that commit 24d5a3bffef1 ("lockdep: Fix
usage_traceoverflow") breaks the nr_unused_locks validation code
triggered by /proc/lockdep_stats.
By fully splitting LOCK_USED and LOCK_USED_READ it becomes a bad
indicator for accounting nr_unused_locks; simplyfy by using any first
bit.
Peter Zijlstra [Mon, 26 Oct 2020 15:22:56 +0000 (16:22 +0100)]
locking/lockdep: Remove more raw_cpu_read() usage
I initially thought raw_cpu_read() was OK, since if it is !0 we have
IRQs disabled and can't get migrated, so if we get migrated both CPUs
must have 0 and it doesn't matter which 0 we read.
And while that is true; it isn't the whole store, on pretty much all
architectures (except x86) this can result in computing the address for
one CPU, getting migrated, the old CPU continuing execution with another
task (possibly setting recursion) and then the new CPU reading the value
of the old CPU, which is no longer 0.
Similer to:
baffd723e44d ("lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"")
Qais Yousef [Tue, 27 Oct 2020 21:51:13 +0000 (21:51 +0000)]
KVM: arm64: Handle Asymmetric AArch32 systems
On a system without uniform support for AArch32 at EL0, it is possible
for the guest to force run AArch32 at EL0 and potentially cause an
illegal exception if running on a core without AArch32. Add an extra
check so that if we catch the guest doing that, then we prevent it from
running again by resetting vcpu->arch.target and return
ARM_EXCEPTION_IL.
We try to catch this misbehaviour as early as possible and not rely on
an illegal exception occuring to signal the problem. Attempting to run a
32bit app in the guest will produce an error from QEMU if the guest
exits while running in AArch32 EL0.
Tested on Juno by instrumenting the host to fake asym aarch32 and
instrumenting KVM to make the asymmetry visible to the guest.
Greg Ungerer [Wed, 28 Oct 2020 05:22:32 +0000 (15:22 +1000)]
net: fec: fix MDIO probing for some FEC hardware blocks
Some (apparently older) versions of the FEC hardware block do not like
the MMFR register being cleared to avoid generation of MII events at
initialization time. The action of clearing this register results in no
future MII events being generated at all on the problem block. This means
the probing of the MDIO bus will find no PHYs.
Create a quirk that can be checked at the FECs MII init time so that
the right thing is done. The quirk is set as appropriate for the FEC
hardware blocks that are known to need this.
ip6_tunnel: set inner ipproto before ip6_tnl_encap
ip6_tnl_encap assigns to proto transport protocol which
encapsulates inner packet, but we must pass to set_inner_ipproto
protocol of that inner packet.
Calling set_inner_ipproto after ip6_tnl_encap might break gso.
For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
incorrect calling sequence of gso functions:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
instead of:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
Ming Lei [Fri, 30 Oct 2020 02:42:53 +0000 (10:42 +0800)]
blk-mq: mark flush request as IDLE in flush_end_io()
Mark flush request as IDLE in its .end_io(), aligning it with how normal
requests behave. The flush request stays in in-flight tags if we're not
using an IO scheduler, so we need to change its state into IDLE.
Otherwise, we will hang in blk_mq_tagset_wait_completed_request() during
error recovery because flush the request state is kept as COMPLETED.
Merge tag 'icc-5.10-rc2' of https://git.linaro.org/people/georgi.djakov/linux into char-misc-linus
Georgi writes:
interconnect fixes for v5.10
This contains one core fix and a few driver fixes.
- Fix the core to perform also aggregation when setting the initial
bandwidth with sync_state.
- Fixes in some drivers to make sure the correct sequence is used for
initialization when we use sync_state.
- Fix in the sdm845 driver to prevent a board hang that was hit when
bandwidth scaling for display and multimedia was enabled.
Signed-off-by: Georgi Djakov <[email protected]>
* tag 'icc-5.10-rc2' of https://git.linaro.org/people/georgi.djakov/linux:
interconnect: qcom: use icc_sync state for sm8[12]50
interconnect: qcom: Ensure that the floor bandwidth value is enforced
interconnect: qcom: sc7180: Init BCMs before creating the nodes
interconnect: qcom: sdm845: Init BCMs before creating the nodes
interconnect: Aggregate before setting initial bandwidth
interconnect: qcom: sdm845: Enable keepalive for the MM1 BCM
The ABI files are supposed to be unique. Yet,
in the specific case of hw_pattern, there are some duplicated
entries as warned by scripts/get_abi.pl:
Warning: /sys/class/leds/<led>/hw_pattern is defined 3 times: Documentation/ABI/testing/sysfs-class-led-trigger-pattern:14 Documentation/ABI/testing/sysfs-class-led-driver-sc27xx:0 Documentation/ABI/testing/sysfs-class-led-driver-el15203000:0
Drop the duplication from the ABI files, moving the specific
definitions to files inside Documentation/leds.
docs: ABI: sysfs-class-backlight: unify ABI documentation
Both adp8860 and adp8870 define some extensions to the
backlight class. This causes warnings:
Warning: /sys/class/backlight/<backlight>/ambient_light_level is defined 2 times: /sys/class/backlight/<backlight>/ambient_light_level:8 /sys/class/backlight/<backlight>/ambient_light_level:30
Warning: /sys/class/backlight/<backlight>/ambient_light_zone is defined 2 times: /sys/class/backlight/<backlight>/ambient_light_zone:18 /sys/class/backlight/<backlight>/ambient_light_zone:40
As ABI definitions shouldn't be duplicated.
Unfortunately, the ABI is dependent on the specific device
features. As such, ambient_light_level range is somewhat
different among the supported devices.
The ambient_light_zone is even worse: the meanings of each
preset are different, and there's no ABI to retrieve
the supported types nor their meanins. Unfortunately,
it is too late to fix it without causing regressions,
as this has been used since Kernel v2.6.35.
Rewrite those ABI documentation using the current documentation
as a reference, and double-checking at the datasheets:
docs: ABI: sysfs-c2port: remove a duplicated entry
As warned by scripts/get_abi.pl:
Warning: /sys/class/c2port/c2portX/flash_erase is defined 2 times: Documentation/ABI/testing/sysfs-c2port:60 Documentation/ABI/testing/sysfs-c2port:68
This entry was added twice at the same patch. Probalby a
cut-and paste issue.
The ABI is not supposed to have duplicated entries, as warned
by get_abi.pl:
$ ./scripts/get_abi.pl validate 2>&1|grep sysfs-class-power
Warning: /sys/class/power_supply/<supply_name>/current_avg is defined 2 times: Documentation/ABI/testing/sysfs-class-power:108 Documentation/ABI/testing/sysfs-class-power:391
Warning: /sys/class/power_supply/<supply_name>/current_max is defined 2 times: Documentation/ABI/testing/sysfs-class-power:121 Documentation/ABI/testing/sysfs-class-power:404
Warning: /sys/class/power_supply/<supply_name>/current_now is defined 2 times: Documentation/ABI/testing/sysfs-class-power:130 Documentation/ABI/testing/sysfs-class-power:414
Warning: /sys/class/power_supply/<supply_name>/temp is defined 2 times: Documentation/ABI/testing/sysfs-class-power:281 Documentation/ABI/testing/sysfs-class-power:493
Warning: /sys/class/power_supply/<supply_name>/temp_alert_max is defined 2 times: Documentation/ABI/testing/sysfs-class-power:291 Documentation/ABI/testing/sysfs-class-power:505
Warning: /sys/class/power_supply/<supply_name>/temp_alert_min is defined 2 times: Documentation/ABI/testing/sysfs-class-power:306 Documentation/ABI/testing/sysfs-class-power:521
Warning: /sys/class/power_supply/<supply_name>/temp_max is defined 2 times: Documentation/ABI/testing/sysfs-class-power:322 Documentation/ABI/testing/sysfs-class-power:537
Warning: /sys/class/power_supply/<supply_name>/temp_min is defined 2 times: Documentation/ABI/testing/sysfs-class-power:333 Documentation/ABI/testing/sysfs-class-power:547
Warning: /sys/class/power_supply/<supply_name>/voltage_max is defined 2 times: Documentation/ABI/testing/sysfs-class-power:356 Documentation/ABI/testing/sysfs-class-power:571
Warning: /sys/class/power_supply/<supply_name>/voltage_min is defined 2 times: Documentation/ABI/testing/sysfs-class-power:367 Documentation/ABI/testing/sysfs-class-power:581
Warning: /sys/class/power_supply/<supply_name>/voltage_now is defined 2 times: Documentation/ABI/testing/sysfs-class-power:378 Documentation/ABI/testing/sysfs-class-power:591
Yet, both USB and Battery share a common set of charging-related
properties.
Unify the entries for such properties in order to avoid
duplication, while preserving the battery and USB-specific
data properly documented.
Unfortunately, (R) and (W) are valid markups for enumerated
lists, as described at:
https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#enumerated-lists
So, we ned to replace them by:
(R) -> (Read)
(W) -> (Write)
As otherwise, (R) will be displayed as R., with is not what
it is desired.
docs: Kconfig/Makefile: add a check for broken ABI files
The files under Documentation/ABI should follow the syntax
as defined at Documentation/ABI/README.
Allow checking if they're following the syntax by running
the ABI parser script on COMPILE_TEST.
With that, when there's a problem with a file under
Documentation/ABI, it would produce a warning like:
Warning: file ./Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats#14:
What '/sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor' doesn't have a description
Warning: file ./Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats#21:
What '/sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal' doesn't have a description
docs: ABI: make it parse ABI/stable as ReST-compatible files
Now that the stable ABI files are compatible with ReST,
parse them without converting complex descriptions as literal
blocks nor escaping special characters.
Please notice that escaping special characters will probably
be needed at descriptions, at least for the asterisk character.
docs: ABI: README: specify that files should be ReST compatible
As we plan to remove the escaping code from the scripts/get_abi.pl,
specify at the ABI README file that the content of the file should
be ReST compatible.
docs: kernel_abi.py: Handle with a lazy Sphinx parser
The Sphinx docutils parser is lazy: if the content is bigger than
a certain number of lines, it silenlty stops parsing it,
producing an incomplete content. This seems to be worse on newer
Sphinx versions, like 2.0.
So, change the logic to parse the contents per input file.
scripts: get_abi.pl: detect duplicated ABI definitions
The ABI should define only once each What. The current script
logic assumes that.
However, that's not the case, currently: there are several
symbols with a generic definition, and per-driver ones.
Better handle such cases, by preserving the cross-references
with the files that define them, but also track such
cases, producing warnings, as they should be fixed.
scripts: get_abi.pl: improve its parser to better catch up indentation
The original parser for indentation were relying on having
just one description for each "what". However, that's not
the case: there are a number of ABI symbols that got defined
multiple times.
Improve the parser for it to better handle descriptions
if entries are duplicated.
scripts: get_abi.pl: Allow optionally record from where a line came from
The get_abi.pl reads a lot of files and can join them on a
single output file. Store where each "What:" output came from,
in order to be able to optionally display it.
This is useful for the Sphinx extension, with can now be
able to blame what ABI file has issues, and on what line
the What: description with problems begin.
When the source ABI file is using ReST notation, the script
should handle whitespaces and lines with care, as otherwise
the file won't be properly recognized.
Address the bugs that are on such part of the script.
netfilter: nf_tables: missing validation from the abort path
If userspace does not include the trailing end of batch message, then
nfnetlink aborts the transaction. This allows to check that ruleset
updates trigger no errors.
After this patch, invoking this command from the prerouting chain:
# nft -c add rule x y fib saddr . oif type local
fails since oif is not supported there.
This patch fixes the lack of rule validation from the abort/check path
to catch configuration errors such as the one above.
netfilter: use actual socket sk rather than skb sk when routing harder
If netfilter changes the packet mark when mangling, the packet is
rerouted using the route_me_harder set of functions. Prior to this
commit, there's one big difference between route_me_harder and the
ordinary initial routing functions, described in the comment above
__ip_queue_xmit():
/* Note: skb->sk can be different from sk, in case of tunnels */
int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
That function goes on to correctly make use of sk->sk_bound_dev_if,
rather than skb->sk->sk_bound_dev_if. And indeed the comment is true: a
tunnel will receive a packet in ndo_start_xmit with an initial skb->sk.
It will make some transformations to that packet, and then it will send
the encapsulated packet out of a *new* socket. That new socket will
basically always have a different sk_bound_dev_if (otherwise there'd be
a routing loop). So for the purposes of routing the encapsulated packet,
the routing information as it pertains to the socket should come from
that socket's sk, rather than the packet's original skb->sk. For that
reason __ip_queue_xmit() and related functions all do the right thing.
One might argue that all tunnels should just call skb_orphan(skb) before
transmitting the encapsulated packet into the new socket. But tunnels do
*not* do this -- and this is wisely avoided in skb_scrub_packet() too --
because features like TSQ rely on skb->destructor() being called when
that buffer space is truely available again. Calling skb_orphan(skb) too
early would result in buffers filling up unnecessarily and accounting
info being all wrong. Instead, additional routing must take into account
the new sk, just as __ip_queue_xmit() notes.
So, this commit addresses the problem by fishing the correct sk out of
state->sk -- it's already set properly in the call to nf_hook() in
__ip_local_out(), which receives the sk as part of its normal
functionality. So we make sure to plumb state->sk through the various
route_me_harder functions, and then make correct use of it following the
example of __ip_queue_xmit().
wireguard: selftests: check that route_me_harder packets use the right sk
If netfilter changes the packet mark, the packet is rerouted. The
ip_route_me_harder family of functions fails to use the right sk, opting
to instead use skb->sk, resulting in a routing loop when used with
tunnels. With the next change fixing this issue in netfilter, test for
the relevant condition inside our test suite, since wireguard was where
the bug was discovered.
Merge tag 'usb-v5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus
Peter writes:
4 bug fixes for Cadence 3 driver.
The biggest fix is changing endpoint configuration method to
avoid FIFO overflow at multiple endpoints situation.
* tag 'usb-v5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb:
usb: cdns3: gadget: own the lock wrongly at the suspend routine
usb: cdns3: Fix on-chip memory overflow issue
usb: cdns3: gadget: suspicious implicit sign extension
usb: cdns3: Variable 'length' set but not used
Johannes Berg [Tue, 13 Oct 2020 12:01:57 +0000 (14:01 +0200)]
mac80211: don't require VHT elements for HE on 2.4 GHz
After the previous similar bugfix there was another bug here,
if no VHT elements were found we also disabled HE. Fix this to
disable HE only on the 5 GHz band; on 6 GHz it was already not
disabled, and on 2.4 GHz there need (should) not be any VHT.
Johannes Berg [Fri, 9 Oct 2020 12:17:11 +0000 (14:17 +0200)]
mac80211: always wind down STA state
When (for example) an IBSS station is pre-moved to AUTHORIZED
before it's inserted, and then the insertion fails, we don't
clean up the fast RX/TX states that might already have been
created, since we don't go through all the state transitions
again on the way down.
Do that, if it hasn't been done already, when the station is
freed. I considered only freeing the fast TX/RX state there,
but we might add more state so it's more robust to wind down
the state properly.
Note that we warn if the station was ever inserted, it should
have been properly cleaned up in that case, and the driver
will probably not like things happening out of order.
Johannes Berg [Fri, 9 Oct 2020 11:58:22 +0000 (13:58 +0200)]
cfg80211: initialize wdev data earlier
There's a race condition in the netdev registration in that
NETDEV_REGISTER actually happens after the netdev is available,
and so if we initialize things only there, we might get called
with an uninitialized wdev through nl80211 - not using a wdev
but using a netdev interface index.
I found this while looking into a syzbot report, but it doesn't
really seem to be related, and unfortunately there's no repro
for it (yet). I can't (yet) explain how it managed to get into
cfg80211_release_pmsr() from nl80211_netlink_notify() without
the wdev having been initialized, as the latter only iterates
the wdevs that are linked into the rdev, which even without the
change here happened after init.
However, looking at this, it seems fairly clear that the init
needs to be done earlier, otherwise we might even re-init on a
netns move, when data might still be pending.
Johannes Berg [Fri, 9 Oct 2020 11:25:41 +0000 (13:25 +0200)]
mac80211: fix use of skb payload instead of header
When ieee80211_skb_resize() is called from ieee80211_build_hdr()
the skb has no 802.11 header yet, in fact it consist only of the
payload as the ethernet frame is removed. As such, we're using
the payload data for ieee80211_is_mgmt(), which is of course
completely wrong. This didn't really hurt us because these are
always data frames, so we could only have added more tailroom
than we needed if we determined it was a management frame and
sdata->crypto_tx_tailroom_needed_cnt was false.
However, syzbot found that of course there need not be any payload,
so we're using at best uninitialized memory for the check.
Fix this to pass explicitly the kind of frame that we have instead
of checking there, by replacing the "bool may_encrypt" argument
with an argument that can carry the three possible states - it's
not going to be encrypted, it's a management frame, or it's a data
frame (and then we check sdata->crypto_tx_tailroom_needed_cnt).