Aya Levin [Wed, 12 Aug 2020 07:44:36 +0000 (10:44 +0300)]
net/mlx5e: Fix return status when setting unsupported FEC mode
Verify the configured FEC mode is supported by at least a single link
mode before applying the command. Otherwise fail the command and return
"Operation not supported".
Prior to this patch, the command was successful, yet it falsely set all
link modes to FEC auto mode - like configuring FEC mode to auto. Auto
mode is the default configuration if a link mode doesn't support the
configured FEC mode.
Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Aya Levin [Sun, 9 Aug 2020 09:34:21 +0000 (12:34 +0300)]
net/mlx5e: Fix driver's declaration to support GRE offload
Declare GRE offload support with respect to the inner protocol. Add a
list of supported inner protocols on which the driver can offload
checksum and GSO. For other protocols, inform the stack to do the needed
operations. There is no noticeable impact on GRE performance.
Fixes: 2729984149e6 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
The cited commit introduced the following coverity issue at function
mlx5_tc_ct_rule_to_tuple_nat:
- Memory - corruptions (OVERRUN)
Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte
elements at element index 7 (byte offset 31) using index
"ip6_offset" (which evaluates to 7).
In case of IPv6 destination address rewrite, ip6_offset values are
between 4 to 7, which will cause memory overrun of array
"tuple->ip.src_v6.in6_u.u6_addr32" to array
"tuple->ip.dst_v6.in6_u.u6_addr32".
Fixed by writing the value directly to array
"tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are
between 4 to 7.
Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Aya Levin [Mon, 20 Jul 2020 13:53:18 +0000 (16:53 +0300)]
net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU
Prior to this fix, in Striding RQ mode the driver was vulnerable when
receiving packets in the range (stride size - headroom, stride size].
Where stride size is calculated by mtu+headroom+tailroom aligned to the
closest power of 2.
Usually, this filtering is performed by the HW, except for a few cases:
- Between 2 VFs over the same PF with different MTUs
- On bluefield, when the host physical function sets a larger MTU than
the ARM has configured on its representor and uplink representor.
When the HW filtering is not present, packets that are larger than MTU
might be harmful for the RQ's integrity, in the following impacts:
1) Overflow from one WQE to the next, causing a memory corruption that
in most cases is unharmful: as the write happens to the headroom of next
packet, which will be overwritten by build_skb(). In very rare cases,
high stress/load, this is harmful. When the next WQE is not yet reposted
and points to existing SKB head.
2) Each oversize packet overflows to the headroom of the next WQE. On
the last WQE of the WQ, where addresses wrap-around, the address of the
remainder headroom does not belong to the next WQE, but it is out of the
memory region range. This results in a HW CQE error that moves the RQ
into an error state.
Solution:
Add a page buffer at the end of each WQE to absorb the leak. Actually
the maximal overflow size is headroom but since all memory units must be
of the same size, we use page size to comply with UMR WQEs. The increase
in memory consumption is of a single page per RQ. Initialize the mkey
with all MTTs pointing to a default page. When the channels are
activated, UMR WQEs will redirect the RX WQEs to the actual memory from
the RQ's pool, while the overflow MTTs remain mapped to the default page.
Fixes: 24163189da48 ("net/mlx5: Separate IRQ request/free from EQ life cycle") Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible
In case of pci is offline reclaim_pages_cmd() will still try to call
the FW to release FW pages, cmd_exec() in this case will return a silent
success without actually calling the FW.
This is wrong and will cause page leaks, what we should do is to detect
pci offline or command interface un-available before tying to access the
FW and manually release the FW pages in the driver.
In this patch we share the code to check for FW command interface
availability and we call it in sensitive places e.g. reclaim_pages_cmd().
Alternative fix:
1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value,
command success simulation list.
2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd().
Eran Ben Elisha [Mon, 31 Aug 2020 12:04:35 +0000 (15:04 +0300)]
net/mlx5: Add retry mechanism to the command entry index allocation
It is possible that new command entry index allocation will temporarily
fail. The new command holds the semaphore, so it means that a free entry
should be ready soon. Add one second retry mechanism before returning an
error.
Patch "net/mlx5: Avoid possible free of command entry while timeout comp
handler" increase the possibility to bump into this temporarily failure
as it delays the entry index release for non-callback commands.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Eran Ben Elisha [Tue, 21 Jul 2020 07:25:52 +0000 (10:25 +0300)]
net/mlx5: poll cmd EQ in case of command timeout
Once driver detects a command interface command timeout, it warns the
user and returns timeout error to the caller. In such case, the entry of
the command is not evacuated (because only real event interrupt is allowed
to clear command interface entry). If the HW event interrupt
of this entry will never arrive, this entry will be left unused forever.
Command interface entries are limited and eventually we can end up without
the ability to post a new command.
In addition, if driver will not consume the EQE of the lost interrupt and
rearm the EQ, no new interrupts will arrive for other commands.
Add a resiliency mechanism for manually polling the command EQ in case of
a command timeout. In case resiliency mechanism will find non-handled EQE,
it will consume it, and the command interface will be fully functional
again. Once the resiliency flow finished, wait another 5 seconds for the
command interface to complete for this command entry.
Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow.
Add an async EQ spinlock to avoid races between resiliency flows and real
interrupts that might run simultaneously.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Eran Ben Elisha [Tue, 4 Aug 2020 07:40:21 +0000 (10:40 +0300)]
net/mlx5: Avoid possible free of command entry while timeout comp handler
Upon command completion timeout, driver simulates a forced command
completion. In a rare case where real interrupt for that command arrives
simultaneously, it might release the command entry while the forced
handler might still access it.
Fix that by adding an entry refcount, to track current amount of allowed
handlers. Command entry to be released only when this refcount is
decremented to zero.
Command refcount is always initialized to one. For callback commands,
command completion handler is the symmetric flow to decrement it. For
non-callback commands, it is wait_func().
Before ringing the doorbell, increment the refcount for the real completion
handler. Once the real completion handler is called, it will decrement it.
For callback commands, once the delayed work is scheduled, increment the
refcount. Upon callback command completion handler, we will try to cancel
the timeout callback. In case of success, we need to decrement the callback
refcount as it will never run.
In addition, gather the entry index free and the entry free into a one
flow for all command types release.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Eran Ben Elisha [Thu, 13 Aug 2020 13:55:20 +0000 (16:55 +0300)]
net/mlx5: Fix a race when moving command interface to polling mode
As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.
Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.
Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Linus Torvalds [Fri, 2 Oct 2020 17:37:08 +0000 (10:37 -0700)]
Merge branch 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull epoll fixes from Al Viro:
"Several race fixes in epoll"
* 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ep_create_wakeup_source(): dentry name can change under you...
epoll: EPOLL_CTL_ADD: close the race in decision to take fast path
epoll: replace ->visited/visited_list with generation count
epoll: do not insert into poll queues until all sanity checks are done
Linus Torvalds [Fri, 2 Oct 2020 17:13:05 +0000 (10:13 -0700)]
Merge tag 'riscv-for-linus-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"Two fixes for this week:
- The addition of a symbol export for clint_time_val, which has been
inlined into some timex functions and can be used by drivers.
- A fix to avoid calling get_cycles() before the timers have been
probed.
These both only effect !MMU systems"
* tag 'riscv-for-linus-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Check clint_time_val before use
clocksource: clint: Export clint_time_val for modules
Linus Torvalds [Fri, 2 Oct 2020 17:09:40 +0000 (10:09 -0700)]
Merge tag 'for-5.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Two more fixes.
One is for a lockdep warning/lockup (also caught by syzbot), that one
has been seen in practice. Regarding the other syzbot reports
mentioned last time, they don't seem to be urgent and reliably
reproducible so they'll be fixed later.
The second fix is for a potential corruption when device replace
finishes and the in-memory state of trim is not copied to the new
device"
* tag 'for-5.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix filesystem corruption after a device replace
btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks
btrfs: move btrfs_scratch_superblocks into btrfs_dev_replace_finishing
Linus Torvalds [Fri, 2 Oct 2020 17:05:56 +0000 (10:05 -0700)]
Merge tag 'pm-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix one more issue related to the recent RCU-lockdep changes, a
typo in documentation and add a missing return statement to
intel_pstate.
Specifics:
- Fix up RCU usage for cpuidle on the ARM imx6q platform (Ulf
Hansson)
- Fix typo in the PM documentation (Yoann Congal)
- Add return statement that is missing after recent changes in the
intel_pstate driver (Zhang Rui)"
* tag 'pm-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ARM: imx6q: Fixup RCU usage for cpuidle
Documentation: PM: Fix a reStructuredText syntax error
cpufreq: intel_pstate: Fix missing return statement
Linus Torvalds [Fri, 2 Oct 2020 17:01:00 +0000 (10:01 -0700)]
Merge tag 'staging-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO fixes from Greg KH:
"Here are two small IIO driver fixes for 5.9-rc8 that resolve some
reported issues:
- driver name fixed in one driver
- device name typo fixed
Both have been in linux-next for a while with no reported problems"
* tag 'staging-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: qcom-spmi-adc5: fix driver name
iio: adc: ad7124: Fix typo in device name
Linus Torvalds [Fri, 2 Oct 2020 16:51:42 +0000 (09:51 -0700)]
Merge tag 'gpio-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Some late GPIO fixes for the v5.9 series:
- Fix compiler warnings on the OMAP when PM is disabled
- Clear the interrupt when setting edge sensitivity on the Spreadtrum
driver.
- Fix up spurious interrupts on the TC35894.
- Support threaded interrupts on the Siox controller.
- Fix resource leaks on the mockup driver.
- Fix line event handling in syscall compatible mode for the
character device.
- Fix an unitialized variable in the PCA953A driver.
- Fix access to all GPIO IRQs on the Aspeed AST2600.
- Fix line direction on the AMD FCH driver.
- Use the bitmap API instead of compiler intrinsics for bit
manipulation in the PCA953x driver"
* tag 'gpio-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957x
gpio: pca953x: Use bitmap API over implicit GCC extension
gpio: amd-fch: correct logic of GPIO_LINE_DIRECTION
gpio: aspeed: fix ast2600 bank properties
gpio/aspeed-sgpio: don't enable all interrupts by default
gpio/aspeed-sgpio: enable access to all 80 input & output sgpios
gpio: pca953x: Fix uninitialized pending variable
gpiolib: Fix line event handling in syscall compatible mode
gpio: mockup: fix resource leak in error path
gpio: siox: explicitly support only threaded irqs
gpio: tc35894: fix up tc35894 interrupt configuration
gpio: sprd: Clear interrupt when setting the type as edge
gpio: omap: Fix warnings if PM is disabled
Linus Torvalds [Fri, 2 Oct 2020 16:40:09 +0000 (09:40 -0700)]
Merge tag 'mmc-v5.9-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- Fix deadlock when removing MEMSTICK host
- Workaround broken CMDQ on Intel GLK based IRBIS models
* tag 'mmc-v5.9-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models
memstick: Skip allocating card when removing host
random32: Restore __latent_entropy attribute on net_rand_state
Commit f227e3ec3b5c ("random32: update the net random state on interrupt
and activity") broke compilation and was temporarily fixed by Linus in 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy
gcc plugin") by entirely moving net_rand_state out of the things handled
by the latent_entropy GCC plugin.
From what I understand when reading the plugin code, using the
__latent_entropy attribute on a declaration was the wrong part and
simply keeping the __latent_entropy attribute on the variable definition
was the correct fix.
Roman Gushchin [Thu, 1 Oct 2020 20:07:49 +0000 (13:07 -0700)]
mm: memcg/slab: fix slab statistics in !SMP configuration
Since commit ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat
items") the write side of slab counters accepts a value in bytes and
converts it to pages. It happens in __mod_node_page_state().
However a non-SMP version of __mod_node_page_state() doesn't perform
this conversion. It leads to incorrect (unrealistically high) slab
counters values. Fix this by adding a similar conversion to the non-SMP
version of __mod_node_page_state().
Hans de Goede [Fri, 25 Sep 2020 06:58:12 +0000 (08:58 +0200)]
MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
Darren Hart and Andy Shevchenko lately have not had enough time to
maintain the x86 platform drivers, dropping their status to:
"Odd Fixes".
Mark Gross and Hans de Goede will take over maintainership of
the x86 platform drivers. Replace Darren and Andy's entries with
theirs and change the status to "Maintained".
Hans de Goede [Wed, 30 Sep 2020 13:19:05 +0000 (15:19 +0200)]
platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
2 recent commits: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE
on the 9 / "Laptop" chasis-type") 1fac39fd0316 ("platform/x86: intel-vbtn: Also handle tablet-mode switch on
"Detachable" and "Portable" chassis-types")
Enabled reporting of SW_TABLET_MODE on more devices since the vbtn ACPI
interface is used by the firmware on some of those devices to report this.
Testing has shown that unconditionally enabling SW_TABLET_MODE reporting
on all devices with a chassis type of 8 ("Portable") or 10 ("Notebook")
which support the VGBS method is a very bad idea.
Many of these devices are normal laptops (non 2-in-1) models with a VGBS
which always returns 0, which we translate to SW_TABLET_MODE=1. This in
turn causes userspace (libinput) to suppress events from the builtin
keyboard and touchpad, making the laptop essentially unusable.
Since the problem of wrongly reporting SW_TABLET_MODE=1 in combination
with libinput, leads to a non-usable system. Where as OTOH many people will
not even notice when SW_TABLET_MODE is not being reported, this commit
changes intel_vbtn_has_switches() to use a DMI based allow-list.
The new DMI based allow-list matches on the 31 ("Convertible") and
32 ("Detachable") chassis-types, as these clearly are 2-in-1s and
so far if they support the intel-vbtn ACPI interface they all have
properly working SW_TABLET_MODE reporting.
Besides these 2 generic matches, it also contains model specific matches
for 2-in-1 models which use a different chassis-type and which are known
to have properly working SW_TABLET_MODE reporting.
This has been tested on the following 2-in-1 devices:
Dell Venue 11 Pro 7130 vPro
HP Pavilion X2 10-p002nd
HP Stream x360 Convertible PC 11
Medion E1239T
Linus Torvalds [Fri, 2 Oct 2020 02:14:36 +0000 (19:14 -0700)]
pipe: remove pipe_wait() and fix wakeup race with splice
The pipe splice code still used the old model of waiting for pipe IO by
using a non-specific "pipe_wait()" that waited for any pipe event to
happen, which depended on all pipe IO being entirely serialized by the
pipe lock. So by checking the state you were waiting for, and then
adding yourself to the wait queue before dropping the lock, you were
guaranteed to see all the wakeups.
Strictly speaking, the actual wakeups were not done under the lock, but
the pipe_wait() model still worked, because since the waiter held the
lock when checking whether it should sleep, it would always see the
current state, and the wakeup was always done after updating the state.
However, commit 0ddad21d3e99 ("pipe: use exclusive waits when reading or
writing") split the single wait-queue into two, and in the process also
made the "wait for event" code wait for _two_ wait queues, and that then
showed a race with the wakers that were not serialized by the pipe lock.
It's only splice that used that "pipe_wait()" model, so the problem
wasn't obvious, but Josef Bacik reports:
"I hit a hang with fstest btrfs/187, which does a btrfs send into
/dev/null. This works by creating a pipe, the write side is given to
the kernel to write into, and the read side is handed to a thread that
splices into a file, in this case /dev/null.
The box that was hung had the write side stuck here [pipe_write] and
the read side stuck here [splice_from_pipe_next -> pipe_wait].
[ more details about pipe_wait() scenario ]
The problem is we're doing the prepare_to_wait, which sets our state
each time, however we can be woken up either with reads or writes. In
the case above we race with the WRITER waking us up, and re-set our
state to INTERRUPTIBLE, and thus never break out of schedule"
Josef had a patch that avoided the issue in pipe_wait() by just making
it set the state only once, but the deeper problem is that pipe_wait()
depends on a level of synchonization by the pipe mutex that it really
shouldn't. And the whole "wait for any pipe state change" model really
isn't very good to begin with.
So rather than trying to work around things in pipe_wait(), remove that
legacy model of "wait for arbitrary pipe event" entirely, and actually
create functions that wait for the pipe actually being readable or
writable, and can do so without depending on the pipe lock serializing
everything.
Linus Torvalds [Thu, 1 Oct 2020 19:59:36 +0000 (12:59 -0700)]
Merge tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Fix a device reference counting bug in the Exynos IOMMU driver.
- Lockdep fix for the Intel VT-d driver.
- Fix a bug in the AMD IOMMU driver which caused corruption of the IVRS
ACPI table and caused IOMMU driver initialization failures in kdump
kernels.
* tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()
iommu/amd: Fix the overwritten field in IVMD header
iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()
Heiner Kallweit [Thu, 1 Oct 2020 07:23:02 +0000 (09:23 +0200)]
r8169: fix data corruption issue on RTL8402
Petr reported that after resume from suspend RTL8402 partially
truncates incoming packets, and re-initializing register RxConfig
before the actual chip re-initialization sequence is needed to avoid
the issue.
Heiner Kallweit [Thu, 1 Oct 2020 06:44:19 +0000 (08:44 +0200)]
r8169: fix handling ether_clk
Petr reported that system freezes on r8169 driver load on a system
using ether_clk. The original change was done under the assumption
that the clock isn't needed for basic operations like chip register
access. But obviously that was wrong.
Therefore effectively revert the original change, and in addition
leave the clock active when suspending and WoL is enabled. Chip may
not be able to process incoming packets otherwise.
Linus Torvalds [Thu, 1 Oct 2020 18:49:01 +0000 (11:49 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"A previous commit to prevent AML memory opregions from accessing the
kernel memory turned out to be too restrictive. Relax the permission
check to permit the ACPI core to map kernel memory used for table
overrides"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: permit ACPI core to map kernel memory used for table overrides
Linus Torvalds [Thu, 1 Oct 2020 16:45:37 +0000 (09:45 -0700)]
Merge tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"AMD and vmwgfx fixes.
Just dequeuing these a bit early as the AMD ones are bit larger than
I'd prefer, but Alex missed last week so it's a double set of fixes.
The larger ones are just register header fixes for the new chips that
were just introduced in rc1 along with some new PCI IDs for new hw.
Otherwise it is usual fixes.
The vmwgfx fix was due to some testing I was doing and found we
weren't booting properly, vmware had the fix internally so hurried it
vmwgfx:
- fix a regression due to TTM refactor
amdgpu:
- Fix potential double free in userptr handling
- Sienna Cichlid and Navy Flounder udpates
- Add Sienna Cichlid PCI IDs
- Drop experimental flag for navi12
- Raven fixes
- Renoir fixes
- HDCP fix
- DCN3 fix for clang and older versions of gcc
- Fix a runtime pm refcount issue"
* tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: disable gfxoff temporarily for navy_flounder
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/vmwgfx: Fix error handling in get_node
drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()
drm/amdgpu/swsmu/smu12: fix force clock handling for mclk
drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
drm/amdgpu/display: fix CFLAGS setup for DCN30
drm/amd/display: fix return value check for hdcp_work
drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.
drm/amd/pm: Removed fixed clock in auto mode DPM
drm/amdgpu: remove experimental flag from navi12
drm/amdgpu: add device ID for sienna_cichlid (v2)
drm/amdgpu: use the AV1 defines for VCN 3.0
drm/amdgpu: add VCN 3.0 AV1 registers
drm/amdgpu: add the GC 10.3 VRS registers
drm/amdgpu: prevent double kfree ttm->sg
Linus Torvalds [Thu, 1 Oct 2020 16:41:02 +0000 (09:41 -0700)]
Merge tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Two tracing fixes:
- Fix temp buffer accounting that caused a WARNING for
ftrace_dump_on_opps()
- Move the recursion check in one of the function callback helpers to
the beginning of the function, as if the rcu_is_watching() gets
traced, it will cause a recursive loop that will crash the kernel"
* tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Move RCU is watching check after recursion check
tracing: Fix trace_find_next_entry() accounting of temp buffer size
This error can be triggered when config has CONFIG_NET enabled
but CONFIG_INET disabled. In this case, there is no user of
istructs inet_timewait_sock and tcp_timewait_sock and hence
vmlinux BTF types are not generated for these two structures.
To fix the problem, let us force BTF generation for these two
structures with BTF_TYPE_EMIT.
Lu Baolu [Sun, 27 Sep 2020 06:24:28 +0000 (14:24 +0800)]
iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()
Lock(&iommu->lock) without disabling irq causes lockdep warnings.
[ 12.703950] ========================================================
[ 12.703962] WARNING: possible irq lock inversion dependency detected
[ 12.703975] 5.9.0-rc6+ #659 Not tainted
[ 12.703983] --------------------------------------------------------
[ 12.703995] systemd-udevd/284 just changed the state of lock:
[ 12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at:
iommu_flush_dev_iotlb.part.57+0x2e/0x90
[ 12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past:
[ 12.704043] (&iommu->lock){+.+.}-{2:2}
[ 12.704045]
and interrupts could create inverse lock ordering between
them.
[ 12.704073]
other info that might help us debug this:
[ 12.704085] Possible interrupt unsafe locking scenario:
Since commit c330fb1ddc0a ("XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.")
Xen is using the chip_data pointer for storing IRQ specific data. When
running as a HVM domain this can result in problems for legacy IRQs, as
those might use chip_data for their own purposes.
Use a local array for this purpose in case of legacy IRQs, avoiding the
double use.
Adrian Huang [Sat, 26 Sep 2020 10:26:02 +0000 (18:26 +0800)]
iommu/amd: Fix the overwritten field in IVMD header
Commit 387caf0b759a ("iommu/amd: Treat per-device exclusion
ranges as r/w unity-mapped regions") accidentally overwrites
the 'flags' field in IVMD (struct ivmd_header) when the I/O
virtualization memory definition is associated with the
exclusion range entry. This leads to the corrupted IVMD table
(incorrect checksum). The kdump kernel reports the invalid checksum:
ACPI BIOS Warning (bug): Incorrect checksum in table [IVRS] - 0x5C, should be 0x60 (20200717/tbprint-177)
AMD-Vi: [Firmware Bug]: IVRS invalid checksum
Fix the above-mentioned issue by modifying the 'struct unity_map_entry'
member instead of the IVMD header.
Cleanup: The *exclusion_range* functions are not used anymore, so
get rid of them.
Marc Zyngier [Mon, 13 Jul 2020 14:15:14 +0000 (15:15 +0100)]
KVM: arm64: Restore missing ISB on nVHE __tlb_switch_to_guest
Commit a0e50aa3f4a8 ("KVM: arm64: Factor out stage 2 page table
data from struct kvm") dropped the ISB after __load_guest_stage2(),
only leaving the one that is required when the speculative AT
workaround is in effect.
As Andrew points it: "This alternative is 'backwards' to avoid a
double ISB as there is one in __load_guest_stage2 when the workaround
is active."
Restore the missing ISB, conditionned on the AT workaround not being
active.
Fixes: a0e50aa3f4a8 ("KVM: arm64: Factor out stage 2 page table data from struct kvm") Reported-by: Andrew Scull <[email protected]> Reported-by: Thomas Tai <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
Andy Shevchenko [Wed, 30 Sep 2020 14:20:13 +0000 (17:20 +0300)]
gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957x
When driver has been converted to the bitmap API the non-bitmap functions
started behaving differently on 32-bit BE architectures since the bytes in
two consequent unsigned longs are in different order in comparison to byte
array. Hence if the chip had had more than 32 lines the memset() call over
it would have not set up upper lines correctly.
Although it's currently a theoretical case (no supported chips of this type
has 32+ lines), it's better to provide a clean code to avoid people thinking
this is okay and potentially producing not fully working things.
Andy Shevchenko [Wed, 30 Sep 2020 14:20:12 +0000 (17:20 +0300)]
gpio: pca953x: Use bitmap API over implicit GCC extension
In IRQ handler we have to clear bitmap before use. Currently
the GCC extension has been used for that. For sake of the consistency
switch to bitmap API. As expected bloat-o-meter shows no difference
in the object size.
check mtk_is_virt_gpio input parameter,
virtual gpio need to support eint mode.
add error handler for the ko case
to fix this boot fail:
pc : mtk_is_virt_gpio+0x20/0x38 [pinctrl_mtk_common_v2]
lr : mtk_gpio_get_direction+0x44/0xb0 [pinctrl_paris]
====================
Fix bugs in Octeontx2 netdev driver
In existing Octeontx2 network drivers code has issues
like stale entries in broadcast replication list, missing
L3TYPE for IPv6 frames, running tx queues on error and
race condition in mbox reset.
This patch set fixes the above issues.
====================
Mbox implementation in octeontx2 driver has three states
alloc, send and reset in mbox response. VF allocate and
sends message to PF for processing, PF ACKs them back and
reset the mbox memory. In some case we see synchronization
issue where after msgs_acked is incremented and before
mbox_reset API is called, if current execution is scheduled
out and a different thread is scheduled in which checks for
msgs_acked. Since the new thread sees msgs_acked == msgs_sent
it will try to allocate a new message and to send a new mbox
message to PF.Now if mbox_reset is scheduled in, PF will see
'0' in msgs_send.
This patch fixes the issue by calling mbox_reset before
incrementing msgs_acked flag for last processing message and
checks for valid message size.
Fixes: d424b6c02 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Currently in otx2_open on failure of nix_lf_start
transmit queues are not stopped which are already
started in link_event. Since the tx queues are not
stopped network stack still try's to send the packets
leading to driver crash while access the device resources.
Fixes: 50fe6c02e ("octeontx2-pf: Register and handle link notifications") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2-pf: Fix TCP/UDP checksum offload for IPv6 frames
For TCP/UDP checksum offload feature in Octeontx2
expects L3TYPE to be set irrespective of IP header
checksum is being offloaded or not. Currently for
IPv6 frames L3TYPE is not being set resulting in
packet drop with checksum error. This patch fixes
this issue.
Fixes: 3ca6c4c88 ("octeontx2-pf: Add packet transmission support") Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2-af: Fix enable/disable of default NPC entries
Packet replication feature present in Octeontx2
is a hardware linked list of PF and its VF
interfaces so that broadcast packets are sent
to all interfaces present in the list. It is
driver job to add and delete a PF/VF interface
to/from the list when the interface is brought
up and down. This patch fixes the
npc_enadis_default_entries function to handle
broadcast replication properly if packet replication
feature is present.
Fixes: 40df309e4166 ("octeontx2-af: Support to enable/disable default MCAM entries") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Wed, 30 Sep 2020 22:01:09 +0000 (15:01 -0700)]
Merge branch '100GbE' of https://github.com/anguy11/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2020-09-30
This series contains updates to ice driver only.
Jake increases the wait time for firmware response as it can take longer
than the current wait time. Preserves the NVM capabilities of the device in
safe mode so the device reports its NVM update capabilities properly
when in this state.
arm64: permit ACPI core to map kernel memory used for table overrides
Jonathan reports that the strict policy for memory mapped by the
ACPI core breaks the use case of passing ACPI table overrides via
initramfs. This is due to the fact that the memory type used for
loading the initramfs in memory is not recognized as a memory type
that is typically used by firmware to pass firmware tables.
Since the purpose of the strict policy is to ensure that no AML or
other ACPI code can manipulate any memory that is used by the kernel
to keep its internal state or the state of user tasks, we can relax
the permission check, and allow mappings of memory that is reserved
and marked as NOMAP via memblock, and therefore not covered by the
linear mapping to begin with.
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Another batch of clk driver fixes:
- Make sure DRAM and ChipID region doesn't get disabled on Exynos
- Fix a SATA failure on Tegra
- Fix the emac_ptp clk divider on stratix10"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk
clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED
clk: tegra: Fix missing prototype for tegra210_clk_register_emc()
clk: tegra: Always program PLL_E when enabled
clk: tegra: Capitalization fixes
clk: samsung: Keep top BPLL mux on Exynos542x enabled
The NoMMU kernel is broken for QEMU virt machine from Linux-5.9-rc6
because clint_time_val is used even before CLINT driver is probed
at following places:
1. rand_initialize() calls get_cycles() which in-turn uses
clint_time_val
2. boot_init_stack_canary() calls get_cycles() which in-turn
uses clint_time_val
The issue#1 (above) is fixed by providing custom random_get_entropy()
for RISC-V NoMMU kernel. For issue#2 (above), we remove dependency of
boot_init_stack_canary() on get_cycles() and this is aligned with the
boot_init_stack_canary() implementations of ARM, ARM64 and MIPS kernel.
Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation for M-mode systems") Signed-off-by: Anup Patel <[email protected]> Tested-by: Damien Le Moal <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
Filipe Manana [Wed, 23 Sep 2020 14:30:16 +0000 (15:30 +0100)]
btrfs: fix filesystem corruption after a device replace
We use a device's allocation state tree to track ranges in a device used
for allocated chunks, and we set ranges in this tree when allocating a new
chunk. However after a device replace operation, we were not setting the
allocated ranges in the new device's allocation state tree, so that tree
is empty after a device replace.
This means that a fitrim operation after a device replace will trim the
device ranges that have allocated chunks and extents, as we trim every
range for which there is not a range marked in the device's allocation
state tree. It is also important during chunk allocation, since the
device's allocation state is used to determine if a range is already
allocated when allocating a new chunk.
This is trivial to reproduce and the following script triggers the bug:
$ cat reproducer.sh
#!/bin/bash
DEV1="/dev/sdg"
DEV2="/dev/sdh"
DEV3="/dev/sdi"
wipefs -a $DEV1 $DEV2 $DEV3 &> /dev/null
# Create a raid1 test fs on 2 devices.
mkfs.btrfs -f -m raid1 -d raid1 $DEV1 $DEV2 > /dev/null
mount $DEV1 /mnt/btrfs
echo "Starting to replace $DEV1 with $DEV3"
btrfs replace start -B $DEV1 $DEV3 /mnt/btrfs
echo
echo "Running fstrim"
fstrim /mnt/btrfs
echo
echo "Unmounting filesystem"
umount /mnt/btrfs
echo "Mounting filesystem in degraded mode using $DEV3 only"
wipefs -a $DEV1 $DEV2 &> /dev/null
mount -o degraded $DEV3 /mnt/btrfs
if [ $? -ne 0 ]; then
dmesg | tail
echo
echo "Failed to mount in degraded mode"
exit 1
fi
echo
echo "File foo data (expected all bytes = 0xab):"
od -A d -t x1 /mnt/btrfs/foo
umount /mnt/btrfs
When running the reproducer:
$ ./replace-test.sh
wrote 10485760/10485760 bytes at offset 0
10 MiB, 2560 ops; 0.0901 sec (110.877 MiB/sec and 28384.5216 ops/sec)
Starting to replace /dev/sdg with /dev/sdi
Running fstrim
Unmounting filesystem
Mounting filesystem in degraded mode using /dev/sdi only
mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/sdi, missing codepage or helper program, or other error.
[19581.748641] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi started
[19581.803842] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi finished
[19582.208293] BTRFS info (device sdi): allowing degraded mounts
[19582.208298] BTRFS info (device sdi): disk space caching is enabled
[19582.208301] BTRFS info (device sdi): has skinny extents
[19582.212853] BTRFS warning (device sdi): devid 2 uuid 1f731f47-e1bb-4f00-bfbb-9e5a0cb4ba9f is missing
[19582.213904] btree_readpage_end_io_hook: 25839 callbacks suppressed
[19582.213907] BTRFS error (device sdi): bad tree block start, want 30490624 have 0
[19582.214780] BTRFS warning (device sdi): failed to read root (objectid=7): -5
[19582.231576] BTRFS error (device sdi): open_ctree failed
Failed to mount in degraded mode
So fix by setting all allocated ranges in the replace target device when
the replace operation is finishing, when we are holding the chunk mutex
and we can not race with new chunk allocations.
Josef Bacik [Thu, 20 Aug 2020 15:18:27 +0000 (11:18 -0400)]
btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks
When closing and freeing the source device we could end up doing our
final blkdev_put() on the bdev, which will grab the bd_mutex. As such
we want to be holding as few locks as possible, so move this call
outside of the dev_replace->lock_finishing_cancel_unmount lock. Since
we're modifying the fs_devices we need to make sure we're holding the
uuid_mutex here, so take that as well.
There's a report from syzbot probably hitting one of the cases where
the bd_mutex and device_list_mutex are taken in the wrong order, however
it's not with device replace, like this patch fixes. As there's no
reproducer available so far, we can't verify the fix.
https://lore.kernel.org/lkml/000000000000fc04d105afcf86d7@google.com/ link: https://syzkaller.appspot.com/bug?extid=84a0634dc5d21d488419
WARNING: possible circular locking dependency detected
5.9.0-rc5-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.0/6878 is trying to acquire lock: ffff88804c17d780 (&bdev->bd_mutex){+.+.}-{3:3}, at: blkdev_put+0x30/0x520 fs/block_dev.c:1804
but task is already holding lock: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
The commit eb1f00237aca ("lockdep,trace: Expose tracepoints"), started to
expose us for tracepoints. For imx6q cpuidle, this leads to an RCU splat
according to below.
[6.870684] [<c0db7690>] (_raw_spin_lock) from [<c011f6a4>] (imx6q_enter_wait+0x18/0x9c)
[6.878846] [<c011f6a4>] (imx6q_enter_wait) from [<c09abfb0>] (cpuidle_enter_state+0x168/0x5e4)
To fix the problem, let's assign the corresponding idlestate->flags the
CPUIDLE_FLAG_RCU_IDLE bit, which enables us to call rcu_idle_enter|exit()
at the proper point.
Documentation: PM: Fix a reStructuredText syntax error
Fix a reStructuredText syntax error in the cpuidle PM admin-guide
documentation: the ``...'' quotation marks are parsed as partial ''...''
reStructuredText markup and break the output formatting.
Jacob Keller [Wed, 2 Sep 2020 15:53:42 +0000 (08:53 -0700)]
ice: preserve NVM capabilities in safe mode
If the driver initializes in safe mode, it will call
ice_set_safe_mode_caps. This results in clearing the capabilities
structures, in order to set them up for operating in safe mode, ensuring
many features are disabled.
This has a side effect of also clearing the capability bits that relate
to NVM update. The result is that the device driver will not indicate
support for unified update, even if the firmware is capable.
Fix this by adding the relevant capability fields to the list of values
we preserve. To simplify the code, use a common_cap structure instead of
a handful of local variables. To reduce some duplication of the
capability name, introduce a couple of macros used to restore the
capabilities values from the cached copy.
Fixes: de9b277ee032 ("ice: Add support for unified NVM update flow capability") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Brijesh Behera <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
Jacob Keller [Wed, 26 Aug 2020 18:30:34 +0000 (11:30 -0700)]
ice: increase maximum wait time for flash write commands
The ice driver needs to wait for a firmware response to each command to
write a block of data to the scratch area used to update the device
firmware. The driver currently waits for up to 1 second for this to be
returned.
It turns out that firmware might take longer than 1 second to return
a completion in some cases. If this happens, the flash update will fail
to complete.
Fix this by increasing the maximum time that the driver will wait for
both writing a block of data, and for activating the new NVM bank. The
timeout for an erase command is already several minutes, as the firmware
had to erase the entire bank which was already expected to take a minute
or more in the worst case.
In the case where firmware really won't respond, we will now take longer
to fail. However, this ensures that if the firmware is simply slow to
respond, the flash update can still complete. This new maximum timeout
should not adversely increase the update time, as the implementation for
wait_event_interruptible_timeout, and should wake very soon after we get
a completion event. It is better for a flash update be slow but still
succeed than to fail because we gave up too quickly.
Fixes: d69ea414c9b4 ("ice: implement device flash update via devlink") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Brijesh Behera <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
The fbdev driver is used by Android's FVP configuration. Using the
DRM driver together with DRM's fbdev emulation results in a failure
to boot Android. The root cause is that Android's generic fbdev
userspace driver relies on the ability to set the pixel format via
FBIOPUT_VSCREENINFO, which is not supported by fbdev emulation.
There have been other less critical behavioral differences identified
between the fbdev driver and the DRM driver with fbdev emulation. The
DRM driver exposes different values for the panel's width, height and
refresh rate, and the DRM driver fails a FBIOPUT_VSCREENINFO syscall
with yres_virtual greater than the maximum supported value instead
of letting the syscall succeed and setting yres_virtual based on yres.
Evan Quan [Wed, 30 Sep 2020 04:07:56 +0000 (12:07 +0800)]
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
As the dpm clock table is needed during DC HW initialization.
And that (DC HW initialization) comes before smu_late_init()
where current APU dpm clock table setup is performed. So, NULL
pointer dereference will be triggered. By moving APU dpm clock
table setup to smu_hw_init(), this can be avoided.
clocksource: clint: Export clint_time_val for modules
clint_time_val will soon be used by the RISC-V implementation of
random_get_entropy(), which is a static inline function that may be used by
modules (at least CRYPTO_JITTERENTROPY=m).
ttm_mem_type_manager_func.get_node was changed to return -ENOSPC
instead of setting the node pointer to NULL. Unfortunately
vmwgfx still had two places where it was explicitly converting
-ENOSPC to 0 causing regressions. This fixes those spots by
allowing -ENOSPC to be returned. That seems to fix recent
regressions with vmwgfx.
Mark Mielke [Mon, 28 Sep 2020 04:33:29 +0000 (00:33 -0400)]
scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()
The kernel may fail to boot or devices may fail to come up when
initializing iscsi_tcp devices starting with Linux 5.8.
Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param
libiscsi function") introduced getpeername() within the session spinlock.
Commit 1b66d253610c ("bpf: Add get{peer, sock}name attach types for
sock_addr") introduced BPF_CGROUP_RUN_SA_PROG_LOCK() within getpeername(),
which acquires a mutex and when used from iscsi_tcp devices can now lead to
"BUG: scheduling while atomic:" and subsequent damage.
Ensure that the spinlock is released before calling getpeername() or
getsockname(). sock_hold() and sock_put() are used to ensure that the
socket reference is preserved until after the getpeername() or
getsockname() complete.
David S. Miller [Wed, 30 Sep 2020 01:16:09 +0000 (18:16 -0700)]
Merge branch 'mptcp-Fix-for-32-bit-DATA_FIN'
Mat Martineau says:
====================
mptcp: Fix for 32-bit DATA_FIN
The main fix is contained in patch 2, and that commit message explains
the issue with not properly converting truncated DATA_FIN sequence
numbers sent by the peer.
With patch 2 adding an unlocked read of msk->ack_seq, patch 1 cleans up
access to that data with READ_ONCE/WRITE_ONCE.
This does introduce two merge conflicts with net-next, but both have
straightforward resolution. Patch 1 modifies a line that got removed in
net-next so the modification can be dropped when merging. Patch 2 will
require a trivial conflict resolution for a modified function
declaration.
====================
Mat Martineau [Tue, 29 Sep 2020 22:08:20 +0000 (15:08 -0700)]
mptcp: Handle incoming 32-bit DATA_FIN values
The peer may send a DATA_FIN mapping with either a 32-bit or 64-bit
sequence number. When a 32-bit sequence number is received for the
DATA_FIN, it must be expanded to 64 bits before comparing it to the
last acked sequence number. This expansion was missing.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/93 Fixes: 3721b9b64676 ("mptcp: Track received DATA_FIN sequence number and add related helpers") Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Merge tag 'devicetree-fixes-for-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix handling of HOST_EXTRACFLAGS for dtc
- Several warning fixes for DT bindings
* tag 'devicetree-fixes-for-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting
dt-bindings: Fix 'reg' size issues in zynqmp examples
ARM: dts: bcm2835: Change firmware compatible from simple-bus to simple-mfd
dt-bindings: leds: cznic,turris-omnia-leds: fix error in binding
dt-bindings: crypto: sa2ul: fix a DT binding check warning
autofs: use __kernel_write() for the autofs pipe writing
autofs got broken in some configurations by commit 13c164b1a186
("autofs: switch to kernel_write") because there is now an extra LSM
permission check done by security_file_permission() in rw_verify_area().
autofs is one if the few places that really does want the much more
limited __kernel_write(), because the write is an internal kernel one
that shouldn't do any user permission checks (it also doesn't need the
file_start_write/file_end_write logic, since it's just a pipe).
There are a couple of other cases like that - accounting, core dumping,
and splice - but autofs stands out because it can be built as a module.
As a result, we need to export this internal __kernel_write() function
again.
We really don't want any other module to use this, but we don't have a
"EXPORT_SYMBOL_FOR_AUTOFS_ONLY()". But we can mark it GPL-only to at
least approximate that "internal use only" for licensing.
While in this area, make autofs pass in NULL for the file position
pointer, since it's always a pipe, and we now use a NULL file pointer
for streaming file descriptors (see file_ppos() and commit 438ab720c675:
"vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files")
This effectively reverts commits 9db977522449 ("fs: unexport
__kernel_write") and 13c164b1a186 ("autofs: switch to kernel_write").
====================
via-rhine: Resume fix and other maintenance work
I use via-rhine based Ethernet regularly, and the Ethernet dying
after resume was really annoying me. I decided to take the
matter into my own hands, and came up with a fix for the Ethernet
disappearing after resume. I will also want to take over the code
maintenance work for via-rhine. The patches apply to the latest
code, but they should be backported to older kernels as well.
====================
Kevin Brace [Tue, 29 Sep 2020 20:09:41 +0000 (13:09 -0700)]
via-rhine: VTunknown1 device is really VT8251 South Bridge
The VIA Technologies VT8251 South Bridge's integrated Rhine-II
Ethernet MAC comes has a PCI revision value of 0x7c. This was
verified on ASUS P5V800-VM mainboard.
Kevin Brace [Tue, 29 Sep 2020 20:09:40 +0000 (13:09 -0700)]
via-rhine: Fix for the hardware having a reset failure after resume
In rhine_resume() and rhine_suspend(), the code calls netif_running()
to see if the network interface is down or not. If it is down (i.e.,
netif_running() returning false), they will skip any housekeeping work
within the function relating to the hardware. This becomes a problem
when the hardware resumes from a standby since it is counting on
rhine_resume() to map its MMIO and power up rest of the hardware.
Not getting its MMIO remapped and rest of the hardware powered
up lead to a soft reset failure and hardware disappearance. The
solution is to map its MMIO and power up rest of the hardware inside
rhine_open() before soft reset is to be performed. This solution was
verified on ASUS P5V800-VM mainboard's integrated Rhine-II Ethernet
MAC inside VIA Technologies VT8251 South Bridge.
drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()
Commit 78fe9f63947a2b ("drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions")
added a call to rn_vbios_smu_get_smu_version() to set clk_mgr->smu_ver.
That field is initialized prior to the if-statement, already.
Jean Delvare [Mon, 28 Sep 2020 09:10:37 +0000 (11:10 +0200)]
drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.
Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.
scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting
When building with
$ HOST_EXTRACFLAGS=-g make
the expectation is that host tools are built with debug informations.
This however doesn't happen if the Makefile assigns a new value to the
HOST_EXTRACFLAGS instead of appending to it. So use += instead of := for
the first assignment.
Fixes: e3fd9b5384f3 ("scripts/dtc: consolidate include path options in Makefile") Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Rob Herring <[email protected]>
Rob Herring [Mon, 28 Sep 2020 15:22:56 +0000 (10:22 -0500)]
dt-bindings: Fix 'reg' size issues in zynqmp examples
The default sizes in examples for 'reg' are 1 cell each. Fix the
incorrect sizes in zynqmp examples:
Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.example.dt.yaml: example-0: dma-controller@fd4c0000:reg:0: [0, 4249616384, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:0: [0, 4249485312, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:1: [0, 4249526272, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:2: [0, 4249530368, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml: example-0: display@fd4a0000:reg:3: [0, 4249534464, 0, 4096] is too long
From schema: /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
Vladimir Oltean [Tue, 29 Sep 2020 11:20:24 +0000 (14:20 +0300)]
net: dsa: felix: fix incorrect action offsets for VCAP IS2
The port mask width was larger than the actual number of ports, and
therefore, all fields following this one were also shifted by the number
of excess bits. But the driver doesn't use the REW_OP, SMAC_REPLACE_ENA
or ACL_ID bits from the action vector, so the bug was inconsequential.
There are two chip pins named TXDLY and RXDLY which actually adds the 2ns
delays to TXC and RXC for TXD/RXD latching. These two pins can config via
4.7k-ohm resistor to 3.3V hw setting, but also config via software setting
(extension page 0xa4 register 0x1c bit13 12 and 11).
This table describes how to config these hw pins via external pull-high or pull-
low resistor.
It is a misunderstanding that mapping it as register bits below:
8:6 = PHY Address
5:4 = Auto-Negotiation
3 = Interface Mode Select
2 = RX Delay
1 = TX Delay
0 = SELRGV
So I removed these descriptions above and add related settings as below:
14 = reserved
13 = force Tx RX Delay controlled by bit12 bit11
12 = Tx Delay
11 = Rx Delay
10:0 = Test && debug settings reserved by realtek
Test && debug settings are not recommend to modify by default.
Fixes: f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx delays config") Signed-off-by: Willy Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
virtio-net: don't disable guest csum when disable LRO
Open vSwitch and Linux bridge will disable LRO of the interface
when this interface added to them. Now when disable the LRO, the
virtio-net csum is disable too. That drops the forwarding performance.
Merge tag 'dmaengine-fix-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fix from Vinod Koul:
"Fix dmatest for misconfigured channel"
* tag 'dmaengine-fix-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: dmatest: Prevent to run on misconfigured channel
ftrace: Move RCU is watching check after recursion check
The first thing that the ftrace function callback helper functions should do
is to check for recursion. Peter Zijlstra found that when
"rcu_is_watching()" had its notrace removed, it caused perf function tracing
to crash. This is because the call of rcu_is_watching() is tested before
function recursion is checked and and if it is traced, it will cause an
infinite recursion loop.
rcu_is_watching() should still stay notrace, but to prevent this should
never had crashed in the first place. The recursion prevention must be the
first thing done in callback functions.
tracing: Fix trace_find_next_entry() accounting of temp buffer size
The temp buffer size variable for trace_find_next_entry() was incorrectly
being updated when the size did not change. The temp buffer size should only
be updated when it is reallocated.
This is mostly an issue when used with ftrace_dump(). That's because
ftrace_dump() can not allocate a new buffer, and instead uses a temporary
buffer with a fix size. But the variable that keeps track of that size is
incorrectly updated with each call, and it could fall into the path that
would try to reallocate the buffer and produce a warning.
Not only fix the updating of the temp buffer, but also do not free the temp
buffer before a new buffer is allocated (there's no reason to not continue
to use the current temp buffer if an allocation fails).
Cc: [email protected] Fixes: 8e99cf91b99bb ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic") Reported-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
He Zhe [Mon, 28 Sep 2020 09:00:23 +0000 (17:00 +0800)]
bpf, powerpc: Fix misuse of fallthrough in bpf_jit_comp()
The user defined label following "fallthrough" is not considered by GCC
and causes build failure.
kernel-source/include/linux/compiler_attributes.h:208:41: error: attribute
'fallthrough' not preceding a case label or default label [-Werror]
208 define fallthrough _attribute((fallthrough_))
^~~~~~~~~~~~~