Git Repo - linux.git/log

net: dsa: bcm_sf2: Fix array overrun in bcm_sf2_num_active_ports()

After d12e1c464988 ("net: dsa: b53: Set correct number of ports in the
DSA struct") we stopped setting dsa_switch::num_ports to DSA_MAX_PORTS,
which created an off by one error between the statically allocated
bcm_sf2_priv::port_sts array (of size DSA_MAX_PORTS). When
dsa_is_cpu_port() is used, we end-up accessing an out of bounds member
and causing a NPD.

Fix this by iterating with the appropriate port count using
ds->num_ports.

Fixes: d12e1c464988 ("net: dsa: b53: Set correct number of ports in the DSA struct")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: update NXP copyright text

NXP Legal insists that the following are not fine:

- Saying "NXP Semiconductors" instead of "NXP", since the company's
registered name is "NXP"

- Putting a "(c)" sign in the copyright string

- Putting a comma in the copyright string

The only accepted copyright string format is "Copyright <year-range> NXP".

This patch changes the copyright headers in the networking files that
were sent by me, or derived from code sent by me.

Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

regulator: max14577: Revert "regulator: max14577: Add proper module aliases strings"

This reverts commit 0da6736ecd10b45e535b100acd58df2db4c099d8.

The MODULE_DEVICE_TABLE already creates proper alias.  Having another
MODULE_ALIAS causes the alias to be duplicated:

  $ modinfo max14577-regulator.ko

  alias:          platform:max77836-regulator
  alias:          platform:max14577-regulator
  description:    Maxim 14577/77836 regulator driver
  alias:          platform:max77836-regulator
  alias:          platform:max14577-regulator

Cc: Marek Szyprowski <[email protected]>
Fixes: 0da6736ecd10 ("regulator: max14577: Add proper module aliases strings")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

mm: Fully initialize invalidate_lock, amend lock class later

The function __init_rwsem() is not part of the official API, it just a helper
function used by init_rwsem().
Changing the lock's class and name should be done by using
lockdep_set_class_and_name() after the has been fully initialized. The overhead
of the additional class struct and setting it twice is negligible and it works
across all locks.

Fully initialize the lock with init_rwsem() and then set the custom class and
name for the lock.

Fixes: 730633f0b7f95 ("mm: Protect operations adding pages to page cache with invalidate_lock")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Jan Kara <[email protected]>

net: hso: fix muxed tty registration

If resource allocation and registration fail for a muxed tty device
(e.g. if there are no more minor numbers) the driver should not try to
deregister the never-registered (or already-deregistered) tty.

Fix up the error handling to avoid dereferencing a NULL pointer when
attempting to remove the character device.

Fixes: 72dc1c096c70 ("HSO: add option hso driver")
Cc: [email protected] # 2.6.27
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dt-bindings: arm: mediatek: mmsys: update mediatek,mmsys.yaml reference

Changeset cba3c40d1f97 ("dt-bindings: arm: mediatek: mmsys: convert to YAML format")
renamed: Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt
to: Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml.

Update its cross-reference accordingly.

Fixes: cba3c40d1f97 ("dt-bindings: arm: mediatek: mmsys: convert to YAML format")
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/a87eb079a73e8ab41cdf6e40e80b1d1f868da6bd.1631785820.git.mchehab+huawei@kernel.org

dt-bindings: net: dsa: sja1105: update nxp,sja1105.yaml reference

Changeset 62568bdbe6f6 ("dt-bindings: net: dsa: sja1105: convert to YAML schema")
renamed: Documentation/devicetree/bindings/net/dsa/sja1105.txt
to: Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml.

Update its cross-reference accordingly.

Fixes: 62568bdbe6f6 ("dt-bindings: net: dsa: sja1105: convert to YAML schema")
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/994ce6c6358746ff600459822b9f6e336db933c9.1631785820.git.mchehab+huawei@kernel.org

fpga: dfl: Avoid reads to AFU CSRs during enumeration

CSR address space for Accelerator Functional Units (AFU) is not available
during the early Device Feature List (DFL) enumeration. Early access
to this space results in invalid data and port errors. This change adds
a condition to prevent an early read from the AFU CSR space.

Fixes: 1604986c3e6b ("fpga: dfl: expose feature revision from struct dfl_device")
Cc: [email protected]
Signed-off-by: Russ Weight <[email protected]>
Signed-off-by: Moritz Fischer <[email protected]>

Merge tag 'drm-fixes-2021-09-17' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Slightly busier than usual rc2, but mostly scattered amdgpu fixes,
  some i915 and etnaviv resolves an MMU/runtime PM blowup.

  amdgpu:
   - UBSAN fix
   - Powerplay table update fix
   - Fix use after free in BO moves
   - Debugfs init fixes
   - vblank workqueue fixes for headless devices
   - FPU fixes
   - sysfs_emit fixes
   - SMU updates for cyan skillfish
   - Backlight fixes when DMCU is not initialized
   - DP MST fixes
   - HDCP compliance fix
   - Link training fix
   - Runtime pm fix
   - Panel orientation fixes
   - Display GPUVM fix for yellow carp
   - Add missing license

  amdkfd:
   - Drop PCI atomics requirement if proper firmware is available
   - Suspend/resume fixes for IOMMUv2 cases

  radeon:
   - AGP fix

  i915:
   - Propagate DP link training error returns
   - Use max link params for eDP 1.3 and earlier
   - Build warning fixes
   - Gem selftest fixes
   - Ensure wakeref is held before hardware access

  etnaviv:
   - MMU context vs runtime PM fix"

* tag 'drm-fixes-2021-09-17' of git://anongit.freedesktop.org/drm/drm: (44 commits)
  drm/amdgpu/display: add a proper license to dc_link_dp.c
  drm/amd/display: Fix white screen page fault for gpuvm
  amd/display: enable panel orientation quirks
  drm/amdgpu: Demote TMZ unsupported log message from warning to info
  drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
  drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver
  drm/radeon: pass drm dev radeon_agp_head_init directly
  drm/amdgpu: move iommu_resume before ip init/resume
  drm/amdgpu: add amdgpu_amdkfd_resume_iommu
  drm/amdkfd: separate kfd_iommu_resume from kfd_resume
  drm/amd/display: Link training retry fix for abort case
  drm/amd/display: Fix unstable HPCP compliance on Chrome Barcelo
  drm/amd/display: dsc mst 2 4K displays go dark with 2 lane HBR3
  drm/amd/display: Get backlight from PWM if DMCU is not initialized
  drm/amdkfd: make needs_pcie_atomics FW-version dependent
  drm/amdgpu: add manual sclk/vddc setting support for cyan skilfish(v3)
  drm/amdgpu: add some pptable funcs for cyan skilfish(v3)
  drm/amdgpu: update SMU driver interface for cyan skilfish(v3)
  drm/amdgpu: update SMU PPSMC for cyan skilfish
  drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)
  ...

Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...

Merge tag 'amd-drm-fixes-5.15-2021-09-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.15-2021-09-16:

amdgpu:
- UBSAN fix
- Powerplay table update fix
- Fix use after free in BO moves
- Debugfs init fixes
- vblank workqueue fixes for headless devices
- FPU fixes
- sysfs_emit fixes
- SMU updates for cyan skillfish
- Backlight fixes when DMCU is not initialized
- DP MST fixes
- HDCP compliance fix
- Link training fix
- Runtime pm fix
- Panel orientation fixes
- Display GPUVM fix for yellow carp
- Add missing license

amdkfd:
- Drop PCI atomics requirement if proper firmware is available
- Suspend/resume fixes for IOMMUv2 cases

radeon:
- AGP fix

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'drm-intel-fixes-2021-09-16' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.15-rc2:
- Propagate DP link training error returns
- Use max link params for eDP 1.3 and earlier
- Build warning fixes
- Gem selftest fixes
- Ensure wakeref is held before hardware access

Signed-off-by: Dave Airlie <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

net: 6pack: Fix tx timeout and slot time

tx timeout and slot time are currently specified in units of HZ.  On
Alpha, HZ is defined as 1024.  When building alpha:allmodconfig, this
results in the following error message.

  drivers/net/hamradio/6pack.c: In function 'sixpack_open':
  drivers/net/hamradio/6pack.c:71:41: error:
   unsigned conversion from 'int' to 'unsigned char'
   changes value from '256' to '0'

In the 6PACK protocol, tx timeout is specified in units of 10 ms and
transmitted over the wire:

    https://www.linux-ax25.org/wiki/6PACK

Defining a value dependent on HZ doesn't really make sense, and
presumably comes from the (very historical) situation where HZ was
originally 100.

Note that the SIXP_SLOTTIME use explicitly is about 10ms granularity:

        mod_timer(&sp->tx_t, jiffies + ((when + 1) * HZ) / 100);

and the SIXP_TXDELAY walue is sent as a byte over the wire.

Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes

Fixes a very annoying issue where the driver view of the MMU state gets
out of sync with the actual hardware state across a runtime PM cycle,
so we end up restarting the GPU with the wrong (potentially already
freed) MMU context. Hilarity ensues.

Signed-off-by: Dave Airlie <[email protected]>
From: Lucas Stach <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/rockchip: cdn-dp-core: Make cdn_dp_core_resume __maybe_unused

With the new static annotation, the compiler warns when the functions
are actually unused:

   drivers/gpu/drm/rockchip/cdn-dp-core.c:1123:12: error: 'cdn_dp_resume' defined but not used [-Werror=unused-function]
    1123 | static int cdn_dp_resume(struct device *dev)
         |            ^~~~~~~~~~~~~

Mark them __maybe_unused to suppress that warning as well.

[ Not so 'new' static annotations any more, and I removed the part of
  the patch that added __maybe_unused to cdn_dp_suspend(), because it's
  used by the shutdown/remove code.

  So only the resume function ends up possibly unused if CONFIG_PM isn't
  set     - Linus ]

Fixes: 7c49abb4c2f8 ("drm/rockchip: cdn-dp-core: Make cdn_dp_core_suspend/resume static")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Enric Balletbo i Serra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

selftests: kvm: fix get_run_delay() ignoring fscanf() return warn

Fix get_run_delay() to check fscanf() return value to get rid of the
following warning. When fscanf() fails return MIN_RUN_DELAY_NS from
get_run_delay(). Move MIN_RUN_DELAY_NS from steal_time.c to test_util.h
so get_run_delay() and steal_time.c can use it.

lib/test_util.c: In function ‘get_run_delay’:
lib/test_util.c:316:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
316 | fscanf(fp, "%ld %ld ", &val[0], &val[1]);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests: kvm: move get_run_delay() into lib/test_util

get_run_delay() is defined static in xen_shinfo_test and steal_time test.
Move it to lib and remove code duplication.

Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn

Fix get_trans_hugepagesz() to check fscanf() return value to get rid
of the following warning:

lib/test_util.c: In function ‘get_trans_hugepagesz’:
lib/test_util.c:138:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
138 | fscanf(f, "%ld", &size);
| ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn

Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:

x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
85 | fscanf(f, "%d", &warnings);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

cpufreq: vexpress: Drop unused variable

arm:allmodconfig fails to build with the following error.

drivers/cpufreq/vexpress-spc-cpufreq.c:454:13: error:
unused variable 'cur_cluster'

Remove the unused variable.

Fixes: bb8c26d9387f ("cpufreq: vexpress: Set CPUFREQ_IS_COOLING_DEV flag")
Cc: Viresh Kumar <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

alpha: Declare virt_to_phys and virt_to_bus parameter as pointer to volatile

Some drivers pass a pointer to volatile data to virt_to_bus() and
virt_to_phys(), and that works fine.  One exception is alpha.  This
results in a number of compile errors such as

  drivers/net/wan/lmc/lmc_main.c: In function 'lmc_softreset':
  drivers/net/wan/lmc/lmc_main.c:1782:50: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type

  drivers/atm/ambassador.c: In function 'do_loader_command':
  drivers/atm/ambassador.c:1747:58: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type

Declare the parameter of virt_to_phys and virt_to_bus as pointer to
volatile to fix the problem.

Signed-off-by: Guenter Roeck <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

3com 3c515: make it compile on 64-bit architectures

This driver isn't enabled most places because of the ISA config
dependency, but alpha still has it.  And I think the 'Jensen' actually
did have an ISA slot.

However, it doesn't build cleanly, because the "Vortex bus master" code
just casts the skb->data pointer to 'int':

        outl((int) (skb->data), ioaddr + Wn7_MasterAddr);

which is all kinds of broken.  Even on a good old traditional PC/AT it
would be broken because the high bits will be random kernel address
bits, but presumably the hardware ignores those bits.  I mean, it's ISA.
We're talking 16MB dma limits. The "good old days".

Make the build happy with this kind of craziness by using the proper
isa_virt_to_bus() handling that the full bus master code uses anyway
(the Vortex bus mastering is a limited special case).

Who knows, this might even work.

Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'for-5.15/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fix from Helge Deller:
"Fix a build warning when using the PAGE0 pointer"

* tag 'for-5.15/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Use absolute_pointer() to define PAGE0

Merge tag 'm68k-for-v5.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fixes from Geert Uytterhoeven:

- Warning fixes to mitigate CONFIG_WERROR=y

* tag 'm68k-for-v5.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: mvme: Remove overdue #warnings in RTC handling
m68k: Double cast io functions to unsigned long

arm64: Mark __stack_chk_guard as __ro_after_init

__stack_chk_guard is setup once while init stage and never changed
after that.

Although the modification of this variable at runtime will usually
cause the kernel to crash (so does the attacker), it should be marked
as __ro_after_init, and it should not affect performance if it is
placed in the ro_after_init section.

Signed-off-by: Dan Li <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>

arm64/kernel: remove duplicate include in process.c

Remove all but the first include of linux/sched.h from process.c

Reported-by: Zeal Robot <[email protected]>
Signed-off-by: Lv Ruyi <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>

arm64/sve: Use correct size when reinitialising SVE state

When we need a buffer for SVE register state we call sve_alloc() to make
sure that one is there. In order to avoid repeated allocations and frees
we keep the buffer around unless we change vector length and just memset()
it to ensure a clean register state. The function that deals with this
takes the task to operate on as an argument, however in the case where we
do a memset() we initialise using the SVE state size for the current task
rather than the task passed as an argument.

This is only an issue in the case where we are setting the register state
for a task via ptrace and the task being configured has a different vector
length to the task tracing it. In the case where the buffer is larger in
the traced process we will leak old state from the traced process to
itself, in the case where the buffer is smaller in the traced process we
will overflow the buffer and corrupt memory.

Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Cc: <[email protected]> # 4.15.x
Signed-off-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>

dt-bindings: ufs: Add bindings for Samsung ufs host

This patch adds DT bindings for Samsung ufs hci

Signed-off-by: Alim Akhtar <[email protected]>
Signed-off-by: Chanho Park <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>

drm/amdgpu/display: add a proper license to dc_link_dp.c

Was missing.

Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix white screen page fault for gpuvm

[Why]
The "base_addr_is_mc_addr" field was added for dcn3.1 support but
pa_config was never updated to set it to false.

Uninitialized memory causes it to be set to true which results in
address mistranslation and white screen.

[How]
Use memset to ensure all fields are initialized to 0 by default.

Fixes: 64b1d0e8d500 ("drm/amd/display: Add DCN3.1 HWSEQ")
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Acked-by: Aaron Liu <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

amd/display: enable panel orientation quirks

This patch allows panel orientation quirks from DRM core to be
used. They attach a DRM connector property "panel orientation"
which indicates in which direction the panel has been mounted.
Some machines have the internal screen mounted with a rotation.

Since the panel orientation quirks need the native mode from the
EDID, check for it in amdgpu_dm_connector_ddc_get_modes.

Signed-off-by: Simon Ser <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Demote TMZ unsupported log message from warning to info

As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count

This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
                 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
1985 |         max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)"
Reviewed-by: Lyude Paul <[email protected]>
Signed-off-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Signed-off-by: Evan Quan <[email protected]>
Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/radeon: pass drm dev radeon_agp_head_init directly

Pass drm dev directly as rdev->ddev gets initialized later on
at radeon_device_init().

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214375
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amdgpu: move iommu_resume before ip init/resume

Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amdgpu: add amdgpu_amdkfd_resume_iommu

Add amdgpu_amdkfd_resume_iommu for amdgpu.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amdkfd: separate kfd_iommu_resume from kfd_resume

Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amd/display: Link training retry fix for abort case

[Why]
If link training is aborted, it shall be retried if sink is present.

[How]
Check hpd status to find out whether sink is present or not. If sink is
present, then link training shall be tried again with same settings.
Otherwise, link training shall be aborted.

Reviewed-by: Jimmy Kizito <[email protected]>
Acked-by: Mikita Lipski <[email protected]>
Signed-off-by: Meenakshikumar Somasundaram <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix unstable HPCP compliance on Chrome Barcelo

[Why]
Intermittently, there presents two occurrences of 0 stream
commits in a single HPD event. Current HDCP sequence does
not consider such scenerio, and will thus disable HDCP.

[How]
Add condition check to include stream remove and re-enable
case for HDCP enable.

Reviewed-by: Bhawanpreet Lakha <[email protected]>
Acked-by: Mikita Lipski <[email protected]>
Signed-off-by: Qingqing Zhuo <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: dsc mst 2 4K displays go dark with 2 lane HBR3

[Why]
call stack of amdgpu dsc mst pbn, slot num calculation is as below:
-compute_bpp_x16_from_target_bandwidth
-decide_dsc_target_bpp_x16
-setup_dsc_config
-dc_dsc_compute_bandwidth_range
-compute_mst_dsc_configs_for_link
-compute_mst_dsc_configs_for_state

from pbn -> dsc target bpp_x16

bpp_x16 is calulated by compute_bpp_x16_from_target_bandwidth.
Beside pixel clock and bpp, num_slices_h and bpp_increment_div
will also affect bpp_x16.

from dsc target bpp_x16 -> pbn

within dm_update_mst_vcpi_slots_for_dsc,
pbn = drm_dp_calc_pbn_mode(clock, bpp_x16, true);

drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc)
{
return DIV_ROUND_UP_ULL(mul_u32_u32(clock * (bpp / 16), 64 * 1006),
8 * 54 * 1000 * 1000);
}

bpp / 16 trunc digits after decimal point. This will cause calculation
delta. drm_dp_calc_pbn_mode does not have other informations,
like num_slices_h, bpp_increment_div. therefore, it does not do revese
calcuation properly from bpp_x16 to pbn.

pbn from drm_dp_calc_pbn_mode is less than pbn from
compute_mst_dsc_configs_for_state. This cause not enough mst slot
allocated to display. display could not visually light up.

[How]
pass pbn from compute_mst_dsc_configs_for_state to
dm_update_mst_vcpi_slots_for_dsc

Cc: [email protected]
Reviewed-by: Scott Foster <[email protected]>
Acked-by: Mikita Lipski <[email protected]>
Signed-off-by: Hersen Wu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Get backlight from PWM if DMCU is not initialized

On Carrizo/Stoney systems we set backlight through panel_cntl, i.e.
directly via the PWM registers, if DMCU is not initialized. We
always read it back through ABM registers which leads to a
mismatch and forces atomic_commit to program the backlight
each time.

Instead make sure we use the same logic for backlight readback,
i.e. read it from panel_cntl if DMCU is not initialized.

We also need to remove some extraneous and incorrect calculations
at the end of dce_get_16_bit_backlight_from_pwm.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1666
Cc: [email protected]
Reviewed-by: Josip Pavic <[email protected]>
Acked-by: Mikita Lipski <[email protected]>
Signed-off-by: Harry Wentland <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: make needs_pcie_atomics FW-version dependent

On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.

Move PCIe atomic detection from kgd2kfd_probe into kgd2kfd_device_init
because the MEC firmware is not loaded yet at the probe stage.

Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add manual sclk/vddc setting support for cyan skilfish(v3)

Add manual sclk/vddc setting supoort via pp_od_clk_voltage sysfs
to maintain consistency with other asics. As cyan skillfish doesn't
support DPM, there is only a single frequency and voltage to adjust.

v2: maintain consistency and add command guide.
v3: adjust user settings storage and coding style.

Command guide:
echo vc point sclk vddc > pp_od_clk_voltage
"vc"    - sclk voltage curve
"point" - must be 0
"sclk"  - target value of sclk(MHz), should be in safe range
"vddc"  - target value of vddc(mV), a 6.25(mV) stepping is
  recommended and should be in safe range (the real
  vddc is an approximation of target value)
echo c > pp_od_clk_voltage
"c" - commit the changes of sclk and vddc, only after
  the commit command, the target values set by "vc"
  command will take effect
echo r > pp_od_clk_voltage
"r" - reset sclk and vddc to default value, a subsequent
  commit command is needed to take effect

Example:
1) Check default sclk and vddc
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 862mV *
OD_RANGE:
SCLK:    1000Mhz       2000Mhz
VDDC:     700mV        1129mV
2) Set sclk to 1500MHz and vddc to 700mV
$ echo vc 0 1500 700 > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1500Mhz *
OD_VDDC:
0: 693mV *
OD_RANGE:
SCLK:    1000Mhz       2000Mhz
VDDC:     700mV        1129mV
3) Reset sclk and vddc to default
$ echo r > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 874mV *
OD_RANGE:
SCLK:    1000Mhz       2000Mhz
VDDC:     700mV        1129mV
NOTE:
We don't specify an explicit safe range, you can set any values
between min and max at your own risk. Enjoy!

Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add some pptable funcs for cyan skilfish(v3)

Add print_clk_levels and read_sensor pptable funcs for
cyan skilfish.

v2: keep consitency and add get_gpu_metrics callback.
v3: use sysfs_emit_at() in sysfs show function.

Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: update SMU driver interface for cyan skilfish(v3)

Add SmuMetrics_t definition for cyan skilfish.

v2: update SmuMetrics_t definition.
v3: cleanup and rearrange the order of fields.

Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: update SMU PPSMC for cyan skilfish

Add some PPSMC MSGs for cyan skilfish.

Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)

sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: use an inline function.

Warning Log:
[  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0
[  492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765 sysfs_emit_at+0x4a/0xa0
[  492.654805] Call Trace:
[  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
[  492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu]
[  492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
[  492.659733]  ? preempt_schedule_common+0x18/0x30
[  492.660713]  smu_print_ppclk_levels+0x65/0x90 [amdgpu]
[  492.662107]  amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu]
[  492.663620]  dev_attr_show+0x1d/0x40

Signed-off-by: Lang Yu <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: dc_assert_fp_enabled assert only if FPU is not enabled

Assert only when FPU is not enabled.

Fixes: 0ea7ee821701 ("drm/amd/display: Add DC_FP helper to check FPU state")
Signed-off-by: Anson Jacob <[email protected]>
Cc: Christian König <[email protected]>
Cc: Hersen Wu <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Rodrigo Siqueira <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Add NULL checks for vblank workqueue

[Why]
If we're running a headless config with 0 links then the vblank
workqueue will be NULL - causing a NULL pointer exception during
any commit.

[How]
Guard access to the workqueue if it's NULL and don't queue or flush
work if it is.

Reported-by: Mike Lothian <[email protected]>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1700
Fixes: 58aa1c50e5a231 ("drm/amd/display: Use vblank control events for PSR enable/disable")
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

mlxbf_gige: clear valid_polarity upon open

The network interface managed by the mlxbf_gige driver can
get into a problem state where traffic does not flow.
In this state, the interface will be up and enabled, but
will stop processing received packets.  This problem state
will happen if three specific conditions occur:
    1) driver has received more than (N * RxRingSize) packets but
       less than (N+1 * RxRingSize) packets, where N is an odd number
       Note: the command "ethtool -g <interface>" will display the
       current receive ring size, which currently defaults to 128
    2) the driver's interface was disabled via "ifconfig oob_net0 down"
       during the window described in #1.
    3) the driver's interface is re-enabled via "ifconfig oob_net0 up"

This patch ensures that the driver's "valid_polarity" field is
cleared during the open() method so that it always matches the
receive polarity used by hardware.  Without this fix, the driver
needs to be unloaded and reloaded to correct this problem state.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <[email protected]>
Signed-off-by: David Thompson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

igc: fix tunnel offloading

Checking tunnel offloading, it turns out that offloading doesn't work
as expected.  The following script allows to reproduce the issue.
Call it as `testscript DEVICE LOCALIP REMOTEIP NETMASK'

=== SNIP ===
if [ $# -ne 4 ]
then
  echo "Usage $0 DEVICE LOCALIP REMOTEIP NETMASK"
  exit 1
fi
DEVICE="$1"
LOCAL_ADDRESS="$2"
REMOTE_ADDRESS="$3"
NWMASK="$4"
echo "Driver: $(ethtool -i ${DEVICE} | awk '/^driver:/{print $2}') "
ethtool -k "${DEVICE}" | grep tx-udp
echo
echo "Set up NIC and tunnel..."
ip addr add "${LOCAL_ADDRESS}/${NWMASK}" dev "${DEVICE}"
ip link set "${DEVICE}" up
sleep 2
ip link add vxlan1 type vxlan id 42 \
   remote "${REMOTE_ADDRESS}" \
   local "${LOCAL_ADDRESS}" \
   dstport 0 \
   dev "${DEVICE}"
ip addr add fc00::1/64 dev vxlan1
ip link set vxlan1 up
sleep 2
rm -f vxlan.pcap
echo "Running tcpdump and iperf3..."
( nohup tcpdump -i any -w vxlan.pcap >/dev/null 2>&1 ) &
sleep 2
iperf3 -c fc00::2 >/dev/null
pkill tcpdump
echo
echo -n "Max. Paket Size: "
tcpdump -r vxlan.pcap -nnle 2>/dev/null \
| grep "${LOCAL_ADDRESS}.*> ${REMOTE_ADDRESS}.*OTV" \
| awk '{print $8}' | awk -F ':' '{print $1}' \
| sort -n | tail -1
echo
ip link del vxlan1
ip addr del ${LOCAL_ADDRESS}/${NWMASK} dev "${DEVICE}"
=== SNAP ===

The expected outcome is

  Max. Paket Size: 64904

This is what you see on igb, the code igc has been taken from.
However, on igc the output is

  Max. Paket Size: 1516

so the GSO aggregate packets are segmented by the kernel before calling
igc_xmit_frame.  Inside the subsequent call to igc_tso, the check for
skb_is_gso(skb) fails and the function returns prematurely.

It turns out that this occurs because the feature flags aren't set
entirely correctly in igc_probe.  In contrast to the original code
from igb_probe, igc_probe neglects to set the flags required to allow
tunnel offloading.

Setting the same flags as igb fixes the issue on igc.

Fixes: 34428dff3679 ("igc: Add GSO partial support")
Signed-off-by: Paolo Abeni <[email protected]>
Tested-by: Corinna Vinschen <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Nechama Kraus <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert

Remove the assert from the callback priv lookup function since it does
not require RTNL lock and is already protected by flow_indr_block_lock.

This will avoid warnings from being emitted to dmesg if the driver
registers its callback after an ingress qdisc was created for a
netdevice.

The warnings started after the following patch was merged:
commit 74fc4f828769 ("net: Fix offloading indirect devices dependency on qdisc order creation")

Signed-off-by: Eli Cohen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: wan: wanxl: define CROSS_COMPILE_M68K

It was used but never set. The hardcoded value from before the dawn of
time was non-standard; the usual name for cross-tools is $TRIPLET-$TOOL

Signed-off-by: Adam Borowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: nci: replace unsigned int with int

Should not use comparison of unsigned expressions < 0.

Signed-off-by: Xiang wangx <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

software node: balance refcount for managed software nodes

software_node_notify(), on KOBJ_REMOVE drops the refcount twice on managed
software nodes, thus leading to underflow errors. Balance the refcount by
bumping it in the device_create_managed_software_node() function.

The error [1] was encountered after adding a .shutdown() op to our
fsl-mc-bus driver.

[1]
pc : refcount_warn_saturate+0xf8/0x150
lr : refcount_warn_saturate+0xf8/0x150
sp : ffff80001009b920
x29: ffff80001009b920 x28: ffff1a2420318000 x27: 0000000000000000
x26: ffffccac15e7a038 x25: 0000000000000008 x24: ffffccac168e0030
x23: ffff1a2428a82000 x22: 0000000000080000 x21: ffff1a24287b5000
x20: 0000000000000001 x19: ffff1a24261f4400 x18: ffffffffffffffff
x17: 6f72645f726f7272 x16: 0000000000000000 x15: ffff80009009b607
x14: 0000000000000000 x13: ffffccac16602670 x12: 0000000000000a17
x11: 000000000000035d x10: ffffccac16602670 x9 : ffffccac16602670
x8 : 00000000ffffefff x7 : ffffccac1665a670 x6 : ffffccac1665a670
x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff1a2420318000
Call trace:
refcount_warn_saturate+0xf8/0x150
kobject_put+0x10c/0x120
software_node_notify+0xd8/0x140
device_platform_notify+0x4c/0xb4
device_del+0x188/0x424
fsl_mc_device_remove+0x2c/0x4c
rebofind sp.c__fsl_mc_device_remove+0x14/0x2c
device_for_each_child+0x5c/0xac
dprc_remove+0x9c/0xc0
fsl_mc_driver_remove+0x28/0x64
__device_release_driver+0x188/0x22c
device_release_driver+0x30/0x50
bus_remove_device+0x128/0x134
device_del+0x16c/0x424
fsl_mc_bus_remove+0x8c/0x114
fsl_mc_bus_shutdown+0x14/0x20
platform_shutdown+0x28/0x40
device_shutdown+0x15c/0x330
__do_sys_reboot+0x218/0x2a0
__arm64_sys_reboot+0x28/0x34
invoke_syscall+0x48/0x114
el0_svc_common+0x40/0xdc
do_el0_svc+0x2c/0x94
el0_svc+0x2c/0x54
el0t_64_sync_handler+0xa8/0x12c
el0t_64_sync+0x198/0x19c
---[ end trace 32eb1c71c7d86821 ]---

Fixes: 151f6ff78cdf ("software node: Provide replacement for device_add_properties()")
Reported-by: Jon Nettleton <[email protected]>
Suggested-by: Heikki Krogerus <[email protected]>
Reviewed-by: Heikki Krogerus <[email protected]>
Signed-off-by: Laurentiu Tudor <[email protected]>
Cc: 5.12+ <[email protected]> # 5.12+
[ rjw: Fix up the software_node_notify() invocation ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

EDAC/dmc520: Assign the proper type to dimm->edac_mode

dimm->edac_mode contains values of type enum edac_type - not the
corresponding capability flags. Fix that.

Fixes: 1088750d7839 ("EDAC: Add EDAC driver for DMC520")
Signed-off-by: Borislav Petkov <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

s390/bpf: Fix optimizing out zero-extensions

Currently the JIT completely removes things like `reg32 += 0`,
however, the BPF_ALU semantics requires the target register to be
zero-extended in such cases.

Fix by optimizing out only the arithmetic operation, but not the
subsequent zero-extension.

Reported-by: Johan Almbladh <[email protected]>
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant

The JIT uses agfi for subtracting constants, but -(-0x80000000) cannot
be represented as a 32-bit signed binary integer. Fix by using algfi in
this particular case.

Reported-by: Johan Almbladh <[email protected]>
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/bpf: Fix branch shortening during codegen pass

EMIT6_PCREL() macro assumes that the previous pass generated 6 bytes
of code, which is not the case if branch shortening took place. Fix by
using jit->prg, like all the other EMIT6_PCREL_*() macros.

Reported-by: Johan Almbladh <[email protected]>
Fixes: 4e9b4a6883dd ("s390/bpf: Use relative long branches")
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

drm/etnaviv: add missing MMU context put when reaping MMU mapping

When we forcefully evict a mapping from the the address space and thus the
MMU context, the MMU context is leaked, as the mapping no longer points to
it, so it doesn't get freed when the GEM object is destroyed. Add the
mssing context put to fix the leak.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: reference MMU context when setting up hardware state

Move the refcount manipulation of the MMU context to the point where the
hardware state is programmed. At that point it is also known if a previous
MMU state is still there, or the state needs to be reprogrammed with a
potentially different context.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: fix MMU context leak on GPU reset

After a reset the GPU is no longer using the MMU context and may be
restarted with a different context. While the mmu_state proeprly was
cleared, the context wasn't unreferenced, leading to a memory leak.

Cc: [email protected] # 5.4
Reported-by: Michael Walle <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: exec and MMU state is lost when resetting the GPU

When the GPU is reset both the current exec state, as well as all MMU
state is lost. Move the driver side state tracking into the reset function
to keep hardware and software state from diverging.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: keep MMU context across runtime suspend/resume

The MMU state may be kept across a runtime suspend/resume cycle, as we
avoid a full hardware reset to keep the latency of the runtime PM small.

Don't pretend that the MMU state is lost in driver state. The MMU
context is pushed out when new HW jobs with a different context are
coming in. The only exception to this is when the GPU is unbound, in
which case we need to make sure to also free the last active context.

Cc: [email protected] # 5.4
Reported-by: Michael Walle <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: stop abusing mmu_context as FE running marker

While the DMA frontend can only be active when the MMU context is set, the
reverse isn't necessarily true, as the frontend can be stopped while the
MMU state is kept. Stop treating mmu_context being set as a indication that
the frontend is running and instead add a explicit property.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: put submit prev MMU context when it exists

The prev context is the MMU context at the time of the job
queueing in hardware. As a job might be queued multiple times
due to recovery after a GPU hang, we need to make sure to put
the stale prev MMU context from a prior queuing, to avoid the
reference and thus the MMU context leaking.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

drm/etnaviv: return context from etnaviv_iommu_context_get

Being able to have the refcount manipulation in an assignment makes
it much easier to parse the code.

Cc: [email protected] # 5.4
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Michael Walle <[email protected]>
Tested-by: Marek Vasut <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>

EDAC/synopsys: Fix wrong value type assignment for edac_mode

dimm->edac_mode contains values of type enum edac_type - not the
corresponding capability flags. Fix that.

Issue caught by Coverity check "enumerated type mixed with another
type."

[ bp: Rewrite commit message, add tags. ]

Fixes: ae9b56e3996d ("EDAC, synps: Add EDAC support for zynq ddr ecc controller")
Signed-off-by: Sai Krishna Potthuri <[email protected]>
Signed-off-by: Shubhrajyoti Datta <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

parisc: Use absolute_pointer() to define PAGE0

Use absolute_pointer() wrapper for PAGE0 to avoid this compiler warning:

arch/parisc/kernel/setup.c: In function 'start_parisc':
error: '__builtin_memcmp_eq' specified bound 8 exceeds source size 0

Signed-off-by: Helge Deller <[email protected]>
Co-Developed-by: Guenter Roeck <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>

Merge tag 'hyperv-fixes-signed-20210915' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

- Fix kernel crash caused by uio driver (Vitaly Kuznetsov)

- Remove on-stack cpumask from HV APIC code (Wei Liu)

* tag 'hyperv-fixes-signed-20210915' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: remove on-stack cpumask from hv_send_ipi_mask_allbutself
  asm-generic/hyperv: provide cpumask_to_vpset_noself
  Drivers: hv: vmbus: Fix kernel crash upon unbinding a device from uio_hv_generic driver

Merge tag 'rtc-5.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC fix from Alexandre Belloni:
"Fix a locking issue in the cmos rtc driver"

* tag 'rtc-5.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: cmos: Disable irq around direct invocation of cmos_interrupt()

net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports

Sometimes when unbinding the mv88e6xxx driver on Turris MOX, these error
messages appear:

mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 0 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 100 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 0 from fdb: -2

(and similarly for other ports)

What happens is that DSA has a policy "even if there are bugs, let's at
least not leak memory" and dsa_port_teardown() clears the dp->fdbs and
dp->mdbs lists, which are supposed to be empty.

But deleting that cleanup code, the warnings go away.

=> the FDB and MDB lists (used for refcounting on shared ports, aka CPU
and DSA ports) will eventually be empty, but are not empty by the time
we tear down those ports. Aka we are deleting them too soon.

The addresses that DSA complains about are host-trapped addresses: the
local addresses of the ports, and the MAC address of the bridge device.

The problem is that offloading those entries happens from a deferred
work item scheduled by the SWITCHDEV_FDB_DEL_TO_DEVICE handler, and this
races with the teardown of the CPU and DSA ports where the refcounting
is kept.

In fact, not only it races, but fundamentally speaking, if we iterate
through the port list linearly, we might end up tearing down the shared
ports even before we delete a DSA user port which has a bridge upper.

So as it turns out, we need to first tear down the user ports (and the
unused ones, for no better place of doing that), then the shared ports
(the CPU and DSA ports). In between, we need to ensure that all work
items scheduled by our switchdev handlers (which only run for user
ports, hence the reason why we tear them down first) have finished.

Fixes: 161ca59d39e9 ("net: dsa: reference count the MDB entries at the cross-chip notifier level")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Revert "net: phy: Uniform PHY driver access"

This reverts commit 3ac8eed62596387214869319379c1fcba264d8c6, which did
more than it said on the box, and not only it replaced to_phy_driver
with phydev->drv, but it also removed the "!drv" check, without actually
explaining why that is fine.

That patch in fact breaks suspend/resume on any system which has PHY
devices with no drivers bound.

The stack trace is:

Unable to handle kernel NULL pointer dereference at virtual address 00000000000000e8
pc : mdio_bus_phy_suspend+0xd8/0xec
lr : dpm_run_callback+0x38/0x90
Call trace:
mdio_bus_phy_suspend+0xd8/0xec
dpm_run_callback+0x38/0x90
__device_suspend+0x108/0x3cc
dpm_suspend+0x140/0x210
dpm_suspend_start+0x7c/0xa0
suspend_devices_and_enter+0x13c/0x540
pm_suspend+0x2a4/0x330

Examples why that assumption is not fine:

- There is an MDIO bus with a PHY device that doesn't have a specific
  PHY driver loaded, because mdiobus_register() automatically creates a
  PHY device for it but there is no specific PHY driver in the system.
  Normally under those circumstances, the generic PHY driver will be
  bound lazily to it (at phy_attach_direct time). But some Ethernet
  drivers attach to their PHY at .ndo_open time. Until then it, the
  to-be-driven-by-genphy PHY device will not have a driver. The blamed
  patch amounts to saying "you need to open all net devices before the
  system can suspend, to avoid the NULL pointer dereference".

- There is any raw MDIO device which has 'plausible' values in the PHY
  ID registers 2 and 3, which is located on an MDIO bus whose driver
  does not set bus->phy_mask = ~0 (which prevents auto-scanning of PHY
  devices). An example could be a MAC's internal MDIO bus with PCS
  devices on it, for serial links such as SGMII. PHY devices will get
  created for those PCSes too, due to that MDIO bus auto-scanning, and
  although those PHY devices are not used, they do not bother anybody
  either. PCS devices are usually managed in Linux as raw MDIO devices.
  Nonetheless, they do not have a PHY driver, nor does anybody attempt
  to connect to them (because they are not a PHY), and therefore this
  patch breaks that.

The goal itself of the patch is questionable, so I am going for a
straight revert. to_phy_driver does not seem to have a need to be
replaced by phydev->drv, in fact that might even trigger code paths
which were not given too deep of a thought.

For instance:

phy_probe populates phydev->drv at the beginning, but does not clean it
up on any error (including EPROBE_DEFER). So if the phydev driver
requests probe deferral, phydev->drv will remain populated despite there
being no driver bound.

If a system suspend starts in between the initial probe deferral request
and the subsequent probe retry, we will be calling the phydev->drv->suspend
method, but _before_ any phydev->drv->probe call has succeeded.

That is to say, if the phydev->drv is allocating any driver-private data
structure in ->probe, it pretty much expects that data structure to be
available in ->suspend. But it may not. That is a pretty insane
environment to present to PHY drivers.

In the code structure before the blamed patch, mdio_bus_phy_may_suspend
would just say "no, don't suspend" to any PHY device which does not have
a driver pointer _in_the_device_structure_ (not the phydev->drv). That
would essentially ensure that ->suspend will never get called for a
device that has not yet successfully completed probe. This is the code
structure the patch is returning to, via the revert.

Fixes: 3ac8eed62596 ("net: phy: Uniform PHY driver access")
Signed-off-by: Vladimir Oltean <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup

DSA supports connecting to a phy-handle, and has a fallback to a non-OF
based method of connecting to an internal PHY on the switch's own MDIO
bus, if no phy-handle and no fixed-link nodes were present.

The -ENODEV error code from the first attempt (phylink_of_phy_connect)
is what triggers the second attempt (phylink_connect_phy).

However, when the first attempt returns a different error code than
-ENODEV, this results in an unbalance of calls to phylink_create and
phylink_destroy by the time we exit the function. The phylink instance
has leaked.

There are many other error codes that can be returned by
phylink_of_phy_connect. For example, phylink_validate returns -EINVAL.
So this is a practical issue too.

Fixes: aab9c4067d23 ("net: dsa: Plug in PHYLINK support")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

MAINTAINERS: Add Nirmal Patel as VMD maintainer

Change my email to my unaffiliated address and move me to reviewer status.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jon Derrick <[email protected]>
Signed-off-by: Jon Derrick <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Nirmal Patel <[email protected]>

PCI: Add AMD GPU multi-function power dependencies

Some AMD GPUs have built-in USB xHCI and USB Type-C UCSI controllers with
power dependencies between the GPU and the other functions as in
6d2e369f0d4c ("PCI: Add NVIDIA GPU multi-function power dependencies").

Add device link support for the AMD integrated USB xHCI and USB Type-C UCSI
controllers.

Without this, runtime power management, including GPU resume and temp and
fan sensors don't work correctly.

Reported-at: https://gitlab.freedesktop.org/drm/amd/-/issues/1704
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Evan Quan <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: [email protected]

PCI/ACPI: Don't reset a fwnode set by OF

Commit 375553a93201 ("PCI: Setup ACPI fwnode early and at the same time
with OF") added a call to pci_set_acpi_fwnode() in pci_setup_device(),
which unconditionally clears any fwnode previously set by
pci_set_of_node().

pci_set_acpi_fwnode() looks for ACPI_COMPANION(), which only returns the
existing fwnode if it was set by ACPI_COMPANION_SET(). If it was set by
OF instead, ACPI_COMPANION() returns NULL and pci_set_acpi_fwnode()
accidentally clears the fwnode. To fix this, look for any fwnode instead
of just ACPI companions.

Fixes a virtio-iommu boot regression in v5.15-rc1.

Fixes: 375553a93201 ("PCI: Setup ACPI fwnode early and at the same time with OF")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Rob Herring <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>

PCI/VPD: Defer VPD sizing until first access

7bac54497c3e ("PCI/VPD: Determine VPD size in pci_vpd_init()") reads VPD at
enumeration-time to find the size.  But this is quite slow, and we don't
need the size until we actually need data from VPD.  Dave reported a boot
slowdown of more than two minutes [1].

Defer the VPD sizing until a driver or the user (via sysfs) requests
information from VPD.

If devices are quirked because VPD is known not to work, don't bother even
looking for the VPD capability.  The VPD will not be accessible at all.

[1] https://lore.kernel.org/r/20210913141818 [email protected]/
Link: https://lore.kernel.org/r/20210914215543.GA1437800@bjorn-Precision-5520
Fixes: 7bac54497c3e ("PCI/VPD: Determine VPD size in pci_vpd_init()")
Signed-off-by: Bjorn Helgaas <[email protected]>

fpga: machxo2-spi: Fix missing error code in machxo2_write_complete()

The error code is missing in this code scenario, add the error code
'-EINVAL' to the return value 'ret'.

Eliminate the follow smatch warning:

drivers/fpga/machxo2-spi.c:341 machxo2_write_complete()
warn: missing error code 'ret'.

[[email protected]: Reworded commit message]
Fixes: 88fb3a002330 ("fpga: lattice machxo2: Add Lattice MachXO2 support")
Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Jiapeng Chong <[email protected]>
Signed-off-by: Moritz Fischer <[email protected]>

fpga: machxo2-spi: Return an error on failure

Earlier successes leave 'ret' in a non error state, so these errors are
not reported. Set ret to -EINVAL before going to the error handler.

This addresses two issues reported by smatch:
drivers/fpga/machxo2-spi.c:229 machxo2_write_init()
warn: missing error code 'ret'

drivers/fpga/machxo2-spi.c:316 machxo2_write_complete()
warn: missing error code 'ret'

[[email protected]: Reworded commit message]
Fixes: 88fb3a002330 ("fpga: lattice machxo2: Add Lattice MachXO2 support")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Moritz Fischer <[email protected]>

qnx4: avoid stringop-overread errors

The qnx4 directory entries are 64-byte blocks that have different
contents depending on the a status byte that is in the last byte of the
block.

In particular, a directory entry can be either a "link info" entry with
a 48-byte name and pointers to the real inode information, or an "inode
entry" with a smaller 16-byte name and the full inode information.

But the code was written to always just treat the directory name as if
it was part of that "inode entry", and just extend the name to the
longer case if the status byte said it was a link entry.

That work just fine and gives the right results, but now that gcc is
tracking data structure accesses much more, the code can trigger a
compiler error about using up to 48 bytes (the long name) in a structure
that only has that shorter name in it:

   fs/qnx4/dir.c: In function ‘qnx4_readdir’:
   fs/qnx4/dir.c:51:32: error: ‘strnlen’ specified bound 48 exceeds source size 16 [-Werror=stringop-overread]
      51 |                         size = strnlen(de->di_fname, size);
         |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from fs/qnx4/qnx4.h:3,
                    from fs/qnx4/dir.c:16:
   include/uapi/linux/qnx4_fs.h:45:25: note: source object declared here
      45 |         char            di_fname[QNX4_SHORT_NAME_MAX];
         |                         ^~~~~~~~

which is because the source code doesn't really make this whole "one of
two different types" explicit.

Fix this by introducing a very explicit union of the two types, and
basically explaining to the compiler what is really going on.

Signed-off-by: Linus Torvalds <[email protected]>

sparc: avoid stringop-overread errors

The sparc mdesc code does pointer games with 'struct mdesc_hdr', but
didn't describe to the compiler how that header is then followed by the
data that the header describes.

As a result, gcc is now unhappy since it does stricter pointer range
tracking, and doesn't understand about how these things work.  This
results in various errors like:

    arch/sparc/kernel/mdesc.c: In function ‘mdesc_node_by_name’:
    arch/sparc/kernel/mdesc.c:647:22: error: ‘strcmp’ reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread]
      647 |                 if (!strcmp(names + ep[ret].name_offset, name))
          |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

which are easily avoided by just describing 'struct mdesc_hdr' better,
and making the node_block() helper function look into that unsized
data[] that follows the header.

This makes the sparc64 build happy again at least for my cross-compiler
version (gcc version 11.2.1).

Link: https://lore.kernel.org/lkml/CAHk-=wi4NW3NC0xWykkw=6LnjQD6D_rtRtxY9g8gQAJXtQMi8A@mail.gmail.com/
Cc: Guenter Roeck <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'absolute-pointer' (patches from Guenter)

Merge absolute_pointer macro series from Guenter Roeck:
"Kernel test builds currently fail for several architectures with error
  messages such as the following.

  drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
  arch/m68k/include/asm/string.h:72:25: error:
        '__builtin_memcpy' reading 6 bytes from a region of size 0
                [-Werror=stringop-overread]

  Such warnings may be reported by gcc 11.x for string and memory
  operations on fixed addresses if gcc's builtin functions are used for
  those operations.

  This series introduces absolute_pointer() to fix the problem.
  absolute_pointer() disassociates a pointer from its originating symbol
  type and context, and thus prevents gcc from making assumptions about
  pointers passed to memory operations"

* emailed patches from Guenter Roeck <[email protected]>:
  alpha: Use absolute_pointer to define COMMAND_LINE
  alpha: Move setup.h out of uapi
  net: i825xx: Use absolute_pointer for memcpy from fixed memory location
  compiler.h: Introduce absolute_pointer macro

alpha: Use absolute_pointer to define COMMAND_LINE

alpha:allmodconfig fails to build with the following error
when using gcc 11.x.

arch/alpha/kernel/setup.c: In function 'setup_arch':
arch/alpha/kernel/setup.c:493:13: error:
'strcmp' reading 1 or more bytes from a region of size 0

Avoid the problem by declaring COMMAND_LINE as absolute_pointer().

Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

alpha: Move setup.h out of uapi

Most of the contents of setup.h have no value for userspace
applications. The file was probably moved to uapi accidentally.

Keep the file in uapi to define the alpha-specific COMMAND_LINE_SIZE.
Move all other defines to arch/alpha/include/asm/setup.h.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

net: i825xx: Use absolute_pointer for memcpy from fixed memory location

gcc 11.x reports the following compiler warning/error.

drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
arch/m68k/include/asm/string.h:72:25: error:
'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]

Use absolute_pointer() to work around the problem.

Cc: Geert Uytterhoeven <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

compiler.h: Introduce absolute_pointer macro

absolute_pointer() disassociates a pointer from its originating symbol
type and context. Use it to prevent compiler warnings/errors such as

drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
arch/m68k/include/asm/string.h:72:25: error:
'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]

Such warnings may be reported by gcc 11.x for string and memory
operations on fixed addresses.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd

KASAN reports a use-after-free report when doing fuzz test:

[693354.104835] ==================================================================
[693354.105094] BUG: KASAN: use-after-free in bfq_io_set_weight_legacy+0xd3/0x160
[693354.105336] Read of size 4 at addr ffff888be0a35664 by task sh/1453338

[693354.105607] CPU: 41 PID: 1453338 Comm: sh Kdump: loaded Not tainted 4.18.0-147
[693354.105610] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.81 07/02/2018
[693354.105612] Call Trace:
[693354.105621]  dump_stack+0xf1/0x19b
[693354.105626]  ? show_regs_print_info+0x5/0x5
[693354.105634]  ? printk+0x9c/0xc3
[693354.105638]  ? cpumask_weight+0x1f/0x1f
[693354.105648]  print_address_description+0x70/0x360
[693354.105654]  kasan_report+0x1b2/0x330
[693354.105659]  ? bfq_io_set_weight_legacy+0xd3/0x160
[693354.105665]  ? bfq_io_set_weight_legacy+0xd3/0x160
[693354.105670]  bfq_io_set_weight_legacy+0xd3/0x160
[693354.105675]  ? bfq_cpd_init+0x20/0x20
[693354.105683]  cgroup_file_write+0x3aa/0x510
[693354.105693]  ? ___slab_alloc+0x507/0x540
[693354.105698]  ? cgroup_file_poll+0x60/0x60
[693354.105702]  ? 0xffffffff89600000
[693354.105708]  ? usercopy_abort+0x90/0x90
[693354.105716]  ? mutex_lock+0xef/0x180
[693354.105726]  kernfs_fop_write+0x1ab/0x280
[693354.105732]  ? cgroup_file_poll+0x60/0x60
[693354.105738]  vfs_write+0xe7/0x230
[693354.105744]  ksys_write+0xb0/0x140
[693354.105749]  ? __ia32_sys_read+0x50/0x50
[693354.105760]  do_syscall_64+0x112/0x370
[693354.105766]  ? syscall_return_slowpath+0x260/0x260
[693354.105772]  ? do_page_fault+0x9b/0x270
[693354.105779]  ? prepare_exit_to_usermode+0xf9/0x1a0
[693354.105784]  ? enter_from_user_mode+0x30/0x30
[693354.105793]  entry_SYSCALL_64_after_hwframe+0x65/0xca

[693354.105875] Allocated by task 1453337:
[693354.106001]  kasan_kmalloc+0xa0/0xd0
[693354.106006]  kmem_cache_alloc_node_trace+0x108/0x220
[693354.106010]  bfq_pd_alloc+0x96/0x120
[693354.106015]  blkcg_activate_policy+0x1b7/0x2b0
[693354.106020]  bfq_create_group_hierarchy+0x1e/0x80
[693354.106026]  bfq_init_queue+0x678/0x8c0
[693354.106031]  blk_mq_init_sched+0x1f8/0x460
[693354.106037]  elevator_switch_mq+0xe1/0x240
[693354.106041]  elevator_switch+0x25/0x40
[693354.106045]  elv_iosched_store+0x1a1/0x230
[693354.106049]  queue_attr_store+0x78/0xb0
[693354.106053]  kernfs_fop_write+0x1ab/0x280
[693354.106056]  vfs_write+0xe7/0x230
[693354.106060]  ksys_write+0xb0/0x140
[693354.106064]  do_syscall_64+0x112/0x370
[693354.106069]  entry_SYSCALL_64_after_hwframe+0x65/0xca

[693354.106114] Freed by task 1453336:
[693354.106225]  __kasan_slab_free+0x130/0x180
[693354.106229]  kfree+0x90/0x1b0
[693354.106233]  blkcg_deactivate_policy+0x12c/0x220
[693354.106238]  bfq_exit_queue+0xf5/0x110
[693354.106241]  blk_mq_exit_sched+0x104/0x130
[693354.106245]  __elevator_exit+0x45/0x60
[693354.106249]  elevator_switch_mq+0xd6/0x240
[693354.106253]  elevator_switch+0x25/0x40
[693354.106257]  elv_iosched_store+0x1a1/0x230
[693354.106261]  queue_attr_store+0x78/0xb0
[693354.106264]  kernfs_fop_write+0x1ab/0x280
[693354.106268]  vfs_write+0xe7/0x230
[693354.106271]  ksys_write+0xb0/0x140
[693354.106275]  do_syscall_64+0x112/0x370
[693354.106280]  entry_SYSCALL_64_after_hwframe+0x65/0xca

[693354.106329] The buggy address belongs to the object at ffff888be0a35580
                 which belongs to the cache kmalloc-1k of size 1024
[693354.106736] The buggy address is located 228 bytes inside of
                 1024-byte region [ffff888be0a35580, ffff888be0a35980)
[693354.107114] The buggy address belongs to the page:
[693354.107273] page:ffffea002f828c00 count:1 mapcount:0 mapping:ffff888107c17080 index:0x0 compound_mapcount: 0
[693354.107606] flags: 0x17ffffc0008100(slab|head)
[693354.107760] raw: 0017ffffc0008100 ffffea002fcbc808 ffffea0030bd3a08 ffff888107c17080
[693354.108020] raw: 0000000000000000 00000000001c001c 00000001ffffffff 0000000000000000
[693354.108278] page dumped because: kasan: bad access detected

[693354.108511] Memory state around the buggy address:
[693354.108671]  ffff888be0a35500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[693354.116396]  ffff888be0a35580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[693354.124473] >ffff888be0a35600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[693354.132421]                                                        ^
[693354.140284]  ffff888be0a35680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[693354.147912]  ffff888be0a35700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[693354.155281] ==================================================================

blkgs are protected by both queue and blkcg locks and holding
either should stabilize them. However, the path of destroying
blkg policy data is only protected by queue lock in
blkcg_activate_policy()/blkcg_deactivate_policy(). Other tasks
can get the blkg policy data before the blkg policy data is
destroyed, and use it after destroyed, which will result in a
use-after-free.

CPU0                             CPU1
blkcg_deactivate_policy
  spin_lock_irq(&q->queue_lock)
                                 bfq_io_set_weight_legacy
                                   spin_lock_irq(&blkcg->lock)
                                   blkg_to_bfqg(blkg)
                                     pd_to_bfqg(blkg->pd[pol->plid])
                                     ^^^^^^blkg->pd[pol->plid] != NULL
                                           bfqg != NULL
  pol->pd_free_fn(blkg->pd[pol->plid])
    pd_to_bfqg(blkg->pd[pol->plid])
    bfqg_put(bfqg)
      kfree(bfqg)
  blkg->pd[pol->plid] = NULL
  spin_unlock_irq(q->queue_lock);
                                   bfq_group_set_weight(bfqg, val, 0)
                                     bfqg->entity.new_weight
                                     ^^^^^^trigger uaf here
                                   spin_unlock_irq(&blkcg->lock);

Fix by grabbing the matching blkcg lock before trying to
destroy blkg policy data.

Suggested-by: Tejun Heo <[email protected]>
Signed-off-by: Li Jinlin <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

blkcg: fix memory leak in blk_iolatency_init

BUG: memory leak
unreferenced object 0xffff888129acdb80 (size 96):
  comm "syz-executor.1", pid 12661, jiffies 4294962682 (age 15.220s)
  hex dump (first 32 bytes):
    20 47 c9 85 ff ff ff ff 20 d4 8e 29 81 88 ff ff   G...... ..)....
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff82264ec8>] kmalloc include/linux/slab.h:591 [inline]
    [<ffffffff82264ec8>] kzalloc include/linux/slab.h:721 [inline]
    [<ffffffff82264ec8>] blk_iolatency_init+0x28/0x190 block/blk-iolatency.c:724
    [<ffffffff8225b8c4>] blkcg_init_queue+0xb4/0x1c0 block/blk-cgroup.c:1185
    [<ffffffff822253da>] blk_alloc_queue+0x22a/0x2e0 block/blk-core.c:566
    [<ffffffff8223b175>] blk_mq_init_queue_data block/blk-mq.c:3100 [inline]
    [<ffffffff8223b175>] __blk_mq_alloc_disk+0x25/0xd0 block/blk-mq.c:3124
    [<ffffffff826a9303>] loop_add+0x1c3/0x360 drivers/block/loop.c:2344
    [<ffffffff826a966e>] loop_control_get_free drivers/block/loop.c:2501 [inline]
    [<ffffffff826a966e>] loop_control_ioctl+0x17e/0x2e0 drivers/block/loop.c:2516
    [<ffffffff81597eec>] vfs_ioctl fs/ioctl.c:51 [inline]
    [<ffffffff81597eec>] __do_sys_ioctl fs/ioctl.c:874 [inline]
    [<ffffffff81597eec>] __se_sys_ioctl fs/ioctl.c:860 [inline]
    [<ffffffff81597eec>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
    [<ffffffff843fa745>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<ffffffff843fa745>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
    [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Once blk_throtl_init() queue init failed, blkcg_iolatency_exit() will
not be invoked for cleanup. That leads a memory leak. Swap the
blk_throtl_init() and blk_iolatency_init() calls can solve this.

Reported-by: [email protected]
Fixes: 19688d7f9592 (block/blk-cgroup: Swap the blk_throtl_init() and blk_iolatency_init() calls)
Signed-off-by: Yanfei Xu <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

tools/bootconfig: Define memblock_free_ptr() to fix build error

The lib/bootconfig.c file is shared with the 'bootconfig' tooling, and
as a result, the changes incommit 77e02cf57b6c ("memblock: introduce
saner 'memblock_free_ptr()' interface") need to also be reflected in the
tooling header file.

So define the new memblock_free_ptr() wrapper, and remove unused __pa()
and memblock_free().

Fixes: 77e02cf57b6c ("memblock: introduce saner 'memblock_free_ptr()' interface")
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

selftests: be sure to make khdr before other targets

LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?

Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.

Here we mark target khdr as NOTPARALLEL to make it be always built
first.

CC: Philip Li <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

locking/rwbase: Take care of ordering guarantee for fastpath reader

Readers of rwbase can lock and unlock without taking any inner lock, if
that happens, we need the ordering provided by atomic operations to
satisfy the ordering semantics of lock/unlock. Without that, considering
the follow case:

{ X = 0 initially }

CPU 0 CPU 1
===== =====
rt_write_lock();
X = 1
rt_write_unlock():
  atomic_add(READER_BIAS - WRITER_BIAS, ->readers);
  // ->readers is READER_BIAS.
rt_read_lock():
  if ((r = atomic_read(->readers)) < 0) // True
    atomic_try_cmpxchg(->readers, r, r + 1); // succeed.
  <acquire the read lock via fast path>

r1 = X; // r1 may be 0, because nothing prevent the reordering
        // of "X=1" and atomic_add() on CPU 1.

Therefore audit every usage of atomic operations that may happen in a
fast path, and add necessary barriers.

Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

locking/rwbase: Extract __rwbase_write_trylock()

The code in rwbase_write_lock() is a little non-obvious vs the
read+set 'trylock', extract the sequence into a helper function to
clarify the code.

This also provides a single site to fix fast-path ordering.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

locking/rwbase: Properly match set_and_save_state() to restore_state()

Noticed while looking at the readers race.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

events: Reuse value read using READ_ONCE instead of re-reading it

In perf_event_addr_filters_apply, the task associated with
the event (event->ctx->task) is read using READ_ONCE at the beginning
of the function, checked, and then re-read from event->ctx->task,
voiding all guarantees of the checks. Reuse the value that was read by
READ_ONCE to ensure the consistency of the task struct throughout the
function.

Fixes: 375637bc52495 ("perf/core: Introduce address range filtering")
Signed-off-by: Baptiste Lepers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

io_uring: move iopoll reissue into regular IO path

230d50d448acb ("io_uring: move reissue into regular IO path")
made non-IOPOLL I/O to not retry from ki_complete handler. Follow it
steps and do the same for IOPOLL. Same problems, same implementation,
same -EAGAIN assumptions.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/f80dfee2d5fa7678f0052a8ab3cfca9496a112ca.1631699928.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

Revert "iov_iter: track truncated size"

This reverts commit 2112ff5ce0c1128fe7b4d19cfe7f2b8ce5b595fa.

We no longer need to track the truncation count, the one user that did
need it has been converted to using iov_iter_restore() instead.

Signed-off-by: Jens Axboe <[email protected]>

io_uring: use iov_iter state save/restore helpers

Get rid of the need to do re-expand and revert on an iterator when we
encounter a short IO, or failure that warrants a retry. Use the new
state save/restore helpers instead.

We keep the iov_iter_state persistent across retries, if we need to
restart the read or write operation. If there's a pending retry, the
operation will always exit with the state correctly saved.

Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'nvme-5.15-2021-09-15' of git://git.infradead.org/nvme into block-5.15

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.15

- fix ANA state updates when a namespace is not present (Anton Eidelman)
- nvmet: fix a width vs precision bug in nvmet_subsys_attr_serial_show
   (Dan Carpenter)
- avoid race in shutdown namespace removal (Daniel Wagner)
- fix io_work priority inversion in nvme-tcp (Keith Busch)
- destroy cm id before destroy qp to avoid use after free (Ruozhu Li)"

* tag 'nvme-5.15-2021-09-15' of git://git.infradead.org/nvme:
  nvme-tcp: fix io_work priority inversion
  nvme-rdma: destroy cm id before destroy qp to avoid use after free
  nvme-multipath: fix ANA state updates when a namespace is not present
  nvme: avoid race in shutdown namespace removal
  nvmet: fix a width vs precision bug in nvmet_subsys_attr_serial_show()