Git Repo - linux.git/log

drm/amd/display: Always pass connector_state to stream validation

We need the connector_state for colorspace and scaling information
and can get it from connector->state.

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Joshua Ashton <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Allow drivers to pass list of supported colorspaces

Drivers might not support all colorspaces defined in
dp_colorspaces and hdmi_colorspaces. This results in
undefined behavior when userspace is setting an
unsupported colorspace.

Allow drivers to pass the list of supported colorspaces
when creating the colorspace property.

v2:
- Use 0 to indicate support for all colorspaces (Jani)
- Print drm_dbg_kms message when drivers pass 0
   to signal that drivers should specify supported
   colorspaecs explicity (Jani)

v3:
- Move changes to create a common colorspace_names array
   to separate patch

v6:
- Avoid magic when passing 0 for supported_colorspaces;
  be explicit in treating it as "all DP/HDMI"

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Print connector colorspace in state debugfs

v3: Fix kerneldocs (kernel test robot)

v4: Avoid returning NULL from drm_get_colorspace_name

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Use common colorspace_names array

We an use bitfields to track the support ones for HDMI
and DP. This allows us to print colorspaces in a consistent
manner without needing to know whether we're dealing with
DP or HDMI.

v4:
- Rename _MAX to _COUNT and leave comment to indicate
  it's not a valid value
- Fix misplaced function doc

v6:
- Drop magic in drm_mode_create_colorspace_property for
  dealing with "0" supported_colorspaces. Expect the caller
  to always provide a non-zero supported_colorspaces.
- Improve error checking and logging

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Pull out common create_colorspace_property code

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Add enum documentation to drm_colorspace

To match the other enums, and add more information about these values.

v2:
- Specify where an enum entry comes from
- Clarify DEFAULT and NO_DATA behavior
- BT.2020 CYCC is "constant luminance"
- correct type for BT.601

v4:
- drop DP/HDMI clarifications that might create
more questions than answers

v5:
- Add note on YCC and RGB variants

Signed-off-by: Joshua Ashton <[email protected]>
Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Acked-by: Pekka Paalanen <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/connector: Convert DRM_MODE_COLORIMETRY to enum

This allows us to use strongly typed arguments.

v2:
- Bring NO_DATA back
- Provide explicit enum values

v3:
- Drop unnecessary '&' from kerneldoc (emersion)

v4:
- Fix Normal Colorimetry comment

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Simon Ser <[email protected]>
Reviewed-by: Sebastian Wick <[email protected]>
Reviewed-by: Pekka Paalanen <[email protected]>
Reviewed-by: Joshua Ashton <[email protected]>
Cc: Pekka Paalanen <[email protected]>
Cc: Sebastian Wick <[email protected]>
Cc: [email protected]
Cc: Uma Shankar <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Joshua Ashton <[email protected]>
Cc: Simon Ser <[email protected]>
Cc: Melissa Wen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: Fix reserved SDMA queues handling

This patch fixes a regression caused by a bad merge where
the handling of reserved SDMA queues was accidentally removed.
With the fix, the reserved SDMA queues are again correctly
marked as unavailable for allocation.

Fixes: a805889a1531 ("drm/amdkfd: Update SDMA queue management for GFX9.4.3")
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add missing radeon secondary PCI ID

0x5b70 is a missing RV370 secondary id. Add it so
we don't try and probe it with amdgpu.

Cc: [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd: Check that a system is a NUMA system before looking for SRAT

It's pointless on laptops to look for the SRAT table as these are not
NUMA. Check the number of possible nodes is > 1 to decide whether to
look for SRAT.

Suggested-by: Felix Kuehling <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: fix vmfault signalling with additional data.

Exception handling for vmfaults should be raised with additional data.

Reported-by: Mukul Joshi <[email protected]>
Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Mukul Joshi <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Set EEPROM ras info

Set EEPROM ras info: rma status, health percent and bad
page threshold.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Calculate EEPROM table ras info bytes sum

It's more reasonable to check EEPROM table ras info bytes.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Add support EEPROM table v2.1

Add ras info to EEPROM table, app can analyse device ECC
status without GPU driver through EEPROM table ras info.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Support setting EEPROM table version

Add setting EEPROM table version interface for umcv8.10,
Add EEPROM table v2.1 to UMC v8.10.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Add RAS table v2.1 macro definition

Add RAS EEPROM table version 2.1 macro definition.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Rename ras table version

Rename RAS_TABLE_VER to RAS_TABLE_VER_V1,
move RAS_TABLE_VER_V1 from amdgpu_ras_eeprom.c to amdgpu_ras_eeprom.h.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Implement gfx9 patch functions for resubmission

Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9
during the resubmission scenario.

Signed-off-by: Jiadong Zhu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Modify indirect buffer packages for resubmission

When the preempted IB frame resubmitted to cp, we need to modify the frame
data including:
1. set PRE_RESUME 1 in CONTEXT_CONTROL.
2. use meta data(DE and CE) read from CSA in WRITE_DATA.

Add functions to save the location the first time IBs emitted and callback
to patch the package when resubmission happens.

Signed-off-by: Jiadong Zhu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/mmsch: Correct the definition for mmsch init header

For the header, it is version related, shouldn't use MAX_VCN_INSTANCES.

Signed-off-by: Emily Deng <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: potential error pointer dereference in ioctl

The "target" either comes from kfd_create_process() which returns error
pointers on error or kfd_lookup_process_by_pid() which returns NULL on
error. So we need to check for both types of errors.

Fixes: 0ab2d7532b05 ("drm/amdkfd: prepare per-process debug enable and disable")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Jonathan Kim <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Only use ODM2:1 policy for high pixel rate displays

We only gain a benefit of using the ODM2:1 dynamic policy if it allow us
to decrease DISPCLK to use the VMIN freq. If the display config can
already achieve VMIN DISPCLK freq without ODM2:1, don't apply the
policy.

This patch was reverted but that causes some IGT regressions. To
unblock, the patch is being applied again until IGT failures are
fixed.

Signed-off-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Aurabindo Pillai <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>

drm/amd/pm: Fix memory some memory corruption

The "od_table" is a pointer to a large struct, but this code is doing
pointer math as if it were pointing to bytes. It results in writing
far outside the struct.

Fixes: 2e8452ea4ef6 ("drm/amd/pm: fulfill the OD support for SMU13.0.0")
Fixes: 2a9aa52e4617 ("drm/amd/pm: fulfill the OD support for SMU13.0.7")
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: display/Kconfig: replace leading spaces with tab

This patch replace the leading spaces with tab, make them keep aligned with
the rest of the config options. No functional change.

Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: mark dml314's UseMinimumDCFCLK() as noinline_for_stack

clang reports:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:3892:6: error: stack frame size (2632) exceeds limit (2048) in 'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
3892 | void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
1 error generated.

So, since UseMinimumDCFCLK() consumes a lot of stack space, mark it as
noinline_for_stack to prevent it from blowing up
dml314_ModeSupportAndSystemConfigurationFull()'s stack size.

Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: mark dml31's UseMinimumDCFCLK() as noinline_for_stack

clang reports:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3797:6: error: stack frame size (2632) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
3797 | void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
1 error generated.

So, since UseMinimumDCFCLK() consumes a lot of stack space, mark it as
noinline_for_stack to prevent it from blowing up
dml31_ModeSupportAndSystemConfigurationFull()'s stack size.

Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix unused variable ‘should_lock_all_pipes’

Fix below compilation error:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3524:7: error: unused variable 'should_lock_all_pipes' [-Werror,-Wunused-variable]
bool should_lock_all_pipes = (update_type != UPDATE_TYPE_FAST);

Cc: Rodrigo Siqueira <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Alex Deucher <[email protected]>
Reviewed-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Reduce sdp bw after urgent to 90%

[Description]
Reduce expected SDP bandwidth due to poor QoS and
arbitration issues on high bandwidth configs

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Reviewed-by: Nevenko Stupar <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Add control flag to dc_stream_state to skip eDP BL off/link off

Add control flag to dc_stream_state to skip eDP BL off/link off.

Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Max Tseng <[email protected]>
Reviewed-by: Anthony Koo <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Wrong index type for pipe iterator

[Why and How]
Type mismatch in index and pipe count might cause an infinite loop. code
Change should resolve this issue.

Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Saaem Rizvi <[email protected]>
Reviewed-by: Josip Pavic <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Refactor fast update to use new HWSS build sequence

[Description]
- Refactor HW sequencer to use a build / execute sequence
- Also move gamma updates to become fast

v2: squash in build fix ("drm/amd/display: Fix guarding of 'if (dc->debug.visual_confirm)'")

Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: fix dcn315 single stream crb allocation

Change to improve avoiding asymetric crb calculations for single stream
scenarios.

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Dmytro Laktyushkin <[email protected]>
Reviewed-by: Charlene Liu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: change reserved vram info print

The link object of mgr->reserved_pages is the blocks
variable in struct amdgpu_vram_reservation, not the
link variable in struct drm_buddy_block.

Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Arunpravin Paneer Selvam <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amdgpu: fix xclk freq on CHIP_STONEY

According to Alex, most APUs from that time seem to have the same issue
(vbios says 48Mhz, actual is 100Mhz). I only have a CHIP_STONEY so I
limit the fixup to CHIP_STONEY

Signed-off-by: Chia-I Wu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl

Userspace can race to free the gobj(robj converted from), robj should not
be accessed again after drm_gem_object_put, otherwith it will result in
use-after-free.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Min Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Revert "drm/amdgpu: switch to golden tsc registers for raven/raven2"

This reverts commit f03eb1d26c2739b75580f58bbab4ab2d5d3eba46.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Revert "drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according to revision id"

This reverts commit 9d2d1827af295fd6971786672c41c4dba3657154.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Revert "drm/amdgpu: change the reference clock for raven/raven2"

This reverts commit fbc24293ca16b3b9ef891fe32ccd04735a6f8dc1.

This results in inconsistent timing reported via asynchronous
GPU queries.

Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: convert vcn/jpeg logical mask to physical mask

Changed from V1:
Remove amdgpu_ras_logical_mask_to_physical_mask
due to GET_MASK provides same feature.
Support convert VCN/JPEG logical mask to physical
mask.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: support check vcn jpeg block mask

Support VCN/JPEG instance mask checking, pass logical
mask directly except GFX/SDMA/VCN/JPEG blocks.

Changed from V1:
correct a typo

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: pass xcc mask to ras ta

pass xcc mask to ras ta, ras ta will compare
the mask with the one from chiplet topology.

Changed from V1:
Remove IP version checking.
Set ras_cmd->ras_init_message.init_flags.xcc_mask
directly due to xcc_mask is common structres to
all the devices.

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: update smu-driver if header for smu 13.0.0 and smu 13.0.10

update smu-driver if header for smu 13.0.0 and smu 13.0.10

Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/pm: notify driver unloading to PMFW for SMU v13.0.6 dGPU

Per requested, follow the same sequence as APU to send only
PPSMC_MSG_PrepareForDriverUnload to PMFW during driver unloading.

Signed-off-by: Le Ma <[email protected]>
Reviewed-by: Shiwu Zhang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Mark 'kgd_gfx_aldebaran_clear_address_watch' & 'kgd_gfx_v11_clear_address_watch' functions as static

Below two functions cause a warning because they lack a prototype:

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c:164:10: warning: no previous prototype for ‘kgd_gfx_aldebaran_clear_address_watch’ [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c:782:10: warning: no previous prototype for ‘kgd_gfx_v11_clear_address_watch’ [-Wmissing-prototypes]

There are no callers from other files, so just mark them as 'static'.

Also fixes the following checks:

CHECK: Alignment should match open parenthesis +static uint32_t
kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device *adev,
uint32_t watch_id)

CHECK: Alignment should match open parenthesis +static uint32_t
kgd_gfx_v11_clear_address_watch(struct amdgpu_device *adev, uint32_t
watch_id)

Cc: Felix Kuehling <[email protected]>
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Program OTG vtotal min/max selectors unconditionally for DCN1+

For FPO/FAMS, DMCUB will try to change the output timings by writing to
the OTG registers. However, the timings written directly to the OTG
registers will not be honoured unless VMIN/VMAX selector registers are
programmed with the right bits and trigger source is selected correctly.
Proper solution needs to go into DMCUB but will require additional state
tracking to ensure that the selectors are set and reset correctly as per
driver state. Until fix is merged into firmware, apply the workaround in
driver to unconditionally write OTG vmin/vmax selectors.

Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Revert "drm/amd/display: Only use ODM2:1 policy for high pixel rate displays"

This reverts commit 047783cdd5f604d87398236beb4971abb4d43293 since it
causes higher power consumption for single display use case (4k60).

Also, this patch introduced a 35% performance drop in a Vulkan benchmark.

* The patch disabled the ODM-combination on most popular monitors, including 4K, 2K and FHD monitors at 60Hz.

* ODM-combination can halve the DPP clock to save power, that is the reason why we introduce ODM-combination, and the PM log shows single pipe consumes more power at 4K@60Hz.

* ODM-combination has 2 de-tiled buffer involved, which provides longer self-sustained time, that benefit to the memory power optimization.

Signed-off-by: Aurabindo Pillai <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Add gnu_printf format attribute for snprintf_count()

Fix the following W=1 kernel build warning:

display/dc/dcn10/dcn10_hw_sequencer_debug.c: In function ‘snprintf_count’:
display/dc/dcn10/dcn10_hw_sequencer_debug.c:56:2: warning: function ‘snprintf_count’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]

Use the __printf() attribute to let the compiler warn if
invalid format strings are passed in.

And fix the following checks:

CHECK: Avoid CamelCase: <pBuf> +unsigned int __printf(3, 4)
snprintf_count(char *pBuf, unsigned int bufSize, char *fmt, ...)

CHECK: Avoid CamelCase: <bufSize> +unsigned int __printf(3, 4)
snprintf_count(char *pBuf, unsigned int bufSize, char *fmt, ...)

Cc: Hamza Mahfooz <[email protected]>
Cc: Rodrigo Siqueira <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Address kdoc warnings in dcn30_fpu.c

Fixes the following gcc with W=1:

display/dc/dml/dcn30/dcn30_fpu.c:677: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Finds dummy_latency_index when MCLK switching using firmware based
display/dc/dml/dcn30/dcn30_fpu.c:688: warning: Function parameter or member 'dc' not described in 'dcn30_find_dummy_latency_index_for_fw_based_mclk_switch'
display/dc/dml/dcn30/dcn30_fpu.c:688: warning: Function parameter or member 'context' not described in 'dcn30_find_dummy_latency_index_for_fw_based_mclk_switch'
display/dc/dml/dcn30/dcn30_fpu.c:688: warning: Function parameter or member 'pipes' not described in 'dcn30_find_dummy_latency_index_for_fw_based_mclk_switch'
display/dc/dml/dcn30/dcn30_fpu.c:688: warning: Function parameter or member 'pipe_cnt' not described in 'dcn30_find_dummy_latency_index_for_fw_based_mclk_switch'
display/dc/dml/dcn30/dcn30_fpu.c:688: warning: Function parameter or member 'vlevel' not described in 'dcn30_find_dummy_latency_index_for_fw_based_mclk_switch'

Cc: Rodrigo Siqueira <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: fix compilation error due to shifting negative value

Currently compiling linux-next with allmodconfig triggers the following
error:

./drivers/gpu/drm/amd/amdgpu/../display/include/fixed31_32.h: In function ‘dc_fixpt_truncate’:
./drivers/gpu/drm/amd/amdgpu/../display/include/fixed31_32.h:528:22: error: left shift of negative value [-Werror=shift-negative-value]
528 | arg.value &= (~0LL) << (FIXED31_32_BITS_PER_FRACTIONAL_PART - frac_bits);
| ^~

Use `unsigned long long` instead.

Signed-off-by: GONG, Ruiqi <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/discovery: Replace fake flex-arrays with flexible-array members

Zero-length and one-element arrays are deprecated, and we are moving
towards adopting C99 flexible-array members, instead.

Use the DECLARE_FLEX_ARRAY() helper macro to transform zero-length
arrays in a union into flexible-array members. And replace a one-element
array with a C99 flexible-array member.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1009:89: warning: array subscript kk is outside array bounds of ‘uint32_t[0]’ {aka ‘unsigned int[]’} [-Warray-bounds=]
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1007:94: warning: array subscript kk is outside array bounds of ‘uint64_t[0]’ {aka ‘long long unsigned int[]’} [-Warray-bounds=]
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1310:94: warning: array subscript k is outside array bounds of ‘uint64_t[0]’ {aka ‘long long unsigned int[]’} [-Warray-bounds=]
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1309:57: warning: array subscript k is outside array bounds of ‘uint32_t[0]’ {aka ‘unsigned int[]’} [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

This results in no differences in binary output.

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/300
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

amdgpu: validate offset_in_bo of drm_amdgpu_gem_va

This is motivated by OOB access in amdgpu_vm_update_range when
offset_in_bo+map_size overflows.

v2: keep the validations in amdgpu_vm_bo_map
v3: add the validations to amdgpu_vm_bo_map/amdgpu_vm_bo_replace_map
rather than to amdgpu_gem_va_ioctl

Fixes: 9f7eb5367d00 ("drm/amdgpu: actually use the VM map parameters")
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Chia-I Wu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix debug wait on idle for gfx9.4.1

Wait calls for amd_ip_block_type not amd_hw_ip_block_type.

Reported-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: fix seamless odm transitions

Add missing programming and function pointers

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Dmytro Laktyushkin <[email protected]>
Reviewed-by: Charlene Liu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: add ODM case when looking for first split pipe

[Why]
When going from ODM 2:1 single display case to max displays, second
odm pipe needs to be repurposed for one of the new single displays.
However, acquire_first_split_pipe() only handles MPC case and not
ODM case

[How]
Add ODM conditions in acquire_first_split_pipe()
Add commit_minimal_transition_state() in commit_streams() to handle
odm 2:1 exit first, and then process new streams
Handle ODM condition in commit_minimal_transition_state()

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Acked-by: Stylon Wang <[email protected]>
Signed-off-by: Samson Tam <[email protected]>
Reviewed-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: clean up some inconsistent indenting

No functional modification involved.

drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:2377 link_set_dpms_on() warn: inconsistent indenting.

Reported-by: Abaci Robot <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5376
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Jiapeng Chong <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Fix dc/dcn20/dcn20_optc.c kdoc

Fix all kdoc warnings in dc/dcn20/dcn20_optc.c:

display/dc/dcn20/dcn20_optc.c:41: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Enable CRTC
display/dc/dcn20/dcn20_optc.c:76: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
*For the below, I'm not sure how your GSL parameters are stored in your
env,
display/dc/dcn20/dcn20_optc.c:85: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* There are (MAX_OPTC+1)/2 gsl groups available for use.

Cc: Rodrigo Siqueira <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Cc: Harry Wentland <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: fulfill the OD support for SMU13.0.7

Fulfill the interfaces for OD settings retrieving and setting.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: Fill metrics data for SMUv13.0.6

Populate metrics data table for SMU v13.0.6. Add PCIe link speed/width
information also.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: fulfill the OD support for SMU13.0.0

Fulfill the interfaces for OD settings retrieving and setting.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: fulfill SMU13 OD settings init and restore

Gfxclk fmin/fmax, Uclk fmin/fmax and Gfx v/f curve voltage offset
OD settings are supported for SMU13.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: bump kfd ioctl minor version for debug api availability

Bump the minor version to declare debugging capability is now
available.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug device snapshot operation

Similar to queue snapshot, return an array of device information using
an entry_size check and return.
Unlike queue snapshots, the debugger needs to pass to correct number of
devices that exist. If it fails to do so, the KFD will return the
number of actual devices so that the debugger can make a subsequent
successful call.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug queue snapshot operation

Allow the debugger to get a snapshot of a specified number of queues
containing various queue property information that is copied to the
debugger.

Since the debugger doesn't know how many queues exist at any given time,
allow the debugger to pass the requested number of snapshots as 0 to get
the actual number of potential snapshots to use for a subsequent snapshot
request for actual information.

To prevent future ABI breakage, pass in the requested entry_size.
The KFD will return it's own entry_size in case the debugger still wants
log the information in a core dump on sizing failure.

Also allow the debugger to clear exceptions when doing a snapshot.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug query exception info operation

Allow the debugger to query additional info based on an exception code.
For device exceptions, it's currently only memory violation information.
For process exceptions, it's currently only runtime information.
Queue exception only report the queue exception status.

The debugger has the option of clearing the target exception on query.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug query event operation

Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug set flags operation

Allow the debugger to set single memory and single ALU operations.

Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction. Trap temporaries 0 & 1 save
the program counter address, which means that these values will not point
to the faulty instruction address but to whenever the interrupt was
raised.

Setting the Single Memory Operations flag will inject an automatic wait
on every memory operation instruction forcing imprecise memory exceptions
to become precise at the cost of performance. This setting is not
permitted on debug devices that support only a global setting of this
option.

Return the previous set flags to the debugger as well.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug set and clear address watch points operation

Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.

Allow the debugger to pass in a watch point to a particular memory
address per device.

Note that there exists only 4 watch points per devices to date, so have
the KFD keep track of what watch points are allocated or not.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug suspend and resume process queues operation

In order to inspect waves from the saved context at any point during a
debug session, the debugger must be able to preempt queues to trigger
context save by suspending them.

On queue suspend, the KFD will copy the context save header information
so that the debugger can correctly crawl the appropriate size of the saved
context. The debugger must then also be allowed to resume suspended queues.

A queue that is newly created cannot be suspended because queue ids are
recycled after destruction so the debugger needs to know that this has
occurred.  Query functions will be later added that will clear a given
queue of its new queue status.

A queue cannot be destroyed while it is suspended to preserve its saved
context during debugger inspection.  Have queue destruction block while
a queue is suspended and unblocked when it is resumed.  Likewise, if a
queue is about to be destroyed, it cannot be suspended.

Return the number of queues successfully suspended or resumed along with
a per queue status array where the upper bits per queue status show that
the request was invalid (new/destroyed queue suspend request, missing
queue) or an error occurred (HWS in a fatal state so it can't suspend or
resume queues).

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug wave launch mode operation

Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug wave launch override operation

This operation allows the debugger to override the enabled HW
exceptions on the device.

On debug devices that only support the debugging of a single process,
the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK
register.
Because they are global, only address watch exceptions are allowed to
be enabled. In other words, the debugger must preserve all non-address
watch exception states in normal mode operation by barring a full
replacement override or a non-address watch override request.

For multi-process debugging, all HW exception overrides are per-VMID so
all exceptions can be overridden or fully replaced.

In order for the debugger to know what is permissible, returned the
supported override mask back to the debugger along with the previously
enable overrides.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug set exceptions enabled operation

The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: update process interrupt handling for debug events

The debugger must be notified by any debugger subscribed exception
that comes from hardware interrupts.

If a debugger session exits, any exceptions it subscribed to may still
have interrupts in the interrupt ring buffer or KGD/KFD pipeline.
To prevent a new session from inheriting stale interrupts, when a new
queue is created, open an interrupt drain and allow the IH ring to drain
from a timestamped checkpoint.  Then inject a custom IV so that once
the custom IV is picked up by the KFD, it's safe to close the drain
and proceed with queue creation.

The drain must also be on debug disable as SW interrupts may still
be processed.  Drain at this time and clear all the exception status.

The debugger may also not be attached nor subscibed to certain
exceptions so forward them directly to the runtime.

GFX10 also requires its own IV processing, hence the creation of
kfd_int_process_v10.c.  This is because the IV from SQ interrupts are
packed into a new continguous format unlike GFX9. To make this clear,
a separate interrupting handling code file was created.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: update SMU13 header files for coming OD support

Correct the data structures for OD feature support.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug trap enabled flag to tma

Trap handler behavior will differ when a debugger is attached.

Make the debug trap flag available in the trap handler TMA.
Update it when the debug trap ioctl is invoked.

Signed-off-by: Jay Cornwall <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Jonathan Kim <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add runtime enable operation

The debugger can attach to a process prior to HSA enablement (i.e.
inferior is spawned by the debugger and attached to immediately before
target process has been enabled for HSA dispatches) or it
can attach to a running target that is already HSA enabled.  Either
way, the debugger needs to know the enablement status to know when
it can inspect queues.

For the scenario where the debugger spawns the target process,
it will have to wait for ROCr's runtime enable request from the target.
The runtime enable request will be able to see that its process has been
debug attached.  ROCr raises an EC_PROCESS_RUNTIME signal to the
debugger then blocks the target process while waiting the debugger's
response. Once the debugger has received the runtime signal, it will
unblock the target process.

For the scenario where the debugger attaches to a running target
process, ROCr will set the target process' runtime status as enabled so
that on an attach request, the debugger will be able to see this
status and will continue with debug enablement as normal.

A secondary requirement is to conditionally enable the trap tempories only
if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger
attaches with HSA runtime enabled.  This is because setting up the trap
temporaries incurs a performance overhead that is unacceptable for
microbench performance in normal mode for certain customers.

In the scenario where the debugger spawns the target process, when ROCr
detects that the debugger has attached during the runtime enable
request, it will enable the trap temporaries before it blocks the target
process while waiting for the debugger to respond.

In the scenario where the debugger attaches to a running target process,
it will enable to trap temporaries itself.

Finally, there is an additional restriction that is required to be
enforced with runtime enable and HW debug mode setting. The debugger must
first ensure that HW debug mode has been enabled before permitting HW debug
mode operations.

With single process debug devices, allowing the debugger to set debug
HW modes prior to trap activation means that debug HW mode setting can
occur before the KFD has reserved the debug VMID (0xf) from the hardware
scheduler's VMID allocation resource pool.  This can result in the
hardware scheduler assigning VMID 0xf to a non-debugged process and
having that process inherit debug HW mode settings intended for the
debugged target process instead, which is both incorrect and potentially
fatal for normal mode operation.

With multi process debug devices, allowing the debugger to set debug
HW modes prior to trap activation means that non-debugged processes
migrating to a new VMID could inherit unintended debug settings.

All debug operations that touch HW settings must require trap activation
where trap activation is triggered by both debug attach and runtime
enablement (target has KFD opened and is ready to dispatch work).

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add send exception operation

Add a debug operation that allows the debugger to send an exception
directly to runtime through a payload address.

For memory violations, normal vmfault signals will be applied to
notify runtime instead after passing in the saved exception data
when a memory violation was raised to the debugger.

For runtime exceptions, this will unblock the runtime enable
function which will be explained and implemented in a follow up
patch.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add raise exception event function

Exception events can be generated from interrupts or queue activitity.

The raise event function will save exception status of a queue, device
or process then notify the debugger of the status change by writing to
a debugger polled file descriptor that the debugger provides during
debug attach.

For memory violation exceptions, extra exception data will be saved.

The debugger will be able to query the saved exception states by query
operation that will be provided by follow up patches.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: apply trap workaround for gfx11

Due to a HW bug, waves in only half the shader arrays can enter trap.

When starting a debug session, relocate all waves to the first shader
array of each shader engine and mask off the 2nd shader array as
unavailable.

When ending a debug session, re-enable the 2nd shader array per
shader engine.

User CU masking per queue cannot be guaranteed to remain functional
if requested during debugging (e.g. user cu mask requests only 2nd shader
array as an available resource leading to zero HW resources available)
nor can runtime be alerted of any of these changes during execution.

Make user CU masking and debugging mutual exclusive with respect to
availability.

If the debugger tries to attach to a process with a user cu masked
queue, return the runtime status as enabled but busy.

If the debugger tries to attach and fails to reallocate queue waves to
the first shader array of each shader engine, return the runtime status
as enabled but with an error.

In addition, like any other mutli-process debug supported devices,
disable trap temporary setup per-process to avoid performance impact from
setup overhead.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add per process hw trap enable and disable functions

To enable HW debug mode per process, all devices must be debug enabled
successfully.  If a failure occures, rewind the enablement of debug mode
on the enabled devices.

A power management scenario that needs to be considered is HW
debug mode setting during GFXOFF.  During GFXOFF, these registers
will be unreachable so we have to transiently disable GFXOFF when
setting.  Also, some devices don't support the RLC save restore
function for these debug registers so we have to disable GFXOFF
completely during a debug session.

Cooperative launch also has debugging restriction based on HW/FW bugs.
If such bugs exists, the debugger cannot attach to a process that uses GWS
resources nor can GWS resources be requested if a process is being
debugged.

Multi-process debug devices can only enable trap temporaries based
on certain runtime scenerios, which will be explained when the
runtime enable functions are implemented in a follow up patch.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: expose debug api for mes

Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process
debug API.

The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES
from clearing the process context when the first queue is added to the
scheduler in order to maintain debug mode settings during queue preemption
and restore. The MES clears the process context in this case due to an
unresolved FW caching bug during normal mode operations.
During debug mode, the KFD will hold a reference to the target process
so the process context should never go stale and MES can afford to skip
this requirement.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: prepare map process for multi-process debug devices

Unlike single process debug devices, multi-process debug devices allow
debug mode setting per-VMID (non-device-global).

Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS API allows
the KFD to forward the required SPI debug register write requests.

To request a new debug mode setting change, the KFD must be able to
preempt all queues then remap all queues with these new setting
requests for MAP_PROCESS to take effect.

Note that by default, trap enablement in non-debug mode must be disabled
for performance reasons for multi-process debug devices due to setup
overhead in FW.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: prepare map process for single process debug devices

Older HW only supports debugging on a single process because the
SPI debug mode setting registers are device global.

The HWS has supplied a single pinned VMID (0xf) for MAP_PROCESS
for debug purposes. To pin the VMID, the KFD will remove the VMID from
the HWS dynamic VMID allocation via SET_RESOUCES so that a debugged
process will never migrate away from its pinned VMID.

The KFD is responsible for reserving and releasing this pinned VMID
accordingly whenever the debugger attaches and detaches respectively.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add configurable grace period for unmap queues

The HWS schedule allows a grace period for wave completion prior to
preemption for better performance by avoiding CWSR on waves that can
potentially complete quickly. The debugger, on the other hand, will
want to inspect wave status immediately after it actively triggers
preemption (a suspend function to be provided).

To minimize latency between preemption and debugger wave inspection, allow
immediate preemption by setting the grace period to 0.

Note that setting the preepmtion grace period to 0 will result in an
infinite grace period being set due to a CP FW bug so set it to 1 for now.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add gfx11 hw debug mode enable and disable calls

Implement the per-device calls to enable or disable HW debug mode
for GFX11.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add gfx9.4.2 hw debug mode enable and disable calls

GFX9.4.2 now supports per-VMID debug mode controls registers
(SPI_GDBG_PER_VMID_CNTL).

Because the KFD lets the HWS handle PASID-VMID mapping, the KFD will
forward all debug mode setting register writes to the HWS scheduler
using a new MAP_PROCESS API, so instead of writing to registers, return
the required register values that the HWS needs to write on debug enable
and disable.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add gfx10 hw debug mode enable and disable calls

Similar to GFX9 debug devices, set the hardware debug mode by draining
the SPI appropriately prior the mode setting request.

Because GFX10 has waves allocated by the work group boundary and each
SE's SPI instances do not communicate, the SPI drain time is much longer.
This long drain time will be fixed for GFX11 onwards.

Also remove a bunch of deprecated misplaced references for GFX10.3.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: fix kfd_suspend_all_processes

Flush delayed restore work in kfd_suspend_all_queues instead of
cancelling. Cancelling the work before it runs results in the queues
becoming permanently disabled. Flushing the work ensures that the
queue suspend/resume state stays balanced.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls

On GFX9.4.1, the implicit wait count instruction on s_barrier is
disabled by default in the driver during normal operation for
performance requirements.

There is a hardware bug in GFX9.4.1 where if the implicit wait count
instruction after an s_barrier instruction is disabled, any wave that
hits an exception may step over the s_barrier when returning from the
trap handler with the barrier logic having no ability to be
aware of this, thereby causing other waves to wait at the barrier
indefinitely resulting in a shader hang.  This bug has been corrected
for GFX9.4.2 and onward.

Since the debugger subscribes to hardware exceptions, in order to avoid
this bug, the debugger must enable implicit wait count on s_barrier
for a debug session and disable it on detach.

In order to change this setting in the in the device global SQ_CONFIG
register, the GFX pipeline must be idle.  GFX9.4.1 as a compute device
will either dispatch work through the compute ring buffers used for
image post processing or through the hardware scheduler by the KFD.

Have the KGD suspend and drain the compute ring buffer, then suspend the
hardware scheduler and block any future KFD process job requests before
changing the implicit wait count setting.  Once set, resume all work.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add gfx9 hw debug mode enable and disable calls

Implement the per-device calls to enable or disable HW debug mode for
GFX9 prior to GFX9.4.1.

GFX9.4.1 and onward will require their own enable/disable sequence as
follow on patches.

When hardware debug mode setting is requested, waves will inherit
these settings in the Shader Processor Input's (SPI) Sequencer Global
Block (SQG). This means that the KGD must drain all waves from the SPI
into SQG (approximately 96 SPI clock cycles) prior to debug mode setting
to ensure that the order of operations that the debugger expects with
regards to debug mode setting transaction requests and wave inheritence
of that mode is upheld.

Also ensure that exception overrides are reset to their original state
prior to debug enable or disable.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: clean up one inconsistent indenting

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:1036 kgd2kfd_interrupt() warn: inconsistent indenting

Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Drop unused DCN_BASE variable in dcn314_resource.c

Fixes the following W=1 kernel build warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn314/dcn314_resource.c:128:29: warning: ‘DCN_BASE’ defined but not used [-Wunused-const-variable=]
128 | static const struct IP_BASE DCN_BASE = { { { { 0x00000012, 0x000000C0, 0x000034C0, 0x00009000, 0x02403C00, 0, 0, 0 } },
| ^~~~~~~~

Suggested-by: Roman Li <[email protected]>
Cc: Hamza Mahfooz <[email protected]>
Cc: Rodrigo Siqueira <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Aurabindo Pillai <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Roman Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd: Make lack of `ACPI_FADT_LOW_POWER_S0` or `CONFIG_AMD_PMC` louder during suspend path

Users have reported that s2idle wasn't working on OEM Phoenix systems,
but it was root caused to be because `CONFIG_AMD_PMC` wasn't set in
the distribution kernel config.

To make this more apparent, raise the messaging to err instead of warn.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217497
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: setup hw debug registers on driver initialization

Add missing debug trap registers references and initialize all debug
registers on boot by clearing the hardware exception overrides and the
wave allocation ID index.

The debugger requires that TTMPs 6 & 7 save the dispatch ID to map
waves onto dispatch during compute context inspection.
In order to correctly set this up, set the special reserved CP bit by
default whenever the MQD is initailized.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add kgd hw debug mode setting interface

Introduce the require KGD debug calls that will execute hardware debug
mode setting.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: prepare per-process debug enable and disable

The ROCm debugger will attach to a process to debug by PTRACE and will
expect the KFD to prepare a process for the target PID, whether the
target PID has opened the KFD device or not.

This patch is to explicity handle this requirement. Further HW mode
setting and runtime coordination requirements will be handled in
following patches.

In the case where the target process has not opened the KFD device,
a new KFD process must be created for the target PID.
The debugger as well as the target process for this case will have not
acquired any VMs so handle process restoration to correctly account for
this.

To coordinate with HSA runtime, the debugger must be aware of the target
process' runtime enablement status and will copy the runtime status
information into the debugged KFD process for later query.

On enablement, the debugger will subscribe to a set of exceptions where
each exception events will notify the debugger through a pollable FIFO
file descriptor that the debugger provides to the KFD to manage.

Finally on process termination of either the debugger or the target,
debugging must be disabled if it has not been done so.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: display debug capabilities

Expose debug capabilities in the KFD topology node's HSA capabilities and
debug properties flags.

Ensure correct capabilities are exposed based on firmware support.

Flag definitions can be referenced in uapi/linux/kfd_sysfs.h.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: add debug and runtime enable interface

Introduce the GPU debug operations interface.

For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD GPU
instruction set, provide the necessary interface to allow the debugger
to HW debug-mode set and query exceptions per HSA queue, process or
device.

The runtime_enable interface coordinates exception handling with the
HSA runtime.

Usage is available in the kern docs at uapi/linux/kfd_ioctl.h.

Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

amd/amdkfd: drop unused KFD_IOCTL_SVM_FLAG_UNCACHED flag

Was leftover from GC 9.4.3 bring up and is currently
unused. Drop it for now.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: conditionally disable pcie lane switching for some sienna_cichlid SKUs

Disable the pcie lane switching for some sienna_cichlid SKUs since it
might not work well on some platforms.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: Fix power context allocation in SMU13

Use the right data structure for allocation.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>