Git Repo - linux.git/log

drm/xe/guc: s/guc_fini/guc_fini_hw/

Make it clear that is about cleaning up the HW/FW side, and not software
state.

Signed-off-by: Matthew Auld <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: move guc_fini over to devm

Make sure to actually call this when the device is removed. Currently we
only trigger it when the driver instance goes away, but that doesn't
work too well with hotunplug, since device can be removed and re-probed
with a new driver instance, where the guc_fini() is called too late.
Move the fini over to devm to ensure this is called when device is
removed.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1717
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/ggtt: use drm_dev_enter to mark device section

Device can be hotunplugged before we start destroying gem objects. In
such a case don't touch the GGTT entries, trigger any invalidations or
mess around with rpm.  This should already be taken care of when
removing the device, we just need to take care of dealing with the
software state, like removing the mm node.

v2: (Andrzej)
  - Avoid some duplication by tracking the bound status and checking
    that instead.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1717
Signed-off-by: Matthew Auld <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Jagmeet Randhawa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: covert sysfs over to devm

Hotunplugging the device seems to result in stuff like:

kobject_add_internal failed for tile0 with -EEXIST, don't try to
register things with the same name in the same directory.

We only remove the sysfs as part of drmm, however that is tied to the
lifetime of the driver instance and not the device underneath. Attempt
to fix by using devm for all of the remaining sysfs stuff related to the
device.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1667
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1432
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pci: remove broken driver_release

This is quite broken since we are nuking the pdev link to the private
driver struct, but note here that driver_release is called when the
drm_device is released (poor mans drmm), which can be long after the
device has been removed. So here what we are actually doing is nuking
the pdev link for what is potentially bound to a different drm_device.
If that happens before our pci remove callback is triggered (for the new
drm_device) we silently exit and skip some important cleanup steps,
resulting in hilarity.

There should be no reason to implement driver_release, when we already
have nicer stuff like drmm, so just remove completely. The actual pdev
link is already nuked when removing the device.

Signed-off-by: Matthew Auld <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/vf: Custom GuC initialization if VF

The GuC firmware is loaded and initialized by the PF driver. Make
sure VF drivers only perform permitted operations. For submission
initialization, use number of GuC context IDs from self config.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: Piotr Piórkowski <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Allow to initialize submission with limited set of IDs

While PF and native drivers may initialize submission code to use
all available GuC contexts IDs, the VF driver may only use limited
number of IDs. Update init function to accept number of context
IDs available for use.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Cleanup xe_mmio.h

We don't need <linux/delay.h> include since commit 5c09bd6ccd41
("drm/xe/mmio: Move xe_mmio_wait32() to xe_mmio.c").

We don't need <linux/io-64-nonatomic-lo-hi.h> include since commit
54c659660d63 ("drm/xe: Make xe_mmio_read|write() functions non-inline").

And since commit 924e6a9789a0 ("drm/xe/uapi: Remove MMIO ioctl")
we don't need forward declarations of drm_device and drm_file.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Don't rely on indirect includes from xe_mmio.h

These compilation units use udelay() or some GT oriented printk
functions without explicitly including proper header files, and
relying on #includes from the xe_mmio.h instead. Fix that.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/i915/display: Add missing include to intel_vga.c

This compilation unit uses udelay() function without including
it's header file. Fix that to break dependency on other code.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Jani Nikula <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_guc_pc.h

Prefer forward declaration over #include xe_guc_pc_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_huc.h

Prefer forward declaration over #include xe_huc_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_gsc.h

Prefer forward declaration over #include xe_gsc_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_uc.h

Prefer forward declaration over #include xe_uc_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/tests: Use uninterruptible VM lock

Interruptible lock can return error and needed a return value
check. This test should finish quick enough so use a uninterruptible
lock instead.

Cc: Matthew Auld <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Add warn when level can not be zero.

At xe_pt_zap_ptes_entry() and xe_pt_stage_unbind_entry, the level cannot
be 0. Therefore, add an independent check for the level. Since the level
cannot be zero at this point, there is no need to check for `is_compact`,
so remove that instead.

Cc: Matthew Auld <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe/uapi: Expose the L3 bank mask

The L3 bank mask is already generated and stored internally with
the rest of the GT topology. In user space, the compute runtime
now needs this information to be added to the device properties
therefore the topology mask query is extended to provide a new
mask which represents the L3 banks enabled on the GT.

The changes in the compute runtime are ready and approved, see
link below.

v2: Rewrite commit message and add a link to the compute
runtime PR (Francois Dugast)

Cc: Matt Roper <[email protected]>
Cc: Robert Krzemien <[email protected]>
Cc: Mateusz Jablonski <[email protected]>
Link: https://github.com/intel/compute-runtime/pull/722
Signed-off-by: Francois Dugast <[email protected]>
Acked-by: Mateusz Jablonski <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/client: Print runtime to fdinfo

Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-<class>: <RUNTIME>
drm-total-cycles-<class>: <TIMESTAMP>

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

    RUNTIME1 - RUNTIME0
busyness = ---------------------
  T1 - T0

Since drm-cycles-<class> always starts at 0, it's also possible to know
if and engine was ever used by a client.

It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.

Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Acked-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Add helper to return any available hw engine

Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.

Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Cache data about user-visible engines

gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Reviewed-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Add helper to accumulate exec queue runtime

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
    runtime when job is finished since xe_sched_job_completed() is not a
    notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
    redundant and may happen after xef is closed, leading to a
    use-after-free
  - Do not special case the first timestamp read: the default LRC sets
    CTX_TIMESTAMP to zero, so even the first sample should be a valid
    one.
  - Handle the parallel submission case by multiplying the runtime by
    width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Add helper to capture engine timestamp

Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.

Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe/lrc: Add helper to capture context timestamp

Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.

v2: Add kernel-doc to clarify expectation from caller

Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion

XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.

Reviewed-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Promote xe_hw_engine_class_to_str()

Move it out of the sysfs compilation unit so it can be re-used in other
places.

Reviewed-by: Nirmoy Das <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Replace RING_START_UDW by u64 RING_START

Other u64 registers are printed in a single line so RING_START
needs to follow that too.
As there is no upstream decoder tool parsing RING_START this will
not break any decoder application.

Cc: Niranjana Vishwanathapura <[email protected]>
Cc: Matt Roper <[email protected]>
Signed-off-by: José Roberto de Souza <[email protected]>
Reviewed-by: Niranjana Vishwanathapura <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs

For debug purposes we might want to view actual VF configuration
(including GGTT range, LMEM size, number of GuC contexts IDs or
doorbells) and the negotiated ABI versions (with GuC and PF).

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/vf: Custom hardware config load step if VF

The VF drivers may immediately communicate with the GuC to obtain
the hardware config since the firmware shall already be running.

With the GuC communication established, VFs can also obtain the
values of the runtime registers (fuses) from the PF driver.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/vf: Add support for VF to query its configuration

The VF driver doesn't know which GuC firmware was loaded by the PF
driver and must perform GuC ABI version handshake prior to sending
any other H2G actions to the GuC to submit workloads.

The VF driver also doesn't have access to the fuse registers and
must rely on the runtime info, which includes values of the fuse
registers, that the PF driver is exposing to the VFs.

Add functions to cover that functionality. We will use these
functions in upcoming patches.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Piotr Piórkowski <[email protected]>
Reviewed-by: Piotr Piórkowski <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add VF2GUC_QUERY_SINGLE_KLV to ABI

In upcoming patches we will add support to the VF driver to
read its configuration from the GuC using special H2G actions.
Add necessary definitions to our GuC firmware ABI header.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add VF2GUC_VF_RESET to ABI

The version negotiation between the VF driver and the GuC firmware
must start with explicit soft reset of the GuC state initiated by
the VF driver. Add VF2GUC action definitions to the ABI header.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add VF2GUC_MATCH_VERSION to ABI

In upcoming patches we will add a version negotiation between
the VF driver and the GuC firmware. Add necessary definitions
to our GuC firmware ABI header.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Expose PF monitor details via debugfs

For debug purposes we might want to view statistics maintained by
the PF driver about VFs activity.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Track adverse events notifications from GuC

When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Start handling such messages.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add GUC2PF_ADVERSE_EVENT to ABI

When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Add necessary definitions
to our GuC firmware ABI header.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Allow configuration of VF thresholds over debugfs

Initial values of all thresholds used by the GuC to monitor VF's
activity is zero (disabled) and we need to explicitly configure
them per each VF. Expose additional attributes over debugfs.

Definitions of all attributes are generated so we will not need
to make any changes if new thresholds would be added to the set.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Introduce functions to configure VF thresholds

The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
Add functions to allow configuration of these thresholds per VF.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add support for threshold KLVs in to_string() helper

Use MAKE_XE_GUC_KLV_THRESHOLDS_SET to generate missing conversion
of threshold KLV keys to string.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Introduce GuC KLV thresholds set

The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.

The available thresholds are defined in the GuC ABI as part of the
GuC VF Configuration KLVs. Threshold configurations performed by
the PF driver and notifications sent by the GuC rely on the KLV keys,
which are not zero-based and might not guarantee continuity.

To simplify the driver code and eliminate the need to repeat very
similar code for each threshold, introduce the threshold set macro
that allows to generate required code based on unique threshold tag.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Acked-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/guc: Add more KLV helper macros

In upcoming patches we will want to generate some of the KLV keys
from other macros. Add MAKE_GUC_KLV_{KEY|LEN} macros for that and
make sure they will correctly expand provided TAG parameter. Also
fix PREP_GUC_KLV_TAG to also work correctly within other macros.

Reviewed-by: Piotr Piórkowski <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Implement pci_driver.sriov_configure callback

The PCI subsystem already exposes the "sriov_numvfs" attribute
that users can use to enable or disable SR-IOV VFs. Add custom
implementation of the .sriov_configure callback defined by the
pci_driver to perform additional steps, including fair VFs
provisioning with the resources, as required by our platforms.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Piotr Piórkowski <[email protected]>
Cc: Badal Nilawar <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Piotr Piórkowski <[email protected]> #v2
Reviewed-by: Badal Nilawar <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/pf: Don't advertise support to enable VFs if not ready

Even if we have not enabled SR-IOV support using the platform
specific has_sriov flag, the hardware may still report SR-IOV
capability and the PCI layer may wrongly advertise driver support
to enable VFs. Explicitly reset the number of supported VFs to
zero to avoid confusion.

Applications may read the /sys/bus/pci/devices/.../sriov_totalvfs
prior to enabling VFs using the sriov_numvfs to check if such an
operation is possible.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Piotr Piórkowski <[email protected]>
Acked-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Only zap PTEs as needed

If PTEs are already invalidated no need to invalidate again.

Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/xe_guc_submit: Declare reset if banned or killed or wedged

Add an additional condition to the reset_status guc_exec_queue_op that
returns true if the exec queue has been banned or killed or wedged. The
reset_status op is only used for exiting any xe_wait_user_fence_ioctl
that waits on an exec queue without timing out, so doing this will exit
the ioctl early in cases where the exec queue can no longer function,
such as after a GuC stop during a reset.

Suggested-by: Matthew Brost <[email protected]>
Signed-off-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/xe_guc_submit: Allow lr exec queues to be banned

LR queues currently don't get banned during a GT/GuC reset because they
lack a job. Though they don't have a job to detect the reset status of,
it's still possible to tell when they should be banned by looking at the
LRC: if the LRC head and tail don't match, then the exec queue should be
banned and cleaned up.

This also requires swapping the usage of xe_sched_tdr_queue_imm with
xe_guc_exec_queue_trigger_cleanup, as the former is specific to non-lr
exec queues.

Suggested-by: Matthew Brost <[email protected]>
Signed-off-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/xe_guc_submit: Fix exec queue stop race condition

Reorder the xe_sched_tdr_queue_imm and set_exec_queue_banned calls in
guc_exec_queue_stop. This prevents a possible race condition between
the two events in which it's possible for xe_sched_tdr_queue_imm to
wake the ufence waiter before the exec queue is banned, causing the
ufence waiter to miss the banned state.

Suggested-by: Matthew Brost <[email protected]>
Signed-off-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/uc: Move GuC submission init to post hwconfig step

We shouldn't need anything from the GuC submission code until we
finish GuC initialization in post hwconfig step.

While around add diagnostic message if we fail uC init.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/uc: Reorder post hwconfig uC initialization step

We want to move the GuC submission initialization to the post
hwconfig step, but now this step is done too late as migration
initialization uses exec_queue that would crash due to a unset
exec_queue_ops. We can easily fix that by small function reorder.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Only use reserved BCS instances for usm migrate exec queue

The GuC context scheduling queue is 2 entires deep, thus it is possible
for a migration job to be stuck behind a fault if migration exec queue
shares engines with user jobs. This can deadlock as the migrate exec
queue is required to service page faults. Avoid deadlock by only using
reserved BCS instances for usm migrate exec queue.

Fixes: a043fbab7af5 ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC")
Cc: Matt Roper <[email protected]>
Cc: Niranjana Vishwanathapura <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Brian Welty <[email protected]>

drm/xe: Fix the warning conditions

The maximum timeout display uses in xe_pcode_request is 3 msec, add the
warning in cases the function is misused with higher timeouts.

Add a warning if pcode_try_request is not passed the timeout parameter
greater than 0.

Cc: Lucas De Marchi <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe: Change pcode timeout to 50msec while polling again

Polling is initially attempted with timeout_base_ms enabled for
preemption, and if it exceeds this timeframe, another attempt is made
without preemption, allowing an additional 50 ms before timing out.

v2
- Rebase

v3
- Move warnings to separate patch (Lucas)

Cc: Lucas De Marchi <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Himal Prasad Ghimiray <[email protected]>
Fixes: 7dc9b92dcfef ("drm/xe: Remove i915_utils dependency from xe_pcode.")
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe: Move sw-only pcode initialization

Move it to xe_gt_init_early() that initializes the sw-only part for each
gt.

Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Michał Winiarski <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Move xe_force_wake_init_gt() inside gt initialization

xe_force_wake_init_gt() is a software-only initialization and doesn't
need to be called from xe_device_probe(). Move it to initialize
together with the gt.

Reviewed-by: Michał Winiarski <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Move xe_gt_init_early() where it belongs

Early shall be early enough, stop doing other things with gt before it.
Now that xe_gt_init_early() doesn't need forcewake and doesn't depend on
the fake engine_mask initialization, move it where it belongs: it
doesn't need to be after hwconfig config anymore.

Reviewed-by: Michał Winiarski <[email protected]>
Reviewed-by: Vinay Belgaumkar <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Drop useless forcewake get/put

Forcewake used to be needed in xe_gt_init_early() since it was calling
xe_gt_topology_init(). That call was dropped in commit 4c47049d93b7
("drm/xe/guc: Fix missing topology init"), but the forcewake calls were
left behind. Remove them.

Cc: Zhanjun Dong <[email protected]>
Reviewed-by: Michał Winiarski <[email protected]>
Reviewed-by: Zhanjun Dong <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Drop __engine_mask

Not really used, it's just a copy of engine_mask, which already reads
the fuses to mark engines as available/not-available.

Reviewed-by: Michał Winiarski <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Fix xe_reg_sr.h

Prefer forward declarations over #include xe_reg_sr_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_lrc.h

Prefer forward declarations over #include xe_lrc_types.h

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_guc_ads.h

We don't need to include xe_guc_ads_types.h here.
Use forward declaration instead.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Fix xe_gt_throttle_sysfs.h

We don't need to include drm/drm_managed.h here.
We don't need to comment final #endif.
Also remove empty line at the end.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Rename few xe_args.h macros

To minimize the risk of future name collisions, rename macros to
always include the ARG or ARGS tag:

  DROP_FIRST to DROP_FIRST_ARG
  PICK_FIRST to FIRST_ARG
  PICK_LAST to LAST_ARG

Suggested-by: Andy Shevchenko <[email protected]>
Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]> #v2
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Move xe_gpu_commands.h file to instructions/

All other files with commands definitions are in instructions/
folder. Move xe_gpu_commands.h also there.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/hwmon: Remove unwanted write permission for currN_label

Change umode of currN_label from 0644 to 0444 as write permission
not needed for label.

Signed-off-by: Karthik Poosa <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe: Fix UBSAN shift-out-of-bounds failure

Here is the failure stack:
[   12.988209] ------------[ cut here ]------------
[   12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[   12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G     U             6.9.0-rc6+prerelease1158+ #19
[   12.988237] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 08/05/2022
[   12.988239] Call Trace:
[   12.988240]  <TASK>
[   12.988242]  dump_stack_lvl+0xd7/0xf0
[   12.988248]  dump_stack+0x10/0x20
[   12.988250]  ubsan_epilogue+0x9/0x40
[   12.988253]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
[   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
[   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
[   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
[   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]

It is caused by the value 0 of parameter num_fences in function
drm_exec_prepare_obj.  And lead to in function __rounddown_pow_of_two,
"0 - 1" causes the shift-out-of-bounds.

By design drm_exec_prepare_obj() should be called only when there are
fences to be reserved. If num_fences is 0, calling drm_exec_lock_obj()
is sufficient as was done in commit 9377de4cb3e8 ("drm/xe/vm: Avoid
reserving zero fences")

Cc: Nirmoy Das <[email protected]>
Cc: Matthew Brost <[email protected]>
Signed-off-by: Shuicheng Lin <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe/xe2: Enable Indirect Ring State support for Xe2

Indirect Ring State is the recommended mode for Xe2 platforms,
enable it by default.

v2: Set has_indirect_ring_state to '1' instead of 'true'

Signed-off-by: Niranjana Vishwanathapura <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Dump Indirect Ring State registers

Dump INDIRECT_RING_STATE and RING_START_UDW registers.

v2: Add bspec reference

Bspec: 67137, 67138
Signed-off-by: Niranjana Vishwanathapura <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Add Indirect Ring State support

When Indirect Ring State is enabled, the Ring Buffer state and
Batch Buffer state are context save/restored to/from Indirect
Ring State instead of the LRC. The Indirect Ring State is a 4K
page mapped in global GTT at a 4K aligned address. This address
is programmed in the INDIRECT_RING_STATE register of the
corresponding context's LRC.

v2: Fix kernel-doc, add bspec reference
v3: Fix typo in commit text

Bspec: 67296, 67139
Signed-off-by: Niranjana Vishwanathapura <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Minor cleanup in LRC handling

Properly define register fields and remove redundant
lower_32_bits().

Signed-off-by: Niranjana Vishwanathapura <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Stuart Summers <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/xe2: Add workaround 14021402888

This workaround applies to Graphics 20.01 as RCS engine workaround

Signed-off-by: Bommu Krishnaiah <[email protected]>
Cc: Tejas Upadhyay <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe/ads: Use flexible-array

Zero-length arrays are deprecated and flexible arrays should be
used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arrays

Reported-by: kernel test robot <[email protected]>
Reported-by: Julia Lawall <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]/
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Fix xe_device.h

Some explicit includes are needed only from the xe_device.c.
And there is no need for redundant forward declarations.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Don't rely on xe_force_wake.h to be included elsewhere

While xe_force_wake.h is now included from the xe_device.h, we
want to drop that include as we don't need it there. Explicitly
include xe_force_wake.h where needed.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Don't rely on xe_assert.h to be included elsewhere

While xe_assert.h is now included and used by the xe_force_wake.h,
we want to stop include xe_force_wake.h from xe_device.h as it's
not needed there. Explicitly include xe_assert.h where needed.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: skip error capture when exec queue is killed

When user closes exec queue soon after job submission,
we are generating error coredump. Instead check if
exec queue is killed during job timeout then skip
error coredump capture.

V2:
- Just skip error capture - MattB

Signed-off-by: Tejas Upadhyay <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/vm_doc: Fix some typos

Fix some typos and add / remove / change a few words to improve
readability and prevent some ambiguities.

Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe: Fix xe_mocs.h

We don't need to include <linux/types.h>.
We don't use struct xe_exec_queue here.
We should sort forward declarations.

Signed-off-by: Michal Wajdeczko <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Use ordered WQ for G2H handler

System work queues are shared, use a dedicated work queue for G2H
processing to avoid G2H processing getting block behind system tasks.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/mocs: Add debugfs node to dump mocs

This is useful to check mocs configuration. Tests/Tools can use
this debugfs entry to get mocs info.

v2: Address review comments. Change debugfs output style similar
to pat debugfs. (Lucas De Marchi)

v3: rebase.

v4: Address review comments. Use function pointer inside ops
struct. Update Test-with links. Remove usage of flags wherever
not required. (Lucas De Marchi)

v5: Address review comments. Move register defines. Modify mocs
info struct to avoid holes. (Luca De Marchi)

Cc: Matt Roper <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Janga Rahul Kumar <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Relocate regs_are_mcr function

Relocate regs_are_mcr funciton to a higher position in the file
for improved visibility.

Cc: Matt Roper <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Janga Rahul Kumar <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe: Refactor default device atomic settings

The default behavior of device atomics depends on the
VM type and buffer allocation types. Device atomics are
expected to function with all types of allocations for
traditional applications/APIs. Additionally, in compute/SVM
API scenarios with fault mode or LR mode VMs, device atomics
must work with single-region allocations. In all other cases
device atomics should be disabled by default also on platforms
where we know device atomics doesn't on work on particular
allocations types.

v3: fault mode requires LR mode so only check for LR mode
    to determine compute API(Jose).
    Handle SMEM+LMEM BO's migration to LMEM where device
    atomics is expected to work. (Brian).
v2: Fix platform checks to correct atomics behaviour on PVC.

Acked-by: Michal Mrozek <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Add function to check if BO has single placement

A new helper function xe_bo_has_single_placement() to check
if a BO has single placement.

Reviewed-by: José Roberto de Souza <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Introduce has_device_atomics_on_smem device info

Add has_device_atomics_on_smem to specify that a device
supports device atomics on system memory. Currently XE2
supports this so set this for XE2.

v2: Set has_device_atomics_on_smem for all platform but
PVC.

Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Move vm bind bo validation to a helper function

Move vm bind bo validation to a helper function to make the
xe_vm_bind_ioctl() more readable.

v2: Capture ret value of xe_vm_bind_ioctl_validate_bo(Matt B).
Remove redundant coh_mode param.

Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Introduce has_atomic_enable_pte_bit device info

Add has_atomic_enable_pte_bit to specify that a device
has PTE_AE bit in its PTE feild. Currently XE2 and PVC
supports this so set this for those two.

This will help consolidate setting atomic access bit in PTE
logic which is spread between multiple files.

Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Nirmoy Das <[email protected]>

drm/xe: Demote CCS_MODE info to debug only

This information is printed in any gt_reset, which actually
occurs in any runtime resume, what can be so verbose in
production build. Let's demote it to debug only.

Cc: Niranjana Vishwanathapura <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe/debugfs: Get a runtime_pm reference when setting wedged mode

This function is another entry point where it must be ensured that
the device resumes before operating on the GuC, so grab a runtime_pm
reference. This fixes inner xe_pm_runtime_get_noresume calls which
were previously failing.

Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/xe/rtp: Prefer helper macros from xe_args.h

Some custom implementation can be replaced with generic macros
from the linux/args.h or xe_args.h.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/kunit: Add simple tests for new xe_args macros

We want to make sure that helper macros are working as expected.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Add helpers for manipulating macro arguments

Define generic helpers that will replace private definitions used
by the RTP code and will allow reuse by the new code.

Put them in new xe_args.h file (instead of infamous xe_macros.h)
as once we find more potential users outside of the Xe driver we
may want to move all of these macros as-is to linux/args.h.

Signed-off-by: Michal Wajdeczko <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Perform dma_map when moving system buffer objects to TT

Currently we dma_map on ttm_tt population and dma_unmap when
the pages are released in ttm_tt unpopulate.

Strictly, the dma_map is not needed until the bo is moved to the
XE_PL_TT placement, so perform the dma_mapping on such moves
instead, and remove the dma_mappig when moving to XE_PL_SYSTEM.

This is desired for the upcoming shrinker series where shrinking
of a ttm_tt might fail. That would lead to an odd construct where
we first dma_unmap, then shrink and if shrinking fails dma_map
again. If dma_mapping instead is performed on move like this,
shrinking does not need to care at all about dma mapping.

Finally, where a ttm_tt is destroyed while bound to a different
memory type than XE_PL_SYSTEM, we keep the dma_unmap in
unpopulate().

v2:
- Don't accidently unmap the dma-buf's sgtable.

Signed-off-by: Thomas Hellström <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/gt: Fix assert in L3 bank mask generation

What needs to be asserted is that the pattern fits in the number
of bits provided by the user in patternbits, otherwise it would
be truncated when replicated according to the mask, which is
likely not the intended use of this function.
The pattern argument is a bitmap so use find_last_bit() instead
of fls(). The bit position starts at index 0 so remove "or equal"
from the comparison. XE_MAX_L3_BANK_MASK_BITS would be the
returned value if the pattern is 0, which can be the case on some
platforms.

v2: Check the result does not overflow the array (Lucas De Marchi)

v3: Use __fls() for long and handle mask == 0 (Lucas De Marchi)

Cc: Matt Roper <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

Merge drm/drm-next into drm-xe-next

Avoid falling too far behind drm-next.

Signed-off-by: Thomas Hellström <[email protected]>

drm/xe/gsc: define GSCCS for LNL

LNL has 1 GSCCS, same as MTL. Note that the GSCCS will be disabled until
we have a GSC FW defined, but having it in the list of engine is a
requirement to add such definition.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Shekhar Chauhan <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe/gsc: Turn off GSCCS interrupts when disabling the engine

Starting on LNL, there is a new GSCCS interrupt that is triggered when
the GSC engine reset fails. If the HW is in a bad state, this interrupt
might end up being triggered even if we're not using the engine, which
will lead to a warning because we'll see it as unexpected. Since there
is no point in handling the interrupt in this scenario, we can just
make sure the interrupts are off when we disable the engine.

Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Matt Roper <[email protected]>
Tested-by: Matt Roper <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/xe: Remove uninitialized end var from xe_gt_tlb_invalidation_range()

This fixes commit c4f18703629d ("drm/xe: Add
xe_gt_tlb_invalidation_range and convert PT layer to use this")
which added the end variable as part of the function param.

v2: Add fixes tag(Matt)

Fixes: c4f18703629d ("drm/xe: Add xe_gt_tlb_invalidation_range and convert PT layer to use this")
Cc: Matthew Brost <[email protected]>
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

Merge tag 'amd-drm-next-6.10-2024-04-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.10-2024-04-26:

amdgpu:
- Misc code cleanups and refactors
- Support setting reset method at runtime
- Report OD status
- SMU 14.0.1 fixes
- SDMA 4.4.2 fixes
- VPE fixes
- MES fixes
- Update BO eviction priorities
- UMSCH fixes
- Reset fixes
- Freesync fixes
- GFXIP 9.4.3 fixes
- SDMA 5.2 fixes
- MES UAF fix
- RAS updates
- Devcoredump updates for dumping IP state
- DSC fixes
- JPEG fix
- Fix VRAM memory accounting
- VCN 5.0 fixes
- MES fixes
- UMC 12.0 updates
- Modify contiguous flags handling
- Initial support for mapping kernel queues via MES

amdkfd:
- Fix rescheduling of restore worker
- VRAM accounting for SVM migrations
- mGPU fix
- Enable SQ watchpoint for gfx10

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'drm-intel-gt-next-2024-04-26' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next

UAPI Changes:

- drm/i915/guc: Use context hints for GT frequency

    Allow user to provide a low latency context hint. When set, KMD
    sends a hint to GuC which results in special handling for this
    context. SLPC will ramp the GT frequency aggressively every time
    it switches to this context. The down freq threshold will also be
    lower so GuC will ramp down the GT freq for this context more slowly.
    We also disable waitboost for this context as that will interfere with
    the strategy.

    We need to enable the use of SLPC Compute strategy during init, but
    it will apply only to contexts that set this bit during context
    creation.

    Userland can check whether this feature is supported using a new param-
    I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
    enabled platforms as they use SLPC for frequency management.

    The Mesa usage model for this flag is here -
    https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint

- drm/i915/gt: Enable only one CCS for compute workload

    Enable only one CCS engine by default with all the compute sices
    allocated to it.

    While generating the list of UABI engines to be exposed to the
    user, exclude any additional CCS engines beyond the first
    instance

    ***

    NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
    default to mitigate a hardware bug. All the EUs will still remain
    usable, and all the userspace drivers have been confirmed to be able
    to dynamically detect the change in number of CCS engines and adjust.

    For the smaller percent of applications that get perf benefit from
    letting the userspace driver dispatch across all 4 CCS engines we will
    be introducing a sysfs control as a later patch to choose 4 CCS each
    with 25% EUs (or 50% if 2 CCS).

    NOTE: A regression has been reported at

    https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895

    However Andi has been triaging the issue and we're closing in a fix
    to the gap in the W/A implementation:

    https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html

Driver Changes:

- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
  Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
  Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
  partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)

- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)

. Selftest improvements (Janusz, Nirmoy, Daniele)

Signed-off-by: Dave Airlie <[email protected]>
From: Joonas Lahtinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge v6.9-rc6 into drm-next

Thomas needs the defio fixes, Maíra needs the vkms fixes and Joonas
has some fun with i915-gem conflicts.

Signed-off-by: Daniel Vetter <[email protected]>

drm/xe: Merge 16021540221 and 18034896535 WAs

In order to detect duplicate implementations for the same workaround,
early in the implementation of RTP it was decided to error out even if
the values set are exactly the same. With the introduction of 18034896535
in commit 74671d23ca18 ("drm/xe/xe2: Add workaround 18034896535"), LNL
stepping with graphics stepping A1 now gives the following error on
module load:

xe 0000:00:02.0: [drm] *ERROR* GT0: [GT OTHER] \
discarding save-restore reg e48c (clear: 00000200, set: 00000200,\
masked: yes, mcr: yes): ret=-22

RTP may be improved in the future, but for now simply join the entries
like done with e.g. "1607297627, 1607030317, 1607186500".

Fixes: 74671d23ca18 ("drm/xe/xe2: Add workaround 18034896535")
Cc: Bommu Krishnaiah <[email protected]>
Cc: Tejas Upadhyay <[email protected]>
Cc: Matt Roper <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

drm/xe/xe2hpg: Add Wa_14021490052

Add Wa_14021490052 for Xe2HPG 20.01.

Signed-off-by: Shekhar Chauhan <[email protected]>
Reviewed-by: Gustavo Sousa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>