Chris Wilson [Wed, 2 Jan 2019 16:35:24 +0000 (16:35 +0000)]
drm/i915/gen6: Flush RING_IMR changes before changing the global GT IMR
On Baytail, notably, we can still detect missed interrupt syndrome
(where we never spot a completed request). In this case, it can be
alleviated by always keeping the interrupt unmasked, implying that the
interrupt is being lost in the window after modifying the IMR. (This is
the reason we still have the posting reads on enable_irq, if we remove
them we miss interrupts!) Having narrowed the issue down to the IMR,
rather than keeping it always enabled, applying the usual posting
read/flush of the RING_IMR before unmasking the GT IMR also seems to
prevent the missed interrupt. So be it.
Chris Wilson [Wed, 2 Jan 2019 11:44:31 +0000 (11:44 +0000)]
drm/i915/selftests: Take a breath during check_partial_mappings()
With kasan on a slow machine, it can take an age to check all the
partial mappings in a single iteration, so break it up with a
cond_resched) to avoid RCU stall reports.
Jani Nikula [Mon, 31 Dec 2018 14:56:45 +0000 (16:56 +0200)]
drm/i915: drop intel_device_info_dump()
The debugfs, error state and regular dmesg logging dump needs seem to be
different. Remove the generic dump function only used for the welcome
message. This may be added back later when better abstractions are
identified, but at the moment this seems to be the simplest considering
the device info rework in progress. No longer rely on device info being
a substruct of dev_priv.
Chris Wilson [Fri, 28 Dec 2018 17:16:41 +0000 (17:16 +0000)]
drm/i915: Drop unused engine->irq_seqno_barrier w/a
Now that we have eliminated the CPU-side irq_seqno_barrier by moving the
delays on the GPU before emitting the MI_USER_INTERRUPT, we can remove
the engine->irq_seqno_barrier infrastructure. Though intentionally
slowing down the GPU is nasty, so is the code we can now remove!
Chris Wilson [Fri, 28 Dec 2018 17:16:40 +0000 (17:16 +0000)]
drm/i915/ringbuffer: Move irq seqno barrier to the GPU for gen5
The irq_seqno_barrier is a tradeoff between doing work on every request
(on the GPU) and doing work after every interrupt (on the CPU). We
presume we have many more requests than interrupts! However, for
Ironlake, the workaround is a pretty hideous usleep() and so even though
it was found we need to repeat the MI_STORE_DWORD_IMM 8 times, or about
1us of GPU time, doing so is preferrable than requiring a sleep of
125-250us on the CPU where we desire to respond immediately (ideally from
within the interrupt handler)!
The additional MI_STORE_DWORD_IMM also have the side-effect of flushing
MI operations from userspace which are not caught by MI_FLUSH!
Chris Wilson [Fri, 28 Dec 2018 17:16:39 +0000 (17:16 +0000)]
drm/i915/ringbuffer: Move irq seqno barrier to the GPU for gen7
The irq_seqno_barrier is a tradeoff between doing work on every request
(on the GPU) and doing work after every interrupt (on the CPU). We
presume we have many more requests than interrupts! However, the current
w/a for Ivybridge is an implicit delay that currently fails sporadically
and consistently if we move the w/a into the irq handler itself. This
makes the CPU barrier untenable for upcoming interrupt handler changes
and so we need to replace it with a delay on the GPU before we send the
MI_USER_INTERRUPT. As it turns out that delay is 32x MI_STORE_DWORD_IMM,
or about 0.6us per request! Quite nasty, but the lesser of two evils
looking to the future.
Chris Wilson [Fri, 28 Dec 2018 17:16:38 +0000 (17:16 +0000)]
drm/i915/ringbuffer: Remove irq-seqno w/a for gen6 xcs
The MI_FLUSH_DW does appear coherent with the following
MI_USER_INTERRUPT, but only on Sandybridge. Ivybridge requires a heavier
hammer, but on Sandybridge we can stop requiring the irq_seqno barrier.
Chris Wilson [Fri, 28 Dec 2018 17:16:37 +0000 (17:16 +0000)]
drm/i915/ringbuffer: Remove irq-seqno w/a for gen6/7 rcs
Having transitioned to using PIPECONTROL to combine the flush with the
breadcrumb write using their post-sync functions, assume that this will
resolve the serialisation with the subsequent MI_USER_INTERRUPT. That is
when inspecting the breadcrumb after an interrupt we can rely on the write
being posted (i.e. the HWSP will be coherent).
Testing using gem_sync shows that the PIPECONTROL + CS stall does
serialise the command streamer sufficient that the breadcrumb lands
before the MI_USER_INTERRUPT. The same is not true for MI_FLUSH_DW.
Chris Wilson [Fri, 28 Dec 2018 17:16:36 +0000 (17:16 +0000)]
drm/i915: Remove redundant trailing request flush
Now that we perform the request flushing inline with emitting the
breadcrumb, we can remove the now redundant manual flush. And we can
also remove the infrastructure that remained only for its purpose.
v2: emit_breadcrumb_sz is in dwords, but rq->reserved_space is in bytes
Chris Wilson [Mon, 31 Dec 2018 14:35:05 +0000 (14:35 +0000)]
drm/i915: Update kerneldoc for intel_wm_need_update()
drivers/gpu/drm/i915/intel_display.c:10708: warning: Function parameter or member 'cur' not described in 'intel_wm_need_update'
drivers/gpu/drm/i915/intel_display.c:10708: warning: Function parameter or member 'new' not described in 'intel_wm_need_update'
drivers/gpu/drm/i915/intel_display.c:10708: warning: Excess function parameter 'plane' description in 'intel_wm_need_update'
drivers/gpu/drm/i915/intel_display.c:10708: warning: Excess function parameter 'state' description in 'intel_wm_need_update'
Jani Nikula [Thu, 27 Dec 2018 14:33:40 +0000 (16:33 +0200)]
drm/i915/params: set i915.enable_hangcheck permissions to 0600
i915.enable_hangcheck has been an outlier since its introduction in
commit 3e0dc6b01f53 ("drm/i915: hangcheck disable parameter") with 0644
permissions, while all the rest are either 0400 or 0600. Follow suit
with 0600.
IGT never reads the value, so there should be no impact.
Chris Wilson [Fri, 28 Dec 2018 15:31:14 +0000 (15:31 +0000)]
drm/i915/ringbuffer: Pull the render flush into breadcrumb emission
In preparation for removing the manual EMIT_FLUSH prior to emitting the
breadcrumb implement the flush inline with writing the breadcrumb for
ringbuffer emission.
With a combined flush+breadcrumb, we can use a single operation to both
flush and after the flush is complete (post-sync) write the breadcrumb.
This gives us a strongly ordered operation that should be sufficient to
serialise the write before we emit the interrupt; and therefore we may
take the opportunity to remove the irq_seqno_barrier w/a for gen6+.
Although using the PIPECONTROL to write the breadcrumb is slower than
MI_STORE_DWORD_IMM, by combining the operations into one and removing the
extra flush (next patch) it is faster
For gen2-5, we simply combine the MI_FLUSH into the breadcrumb emission,
though maybe we could find a solution here to the seqno-vs-interrupt
issue on Ironlake by mixing up the flush? The answer is no, adding an
MI_FLUSH before the interrupt is insufficient.
Chris Wilson [Fri, 28 Dec 2018 15:31:13 +0000 (15:31 +0000)]
drm/i915/execlists: Pull the render flush into breadcrumb emission
In preparation for removing the manual EMIT_FLUSH prior to emitting the
breadcrumb implement the flush inline with writing the breadcrumb for
execlists. Using one command to both flush and write the breadcrumb is
naturally a tiny bit faster than splitting it into two.
Chris Wilson [Fri, 28 Dec 2018 14:07:35 +0000 (14:07 +0000)]
drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation
The writing is on the wall for the existence of a single execution queue
along each engine, and as a consequence we will not be able to track
dependencies along the HW queue itself, i.e. we will not be able to use
HW semaphores on gen7 as they use a global set of registers (and unlike
gen8+ we can not effectively target memory to keep per-context seqno and
dependencies).
On the positive side, when we implement request reordering for gen7 we
also can not presume a simple execution queue and would also require
removing the current semaphore generation code. So this bring us another
step closer to request reordering for ringbuffer submission!
The negative side is that using interrupts to drive inter-engine
synchronisation is much slower (4us -> 15us to do a nop on each of the 3
engines on ivb). This is much better than it was at the time of introducing
the HW semaphores and equally important userspace weaned itself off
intermixing dependent BLT/RENDER operations (the prime culprit was glyph
rendering in UXA). So while we regress the microbenchmarks, it should not
impact the user.
Chris Wilson [Fri, 28 Dec 2018 14:07:34 +0000 (14:07 +0000)]
drm/i915: Restrict PSMI context load w/a to Haswell GT1
After we found a workaround for a hang on context load, Ben Widawsky
found confirmation that it was for an issue with waking from rc6 and
loading a context image.
The workaround from on high suggests that we should
in our rc6 setup for Haswell GT1, but on applying that we find instead
that the machine encounters a GT forcewake error and locks up.
As we are removing HW semaphore usage in the next patch, and the
suggested workaround is no improvement, we need to
decouple the PSMI workaround from HAS_SEMAPHORES to IS_HSW_GT1.
Young Xiao [Mon, 17 Dec 2018 12:23:03 +0000 (12:23 +0000)]
drm/i915: avoid division by zero on skl_calc_wrpll_link
If for some unexpected reason the registers all read zero it's better
to WARN and return instead of dividing by zero and completely freezing
the machine.
See commit 0e005888b833 ("drm/i915: avoid division by zero on
cnl_calc_wrpll_link") for detail.
Chris Wilson [Thu, 27 Dec 2018 12:15:49 +0000 (12:15 +0000)]
drm/i915: Remove debugfs/i915_ppgtt_info
The information presented here is not relevant to current development.
We can either use the context information, but more often we want to
inspect the active gpu state.
The ulterior motive is to eradicate dev->filelist.
Do not make it an error to call intel_edp_drrs_enable while drrs has
already been enabled, instead exit silently in this case.
This is a preparation patch for ensuring that DRRS is enabled on fastsets.
Note that the removed WARN_ON could also be triggered from userspace
through the i915_drrs_ctl debugfs entry which was added by
commit 35954e88bc50 ("drm/i915: Runtime disable for eDP DRRS")
Hans de Goede [Thu, 20 Dec 2018 13:21:18 +0000 (14:21 +0100)]
drm/i915: Add an update_pipe callback to intel_encoder and call this on fastsets (v2)
When we are doing a fastset (needs_modeset=false, update_pipe=true) we
may need to update some encoder-level things such as checking that PSR
is enabled.
This commit adds an update_pipe callback to intel_encoder and a new
intel_encoders_update_pipe helper which calls this for all encoders
connected to a crtc. The new intel_encoders_update_pipe helper is called
from intel_update_crtc when doing a fastset.
Changes in v2:
-Name the new encoder callback update_pipe instead of just update
Chris Wilson [Sat, 22 Dec 2018 03:06:23 +0000 (03:06 +0000)]
drm/i915: Unwind failure on pinning the gen7 ppgtt
If we fail to pin the ggtt vma slot for the ppgtt page tables, we need
to unwind the locals before reporting the error. Or else on subsequent
attempts to bind the page tables into the ggtt, we will already believe
that the vma has been pinned and continue on blithely. If something else
should happen to be at that location, choas ensues.
Paulo Zanoni [Wed, 14 Nov 2018 01:24:32 +0000 (17:24 -0800)]
drm/i915: don't apply Display WAs 1125 and 1126 to GLK/CNL+
BSpec does not show these WAs as applicable to GLK, and for CNL it
only shows them applicable for a super early pre-production stepping
we shouldn't be caring about anymore. Remove these so we can avoid
them on ICL too.
v2: Change how we check for gen9 display platforms (Ville).
Imre Deak [Fri, 14 Dec 2018 18:27:03 +0000 (20:27 +0200)]
drm/i915/icl: Add fallback detection method for TypeC legacy ports
Add a fallback detection method for TypeC legacy ports in case the
VBT port information used to detect normally such ports is
incorrect.
For the fallback method we use the TypeC legacy mode specific HPD
interrupt flag which should only be raised for a legacy port.
WARN if the VBT port info is incorrect.
In a case where we'd detect the port in a contradicting way both as a
legacy and also as a USB DP and/or TBT alternate port treat the port
as legacy (by also emitting a WARN from icl_update_tc_port_type).
v2:
- Repurpose the detection as a fallback method instead of using
it only for the DP legacy case. By now we should normally use VBT to
detect DP legacy ports as well.
Imre Deak [Fri, 14 Dec 2018 18:27:02 +0000 (20:27 +0200)]
drm/i915/icl: Fix HPD handling for TypeC legacy ports
Atm HPD disconnect events on TypeC ports will break things, since we'll
switch the TypeC mode (between legacy and disconnected modes as well as
among USB DP alternate, Thunderbolt alternate and disconnected modes) on
the fly from the HPD disconnect interrupt work while the port may be
still active.
Even if the port happens to be not active during the disconnect we'd
still have a problem during a subsequent modeset or AUX transfer that
could happen regardless of the port's connected state. For instance the
system resume display mode restore code and userspace could perform a
modeset on the port or userspace could start an AUX transfer even if the
port is in disconnected state.
To fix this keep TypeC legacy ports in legacy mode whenever we're not
suspended. This mode is a static configuration as opposed to the
Thunderbolt and USB DP alternate modes between which we can switch
dynamically.
We determine if a TypeC port is legacy (wired to a legacy HDMI or a
legacy DP connector) via the VBT DDI port specific USB-TypeC and
Thunderbolt flags. If both these flags are cleared then the port is
configured for legacy mode.
On such legacy ports we'll run the TypeC PHY connect sequence explicitly
during driver loading and system resume (vs. running the sequence during
HPD processing). The connect will succeed even if the display is not
connected to begin with (or disappears during the suspended state) since
for legacy ports the PORT_TX_DFLEXDPPMS / DP_PHY_MODE_STATUS_COMPLETED
flag is always set (as opposed to the USB DP alternate mode where it
gets set only when a display is connected).
Correspondingly run the TypeC PHY disconnect sequence during system
suspend and driver unloading. For the unloading case I had to split
up intel_dp_encoder_destroy() to be able to have the 1. flush any
pending encoder work, 2. disconnect TC PHY, 3. call DRM core cleanup and
kfree on the encoder object.
For now run the PHY disconnect during suspend only for TypeC legacy
ports. We will need to disconnect even in USB DP alternate mode in the
future, but atm we don't have a way to reconnect the port in this mode
during resume if the display disappears while being suspended. So for
now punt on this case.
Note that we do not disconnect the port during runtime suspend; in
legacy mode there are no shared HW resources (PHY lanes) with other HW
blocks (USB), so no need to release / reacquire these resources as with
USB DP alternate mode. The only reason to disconnect legacy ports during
system suspend is that the PORT_TX_DFLEXDPPMS /
DP_PHY_MODE_STATUS_COMPLETED flag must be rechecked and the port must be
connected again during system resume. We'll also have to turn the check
for this flag into a poll, after figuring out what's the proper timeout
value for it.
v2:
- Remove the redundant special casing of legacy mode when doing a
disconnect in icl_tc_port_connected(). It's guaranteed already that we
won't disconnect legacy ports in that function.
- Add a note about the new intel_ddi_encoder_destroy() hook.
- Reword the commit message after switching to the VBT based detection.
Imre Deak [Fri, 14 Dec 2018 18:27:01 +0000 (20:27 +0200)]
drm/i915/bios: Parse the VBT TypeC and Thunderbolt port flags
This is needed by the next patch to determine if a DDI TypeC port is
physically wired to a legacy DP or legacy HDMI connector or if the port
is wired to a USB-C/Thunderbolt connector.
Chris Wilson [Tue, 18 Dec 2018 10:27:12 +0000 (10:27 +0000)]
drm/i915: Apply missed interrupt after reset w/a to all ringbuffer gen
Having completed a test run of gem_eio across all machines in CI we also
observe the phenomenon (of lost interrupts after resetting the GPU) on
gen3 machines as well as the previously sighted gen6/gen7. Let's apply
the same HWSTAM workaround that was effective for gen6+ for all, as
although we haven't seen the same failure on gen4/5 it seems prudent to
keep the code the same.
As a consequence we can remove the extra setting of HWSTAM and apply the
register from a single site.
v2: Delazy and move the HWSTAM into its own function
v3: Mask off all HWSP writes on driver unload and engine cleanup.
v4: And what about the physical hwsp?
v5: No, engine->init_hw() is not called from driver_init_hw(), don't be
daft. Really scrub HWSTAM as early as we can in driver_init_mmio()
v6: Rename set_hwsp as it was setting the mask not the hwsp register.
v7: Ville pointed out that although vcs(bsd) was introduced for g4x/ilk,
per-engine HWSTAM was not introduced until gen6!
Clint Taylor [Mon, 17 Dec 2018 22:13:47 +0000 (14:13 -0800)]
drm/i915/icl: combo port vswing programming changes per BSPEC
In August 2018 the BSPEC changed the ICL port programming sequence to
closely resemble earlier gen programming sequence. Restrict combo phy to
HBR max rate unless eDP panel is connected to port.
Hans de Goede [Mon, 17 Dec 2018 14:19:03 +0000 (15:19 +0100)]
drm/i915: Update crtc scaler settings when update_pipe is set
When the pipe_config's update_pipe flag is set we may need to update the
panel fitting settings. On GEN9+ this means we need to update the crtc's
scaler settings.
This fixes the following WARN_ON, during i915 loading on an Asrock
B150M Pro4S/D3 board with an i5-6500 CPU / graphics:
[drm:pipe_config_err [i915]] *ERROR* mismatch in pch_pfit.enabled
(expected no, found yes)
pipe state doesn't match!
WARNING: CPU: 3 PID: 305 at drivers/gpu/drm/i915/intel_display.c:12084
With line 12084 being the I915_STATE_WARN call inside the
"if (!intel_pipe_config_compare())" block in verify_crtc_state().
On this board with 2 1920x1080 monitors connected over HDMI the GOP
initializes both monitors at 1920x1080 and despite no scaling being
necessary configures a scaler for one of them.
When booting with fastboot=1 on the initial modeset needs_modeset will
be false while update_pipe is true. Since we were not calling
skl_update_scaler_crtc() in this case we would leave the scaler enabled
causing this error.
Manasi Navare [Thu, 6 Dec 2018 00:54:07 +0000 (16:54 -0800)]
drm/i915/dsc: Add Per connector debugfs node for DSC support/enable
DSC can be supported per DP connector. This patch adds a per connector
debugfs node to expose DSC support capability by the kernel.
The same node can be used from userspace to force DSC enable.
force_dsc_en written through this debugfs node is used to force
DSC even for lower resolutions.
Credits to Ville Syrjala for suggesting the proper locks to be used
and to Lyude Paul for explaining how to use them in this context
v8:
* Add else if (ret) for drm_modeset_lock (Lyude)
v7:
* Get crtc, crtc_state from connector atomic state
and add proper locks and backoff (Ville, Chris Wilson, Lyude)
(Suggested-by: Ville Syrjala <[email protected]>)
* Use %zu for printing size_t variable (Lyude)
v6:
* Read fec_capable only for non edp (Manasi)
v5:
* Name it dsc sink support and also add
fec support in the same node (Ville)
v4:
* Add missed connector_status check (Manasi)
* Create i915_dsc_support node only for Gen >=10 (manasi)
* Access intel_dp->dsc_dpcd only if its not NULL (Manasi)
v3:
* Combine Force_dsc_en with this patch (Ville)
v2:
* Use kstrtobool_from_user to avoid explicit error checking (Lyude)
* Rebase on drm-tip (Manasi)
Oscar Mateo [Thu, 13 Dec 2018 09:15:22 +0000 (09:15 +0000)]
drm/i915/icl: Mind the SFC units when resetting VD or VEBox engines
SFC (Scaler & Format Converter) units are shared between VD and VEBoxes.
They also happen to have separate reset bits. So, whenever we want to reset
one or more of the media engines, we have to make sure the SFCs do not
change owner in the process and, if this owner happens to be one of the
engines being reset, we need to reset the SFC as well.
This happens in 4 steps:
1) Tell the engine that a software reset is going to happen. The engine
will then try to force lock the SFC (if currently locked, it will
remain so; if currently unlocked, it will ignore this and all new lock
requests).
2) Poll the ack bit to make sure the hardware has received the forced
lock from the driver. Once this bit is set, it indicates SFC status
(lock or unlock) will not change anymore (until we tell the engine it
is safe to unlock again).
3) Check the usage bit to see if the SFC has ended up being locked to
the engine we want to reset. If this is the case, we have to reset
the SFC as well.
4) Unlock all the SFCs once the reset sequence is completed.
Obviously, if we are resetting the whole GPU, we don't have to worry
about all of this.
Oscar Mateo [Thu, 13 Dec 2018 09:15:21 +0000 (09:15 +0000)]
drm/i915/icl: Record the valid VDBoxes with SFC capability
In Gen11, only even numbered "logical" VDBoxes are hooked up to an SFC
(Scaler & Format Converter) unit. We will use this information to decide
when the SFC units need to be reset.
Chris Wilson [Thu, 13 Dec 2018 09:15:20 +0000 (09:15 +0000)]
drm/i915/selftests: Verify we can perform resets from atomic context
We currently require that our per-engine reset can be called from any
context, even hardirq, and in the future wish to perform the device
reset without holding struct_mutex (which requires some lockless
shenanigans that demand the lowlevel intel_reset_gpu() be able to be
used in atomic context). Test that we meet the current requirements by
calling i915_reset_engine() from under various atomic contexts.
Chris Wilson [Thu, 13 Dec 2018 09:15:19 +0000 (09:15 +0000)]
drm/i915/selftests: Check we can recover a wedged device
After declaring a terminally wedged device, we allow ourselves to
recover on the next GPU reset (manually triggered), or resume. Check
that resetting a wedged device does work.
v2: Add rpm (taken explicitly in the subtest in case we remove the outer
wakeref) and early warning to i915_reset() for missed wakerefs
Lucas De Marchi [Wed, 12 Dec 2018 18:10:44 +0000 (10:10 -0800)]
drm/i915: merge gen checks to use range
Instead of using IS_GEN() for consecutive gen checks, let's pass the
range to IS_GEN_RANGE(). By code inspection these were the ranges deemed
necessary for spatch:
Lucas De Marchi [Wed, 12 Dec 2018 18:10:43 +0000 (10:10 -0800)]
drm/i915: replace IS_GEN<N> with IS_GEN(..., N)
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.
The following spatch was used to convert the users of these macros:
Matt Roper [Wed, 12 Dec 2018 19:17:20 +0000 (11:17 -0800)]
drm/i915: Don't forget to reset blocks when testing lower wm levels
During DDB allocation, we try to distribute enough blocks for each plane
to hit the highest watermark level; if that fails, we retry each lower
level (which should require fewer blocks) until we find one that's
possible (or until the whole commit is rejected as impossible). We need
to reset our running block count when trying each lower level, otherwise
all lower levels will fail as well.
Matt Roper [Tue, 11 Dec 2018 17:31:07 +0000 (09:31 -0800)]
drm/i915: Switch to level-based DDB allocation algorithm (v5)
The DDB allocation algorithm currently used by the driver grants each
plane a very small minimum allocation of DDB blocks and then divies up
all of the remaining blocks based on the percentage of the total data
rate that the plane makes up. It turns out that this proportional
allocation approach is overly-generous with the larger planes and can
leave very small planes wthout a big enough allocation to even hit their
level 0 watermark requirements (especially on APL, which has a smaller
DDB in general than other gen9 platforms). Or there can be situations
where the smallest planes hit a lower watermark level than they should
have been able to hit with a more equitable division of DDB blocks, thus
limiting the overall system sleep state that can be achieved.
The bspec now describes an alternate algorithm that can be used to
overcome these types of issues. With the new algorithm, we calculate
all plane watermark values for all wm levels first, then go back and
partition a pipe's DDB space second. The DDB allocation will calculate
what the highest watermark level that can be achieved on *all* active
planes, and then grant the blocks necessary to hit that level to each
plane. Any remaining blocks are then divided up proportionally
according to data rate, similar to the old algorithm.
There was a previous attempt to implement this algorithm a couple years
ago in bb9d85f6e9d ("drm/i915/skl: New ddb allocation algorithm"), but
some regressions were reported, the patch was reverted, and nobody
ever got around to figuring out exactly where the bug was in that
version. Our watermark code has evolved significantly in the meantime,
but we're still getting bug reports caused by the unfair proportional
algorithm, so let's give this another shot.
v2:
- Make sure cursor allocation stays constant and fixed at the end of
the pipe allocation.
- Fix some watermark level iterators that weren't handling the max
level.
v3:
- Ensure we don't leave any DDB blocks unused by using DIV_ROUND_UP+min
to calculate the extra blocks for each plane. (Ville)
- Replace a while() loop with a for() loop to be more consistent with
surrounding code. (Ville)
- Clean unattainable watermark levels with memset rather than directly
clearing the member fields. Also do the same for the transition
watermark values if they can't be achieved. (Ville)
- Drop min_disp_buf_needed calculations in skl_compute_plane_wm() since
the results are no longer needed or used. (Ville)
- Drop skl_latency[0] != 0 sanity check; both watermark methods already
account for an invalid 0 latency by returning FP_16_16_MAX. (Ville)
v4:
- Break DDB allocation loop when total_data_rate=0 rather than
alloc_size=0. If total_data_rate has dropped to 0, all remaining
planes are disabled, which isn't true for alloc_size (we might just
have not had any remaining blocks to hand out). Plus
total_data_rate=0 is the case we need to avoid to a prevent a
div-by-0. (Ville)
- s/DIV_ROUND_UP/DIV64_U64_ROUND_UP/ to prevent 32-bit breakage (Ville)
v5:
- Don't forget to move 'start' pointer forward for UV surface when
setting plane DDB boundaries. (Ville)
Matt Roper [Tue, 11 Dec 2018 17:31:06 +0000 (09:31 -0800)]
drm/i915: Don't use DDB allocation when choosing gen9 watermark method
The bspec gives an if/else chain for choosing whether to use "method 1"
or "method 2" for calculating the watermark "Selected Result Blocks"
value for a plane. One of the branches of the if chain is:
"Else If ('plane buffer allocation' is known and (plane buffer
allocation / plane blocks per line) >=1)"
Since our driver currently calculates DDB allocations first and the
actual watermark values second, the plane buffer allocation is known at
this point in our code and we include this test in our driver's logic.
However we plan to soon move to a "watermarks first, ddb allocation
second" sequence where we won't know the DDB allocation at this point.
Let's drop this arm of the if/else statement (effectively considering
the DDB allocation unknown) as an independent patch so that any
regressions can be more accurately bisected to either the different
watermark value (in this patch) or the new DDB allocation (in the next
patch).
Matt Roper [Mon, 10 Dec 2018 21:54:15 +0000 (13:54 -0800)]
drm/i915: Use intel_ types more consistently for color management code (v2)
Try to be more consistent about intel_* types rather than drm_* types
for lower-level driver functions. While we're at it, let's also be more
consistent with state variable naming (half of the platforms use the
name 'state' whereas the other half used 'crtc_state').
While we're touching these variables, let's also be more consistent
about always naming the intel_crtc_state's "crtc_state" rather than
"state" so that different platform types aren't using different naming
conventions.
v2:
- s/state/crtc_state/ for consistency between platform types (Ville)
- Drop the crtc parameter to intel_color_check(); we can just pull that
out of the state object.
Ville Syrjälä [Tue, 13 Nov 2018 17:23:30 +0000 (19:23 +0200)]
drm/i915: Remove dead update_wm_pre assignment from SKL wm code
SKL+ do not use crtc_state->update_wm_pre, so there is absolutely no
point it setting it. crtc_state->update_wm_pre only exists as a
temporary hack for pre-g4x platforms until we redo their
watermarks to be be atomic.
Ville Syrjälä [Tue, 13 Nov 2018 17:23:28 +0000 (19:23 +0200)]
drm/i915: Use explicit old crtc state in skl_compute_wm()
skl_compute_wm() wants to compare the old and new watermarks. Currently
it gets at the old watermarks via crtc->state, which is confusing since
it can point at either the old or the new state depending on where
in the sequence we are. In this case it is correct since we have not yet
swapped the states, but let's make it super clear what this is doing
by using the explicit old state.
Chris Wilson [Fri, 7 Dec 2018 13:40:37 +0000 (13:40 +0000)]
drm/i915: Flush GPU relocs harder for gen3
Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.
Chris Wilson [Fri, 7 Dec 2018 11:05:54 +0000 (11:05 +0000)]
drm/i915: Skip the ERR_PTR error state
Although commit fb6f0b64e455 ("drm/i915: Prevent machine hang from
Broxton's vtd w/a and error capture") applied cleanly after a 24 month
hiatus, the code had moved on with new methods for peeking and fetching
the captured gpu info. Make sure we catch all uses of the stashed error
state and avoid dereferencing the error pointer.
v2: Move error pointer determination into i915_gpu_capture_state
v3: Restore early check to avoid capturing and then throwing away
subsequent GPU error states.
Chris Wilson [Fri, 7 Dec 2018 09:02:13 +0000 (09:02 +0000)]
drm/i915: Pipeline PDP updates for Braswell
Currently we face a severe problem on Braswell that manifests as invalid
ppGTT accesses. The code tries to maintain the PDP (page directory
pointers) inside the context in two ways, direct write into the context
and a pipelined LRI update. The direct write into the context is
fundamentally racy as it is unserialised with any access (read or write)
the GPU is doing. By asserting that Braswell is not used with vGPU
(currently an unsupported platform) we can eliminate the dangerous
direct write into the context image and solely use the pipelined update.
However, the LRI of the PDP fouls up the GPU, causing it to freeze and
take out the machine with "forcewake ack timeouts". This seems possible
to workaround by preventing the GPU from sleeping (via means of
disabling the power-state management interface, i.e. forcing each ring
to remain awake) around the update. Equally, it seems an EMIT_INVALIDATE
before the LRI is sufficient to prevent the forcewake errors.
Mika Kuoppala [Wed, 5 Dec 2018 13:46:12 +0000 (15:46 +0200)]
drm/i915/icl: Forcibly evict stale csb entries
Gen11 fails to deliver wrt global observation point on
tail/entry updates and we sometimes see old entry.
Use clflush to forcibly evict our possibly stale copy
of the cacheline in hopes that we get fresh one from gpu.
Obviously there is something amiss in the coherency protocol so
this can be consired as a workaround until real cause
is found.
The working hardware will do the evict without our cue anyways,
so the cost in there should be ameliorated by that fact.
v2: for next pass, s/flush/evict, add reset (Chris)
Chris Wilson [Thu, 6 Dec 2018 08:44:31 +0000 (08:44 +0000)]
drm/i915/execlists: Apply a full mb before execution for Braswell
Braswell is really picky about having our writes posted to memory before
we execute or else the GPU may see stale values. A wmb() is insufficient
as it only ensures the writes are visible to other cores, we need a full
mb() to ensure the writes are in memory and visible to the GPU.
The most frequent failure in flushing before execution is that we see
stale PTE values and execute the wrong pages.
Chris Wilson [Thu, 6 Dec 2018 18:07:13 +0000 (18:07 +0000)]
drm/i915/execlists: Move RCS mmio workaround to new common wa_list
We can move the remaining RCS workarounds applied to only gen8 to the
engine->wa_list, and then reduce all engine->init_hw callbacks to common
code. The benefit of using the new wa_list is that we verify that the
registers are indeed restored and keep their magic values.
v2: INSTPM_FORCE_ORDERING is already part of gen8_ctx_workarounds, and
as confirmed by the mmio verification is a part of the context image!
v3: MI_MODE is already part of gen8_ctx_workarounds...
The mmio readback for verify_gt_engine_wa() also needs a runtime-pm
wakeref, so effectively do the entirety of both engine workarounds
tests. As such simplify the rpm behaviour here by acquiring the wakeref
for the whole of each subtest. It would be still useful to later verify
the registers retain their magic values across rpm suspend/resume.
Ramalingam C [Wed, 5 Dec 2018 11:44:43 +0000 (17:14 +0530)]
drm/i915: Increase timeout for Encrypt status change
At enable/disable of the HDCP encryption, for encryption status change
we need minimum one frame duration. And we might program this bit any
point(start/End) in the previous frame.
With 20mSec, observed the timeout for change in encryption status.
Since this is not time critical operation and we need to hold on
until the status is changed, fixing the timeout to 50mSec. (Based on
trial and error method!)
Ramalingam C [Wed, 5 Dec 2018 11:44:40 +0000 (17:14 +0530)]
drm/i915: Fix GEN9 HDCP1.4 key load process
HDCP1.4 key load process varies between Intel platform to platform.
For Gen9 platforms except BXT and GLK, HDCP1.4 key is loaded using
the GT Driver Mailbox interface. So all GEN9_BC platforms will use
the GT Driver Mailbox interface for HDCP1.4 key load.
v2:
Using the IS_GEN9_BC for filtering the platforms [Ville]
According to DP spec (2.9.3.1 of DP 1.4) if
EXTENDED_RECEIVER_CAPABILITY_FIELD_PRESENT is set the addresses in DPCD
02200h through 0220Fh shall contain the DPRX's true capability. These
values will match 00000h through 0000Fh, except for DPCD_REV,
MAX_LINK_RATE, DOWN_STREAM_PORT_PRESENT.
Read from DPCD once for all 3 values as this is an expensive operation.
Spec mentions that all of address space 02200h through 0220Fh should
contain the right information however currently only 3 values can
differ.
There is no address space in the intel_dp->dpcd struct for addresses
02200h through 0220Fh, and since so much of the data is a identical,
simply overwrite the values stored in 00000h through 0000Fh with the
values that can be overwritten from addresses 02200h through 0220Fh.
This patch helps with backward compatibility for devices pre DP1.3.
v2: read only dpcd values which can be affected, remove incorrect check,
split into drm include changes into separate patch, commit message,
verbose debugging statements during overwrite.
v3: white space fixes
v4: make path dependent on DPCD revision > 1.2
v5: split into function, removed DPCD rev check
v6: add debugging prints for early exit conditions
v7 (From Manasi):
* Memcpy, memcmp and debig logging based on sizeof(dpcd_ext) (Jani N)
* Exit early (Jani N)
v8 (From Manasi):
* Get rid of superfluous debug prints (Jani N)
* Print entire base DPCD before memcpy (Jani N)
v9 (From Manasi):
* Add uniform newlines (Rodrigo)
drm/i915/psr: Check if source supports sink specific SU granularity
According to eDP spec, sink can required specific selective update
granularity that source must comply.
Here caching the value if required and checking if source supports
it.
v3:
- Returning the default granularity in case DPCD read fails(Dhinakaran)
- Changed DPCD error message level(Dhinakaran)
v4:
- Setting granularity to defaul when granularity read is equal to
0(Dhinakaran)
drm/i915/psr: Check if resolution is supported by default SU granularity
Selective updates have a default granularity requirements as stated
by eDP spec(PSR2 SELECTIVE UPDATE X GRANULARITY CAPABILITY register
definition), so check if HW can match those requirements before
enabling PSR2.
v3:
- Changes in the comments and commit message(Dhinakaran)
- Printing the hdisplay that do not match with default granularity
drm/i915: Remove old PSR2 FIXME about frontbuffer tracking
Our frontbuffer tracking improved over the years + the WA #0884
helped us keep PSR2 enabled while triggering screen updates when
necessary so this FIXME is not valid anymore.
drm/i915/icl: Do not change reserved registers related to PSR2
For ICL the bit 12 of CHICKEN_TRANS is reserved so we should not
touch it and as by default VSC_DATA_SEL_SOFTWARE_CONTROL is already
unset in gen10 + GLK we can just drop it and fix for both gens.
drm/i915/psr: Enable sink to trigger a interruption on PSR2 CRC mismatch
eDP spec states 2 different bits to enable sink to trigger a
interruption when there is a CRC mismatch.
DP_PSR_CRC_VERIFICATION is for PSR only and
DP_PSR_IRQ_HPD_WITH_CRC_ERRORS is for PSR2 only.
drm/i915/psr: Set PSR CRC verification bit in sink inside PSR1 block
As we have a else block for the 'if (dev_priv->psr.psr2_enabled) {'
and this bit is only set for PSR1 move it to that block to make it
more easy to read.
Chris Wilson [Tue, 4 Dec 2018 14:15:18 +0000 (14:15 +0000)]
drm/i915/selftests: Reorder request allocation vs vma pinning
Impose a restraint that we have all vma pinned for a request prior to
its allocation. This is to simplify request construction, and should
facilitate unravelling the lock interdependencies later.
Jani Nikula [Tue, 4 Dec 2018 10:19:26 +0000 (12:19 +0200)]
drm/i915/icl: fix transcoder state readout
Commit 2ca711caeca2 ("drm/i915/icl: Consider DSI for getting transcoder
state") clobbers the previously read TRANS_DDI_FUNC_CTL_EDP register
contents with TRANS_DDI_FUNC_CTL_DSI0 contents. Fix the state readout,
and handle DSI 1 while at it.
Use a bitmask for iterating and logging transcoders, because the allowed
combinations are a bit funky.
Chris Wilson [Tue, 4 Dec 2018 14:15:16 +0000 (14:15 +0000)]
drm/i915: Allocate a common scratch page
Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.
For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.
v2: Check that we allocated a large enough scratch for I830 w/a
Tvrtko Ursulin [Mon, 3 Dec 2018 12:50:14 +0000 (12:50 +0000)]
drm/i915: Trim unused workaround list entries
The new workaround list allocator grows the list in chunks so will end up
with some unused space. Trim it when the initialization phase is done to
free up a tiny bit of slab.
Tvrtko Ursulin [Mon, 3 Dec 2018 13:33:57 +0000 (13:33 +0000)]
drm/i915: Fuse per-context workaround handling with the common framework
Convert the per context workaround handling code to run against the newly
introduced common workaround framework and fuse the two to use the
existing smarter list add helper, the one which does the sorted insert and
merges registers where possible.
This completes migration of all four classes of workarounds onto the
common framework.
Existing macros are kept untouched for smaller code churn.
v2:
* Rename to list name ctx_wa_list and move from dev_priv to engine.
v3:
* API rename and parameters tweaking. (Chris Wilson)