Chris Wilson [Sun, 22 Dec 2019 23:35:58 +0000 (23:35 +0000)]
drm/i915: Mark the GEM context link as RCU protected
The only protection for intel_context.gem_cotext is granted by RCU, so
annotate it as a rcu protected pointer and carefully dereference it in
the few occasions we need to use it.
Chris Wilson [Sun, 22 Dec 2019 21:02:55 +0000 (21:02 +0000)]
drm/i915: Introduce a vma.kref
Start introducing a kref on i915_vma in order to protect the vma unbind
(i915_gem_object_unbind) from a parallel destruction (i915_vma_parked).
Later, we will use the refcount to manage all access and turn i915_vma
into a first class container.
drm/i915: Skip rotated offset adjustment for unsupported modifiers
During framebuffer creation, we pre-compute offsets for 90/270 plane
rotation. However, only Y and Yf modifiers support 90/270 rotation. So,
skip the calculations for other modifiers.
To keep the gem buffer size check still working for tiled planes, factor
out the logic needed for rotation setup and skip only this part for
tiled planes other than Y/Yf.
v2: Add a bounds check WARN for the rotation info array.
v3: Keep the gem buffer size check working for tiled planes.
Imre Deak [Sat, 21 Dec 2019 12:05:40 +0000 (14:05 +0200)]
drm/i915/tgl: Make sure FBs have a correct CCS plane stride
The CCS plane stride must be fixed on TGL, as it's not configurable for
the display. Instead the HW has a hardwired logic to determine it from
the main plane stride. Make sure userspace passes in the correct stride.
Gen-12 display decompression operates on Y-tiled compressed main surface.
The CCS is linear and has 4 bits of metadata for each main surface cache
line pair, a size ratio of 1:256. Gen-12 display decompression is
incompatible with buffers compressed by earlier GPUs, so make use of a new
modifier to identify gen-12 compression. Another notable change is that
render decompression is supported on all planes except cursor and on all
pipes. Start by adding render decompression support for [A,X]BGR888 pixel
formats.
v2: Fix checkpatch warnings (Lucas)
v3:
Rebase, disable color clear, styling changes and modify
intel_tile_width_bytes and intel_tile_height to handle linear CCS
v4:
- Use format block descriptors and the i915 specific func to get the
subsampling for each color plane.
- Use helpers to convert between CCS and main planes.
v5:
- Fix subsampling returned by intel_fb_plane_get_subsampling() for
the CCS plane of the first plane.
v6:
- Rebased on v2 of patch 4.
v7:
- Fix plane dimensions during FB check.
Imre Deak [Sat, 21 Dec 2019 12:05:37 +0000 (14:05 +0200)]
drm/i915: Add helpers to select correct ccs/aux planes
Using helpers instead of open coding this to select a CCS plane for a
main plane makes the code cleaner and less error-prone when the location
of CCS plane can be different based on the format (packed vs. YUV
semiplanar). The same applies to selecting an AUX plane which can be a
UV plane (for an uncompressed YUV semiplanar format), or a CCS plane.
Chris Wilson [Sun, 22 Dec 2019 12:07:52 +0000 (12:07 +0000)]
drm/i915/gt: Pull GT initialisation under intel_gt_init()
Begin pulling the GT setup underneath a single GT umbrella; let intel_gt
take ownership of its engines! As hinted, the complication is the
lifetime of the probed engine versus the active lifetime of the GT
backends. We need to detect the engine layout early and keep it until
the end so that we can sanitize state on takeover and release.
Chris Wilson [Sat, 21 Dec 2019 20:01:09 +0000 (20:01 +0000)]
drm/i915: Move i915_gem_init_contexts() earlier
As the GEM global context setup is now independent of the GT state
(although GT does currently still depend upon the global
i915->kernel_context), we can move its init earlier, leaving the gt init
ready to be extracted.
Chris Wilson [Sat, 21 Dec 2019 18:02:04 +0000 (18:02 +0000)]
drm/i915/gt: Repeat wait_for_idle for retirement workers
Since we may retire timelines from secondary workers,
intel_gt_retire_requests() is not always a reliable indicator that all
pending retirements are complete. If we do detect secondary workers are
in progress, recommend intel_gt_wait_for_idle() to repeat the retirement
check.
Chris Wilson [Sat, 21 Dec 2019 16:03:24 +0000 (16:03 +0000)]
drm/i915: Remove i915->kernel_context
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).
Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.
GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.
Ville Syrjälä [Thu, 19 Dec 2019 11:14:30 +0000 (13:14 +0200)]
drm/i915: Introduce intel_crtc_state_alloc()
We have several places where we want to allocate a pristine
crtc state. Some of those currently call intel_crtc_state_reset()
to properly initialize all the non-zero defaults in the state, but
some places do not. Let's add intel_crtc_state_alloc() to do both
the alloc and the reset, and call that everywhere we need a fresh
crtc state.
v2: s/kzalloc/kmalloc/ since we memset() anyway (José)
Yannick Fertré [Wed, 27 Nov 2019 10:23:38 +0000 (11:23 +0100)]
drm/stm: ltdc: move pinctrl to encoder mode set
The pin control must be set to default as soon as possible to
establish a good video link between tv & bridge hdmi
(encoder mode set is call before encoder enable).
Chris Wilson [Fri, 20 Dec 2019 10:12:30 +0000 (10:12 +0000)]
drm/i915: Push the use-semaphore marker onto the intel_context
Instead of rummaging through the intel_context to peek at the GEM
context in the middle of request submission to decide whether to use
semaphores, store that information on the intel_context itself.
Chris Wilson [Fri, 20 Dec 2019 10:12:29 +0000 (10:12 +0000)]
drm/i915: Drop GEM context as a direct link from i915_request
Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.
Chris Wilson [Thu, 19 Dec 2019 23:29:32 +0000 (23:29 +0000)]
drm/i915/gt: Teach veng to defer the context allocation
Since we added the context_alloc callback to intel_context_ops, we can
safely install a custom hook for the deferred virtual context allocation.
This means that all new contexts behave the same upon creation,
simplifying later code.
Chris Wilson [Thu, 19 Dec 2019 22:13:44 +0000 (22:13 +0000)]
drm/i915/gt: Add breadcrumb retire to physical engine
Avoid adding the retire workers to the virtual engine so that we don't
end up in the unenviable situation of trying to free the virtual engine
while its worker remains active.
Ville Syrjälä [Fri, 13 Dec 2019 13:34:53 +0000 (15:34 +0200)]
drm/i915: Rename pipe update tracepoints
All the other display related tracepoints use intel_ instead
if i915_ as the prefix. Do the same for the pipe update
tracepoints so I don't always have to spend time looking for
them.
Ville Syrjälä [Fri, 13 Dec 2019 13:34:49 +0000 (15:34 +0200)]
drm/i915/fbc: Remove second redundant intel_fbc_pre_update() call
I fumbled the conflict resolution a bit when applying the
fbc vblank wait w/a. Because of that we now call intel_fbc_pre_update()
twice. Remove the second redundant call.
Colin Ian King [Thu, 19 Dec 2019 19:09:16 +0000 (19:09 +0000)]
drm/i915: fix uninitialized pointer reads on pointers to and from
Currently pointers to and from are not initialized and may contain
garbage values. This will cause uninitialized pointer reads in the
call to intel_frontbuffer_track and later checks to see if to and from
are null. Fix this by ensuring to and from are initialized to NULL.
Chris Wilson [Wed, 18 Dec 2019 21:05:45 +0000 (21:05 +0000)]
drm/i915/gt: Suppress threshold updates on RPS parking
When we park RPS, we set the GPU to run at minimum 'idle' frequency.
However, as the GPU is idle, we also disable the worker and RPS
interrupts - changing the RPS thresholds has no effect, it just incurs
extra changes to restore them when we unpark. So on parking, leave the
thresholds set to the current power level and so we expect them to be
valid for our restart.
Chris Wilson [Wed, 18 Dec 2019 21:05:44 +0000 (21:05 +0000)]
drm/i915/gt: Use non-forcewake writes for RPS
Use non-forcewaked writes to queue RPS register changes that will take
effect when the write buffer is flushed, rather than wake the mmio
device for immediate effect. This is so that we can avoid a slow
forcewake dance upon unparking, and at our irregular updates.
Chris Wilson [Thu, 19 Dec 2019 12:43:53 +0000 (12:43 +0000)]
drm/i915/gt: Track engine round-trip times
Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.
Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptible, so
this provides us with a good measurement of the round trip time. It also
provides us with faster engine parking for ringbuffer submission, which
is a welcome bonus (e.g. softer-rc6).
Chris Wilson [Thu, 19 Dec 2019 12:43:52 +0000 (12:43 +0000)]
drm/i915/gt: Schedule request retirement when signaler idles
Very similar to commit 4f88f8747fa4 ("drm/i915/gt: Schedule request
retirement when timeline idles"), but this time instead of coupling into
the execlists CS event interrupt, we couple into the breadcrumb
interrupt and queue a timeline's retirement when the last signaler is
completed. This should allow us to more rapidly park ringbuffer
submission, and so help reduce power consumption on older systems.
v2: Fixup intel_engine_add_retire() to handle concurrent callers
Jani Nikula [Thu, 12 Dec 2019 13:47:28 +0000 (15:47 +0200)]
drm/i915/dsc: fix DSC power domains for DSI
Fix several issues with DSC power domains that did not take DSI
transcoders into account:
- On TGL+ we need to use PW2 for DSC on pipe A, not transcoder A. There
is no longer an eDP transcoder, but there are two DSI transcoders
which may be connected to pipe A.
- On TGL+ we need to use the pipe, not transcoder, power domains for DSC
on pipes other than A. Again, there are DSI transcoders.
- On ICL we need to use PW2 for DSC also for DSI transcoders, not just
for the eDP transcoder.
Using is_pipe_dsc() also adds the warning about ICL pipe A DSC, which
does not exist.
Jani Nikula [Wed, 11 Dec 2019 16:23:47 +0000 (18:23 +0200)]
drm/i915/dsc: clarify DSC support for pipe A on ICL
The check for cpu_transcoder != TRANSCODER_A is more magic than
necessary, and potentially misleading. Before TGL, DSC is supported on
pipe A if, and only if, it's used with eDP or DSI transcoders. No
functional changes.
Chris Wilson [Wed, 18 Dec 2019 09:40:57 +0000 (09:40 +0000)]
drm/i915: Ratelimit i915_globals_park
When doing our global park, we like to be a good citizen and shrink our
slab caches (of which we have quite a few now), but each
kmem_cache_shrink() incurs a stop_machine() and so ends up being quite
expensive, causing machine-wide stalls. While ideally we would like to
throw away unused pages in our slab caches whenever it appears that we
are idling, doing so will require a much cheaper mechanism. In the
meantime use a delayed worked to impose a rate-limit that means we have
to have been idle for more than 2 seconds before we start shrinking.
Chris Wilson [Tue, 17 Dec 2019 09:56:41 +0000 (09:56 +0000)]
drm/i915/gt: Remove direct invocation of breadcrumb signaling
Only signal the breadcrumbs from inside the irq_work, simplifying our
interface and calling conventions. The micro-optimisation here is that
by always using the irq_work interface, we know we are always inside an
irq-off critical section for the breadcrumb signaling and can ellide
save/restore of the irq flags.
First is not assuming rc6 value zero means GT is asleep since that can
also mean GPU is fully busy and we do not want to enter the estimation
path in that case.
Second is ensuring monotonicity on the estimation path itself. I suspect
what is happening is with extremely rapid park/unpark cycles we get no
updates on the real rc6 and therefore have to careful not to
unconditionally trust use last known real rc6 when creating a new
estimation.
v2:
* Simplify logic by not tracking the estimate but last reported value.
Ville Syrjälä [Fri, 13 Dec 2019 19:52:17 +0000 (21:52 +0200)]
drm/i915: Move stuff from haswell_crtc_disable() into encoder .post_disable()
Move all of haswell_crtc_disable() into the encoder
.post_disable() hooks. Now we're left with just
calling the .disable() and .post_disable() hooks
back to back.
I chose to move the code into the .post_disable() hook instead
of the .disable() hook as most of the sequence is currently
implemented in the .post_disable() hook.
We should collapse it all down to just one hook and then the
encoders can drive the modeset sequence fully. But that may
need some further refactoring as we currently call the
ddi .post_disable() hook from mst code and we can't just
replace that with a call to the ddi .disable() hook.
Should also follow up with similar treatment for the enable
sequence but let's start here where it's easier.
Ville Syrjälä [Fri, 13 Dec 2019 19:52:16 +0000 (21:52 +0200)]
drm/i915: Pass old crtc state to intel_crtc_vblank_off()
To make life easier in the future let's pass the old crtc state
to intel_crtc_vblank_off() just like we already do for its
counterpart intel_crtc_vblank_on().
Ville Syrjälä [Fri, 13 Dec 2019 19:52:15 +0000 (21:52 +0200)]
drm/i915: Pass old crtc state to skylake_scaler_disable()
To make life easier in the future let's pass the old crtc state
to skylake_scaler_disable() just like we already do for
for its ancestor ironlake_pfit_disable().
Ville Syrjälä [Fri, 13 Dec 2019 19:52:14 +0000 (21:52 +0200)]
drm/i915: Nuke .post_pll_disable() for DDI platforms
HSW+ platforms call encoder .post_disable() and .post_pll_disable()
back to back. And since we don't even disable the PLL in between
let's just move everything into .post_disable().
intel_dp_mst does forward the .post_disable() call to intel_ddi at
the very end of its own .post_disable() hook, so this time MST
I shouldn't even break MST by accident.
Ville Syrjälä [Fri, 13 Dec 2019 19:52:13 +0000 (21:52 +0200)]
drm/i915: Call hsw_fdi_link_train() directly()
Remove the pointless vfunc detour for hsw_fdi_link_train()
and just call it directly. Also pass the encoder in so we
can nuke the silly encoder loop within.
Ville Syrjälä [Thu, 7 Nov 2019 14:24:17 +0000 (16:24 +0200)]
drm/i915: Introduce intel_plane_state_reset()
For the sake of symmetry with the crtc stuff let's add
a helper to reset the plane state to sane default values.
For the moment this only gets caller from the plane init.
Ville Syrjälä [Thu, 7 Nov 2019 14:24:16 +0000 (16:24 +0200)]
drm/i915: Introduce intel_crtc_state_reset()
We have a few places where we want to reset a crtc state to its
default values. Let's add a helper for that. We'll need the new
__drm_atomic_helper_crtc_state_reset() helper for this to allow
us to just reset the state itself without clobbering the
crtc->state pointer.
And while at it let's zero out the whole thing, except a few
choice member which we'll mark as "invalid". And thanks to this
we can now nuke intel_crtc_init_scalers().
Ville Syrjälä [Thu, 7 Nov 2019 14:24:15 +0000 (16:24 +0200)]
drm/i915: Introduce intel_crtc_{alloc,free}()
We already have alloc/free helpers for planes, add the same for
crtcs. The main benefit is we get to move all the annoying state
initialization out of the main crtc_init() flow.
Annoyingly __drm_atomic_helper_crtc_reset() does two
totally separate things:
a) reset the state to defaults values
b) assign the crtc->state pointer
I just want a) without the b) so let's split out part
a) into __drm_atomic_helper_crtc_state_reset(). And
of course we'll do the same thing for planes and connectors.
v2: Fix conn__state vs. conn_state typo (Lucas)
Make code and kerneldoc match for
__drm_atomic_helper_plane_state_reset()
Chris Wilson [Wed, 18 Dec 2019 09:35:04 +0000 (09:35 +0000)]
drm/i915/gt: Ratelimit display power w/a
For very light workloads that frequently park, acquiring the display
power well (required to prevent the dmc from trashing the system) takes
longer than the execution. A good example is the igt_coherency selftest,
which is slowed down by an order of magnitude in the worst case with
powerwell cycling. To prevent frequent cycling, while keeping our fast
soft-rc6, use a timer to delay release of the display powerwell.
Chris Wilson [Wed, 18 Dec 2019 10:40:43 +0000 (10:40 +0000)]
drm/i915: Hold reference to intel_frontbuffer as we track activity
Since obj->frontbuffer is no longer protected by the struct_mutex, as we
are processing the execbuf, it may be removed. Mark the
intel_frontbuffer as rcu protected, and so acquire a reference to
the struct as we track activity upon it.
At this point in time, compatible string "thine,thc63lvdm83d" is
backed by the lvds-codec driver, and the documentation contained
in thine,thc63lvdm83d.txt is basically the same as the one
contained in lvds-codec.yaml (generic fallback compatible string
aside), therefore absorb thine,thc63lvdm83d.txt.
In an effort to repurpose lvds-encoder.c to also serve the
function of LVDS decoders, we ended up defining a new "generic"
compatible string ("lvds-decoder"), therefore adapt the dt schema
to allow for the new compatible string.
The probe function needs to get ahold of the panel device tree
node, and it achieves that by using a combination of
of_graph_get_port_by_id, of_get_child_by_name, and
of_graph_get_remote_port_parent. We can achieve the same goal
by replacing those calls with a call to of_graph_get_remote_node
these days.
Fabrizio Castro [Tue, 17 Dec 2019 23:07:53 +0000 (01:07 +0200)]
drm/bridge: lvds-codec: Add "lvds-decoder" support
Add support for transparent LVDS decoders by adding a new
compatible string ("lvds-decoder") to the driver.
This patch also adds member connector_type to struct lvds_codec,
and that's because LVDS decoders have a different connector type
from LVDS encoders. We fill this new member up with the data
matching the compatible string.
Fabrizio Castro [Wed, 13 Nov 2019 15:51:24 +0000 (15:51 +0000)]
drm/bridge: Repurpose lvds-encoder.c
lvds-encoder.c implementation is also suitable for LVDS decoders,
not just LVDS encoders.
Instead of creating a new driver for addressing support for
transparent LVDS decoders, repurpose lvds-encoder.c for the greater
good with this patch.
This patch only "rebrands" the lvds-encoder.c driver, to make it
suitable for hosting LVDS decoders support. The actual support for
LVDS decoders will come with a later patch.
Compatible string "ti,sn75lvds83" is being used by device tree
rk3188-bqedison2qc.dts, but it's not documented anywhere, therefore
document it within lvds-transmitter.yaml.
ti,ds90c185.txt documents LVDS encoders using the same driver
as the one documented by lvds-transmitter.yaml.
Since the properties listed in ti,ds90c185.txt are the same
as the ones listed in lvds-transmitter.yaml, absorb the dt-binding
into lvds-transmitter.yaml.
Chris Wilson [Mon, 16 Dec 2019 16:17:16 +0000 (16:17 +0000)]
drm/i915: Unpin vma->obj on early error
If we inherit an error along the fence chain, we skip the main work
callback and go straight to the error. In the case of the vma bind
worker, we only dropped the pinned pages from the worker.
In the process, make sure we call the release earlier rather than wait
until the final reference to the fence is dropped (as a reference is
kept while being listened upon).
Gurchetan Singh [Tue, 17 Dec 2019 23:02:28 +0000 (15:02 -0800)]
udmabuf: fix dma-buf cpu access
I'm just going to put Chia's review comment here since it sums
the issue rather nicely:
"(1) Semantically, a dma-buf is in DMA domain. CPU access from the
importer must be surrounded by {begin,end}_cpu_access. This gives the
exporter a chance to move the buffer to the CPU domain temporarily.
(2) When the exporter itself has other means to do CPU access, it is
only reasonable for the exporter to move the buffer to the CPU domain
before access, and to the DMA domain after access. The exporter can
potentially reuse {begin,end}_cpu_access for that purpose.
Because of (1), udmabuf does need to implement the
{begin,end}_cpu_access hooks. But "begin" should mean
dma_sync_sg_for_cpu and "end" should mean dma_sync_sg_for_device.
Because of (2), if userspace wants to continuing accessing through the
memfd mapping, it should call udmabuf's {begin,end}_cpu_access to
avoid cache issues."
Fabrizio Castro [Tue, 17 Dec 2019 13:45:59 +0000 (13:45 +0000)]
drm: rcar-du: lvds: Allow for even and odd pixels swap
DT properties dual-lvds-even-pixels and dual-lvds-odd-pixels
can be used to work out if the driver needs to swap even
and odd pixels around.
This patch makes use of the return value from function
drm_of_lvds_get_dual_link_pixel_order to determine if we
need to swap odd and even pixels around for things to work
properly.
The dual_link boolean field from struct rcar_lvds is not
sufficient to describe the type of LVDS link anymore, since
we now have information related to pixel order, therefore
rename it to link_type and repurpose its usage to fit the
new requirements.
Fabrizio Castro [Tue, 17 Dec 2019 13:45:58 +0000 (13:45 +0000)]
drm: rcar-du: lvds: Get dual link configuration from DT
For dual-LVDS configurations, it is now possible to mark the
DT port nodes for the sink with boolean properties (like
dual-lvds-even-pixels and dual-lvds-odd-pixels) to let drivers
know the encoders need to be configured in dual-LVDS mode.
Rework the implementation of rcar_lvds_parse_dt_companion
to make use of the DT markers while keeping backward
compatibility.
Fabrizio Castro [Tue, 17 Dec 2019 13:45:57 +0000 (13:45 +0000)]
drm: rcar-du: lvds: Improve identification of panels
Dual-LVDS panels are mistakenly identified as bridges, this
commit replaces the current logic with a call to
drm_of_find_panel_or_bridge to sort that out.
An LVDS dual-link connection is made of two links, with even
pixels transitting on one link, and odd pixels on the other
link. The device tree can be used to fully describe dual-link
LVDS connections between encoders and bridges/panels.
The sink of an LVDS dual-link connection is made of two ports,
the corresponding OF graph port nodes can be marked
with either dual-lvds-even-pixels or dual-lvds-odd-pixels,
and that fully describes an LVDS dual-link connection,
including pixel order.
drm_of_lvds_get_dual_link_pixel_order is a new helper
added by this patch, given the source port nodes it
returns DRM_LVDS_DUAL_LINK_EVEN_ODD_PIXELS if the source
port nodes belong to an LVDS dual-link connection, with even
pixels expected to be generated from the first port, and odd
pixels expected to be generated from the second port.
If the new helper returns DRM_LVDS_DUAL_LINK_ODD_EVEN_PIXELS,
odd pixels are expected to be generated from the first port,
and even pixels from the other port.
Laurent Pinchart [Tue, 15 Oct 2019 23:06:52 +0000 (02:06 +0300)]
drm: rcar-du: lvds: Get mode from state
The R-Car LVDS encoder driver implements the bridge .mode_set()
operation for the sole purpose of storing the mode in the LVDS private
data, to be used later when enabling the encoder.
Switch to the bridge .atomic_enable() and .atomic_disable() operations
in order to access the global atomic state, and get the mode from the
state instead. Remove both the unneeded .mode_set() operation and the
display_mode and mode fields storing state data from the rcar_lvds
private structure.
As a side effect we get the CRTC from the state, replace the CRTC
pointer retrieved through the bridge's encoder that shouldn't be used by
atomic drivers.
While at it, clarify a few error messages in rcar_lvds_get_lvds_mode()
and turn them into warnings as they are not fatal.