Changbin Du [Thu, 1 Mar 2018 07:49:59 +0000 (15:49 +0800)]
drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead
The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.
To address both above issues, this patch introduced two new MPT API:
o dma_map_guest_page - setup dma map for guest page
o dma_unmap_guest_page - cancel dma map for guest page
The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.
With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.
v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.
Zhenyu Wang [Thu, 22 Feb 2018 07:16:14 +0000 (15:16 +0800)]
drm/i915/gvt: Fix check error of vgpu create failure message
Fix check error at
CHECK drivers/gpu/drm/i915//gvt/kvmgt.c
drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454)
For failed vgpu create, just show error return in failure message.
Weinan Li [Fri, 23 Feb 2018 06:46:45 +0000 (14:46 +0800)]
drm/i915/gvt: init mmio by lri command in vgpu inhibit context
There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.
The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.
v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)
v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)
Changbin Du [Tue, 30 Jan 2018 11:19:54 +0000 (19:19 +0800)]
drm/i915/gvt: Manage shadow pages with radix tree
We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.
Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108
This time I didn't get the early data of hash table. The data is
measured when desktop is shown.
As last change, the overall benchmark almost is not changed, but
we get better scalability.
Changbin Du [Tue, 30 Jan 2018 11:19:53 +0000 (19:19 +0800)]
drm/i915/gvt: Provide generic page_track infrastructure for write-protected page
This patch provide generic page_track infrastructure for write-protected
guest page. The old page_track logic gets rewrote and now stays in a new
standalone page_track.c. This page track infrastructure can be both used
by vGUC and GTT shadowing.
The important change is that it uses radix tree instead of hash table.
We don't have a predictable number of pages that will be tracked.
Here is some performance data (duration in us) of looking up a element:
Before: (aka. intel_vgpu_find_tracked_page)
0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291
After: (aka. intel_vgpu_find_page_track)
0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105
The hash table has good performance at beginning, but turns bad with
more pages being tracked even no 3D applications are running. As
expected, radix tree has stable duration and very quick.
The overall benchmark (tested with Heaven Benchmark) marginally improved
since this is not the bottleneck. What we benefit more from this change
is scalability.
Changbin Du [Tue, 30 Jan 2018 11:19:51 +0000 (19:19 +0800)]
drm/i915/gvt: Rename mpt api {set, unset}_wp_page to {enable, disable}_page_track
The kvmgt's implementation of mpt api {set,unset}_wp_page is not real
write-protection - the data get written before invoke this two api.
As discussed, change the mpt api to match the real behavior.
Changbin Du [Tue, 30 Jan 2018 11:19:50 +0000 (19:19 +0800)]
drm/i915/gvt: Rename shadow_page to short name spt
The target structure of some functions is struct intel_vgpu_ppgtt_spt and
their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's
use short name 'spt' instead to reduce the length. As well as the hash
table name.
Changbin Du [Tue, 30 Jan 2018 11:19:49 +0000 (19:19 +0800)]
drm/i915/gvt: Rework shadow page management code
This is a another big one and the GVT shadow page management code is
heavily refined.
The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu
shadow page table - w/ or wo/ a guest page associated with. A pure shadow
page (no guest page associated) will be used to shadow splited 2M huge
gtt. In this case, the spt.guest_page.gfn should be a zero.
To search a existed shadow page table, we have two new interfaces:
- intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not
be a pure spt.
- intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in
shadowed PTE.
The oos_page management is remained as what is was.
v2: Split some changes into small standalone patches.
Changbin Du [Tue, 30 Jan 2018 11:19:44 +0000 (19:19 +0800)]
drm/i915/gvt: Add verbose gtt shadow logs
This add a new macro gvt_vdbg_mm() to print more verbose logs for
gtt shadowing. The added verbose logs are very useful for debugging.
gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by
the developer.
Changbin Du [Tue, 30 Jan 2018 11:19:41 +0000 (19:19 +0800)]
drm/i915/gvt: Refine the intel_vgpu_mm reference management
If we manage an object with a reference count, then its life cycle
must flow the reference count operations. Meanwhile, change the
operation functions to generic name *put* and *get*.
This is a big one and the GVT shadow graphic memory management code is
heavily refined. The new code is more straightforward with less code.
The struct intel_vgpu_mm is restructured to be clearly defined, use
accurate names and some of the original fields are removed which are
really redundant.
Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need
to mix ppgtt and ggtt together, since one vGPU only has one ggtt object.
v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm.
v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs.
v2: Split some changes into small standalone patches.
Chris Wilson [Fri, 2 Mar 2018 14:32:45 +0000 (14:32 +0000)]
drm/i915/execlists: Split spinlock from its irq disabling side-effect
During reset/wedging, we have to clean up the requests on the timeline
and flush the pending interrupt state. Currently, we are abusing the irq
disabling of the timeline spinlock to protect the irq state in
conjunction to the engine's timeline requests, but this is accidental
and conflates the spinlock with the irq state. A baffling state of
affairs for the reader.
Instead, explicitly disable irqs over the critical section, and separate
modifying the irq state from the timeline's requests.
Chris Wilson [Fri, 2 Mar 2018 13:12:46 +0000 (13:12 +0000)]
drm/i915/execlists: Move irq state manipulation inside irq disabled region
Although this state (execlists->active and engine->irq_posted) itself is
not protected by the engine->timeline spinlock, it does conveniently
ensure that irqs are disabled. We can use this to protect our
manipulation of the state and so ensure that the next IRQ to arrive sees
consistent state and (hopefully) ignores the reset engine.
the conclusion is that the only place where the ports are reset to zero,
is from engine->cancel_requests called during i915_gem_set_wedged().
The race is horrible as it results from calling set-wedged on active HW
(the GPU reset failed) and as such we need to be careful as the HW state
changes beneath us. Fortunately, it's the same scary conditions as
affect normal reset, so we can reuse the same machinery to disable state
tracking as we clobber it.
Imre Deak [Thu, 1 Mar 2018 13:44:57 +0000 (15:44 +0200)]
drm/i915/gen9, gen10: Disable FBC on planes with a misaligned Y-offset
Enabling FBC on a plane having a Y-offset that isn't divisible by 4 may
cause pipe FIFO underruns and flickers, so disable FBC on such a config.
I tried the followings to work around the issue:
- enable each HW work around in ILK_DPFC_CHICKEN
- disable each compression algorithm in ILK_DPFC_CONTROL
- disable low-power watermarks
None of the above got rid of the problem. I haven't found this issue in
the Bspec/WA database either.
Besides the igt testcase below (yet to be merged) an easy way to
reproduce the issue is to enable a plane with FBC and a plane Y-offset
not aligned to 4 and then just enable/disable FBC in a loop, keeping the
plane enabled.
I could trigger the problem on BXT/GLK/SKL/CNL, so assume for now that it's
only present on GEN9 and GEN10.
v2: (Ville)
- Run the test/apply the WA on CNL as well.
- Use IS_GEN() instead of INTEL_GEN().
- Fix spelling.
drm/i915/uc: Make GuC/HuC fw fetch and loading functions/file structure symmetric
GuC load function is named intel_guc_fw_upload() and HuC load function is
named intel_huc_init_hw(). Make them consistent intel_*_fw_upload. Also
move HuC fw loading functions and declarations to separate files
intel_huc_fw.c|h like GuC.
While at this, do below changes
1. Update kernel-doc comment for intel_*_fw_upload() functions
2. s/huc_ucode_xfer/huc_fw_xfer
3. Introduce intel_huc_fw_init_early()
v2: Changed patch to update HuC functions instead of changing
guc_fw_upload and update file structure. (Michal Wajdeczko)
v3: Added SPDX License identifier to huc_fw.c|h. (Michal Wajdeczko)
drm/i915: Check for I915_MODE_FLAG_INHERITED before drm_atomic_helper_check_modeset
Moving the check upwards will mean we we no longer have to add planes
and connectors manually, because everything is handled correctly by
drm_atomic_helper_check_modeset() as intended.
Chris Wilson [Thu, 1 Mar 2018 10:33:38 +0000 (10:33 +0000)]
drm/i915: Replace open-coded wait-for loop
Now that we can pass arbitrary commands into the base __wait_for()
macro, we can reimplement the open-coded wait-for inside
i915_gem_idle_work_handler() using the new macro. This means that instead
of using ktime, we now use jiffies, and benefit from the exponential sleep
backoff that allows a fast response if the HW settles quickly.
We're seeing on CI that some contexts don't have the programmed OA
period timer that directs the OA unit on how often to write reports.
The issue is that we're not holding the drm lock from when we edit the
context images down to when we set the exclusive_stream variable. This
leaves a window for the deferred context allocation to call
i915_oa_init_reg_state() that will not program the expected OA timer
value, because we haven't set the exclusive_stream yet.
v2: Drop need_lock from gen8_configure_all_contexts() (Matt)
Mika Kuoppala [Wed, 28 Feb 2018 10:11:53 +0000 (12:11 +0200)]
drm/i915/icl: Interrupt handling
v2: Rebase.
v3:
* Remove DPF, it has been removed from SKL+.
* Fix -internal rebase wrt. execlists interrupt handling.
v4: Rebase.
v5:
* Updated for POR changes. (Daniele Ceraolo Spurio)
* Merged with irq handling fixes by Daniele Ceraolo Spurio:
* Simplify the code by using gen8_cs_irq_handler.
* Fix interrupt handling for the upstream kernel.
v6:
* Remove early bringup debug messages (Tvrtko)
* Add NB about arbitrary spin wait timeout (Tvrtko)
v7 (from Paulo):
* Don't try to write RO bits to registers.
* Don't check for PCH types that don't exist. PCH interrupts are not
here yet.
v9:
* squashed in selector and shared register handling (Daniele)
* skip writing of irq if data is not valid (Daniele)
* use time_after32 (Chris)
* use I915_MAX_VCS and I915_MAX_VECS (Daniele)
* remove fake pm interrupt handling for later patch (Mika)
v10:
* Direct processing of banks. clear banks early (Chris)
* remove poll on valid bit, only clear valid bit (Mika)
* use raw accessors, better naming (Chris)
v11:
* adapt to raw_reg_[read|write]
* bring back polling the valid bit (Daniele)
v12:
* continue if unset intr_dw (Daniele)
* comment the usage of gen8_de_irq_handler bits (Daniele)
Tvrtko Ursulin [Wed, 28 Feb 2018 10:11:52 +0000 (12:11 +0200)]
drm/i915/icl: Prepare for more rings
Gen11 will add more VCS and VECS rings so prepare the
infrastructure to support that.
Bspec: 7021
v2: Rebase.
v3: Rebase.
v4: Rebase.
v5: Rebase.
v6:
- Update for POR changes. (Daniele Ceraolo Spurio)
- Add provisional guc engine ids - to be checked and confirmed.
v7:
- Rebased.
- Added the new ring masks.
- Added the new HW ids.
v8:
- Introduce I915_MAX_VCS/VECS to avoid magic numbers (Michal)
Manasi Navare [Wed, 28 Feb 2018 22:31:50 +0000 (14:31 -0800)]
drm/i915/dp: Add HBR3 rate (8.1 Gbps) to dp_rates array
dp_rates[] array is a superset of all the link rates supported
by sink devices. DP 1.3 specification adds HBR3 (8.1Gbps) link rate
to the set of link rates supported by sink. This patch adds this rate
to dp_rates[] array that gets used to populate the sink_rates[]
array limited by max rate obtained from DP_MAX_LINK_RATE DPCD register.
v2:
* Rebased on top of Jani's localized rates patch
Dave Airlie [Thu, 1 Mar 2018 04:07:22 +0000 (14:07 +1000)]
Merge tag 'drm-intel-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:
- Lift alpha_support protection from Cannonlake (Rodrigo)
* Meaning the driver should mostly work for the hardware we had
at our disposal when testing
* Used to be preliminary_hw_support
- Add missing Cannonlake PCI device ID of 0x5A4C (Rodrigo)
- Cannonlake port register fix (Mahesh)
- Fix Dell Venue 8 Pro black screen after modeset (Hans)
- Fix for always returning zero out-fence from execbuf (Daniele)
- Fix HDMI audio when no no relevant video output is active (Jani)
- Fix memleak of VBT data on driver_unload (Hans)
- Fix for KASAN found locking issue (Maarten)
- RCU barrier consolidation to improve igt/gem_sync/idle (Chris)
- Optimizations to IRQ handlers (Chris)
- vblank tracking improvements (64-bit resolution, PM) (Dhinakaran)
- Pipe select bit corrections (Ville)
- Reduce runtime computed device_info fields (Chris)
- Tune down some WARN_ONs to GEM_BUG_ON now that CI has good coverage (Chris)
- A bunch of kerneldoc warning fixes (Chris)
* tag 'drm-intel-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-intel: (113 commits)
drm/i915: Update DRIVER_DATE to 20180221
drm/i915/fbc: Use PLANE_HAS_FENCE to determine if the plane is fenced
drm/i915/fbdev: Use the PLANE_HAS_FENCE flags from the time of pinning
drm/i915: Move the policy for placement of the GGTT vma into the caller
drm/i915: Also check view->type for a normal GGTT view
drm/i915: Drop WaDoubleCursorLP3Latency:ivb
drm/i915: Set the primary plane pipe select bits on gen4
drm/i915: Don't set cursor pipe select bits on g4x+
drm/i915: Assert that we don't overflow frontbuffer tracking bits
drm/i915: Track number of pending freed objects
drm/i915/: Initialise trans_min for skl_compute_transition_wm()
drm/i915: Clear the in-use marker on execbuf failure
drm/i915: Prune gen8_gt_irq_handler
drm/i915: Track GT interrupt handling using the master iir
drm/i915: Remove WARN_ONCE for failing to pm_runtime_if_in_use
drm: intel_dpio_phy: fix kernel-doc comments at nested struct
drm/i915: Release connector iterator on a digital port conflict.
drm/i915/execlists: Remove too early assert
drm/i915: Assert that we always complete a submission to guc/execlists
drm: move read_domains and write_domain into i915
...
Dave Airlie [Thu, 1 Mar 2018 04:04:30 +0000 (14:04 +1000)]
Merge tag 'tilcdc-4.17' of https://github.com/jsarha/linux into drm-next
drm/tilcdc changes to v4.17
* tag 'tilcdc-4.17' of https://github.com/jsarha/linux:
drm/tilcdc: tilcdc_panel: Rename device from "panel" to "tilcdc-panel"
drm/tilcdc: Add support for drm panels
drm/tilcdc: panel: Use common error handling code in of_get_panel_info()
drm/tilcdc: Delete an error message for a failed memory allocation in seven functions
Jani Nikula [Tue, 27 Feb 2018 10:59:11 +0000 (12:59 +0200)]
drm/i915/dp: move link rate arrays where they're used
Localize link rate arrays by moving them to the functions where they're
used. Further clarify the distinction between source and sink
capabilities. Split pre and post Haswell arrays, and get rid of the
array size arithmetics. Use a direct rate value in the paranoia case of
no common rates find.
Ville Syrjälä [Thu, 22 Feb 2018 18:10:33 +0000 (20:10 +0200)]
drm/i915: Consult aux_ch instead of port in ->get_aux_clock_divider()
While it seems totally unlikely that any system would mix a cpu/north
aux channel with a pch/south port (or vice versa) we should still
consult intel_dp->aux_ch rather than encoder->port when figuring out
which clock is actually used by the aux ch.
Although we protect the request itself, we don't lock inside
intel_engine_dump() and so the request maybe retired as we peek into it.
One consequence is that the request->ctx may be freed before we
dereference it, leading to a use-after-free. Replace the hw_id we are
peeking from inside request->ctx with the request->fence.context, with
which we can still track from which context the request originated
(although to tie to HW reports requires a little more legwork, but is
good enough to follow the GEM traces).
Jyri Sarha [Wed, 21 Feb 2018 16:38:24 +0000 (18:38 +0200)]
drm/tilcdc: tilcdc_panel: Rename device from "panel" to "tilcdc-panel"
Rename the bundled tilcdc_panel driver from just "panel" to
"tilcdc-panel" to avoid noisy error messages from the driver trying to
probe all device nodes named "panel".
Jyri Sarha [Sun, 18 Feb 2018 17:48:32 +0000 (19:48 +0200)]
drm/tilcdc: Add support for drm panels
Add support for drm panels to tilcdc. Adding the support on top of the
existing bridge support needs only couple of lines of code when using
using the drm panel bridge helpers.
Currently, BXT_PP is hardcoded with value '0'.
It practically disabled eDP backlight on MRB (BXT) platform.
This patch will tell which BXT_PP registers (there are two set of
PP_CONTROL in the spec) to be used as defined in VBT (Video Bios Timing
table) and this will enabled eDP backlight controller on MRB (BXT)
platform.
v2:
- Remove unnecessary information in commit message.
- Assign vbt.backlight.controller to a backlight_controller variable and
return the variable value.
v3:
- Rebased to latest code base.
- updated commit title.
Chris Wilson [Tue, 27 Feb 2018 21:18:16 +0000 (21:18 +0000)]
drm/i915: Repeat the GEM_BUG_ON message in the ftrace log
As the ftrace log is overflowing the pstore capture, we lose the last
gasps from dmesg which includes the GEM_BUG_ON function:line and condition
that failed. Vital information for tracking down the bug, so append it to
the frace log as well.
Dave Airlie [Wed, 28 Feb 2018 01:44:29 +0000 (11:44 +1000)]
Merge branch 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-next
- Expose thermal thresholds through hwmon properly
- Rework HDP flushing for rings and CPU
- Improved dual-link DVI handling in DC
- Lots of code clean up
- Additional DC clean up
- Allow scanout from system memory on CZ/BR/ST
- Improved PASID/VM integration
- Expose GPU voltage and power via hwmon
- Initial wattman-like support
- Initial power profiles for use-case optimized performance
- Rework GPUVM TLB flushing
- Rework IP offset handling for SOC15 asics
- Add CRC support in DC
- Fixes for mmhub powergating
- Initial regamma/degamma/CTM support in DC
- ttm cleanups and simplifications
- ttm OOM avoidance fixes
* 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux: (348 commits)
Revert "drm/radeon/pm: autoswitch power state when in balanced mode"
drm/radeon: use drm_gem_private_object_init
drm/amdgpu: use drm_gem_private_object_init
drm/amdgpu: mitigate workaround for i915
drm/amdgpu: implement amdgpu_gem_map_(attach/detach)
drm/amdgpu/powerplay/smu7: drop refresh rate checks for mclk switching
drm/amdgpu/cgs: add refresh rate checking to non-DC display code
drm/amd/powerplay/smu7: allow mclk switching with no displays
drm/amd/powerplay/vega10: allow mclk switching with no displays
drm/amd/powerplay: use PP_CAP macro for disable_mclk_switching_for_frame_lock
drm/amd/powerplay: remove unused headers
drm/amdgpu_gem: fix error handling path in amdgpu_gem_va_update_vm
drm/amdgpu: update the PASID mapping only on demand
drm/amdgpu: separate PASID mapping from VM flush v2
drm/amd/display: Fix increment when sampling OTF in DCE
drm/amd/display: De PQ implementation
drm/amd/display: Remove unused dm_pp_ interfaces
drm/amd/display: Add logging for aux DPCD access
drm/amd/display: Set vsc pack revision when DPCD revision is >= 1.2
drm/amd/display: provide an interface to query firmware version
...
Rodrigo Vivi [Tue, 27 Feb 2018 21:29:13 +0000 (13:29 -0800)]
drm/i915/psr: Don't avoid PSR when PSR2 conditions are not met.
We can still use PSR1 when PSR2 conditions are not met.
So, let's split the check in a way that we make sure has_psr
gets set independently of PSR2 criteria.
v2: Duh! Handle proper return to avoid breaking PSR2.
v3: (DK):
- better name for psr2 conditions check function
- Don't remove FIXME block and psr2.support check.
- Add a debug message to show us what PSR or PSR2 is
getting enabled now we have ways to enabled PSR on
PSR2 panels.
- s/PSR2 disabled/PSR2 not enabled
drm/i915/psr: Check for power state control capability.
eDP spec says - "If PSR/PSR2 is supported, the SET_POWER_CAPABLE bit in the
EDP_GENERAL_CAPABILITY_1 register (DPCD Address 00701h, bit d7) must be set
to 1."
Reject PSR on panels without this cap bit set as such panels cannot be
controlled via SET_POWER & SET_DP_PWR_VOLTAGE register and the DP source
needs to be able to do that for PSR.
Thanks to Nathan for debugging this.
Panel cap checks like this can be done just once, let's fix this
when PSR dpcd init movement lands.
drm/i915/frontbuffer: Mark frontbuffer flush and invalidate with might_sleep()
Frontbuffer flush and invalidate call psr, fbc and drrs functions that use
mutexes but they can be called in atomic contexts in the fbdev path. The
point where the spinlocks are acquired is up in the call stack that is not
entirely easy to spot, so annotate with might_sleep().
PSR on CNL requires AUX IO wells to be kept on and the existing AUX domain
for AUX-A enables DC_OFF well too. This is not required, so add a new
AUX_IO_A domain for AUX-A to allow DC states to remain enabled. Other AUX
channels re-use the existing AUX domains.
v4: Reword comment (Rodrigo and Ville)
Rename _get and _put functions to include aux_io substring(Rodrigo)
Remove unnecessary diff that got included.
v3: Extract aux domain selection into a function (Ville)
v2: Add AUX IO domain only for AUX-A
Rebased on top of Ville's AUX series.
Michał Winiarski [Mon, 26 Feb 2018 16:37:59 +0000 (17:37 +0100)]
drm/i915/guc: Fill preempt context once at init time
Since we're inhibiting context save of preempt context, we're no longer
tracking the position of HEAD/TAIL. With GuC, we're adding a new
breadcrumb for each preemption, which means that the HW will do more and
more breadcrumb writes. Eventually the ring is filled, and we're
submitting the preemption context with HEAD==TAIL==0, which won't result
in breadcrumb write, but will trigger hangcheck instead.
Instead of writing a new preempt breadcrumb for each preemption, let's
just fill the ring once at init time (which also saves a couple of
instructions in the tasklet).
v2: Assert that context save restore is inhibited, don't assert on ring
alignment. (Chris)
v3: Cleanup checkpatch.
Manasi Navare [Tue, 27 Feb 2018 03:11:15 +0000 (19:11 -0800)]
drm/i915/dp: Fix the order of platforms for setting DP source rates
The usual if ladder order should be from newest to oldest
platform. However the CNL conditional statement was misplaced.
This patch sets the DP source for platforms starting from the newest
to oldest.
Chris Wilson [Thu, 22 Feb 2018 14:22:29 +0000 (14:22 +0000)]
drm/i915/preemption: Allow preemption between submission ports
Sometimes we need to boost the priority of an in-flight request, which
may lead to the situation where the second submission port then contains
a higher priority context than the first and so we need to inject a
preemption event. To do so we must always check inside
execlists_dequeue() whether there is a priority inversion between the
ports themselves as well as the head of the priority sorted queue, and we
cannot just skip dequeuing if the queue is empty.
As Michał noted, this doesn't simply extend to handling more than 2-port
submission, as we may need to reorder within the array of executing
requests which themselves are lower priority than the first. A task for
later!
Ville Syrjälä [Thu, 22 Feb 2018 18:10:31 +0000 (20:10 +0200)]
drm/i915: Nuke aux regs from intel_dp
Just store function pointers that give us the correct register offsets
instead of storing the register offsets themselves. Slightly less
efficient perhaps but saves a few bytes and better matches how we do
things elsewhere.
Ville Syrjälä [Thu, 22 Feb 2018 18:10:30 +0000 (20:10 +0200)]
drm/i915: Add enum aux_ch and clean up the aux init to use it
Since we no longer have a 1:1 correspondence between ports and AUX
channels, let's give AUX channels their own enum. Makes it easier
to tell the apples from the oranges, and we get rid of the
port E AUX power domain FIXME since we now derive the power domain
from the actual AUX CH.
v2: Rebase due to AUX F
v3: Split out the power domain fix (Rodrigo)
Dave Airlie [Fri, 23 Feb 2018 01:12:52 +0000 (11:12 +1000)]
Merge tag 'drm-misc-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 4.17:
Cross-subsystem Changes:
- Backlight helpers to enable/disable and find devices in dt (Meghana)
Core Changes:
- Documentation improvements (Chris/Daniel/Jani)
- simple_kms_helper: Add mode_valid() support (Linus)
- mm: Fix bug in interval_tree causing nodes to be out-of-order (Chris)
Driver Changes:
- tinydrm/panel: Use the new backlight helpers (Meghana)
- rockchip: Support gem_prime_import_sg_table + some fixes (Various)
- sun4i: Add A83T HDMI support using dw-hdmi (Jernej)
Cc: Meghana Madhyastha <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Linus Walleij <[email protected]> Cc: Heiko Stuebner <[email protected]> Cc: Jernej Skrabec <[email protected]>
* tag 'drm-misc-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc: (41 commits)
drm/omapdrm: Use of_find_backlight helper
drm/panel: Use of_find_backlight helper
drm/omapdrm: Use backlight_enable/disable helpers
drm/panel: Use backlight_enable/disable helpers
drm/tinydrm: Call devres version of of_find_backlight
drm/tinydrm: Replace tinydrm_of_find_backlight with of_find_backlight
drm/tinydrm: Convert tinydrm_enable/disable_backlight to backlight_enable/disable
drm: add documentation for tv connector state margins
drm/doc: Use new substruct support
drm/doc: Polish for drm_mode_parse_command_line_for_connector
drm/docs: Document "scaling mode" property better
drm/docs: Align layout of optional plane blending properties
drm/docs: Discourage adding more to kms-properties.csv
drm: simple_kms_helper: Add mode_valid() callback support
drm/todo: Add idr_init_base todo
drm: Use idr_init_base(1) when using id==0 for invalid
drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
dma-buf/sw_sync: Fix kerneldoc warnings
drm: Fix kerneldoc warnings for drm_lease
...
Ville Syrjälä [Wed, 21 Feb 2018 16:02:35 +0000 (18:02 +0200)]
drm/i915: Add a FIXME about FBC vs. fence. 90/270 degree rotation
Currently the FBC code doesn't handle the 90/270 degree rotated case
correctly. We would need the GTT tracking to monitor the fence on the
normal GTT view (the rotated view doesn't even have a fence). Not quite
sure how we should program the fence Y offset etc. in that case. For now
we'll end up disabling FBC with 90/270 degree rotation. Add a FIXME
to remind people about this fact.
v2: Reword the text (Chris)
Move the FIXME to the fbc code
Ville Syrjälä [Wed, 21 Feb 2018 16:02:33 +0000 (18:02 +0200)]
drm/i915: Require fence only for FBC capable planes
As only a subset of primary planes are FBC capable there's no need
to waste fences on all of them. So let's skip the fence if the plane
isn't even fbc capable.
In the future we might extend this to skip the fence even for FBC
capable planes if the crtc and/or plane state isn't suitable
for FBC.
Ville Syrjälä [Wed, 21 Feb 2018 17:31:01 +0000 (19:31 +0200)]
drm/i915: Clean up fbc vs. plane checks
Let's record the information whether a plane can do fbc or not under
struct inte_plane.
v2: Rebase due to i9xx_plane_id
Handle BDW/HSW correctly
v3: Move inte_fbc_init() back since we depend on it happening
even with i915.disable_display, and populate
fbc->possible_framebuffer_bits directly from the
plane init code instead
v4: Add note about plane A being tied to pipe A on HSW+
Ville Syrjälä [Wed, 21 Feb 2018 18:48:07 +0000 (20:48 +0200)]
drm/i915: Only pin the fence for primary planes (and gen2/3)
Currently we pin a fence on every plane doing tiled scanout. The
number of planes we have available is fast apporaching the number
of fences so we really should stop wasting them. Only FBC needs
the fence on gen4+, so let's use fences only for the primary planes
on those platforms.
v2: drop the tiling check from plane_uses_fence() as the obj is
NULL during initial_plane_config() and we don't rally need the
check since i915_vma_pin_fence() does the check anyway
Johnson Lin [Tue, 30 Jan 2018 15:51:29 +0000 (21:21 +0530)]
drm/i915: Fix Limited Range Color Handling
Some panels support limited range output (16-235) compared
to full range RGB values (0-255). Also userspace can control
the RGB range using "Broadcast RGB" property. Currently the
code to handle full range to limited range is broken. This
patch fixes the same by properly scaling down all the full
range co-efficients with limited range scaling factor.
v2: Fixed Ville's review comments.
v3: Changed input to const and used correct data types as
suggested by Ville
drm/i915/hsw: add missing disabled EUs registers reads
It turns out that HSW has a register that tells us how many EUs are
disabled per half-slice (roughly a similar notion to subslice). We
didn't read those registers so far as most userspace drivers didn't
need those values prior to Gen8, but an internal library would like to
have access to this.
Since we already have the getparam interface, there is no harm in
exposing this.
Paulo Zanoni [Tue, 20 Feb 2018 15:37:52 +0000 (17:37 +0200)]
drm/i915/icl: Add the ICL PCI IDs
This is the current PCI ID list in our documentation.
Let's leave the _gt#_ part out for now since our current documentation
is not 100% clear and we don't need this info now anyway.
v2: Use the new ICL_11 naming (Kelvin Gardiner).
v3: Latest IDs as per BSpec (Oscar).
v4: Make it compile (Paulo).
v5: Remove comments (Lucas).
v6: Multile rebases (Paulo).
v7: Rebase (Mika)
Chris Wilson [Wed, 21 Feb 2018 13:32:36 +0000 (13:32 +0000)]
drm/i915/execlists: Remove the ring advancement under preemption
Load an empty ringbuffer for preemption, ignoring the lite-restore
workaround as we know the preempt context is always idle before preemption.
Note that after some digging by Michal Winiarski, we found that
RING_HEAD is no longer being updated (due to inhibiting context save
restore) so this patch is already in effect!
Chris Wilson [Wed, 21 Feb 2018 09:56:36 +0000 (09:56 +0000)]
drm/i915: Rename drm_i915_gem_request to i915_request
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)
A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Daniel Vetter [Tue, 20 Feb 2018 13:20:17 +0000 (14:20 +0100)]
drm/todo: i915 could use device_link_add
Noticed while reading some unrelated patches. Unfortunately Imre's
patch to add our early/late hooks predated the device_link
infrastructure by 2 years.
Tvrtko Ursulin [Tue, 20 Feb 2018 10:47:42 +0000 (10:47 +0000)]
drm/i915: Make global seqno known in i915_gem_request_execute tracepoint
Commit fe49789fab97 ("drm/i915: Deconstruct execute fence") re-arranged
the code and moved the i915_gem_request_execute tracepoint to before the
global seqno is assigned to the request.
We need to move the tracepoint a bit later so this information is once
again available.