Ville Syrjälä [Tue, 18 Sep 2018 14:02:43 +0000 (17:02 +0300)]
drm/i915: Check fb stride against plane max stride
commit 4e0b83a567e2 ("drm/i915: Extract per-platform plane->check()
functions") removed the plane max stride check for sprite planes.
I was going to add it back when introducing GTT remapping for the
display, but after further thought it seems better to re-introduce
it separately.
So let's add the max stride check back. And let's do it in a nicer
form than what we had before and do it for all plane types (easy
now that we have the ->max_stride() plane vfunc).
Only sprite planes really need this for now since primary planes
are capable of scanning out the current max fb size we allow, and
cursors have more stringent stride checks elsewhere.
Chris Wilson [Sat, 22 Sep 2018 14:18:03 +0000 (15:18 +0100)]
drm/i915: Match code to comment and enforce ppgtt for execlists
Our execlist dispatch code requires a ppGTT so make sure we enforce that
option in intel_sanitize_enable_ppgtt(). The comment already tries to
explain that execlists requires ppgtt, but was written when gen8 may
have also taken the legacy path; so rewrite the code to match the
comment by using HAS_EXECLISTS() feature instead of the gen.
drm/i915: use for_each_pipe loop to assign crtc_mask
This cleanup patch makes changes to use for_each_pipe loop
during bit-mask assignment of allowed crtc with encoder.
changes:
- use BIT(i) macro instead of (1 << i) (Chris)
changes from V2:
- use int for consistency (Jani)
changes from V3:
- instead use enum pipe (Ville)
changes from V4:
- drop DP/HDMI changes, as already part of patch from ville
we are missing the execlists_submission_tasklet() invocation before the
execlists_reset_fini() implying that either the queue is empty, or we
failed to schedule and run the tasklet on finish. Add an assert so we
are sure that on unsubmitting the incomplete request after reset, the
queue is indeed populated.
This patch setup voltage swing before enabling
combo PHY DDI (shared with DSI).
Note that DSI voltage swing programming is for
high speed data buffers. HW automatically handles
the voltage swing for the low power data buffers.
v2: Rebase
v3: Address various review comments related to VSWING
programming (Jani N)
drm/i915/icl: Configure lane sequencing of combo phy transmitter
This patch set the loadgen select and latency optimization for
aux and transmit lanes of combo phy transmitters. It will be
used for MIPI DSI HS operations.
v2: Rebase
v3: Add empty line to make code more legible (Ville).
On skylake we can switch to a high quality scaler mode when only 1 out
of 2 scalers are used, but on GLK and later bit 28 has a different
meaning. Don't set it, and make clear the distinction between
SKL and later PS values.
drm/i915: Replace call to commit_planes_on_crtc with internal update, v2.
drm_atomic_helper_commit_planes_on_crtc calls begin_commit,
then plane_update hooks, then flush_commit. Because we keep our own
visibility tracking through plane_state->visible there's no need to
rely on the atomic hooks for this.
By explicitly writing our own helper, we can update visible planes
as needed, which is useful to make NV12 support work as intended.
Changes since v1:
- Reword commit message. (Matt Roper)
- Rename to intel_update_planes_on_crtc(). (Matt)
drm/i915: Make intel_crtc_disable_planes() use active planes mask.
This will only disable planes we actually had marked as visible in
crtc_state->visible_planes and cleans up intel_crtc_disable_plane()
slightly.
This is also useful for when we start enabling NV12 support for gen11,
in which we will make the separate Y plane visible, but ignore the
Y plane's state.
We need to assume the plane has been visible before, even if no CRTC
is assigned to the plane. This is because when enabling a nv12 plane
on gen11, we will have to enable an extra plane and make it visible
by marking it in crtc_state->active_planes for
intel_update_planes_on_crtc().
Additionally, clear visible flag in intel_plane_atomic_check, in case
we ever hit a bug with visibility. Our code implicitly assumes that
plane_state->visible is only true when crtc and fb are set,
so we will either null deref in intel_fbc_choose_crtc() or
do something bad during the actual commit which cares even more.
Changes since v1:
- Unconditionally clear crtc_state->active_planes as well.
- Reword commit message, since this is now a preparation patch for
NV12 Y / UV plane linking.
While we may not update new_crtc_state, we may clear active_planes
if the new cursor update state will disable the cursor, but we fail
after. If this is immediately followed by a modeset disable, we may
soon not disable the planes correctly when we start depending on
active_planes.
Changes since v1:
- Clarify why we cannot swap crtc_state. (Matt)
Dave Airlie [Thu, 20 Sep 2018 23:52:34 +0000 (09:52 +1000)]
Merge branch 'drm-next-4.20' of git://people.freedesktop.org/~agd5f/linux into drm-next
This is a new pull for drm-next on top of last weeks with the following
changes:
- Fixed 64 bit divide
- Fixed vram type on vega20
- Misc vega20 fixes
- Misc DC fixes
- Fix GDS/GWS/OA domain handling
Previous changes from last week:
amdgpu/kfd:
- Picasso (new APU) support
- Raven2 (new APU) support
- Vega20 enablement
- ACP powergating improvements
- Add ABGR/XBGR display support
- VCN JPEG engine support
- Initial xGMI support
- Use load balancing for engine scheduling
- Lots of new documentation
- Rework and clean up i2c and aux handling in DC
- Add DP YCbCr 4:2:0 support in DC
- Add DMCU firmware loading for Raven (used for ABM and PSR)
- New debugfs features in DC
- LVDS support in DC
- Implement wave kill for gfx/compute (light weight reset for shaders)
- Use AGP aperture to avoid gart mappings when possible
- GPUVM performance improvements
- Bulk moves for more efficient GPUVM LRU handling
- Merge amdgpu and amdkfd into one module
- Enable gfxoff and stutter mode on Raven
- Misc cleanups
Chris Wilson [Thu, 20 Sep 2018 19:59:48 +0000 (20:59 +0100)]
drm/i915/execlists: Onion unwind for logical_ring_init() failure
Fix up the error unwind for logical_ring_init() failing by moving the
cleanup into the callers who own the various bits of state during
initialisation, so we don't forget to free the state allocated by the
caller.
Chris Wilson [Thu, 20 Sep 2018 16:13:43 +0000 (17:13 +0100)]
drm/i915: Park the GPU on module load
Once we have flushed the first request through the system to both load a
context and record the default state; tell the GPU to park and idle
itself, putting itself immediately (hopefully at least) into a
powersaving state, and allowing ourselves to start from known state
after setting up all our bookkeeping.
Chris Wilson [Thu, 20 Sep 2018 14:49:34 +0000 (15:49 +0100)]
drm/i915/selftests: Live tests emit requests and so require rpm
As we emit requests or touch HW directly for some of the live tests, the
requirement is that we hold the rpm wakeref before doing so. We want a
mix of granularity since we will want to test runtime suspend, so try to
mark up only the critical sections where we need rpm for the live test.
Chris Wilson [Wed, 19 Sep 2018 20:54:32 +0000 (21:54 +0100)]
drm/i915/guc: Restore preempt-context across S3/S4
Stolen memory is lost across S4 (hibernate) or S3-RST as it is a portion
of ordinary volatile RAM. As we allocate our rings from stolen, this may
include the rings used for our preempt context and their breadcrumb
instructions. In order to allow preemption following hibernation and
loss of stolen memory, we therefore need to repopulate the instructions
inside the lost ring upon resume. To handle both module load and resume,
we simply defer constructing the ring to first use.
Chris Wilson [Thu, 20 Sep 2018 10:58:09 +0000 (11:58 +0100)]
drm/i915/selftests: Basic stress test for rapid context switching
We need to exercise the HW and submission paths for switching contexts
rapidly to check that features such as execlists' wa_tail are adequate.
Plus it's an interesting baseline latency metric.
v2: Check the initial request for allocation errors
v3: Use finite waits for more robust handling of broken code
Dave Airlie [Thu, 20 Sep 2018 04:12:01 +0000 (14:12 +1000)]
Merge tag 'du-next-20180914' of git://linuxtv.org/pinchartl/media into drm-next
R-Car DU changes for v4.20
The pull request mostly contains updates to the R-Car DU driver, notably
support for interlaced modes on Gen3 hardware, support for the LVDS output on
R8A77980, and a set of miscellaneous bug fixes. There are also two SPDX
conversion patches for the drm shmobile and panel-lvds drivers, as well as an
update to MAINTAINERS to add Kieran Bingham as a co-maintainer for the DU
driver.
Dave Airlie [Thu, 20 Sep 2018 00:14:59 +0000 (10:14 +1000)]
Merge tag 'drm-misc-next-2018-09-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 4.20:
UAPI Changes:
- None
Cross-subsystem Changes:
- None
Core Changes:
- Allow drivers to disable features with per-device granularity (Ville)
- Use EOPNOTSUPP when iface/feature is unsupported instead of
EINVAL/errno soup (Chris)
- Simplify M/N DP quirk by using constant N to limit size of M/N (Shawn)
- add quirk for LG LP140WF6-SPM1 eDP panel (Shawn)
Driver Changes:
- i915/amdgpu: Disable DRIVER_ATOMIC for older/unsupported devices (Ville)
- sun4i: add support for R40 HDMI PHY (Icenowy)
Evan Quan [Thu, 13 Sep 2018 08:14:33 +0000 (16:14 +0800)]
drm/amd/powerplay: update OD to take voltage value instead of offset
With the latest SMC fw, we are able to get the voltage value for
specific frequency point. So, we update the OD relates to take
absolute voltage instead of offset.
vega20 should use umc_info v3_3 instead of v3_1. There are
serveral versions of umc_info for vega series. Compared to
various versions of these structures, vram_info strucure is
unified for vega series. The patch switch to query mem_type
from vram_info structure for all the vega series dGPU.
drm/radeon: change function signature to pass full range
In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against
255 is done. Since it is always false, change the signature of this
function to use an `int` instead, which match the type used in caller:
`radeon_atom_hw_i2c_xfer`.
Fix the following warning triggered with W=1:
CC [M] drivers/gpu/drm/radeon/atombios_i2c.o
drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’:
drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always false due to limited range of data type [-Wtype-limits]
if (num > ATOM_MAX_HW_I2C_READ) {
^
A. Wilcox [Mon, 2 Jul 2018 03:44:52 +0000 (22:44 -0500)]
drm/amdgpu: use processed values for counting
adev->gfx.rlc has the values from rlc_hdr already processed by
le32_to_cpu. Using the rlc_hdr values on big-endian machines causes
a kernel Oops due to writing well outside of the array (0x24000000
instead of 0x24).
The N value was computed by kernel driver that based on synchronous clock
mode. But only specific N value (0x8000) would be acceptable for
LG LP140WF6-SPM1 eDP panel which is running at asynchronous clock mode.
With the other N value, Tcon will enter BITS mode and display black screen.
Add this panel into quirk database and give particular N value when
calculate M/N divider.
v2: no update
v3: add lost commit messages back for version 2
v4: send patch to both intel-gfx and dri-devel
drm: Change limited M/N quirk to constant N quirk.
Some DP dongles in particular seem to be fussy about too large
link M/N values. Set specific value for N divider can resolve
this issue per dongle vendor's comment. So configure N as
constant value (0x8000) to instead of reduce M/N formula when
specific DP dongle connected.
v2: add more comments for issue description and fix typo.
v3: add lost commit messages back for version 2
v4: send patch to both intel-gfx and dri-devel
DP quirk list just compare sink or branch device's OUI so far.
That means particular vendor's products will be applied specific
change. This change would confirm device_id the same or not.
Then driver can implement some changes for branch/sink device
that really need additional WA.
v2: use sizeof instead of hard coded '6'
v3: add lost commit messages back for version 2
v4: send patch to both intel-gfx and dri-devel
With virtio gpu ttm-pages being dma mapped, dma sync is needed when
swiotlb is used as bounce buffers, before TRANSFER_TO_HOST_2D/3D
commands are sent.
drm/i915/psr: Enable AUX-A IO power well on ICL for PSR
PSR requires AUX IO power well to be enabled. This was already in place
for CNL, extend this for ICL too. Not enabling the power well results in
the aux error interrupts when the hardware exits PSR.
Ville Syrjälä [Mon, 17 Sep 2018 15:15:03 +0000 (18:15 +0300)]
drm/i915/sdvo: Fix multi function encoder stuff
SDVO encoders can have multiple different types of outputs hanging off
them. Currently the code tries to muck around with various is_foo
flags in the encoder to figure out which type its driving. That doesn't
work with atomic and other stuff, so let's nuke those flags and just
look at which type of connector we're actually dealing with.
The is_hdmi we'll need as that's not discoverable via the output flags,
but we'll just move it under the connector.
We'll also move the sdvo fixed mode handling out from the .get_modes()
hook into the sdvo lvds init function so that we can bail out properly
if there is no fixed mode to be found.
Ville Syrjälä [Tue, 18 Sep 2018 13:10:59 +0000 (16:10 +0300)]
drm/i915: Fix logic fumble in rotation vs. ccs check
Smatch reports:
../drivers/gpu/drm/i915/intel_sprite.c:1192 skl_plane_check_fb() warn: was || intended here instead of &&?
Obviously smatch is correct here since we're trying to check if we're
using either of the ccs modifiers. Since we now have is_ccs_modifier()
let's use it to fix this.
Ville Syrjälä [Mon, 17 Sep 2018 17:14:14 +0000 (20:14 +0300)]
drm/i915: Replace some PAGE_SHIFTs with I915_GTT_PAGE_SIZE
Clean up some cases where we're dealing with GTT pages instead of
system pages to use I915_GTT_PAGE_SIZE instead of PAGE_SHIT. So
just replace the the shifts with mul/div as appropriate. These
are the easy ones, the rest probably need some actual thought.
No real changes in the generated asm. Only gen8_ppgtt_insert_4lvl()
was affected as gcc decided to do the following change:
- be9: 89 d9 mov %ebx,%ecx
- beb: c1 e1 0c shl $0xc,%ecx
- bee: 48 63 c9 movslq %ecx,%rcx
+ be9: 48 63 cb movslq %ebx,%rcx
+ bec: 48 c1 e1 0c shl $0xc,%rcx
and that then shifted a bunch of the offset by one byte. I presume
the sign extensions in the asm are due to integer promotions from
u16 etc. Hopefully someone has confirmed that those don't end up
doing the wrong thing for us.
The Gen3 VSP used by the DU for display does not support the packed VYUY
pixel format. Gen2 VSP hardware is able to process this format, but
DU + VSP operation isn't enabled on Gen2, and VYUY isn't a strategic
format, so it can be ignored.
Remove the format from the capabilities of the DU driver.
Laurent Pinchart [Fri, 31 Aug 2018 18:12:59 +0000 (19:12 +0100)]
drm: rcar-du: Update framebuffer pitch and alignment limits for Gen3
The framebuffer pitch and alignment constraints reflect the limitations
of the Gen2 DU hardware. On Gen3, the DU has no memory interface and
thus doesn't impose any constraint. The limitations come instead from
the VSP that has a limit of 65535 bytes for the pitch and no alignment
constraint. Update the checks accordingly.
Koji Matsuoka [Fri, 31 Aug 2018 18:12:58 +0000 (19:12 +0100)]
drm: rcar-du: Add support for missing pixel formats
This patch supports pixel format of RGB332, ARGB4444, XRGB4444,
BGR888, RGB888, BGRA8888, BGRX8888 and YVYU.
VYUY pixel format is not supported by H/W specification.
Jacopo Mondi [Wed, 22 Aug 2018 07:21:48 +0000 (09:21 +0200)]
drm: rcar-du: Write ESCR and OTAR as CRTC registers
The ESCR and OTAR registers exist in each DU channel, but at different
offsets for odd and even channels. This led to usage of the group
register access API to write them, with offsets macros named ESCR/OTAR
and ESCR2/OTAR2 for the first and second ESCR/OTAR register in the group
respectively.
The names are confusing as it suggests that the ESCR/OTAR registers for
DU0 and DU2 are taken into account, especially with writes performed to
the group register access API.
Rename the offsets to ESCR/OTAR02 and ESCR/OTAR13, and use the CRTC
register access API to clarify the code. The offsets values are updated
accordingly.
Jacopo Mondi [Wed, 22 Aug 2018 07:21:47 +0000 (09:21 +0200)]
drm: rcar-du: Rename and document dpll_ch field
Document and re-name the 'dpll_ch' field to a more precise 'dpll_mask' for
consistency with the 'channels_mask' field defined in 'struct
rcar_du_device_info'.
Jacopo Mondi [Mon, 20 Aug 2018 15:26:17 +0000 (17:26 +0200)]
drm: rcar-du: Improve non-DPLL clock selection
DU channels not equipped with a DPLL use an SoC internal (provided by
the CPG) or external clock source combined with a DU internal divider to
generate the desired output dot clock frequency.
The current clock selection procedure does not fully exploit the ability
of external clock sources to generate the exact dot clock frequency by
themselves, but relies instead on tuning the internal DU clock divider
only, resulting in a less precise clock generation process.
When possible, and desirable, ask the external clock source for the
exact output dot clock frequency, and select the clock source that
produces the frequency closest to the desired output dot clock.
This patch specifically targets platforms (like Salvator-X[S] and ULCBs)
where the DU's input dotclock.in is generated by the versaclock VC5
clock source, which is capable of generating the exact rate the DU needs
as pixel clock output.
This patch fixes higher resolution modes which requires an high pixel
clock output currently not working on non-HDMI DU channel (such as
1920x1080@60Hz on the VGA output).
Fixes: 1b30dbde8596 ("drm: rcar-du: Add support for external pixel clock") Signed-off-by: Jacopo Mondi <[email protected]>
[Factor out code to a helper function] Signed-off-by: Laurent Pinchart <[email protected]> Acked-by: Jacopo Mondi <[email protected]>
drm: rcar-du: Rework clock configuration based on hardware limits
The DU channels that have a display PLL (DPLL) can only use external
clock sources, and don't have an internal clock divider (with the
exception of H3 ES1.x where the post-divider is present and needs to be
used as a workaround for a DPLL silicon issue).
Rework the clock configuration to take this into account, avoiding
selection of non-existing clock sources or usage of a missing
post-divider.
drm/amd/display: stop using switch for different CS revisions
Clock sources currently have support for asic specific
function pointers. But actual separation into functions
was never performed, leaving us with giant functions that
rely on switch.
This change creates separate functions, removing switch use.
Chris Wilson [Tue, 11 Sep 2018 11:57:47 +0000 (12:57 +0100)]
drm/i915: Limit number of capture objects
If we fail to allocate an array for a large number of user requested
capture objects, reduce the array size and try to grab at least some of
the objects!
Chris Wilson [Thu, 13 Sep 2018 19:20:50 +0000 (20:20 +0100)]
drm: Differentiate the lack of an interface from invalid parameter
If the ioctl is not supported on a particular piece of HW/driver
combination, report ENOTSUP (aka EOPNOTSUPP) so that it can be easily
distinguished from both the lack of the ioctl and from a regular invalid
parameter.
v2: Across all the kms ioctls we had a mixture of reporting EINVAL,
ENODEV and a few ENOTSUPP (most where EINVAL) for a failed
drm_core_check_feature(). Update everybody to report ENOTSUPP.
v3: ENOTSUPP is an internal errno! It's value (524) does not correspond
to a POSIX errno, the one we want is ENOTSUP. However,
uapi/asm-generic/errno.h doesn't include ENOTSUP but man errno says
"ENOTSUP and EOPNOTSUPP have the same value on Linux,
but according to POSIX.1 these error values should be
distinct."
Alex Deucher [Thu, 13 Sep 2018 20:41:57 +0000 (15:41 -0500)]
drm/amdgpu: simplify Raven, Raven2, and Picasso handling
Treat them all as Raven rather than adding a new picasso
asic type. This simplifies a lot of code and also handles the
case of rv2 chips with the 0x15d8 pci id. It also fixes dmcu
fw handling for picasso.
drm/amdgpu: Style fixes to PRIME code documentation
* Use consistent capitalization in the description of function arguments
* Define and consistently use the BO acronym for buffer objects
* Some minor wording improvements
Signed-off-by: Vijetha Malkai <[email protected]>
[ Michel Dänzer: Made commit log more specific ]
Michel Dänzer [Wed, 12 Sep 2018 16:07:10 +0000 (18:07 +0200)]
drm/amdgpu: Initialize fences array entries in amdgpu_sa_bo_next_hole
The entries were only initialized once in amdgpu_sa_bo_new. If a fence
wasn't signalled yet in the first amdgpu_sa_bo_next_hole call, but then
got signalled before a later amdgpu_sa_bo_next_hole call, it could
destroy the fence but leave its pointer in the array, resulting in
use-after-free in amdgpu_sa_bo_new.
Huang Rui [Tue, 16 Jan 2018 02:42:58 +0000 (10:42 +0800)]
drm/amdgpu: fix the VM fault while write at the top of the invisible vram
Raven2 has a HW issue that it is unable to use the vram which is out of
MC_VM_SYSTEM_APERTURE_HIGH_ADDR. So here is the workaround that increase system
aperture high address to get rid of the VM fault and hardware hang.