Git Repo - linux.git/log

drm/amdgpu: move taking mmap_sem into get_user_pages v2

This didn't helped as intended, just simplify the code.

v2: unlock mmap_sem in the error path as well

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Account for shadow PTs in mapping update IB size.

When amdgpu_vm_frag_ptes calls amdgpu_vm_update_ptes and the pt
has a shadow PT we mirror all the write to the shadow PT too, which
results in twice the commands.

Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more

With shared reservation objects __ttm_bo_reserve() can easily fail even on
destroyed BOs. This prevents correct handling when we need to individualize
the reservation object.

Fix this by individualizing the object before even trying to reserve it.

Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: revert "fix deadlock of reservation between cs and gpu reset v2"

This reverts commit 10e709cb296c98424c03408d23e3addeddcd4088.

The patch doesn't work at all:
1. The CS can still be blocked because of amdgpu_ctx_add_fence().
2. The order of submission isn't correct any more.
3. We could end up using freed up memory because we now drop the
ctx reference to early.

This needs to be fixed cleanly by doing the context handling after the BO
handling, but this is a larger task just avoid the obvious crashes for now.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Monk Liu [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: Fix psm_set_user_performance_state()

We now pass a pointer to a pointer which seems to be
what they meant in the first place. The previous version
was modifying a pointer passed by value.

Fixes bug that was introduced by

commit 332798d40c2e91:drm/amd/powerplay: delete eventmgr layer in poweprlay

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-By: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix userptr put_page handling

Move calling put_page into the unpopulate callback. Otherwise we mess up the pages
reference count when it is unbound multiple times.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: Fix configuration error around populate_and_map() functions

Fixed kbuild errors when IOMMU/SWIOTLB are disabled.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix wait_any_fence

first is incorrect if hit NULL/signaled fence

Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: set uvd/vce/nb/mclk level as UMD P-state required

Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: add UMD P-state in powerplay.

This feature is for UMD to run benchmark in a
power state that is as steady as possible. kmd
need to fix the power state as stable as possible.
now, kmd support four level:
profile_standard,peak,min_sclk,min_mclk

move common related code to amd_powerplay.c

Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: delete eventmgr related files.

Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: delete eventmgr layer in poweprlay

Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: fix sclk setting for profile mode for CZ/ST

Need to select dpm0 to avoid clock fluctuations.

Reviewed-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: Use correct path to trace include

The header comment in include/trace/define_trace.h specifies that the
TRACE_INCLUDE_PATH needs to be relative to the define_trace.h header
rather than the trace file including it. Most instances get that wrong
and work around it by adding the $(src) directory to the include path.

While this works, it is preferable to refer to the correct path to the
trace file in the first place and avoid any workaround.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Use correct path to trace include

The header comment in include/trace/define_trace.h specifies that the
TRACE_INCLUDE_PATH needs to be relative to the define_trace.h header
rather than the trace file including it. Most instances get that wrong
and work around it by adding the $(src) directory to the include path.

While this works, it is preferable to refer to the correct path to the
trace file in the first place and avoid any workaround.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: Fix trace include path (v2)

Reviewed-by: Thierry Reding <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(v2): Drop Makefile change too.

drm/amd/amdgpu: Cleanup gmc_v9_0_suspend()

Even though fini returns 0 always it could theoretically
fail in the future. Might as well return it instead of 0.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gmc_v9_0_hw_init()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gmc_v9_0_gart_enable()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Simplify gmc_v9_0_vm_fault_interrupt_state()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Support full range of GFX ring names

Right now there's only one but the rest of the code is being
setup to support more so might as well fix this up too.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix placement flags in amdgpu_ttm_bind

Otherwise we lose the NO_EVICT flag and can try to evict pinned BOs.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix moved list handling in the VM

Only move BOs to the moved/relocated list when they aren't already on a list.

This prevents accidential removal from the evicted list.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: declare the new firmware files needed by polaris asics

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Flora Cui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: handle all fragment sizes v4

This can improve performance for some cases.

v2 (chk): handle all sizes, simplify the patch quite a bit
v3 (chk): adjust dw estimation as well
v4 (chk): use single loop, make end mask 64bit

Signed-off-by: Roger He <[email protected]>
Signed-off-by: Christian König <[email protected]>
Tested-by: Roger He <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Simplify gfx_v9_0_wait_for_idle()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Fix indentation in gfx_v9_0_mqd_init()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_rlc_stop()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_gfx_dynamic_mg_power_gating()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_gfx_static_mg_power_gating()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_gfx_pipeline_powergating()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_gfx_cg_power_gating()

Make it consistent in style with the other CG/PG enable functions...

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_cp_power_gating()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_sck_slow_down_on_power_down()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_sck_slow_down_on_power_up()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_enable_save_restore_machine()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up gfx_v9_0_ngg_en()

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Tidy up register list formatting.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: bump version for support of local BOs

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add IOCTL interface for per VM BOs v3

Add the IOCTL interface so that applications can allocate per VM BOs.

Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.

v2: add some extra checks, remove the WIP tag
v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add support for per VM BOs v2

Per VM BOs are handled like VM PDs and PTs. They are always valid and don't
need to be specified in the BO lists.

v2: validate PDs/PTs first

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: restrict userptr even more

Don't allow them to be GEM imported into another process.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix new PD update code for Vega10 v2

We need to refer to the parent instead of the root BO for multi
level page tables on Vega10. Also don't set the PDE_PTE bit.

v2: Don't set the PDE_PTE bit either.

Signed-off-by: Christian König <[email protected]>
Reviewed-and-Tested-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: move hw generation check into amdgpu_doorbell_init v2

This way we can safely call it on SI as well.

v2: fix type in commit message

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: cleanup the VM code a bit more

The src isn't used any more after GART hack removal.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: rework page directory filling v2

Keep track off relocated PDs/PTs instead of walking and checking all PDs.

v2: fix root PD handling

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]> (v1)
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay/hwmgr: Remove null check before kfree

kfree on NULL pointer is a no-op and therefore checking is redundant.

Reviewed-by: Harry Wentland <[email protected]>
Reviewed-by: Rex Zhu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Himanshu Jha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd: Remove null check before kfree

Kfree on NULL pointer is a no-op and therefore checking is redundant.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Himanshu Jha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: track evicted page tables v2

Instead of validating all page tables when one was evicted,
track which one needs a validation.

v2: simplify amdgpu_vm_ready as well

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]> (v1)
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix comment on amdgpu_bo_va

Except for the reference count all other members are protected
by the VM PD being reserved.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add bo_va cleared flag again v2

We changed this to use an extra list a while back, but for the next
series I need a separate flag again.

v2: reorder to avoid unlocked list access

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: rework moved handling in the VM v2

Instead of using the vm_state use a separate flag to note
that the BO was moved.

v2: reorder patches to avoid temporary lockless access

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Add write() method to VRAM debugfs entry (v2)

Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
(v2): Call get_user before holding spinlock.

Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: notify smu once display changed on Rv.

when User turn off display or screen idle timeout,
smu need this message to start S0i2 entry.

Signed-off-by: Rex Zhu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: add dummy pp table for raven. (v2)

As there is no PPTable in RV, it is difficult to
cleanly decouple PPTABLE functionality in existing
codes.

v2: agd: squash in clean build fix

Signed-off-by: Rex Zhu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: refine pp code for raven

delete useless code.

Signed-off-by: Rex Zhu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/gfx9: adjust mqd allocation size

To allocate additional space for the dynamic cu masks.
Confirmed with the hw team that we only need 1 dword
for the mask. The mask is the same for each SE so
you only need 1 dword.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/gfx9: update mqd to include dynamic CU mask

Necessary for proper operation with KIQ.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/gfx8: drop cz mqd

It was unused and according to hw team, it's the same for
all asics in a gfx family so remove it.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/gfx8: apply dynamic cu mask to APUs as well

Confirmed with the hw team. It's the same for all asics.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/powerplay/vega10: fix typo in register base index

Probably a copy pasta. No functional difference, both have
the same value.

Reviewed-by: Felix Kuehling <[email protected]>
Reported-by: Michael von Khurja <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: cleanup GWS, GDS and OA allocation

Those are certainly not kernel allocations, instead set the NO_CPU_ACCESS flag.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix and cleanup VM ready check

Stop checking the mapped BO itself, cause that one is
certainly not a page table.

Additional to that move the code into amdgpu_vm.c

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix amdgpu_vm_bo_map trace point

That somehow got lost.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Move VBIOS version to sysfs

sysfs is more stable, and doesn't require root to access

Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Add debugfs file for VBIOS and version

Add 2 debugfs files, one that contains the VBIOS version, and one that
contains the VBIOS itself. These won't change after initialization,
so we can add the VBIOS version when we parse the atombios information.

This ensures that we can find out the VBIOS version, even when the dmesg
buffer fills up, and makes it easier to associate which VBIOS version is
for which GPU on mGPU configurations. Set the size to 20 characters in
case of some weird VBIOS version that exceeds the expected 17 character
format (3-8-3\0). The VBIOS dump also allows for easy debugging

v2: Move to debugfs, clarify commit message, add VBIOS dump file

Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: Remove needless 'extern' on functions in header.

Minor tidy up.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: use new TTM populate/dma map helper functions

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Use new TTM populate/map helper function

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: Add helper functions to populate/map in one call (v2)

These functions replace a section of common code found
in radeon/amdgpu drivers (and possibly others) as part
of the ttm_tt_*populate() callbacks.

v2: squash in fix for sw iommu from Tom

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/include: Add hdmi_redriver_set to atomfirmware

We'll need this for a some upcoming display changes

Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Remove AMDGPU tracepoint and use new TTM tracepoint (v2)

Switches the AMDGPU driver over to the TTM tracepoint and removes
our old one. Now you can enable traces before loading the module
and trace all mappings.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(v2): Use struct device instead of pci in trace.

drm/ttm: Add DMA map/unmap tracepoint (v3)

Also exports two functions that vendor drivers can call
to trace DMA mappings. This is meant to help translate
IOMMU mappings of bus addresses back to physical pages.

Used by the umr amdgpu debugger for instance.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(v2): Use dev_name() to get PCI path instead.
(v3): Use correct types for dma/phys addresses

drm/amdgpu: support polaris10/11/12 new cp firmwares

Newer versions of the CP firmware require changes in how the driver
initializes the hw block.
Change the firmware name for new firmware to maintain compatibility with
older kernels.

Acked-by: Christian König <[email protected]>
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: remove duplicate return statement

Remove a redundant identical return statement, it has no use.

Detected by CoverityScan, CID#1454586 ("Structurally dead code")

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: check memory allocation failure

Check memory allocation failure and return -ENOMEM in such a case.

'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.

The calling graph would be, in such a case!
   failure in amdgpu_cs_process_syncobj_out_dep()
      ---> error code returned by amdgpu_cs_dependencies()
         --> amdgpu_cs_parser_fini() is called

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: fix BANK_SELECT on Vega10 (v2)

BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.

v2: agd: fix warning

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: inline amdgpu_ttm_do_bind again

The function is called only once and doesn't do anything special.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix amdgpu_ttm_bind

Use ttm_bo_mem_space instead of manually allocating GART space.

This allows us to evict BOs when there isn't enought GART space any more.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: remove the GART copy hack

This isn't used since we don't map evicted BOs to GART any more.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm:fix wrong decoding of bo_count

we observe abnormal number from:
/sys/devices/virtual/drm/amdttm/buffer_objects/bo_count

bo_count is atomic_inc which is "int" type,
shouldn't explicitly turn it to unsigned long.

Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/ttm: fix missing inc bo_count

Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: set sched_hw_submission higher for KIQ (v3)

KIQ doesn't really use the GPU scheduler.  The base
drivers generally use the KIQ ring directly rather than
submitting IBs.  However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2.  KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.

v2: move some expressions to one line
    change KIQ sched_hw_submission to at least 16
v3: bump to 256

Reviewed-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: move default gart size setting into gmc modules

Move the asic specific code into the IP modules.

Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: refine default gart size

Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.

Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: ACG frequency added in PPTable

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: discard commands of killed processes

When a process is killed we shouldn't submit all waiting jobs, but instead
clean up as fast as possible.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix and cleanup shadow handling

Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.

This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: add automatic per asic settings for gart_size

We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode. Change the default size based on
the asic type.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/gfx8: fix spelling typo in mqd allocation

Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: unhalt mec after loading

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/virtual_dce: Virtual display doesn't support disable vblank immediately

For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.

BUG: SWDEV-129274

Signed-off-by: Emily Deng <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix huge page updates with CPU

Correctly detect system memory mappings when using CPU and don't use
huge pages for them.

Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.

Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Merge branch 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux into drm-next

vmwgfx add fence fd support.

* 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux:
  drm/vmwgfx: Bump the version for fence FD support
  drm/vmwgfx: Add export fence to file descriptor support
  drm/vmwgfx: Add support for imported Fence File Descriptor
  drm/vmwgfx: Prepare to support fence fd
  drm/vmwgfx: Fix incorrect command header offset at restart
  drm/vmwgfx: Support the NOP_ERROR command
  drm/vmwgfx: Restart command buffers after errors
  drm/vmwgfx: Move irq bottom half processing to threads
  drm/vmwgfx: Don't use drm_irq_[un]install

Merge tag 'exynos-drm-next-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

Summary:
- Provide NV12MT pixel format support of Mixer driver in generic way.
- Refactor Exynos KMS drivers
  . Refactoring to panel detection way
  . Refactoring to setting up possible_crtcs
  . Refactoring to video and command mode support
- Some cleanups

* tag 'exynos-drm-next-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: simplify set_pixfmt() in DECON and FIMD drivers
  drm/exynos: consistent use of cpp
  drm/exynos: mixer: remove src offset from mixer_graph_buffer()
  drm/exynos: mixer: simplify mixer_graph_buffer()
  drm/exynos: mixer: simplify vp_video_buffer()
  drm/exynos: mixer: enable NV12MT support for the video plane
  drm/exynos: mixer: fix chroma comment in vp_video_buffer()
  arm64: dts: exynos: remove i80-if-timings nodes
  dt-bindings: exynos5433-decon: remove i80-if-timings property
  drm/exynos/decon5433: use mode info stored in CRTC to detect i80 mode
  drm/exynos: add mode_valid callback to exynos_drm
  drm/exynos/decon5433: refactor irq requesting code
  drm/exynos/mic: use mode info stored in CRTC to detect i80 mode
  drm/exynos/dsi: propagate info about command mode from panel
  drm/exynos/dsi: refactor panel detection logic
  drm/exynos: use helper to set possible crtcs
  drm/exynos/decon5433: use readl_poll_timeout helpers

Merge tag 'drm-misc-next-fixes-2017-08-28' of git://anongit.freedesktop.org/git/drm-misc into drm-next

UAPI Changes:
- Rename u32 to __u32 in struct drm_format_modifier_blob (Lionel)

Cc: Lionel Landwerlin <[email protected]>
* tag 'drm-misc-next-fixes-2017-08-28' of git://anongit.freedesktop.org/git/drm-misc:
drm: rename u32 in __u32 in uapi

drm/syncobj: Add a signal ioctl (v3)

This IOCTL provides a mechanism for userspace to trigger a sync object
directly.  There are other ways that userspace can trigger a syncobj
such as submitting a dummy batch somewhere or hanging on to a triggered
sync_file and doing an import.  This just provides an easy way to
manually trigger the sync object without weird hacks.

The motivation for this IOCTL is Vulkan fences.  Vulkan lets you create
a fence already in the signaled state so that you can wait on it
immediatly without stalling.  We could also handle this with a new
create flag to ask the driver to create a syncobj that is already
signaled but the IOCTL seemed a bit cleaner and more generic.

v2:
- Take an array of sync objects (Dave Airlie)
v3:
- Throw -EINVAL if pad != 0

Signed-off-by: Jason Ekstrand <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

drm/syncobj: Add a reset ioctl (v3)

This just resets the dma_fence to NULL so it looks like it's never been
signaled. This will be useful once we add the new wait API for allowing
wait on "submit and signal" behavior.

v2:
- Take an array of sync objects (Dave Airlie)
v3:
- Throw -EINVAL if pad != 0

Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Christian König <[email protected]> (v1)
Signed-off-by: Dave Airlie <[email protected]>

drm/syncobj: Add a syncobj_array_find helper

The wait ioctl has a bunch of code to read an syncobj handle array from
userspace and turn it into an array of syncobj pointers. We're about to
add two new IOCTLs which will need to work with arrays of syncobj
handles so let's make some helpers.

Signed-off-by: Jason Ekstrand <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

drm/syncobj: Allow wait for submit and signal behavior (v5)

Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
- Fix a bug in the invalid syncobj error path
- Unify the wait-all and wait-any cases
v3:
- Unify the timeout == 0 case a bit with the timeout > 0 case
- Use wait_event_interruptible_timeout
v4:
- Use proxy fence
v5:
- Revert to a combination of v2 and v3
- Don't use proxy fences
- Don't use wait_event_interruptible_timeout because it just adds an
   extra layer of callbacks

Signed-off-by: Jason Ekstrand <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>