mbox series

[libdrm,0/9] amdgpu:

Message ID 20190624165406.13682-1-michel@daenzer.net (mailing list archive)
Headers show
Series amdgpu: | expand

Message

Michel Dänzer June 24, 2019, 4:53 p.m. UTC
From: Michel Dänzer <michel.daenzer@amd.com>

The motivation for these patches is https://bugs.freedesktop.org/110903 .

Patches 1-3 are preparatory.

Patches 4 & 5 are the core patches allowing the issues discussed in the
bug report to be fixed.

Patches 6-8 are further optimizations / cleanups.

Patch 9 is the Mesa patch making use of the new
amdgpu_bo_handle_type_kms_user API to fix the user visible problem.

Note that the libdrm patches need to land first, a libdrm release needs
to be made, and the Mesa patch needs to bump the libdrm_amdgpu version
requirement to that release's version before it can land.

Michel Dänzer (9):
  amdgpu: Pass file descriptor directly to amdgpu_close_kms_handle
  amdgpu: Add BO handle to table in amdgpu_bo_create
  amdgpu: Rename fd_mutex/list to dev_mutex/list
  amdgpu: Add struct amdgpu_core_device and amdgpu_core_bo
  amdgpu: Add amdgpu_bo_handle_type_kms_user
  amdgpu: Re-use an existing amdgpu_device when possible
  amdgpu: Use normal integers for device / core BO reference counting
  amdgpu: Drop refcount member from struct amdgpu_core_bo/device
  winsys/amdgpu: Use amdgpu_bo_handle_type_kms_user for API KMS handles

 amdgpu/amdgpu.h               |  14 +-
 amdgpu/amdgpu_asic_id.c       |   4 +-
 amdgpu/amdgpu_bo.c            | 367 ++++++++++++++++++++++------------
 amdgpu/amdgpu_cs.c            |  64 +++---
 amdgpu/amdgpu_device.c        | 281 +++++++++++++-------------
 amdgpu/amdgpu_gpu_info.c      |  35 ++--
 amdgpu/amdgpu_internal.h      |  31 ++-
 amdgpu/amdgpu_vamgr.c         |   9 +-
 amdgpu/amdgpu_vm.c            |   4 +-
 tests/amdgpu/amdgpu_test.c    |   2 +-
 tests/amdgpu/bo_tests.c       |   2 +-
 tests/amdgpu/cs_tests.c       |   8 +-
 tests/amdgpu/deadlock_tests.c |   8 +-
 tests/amdgpu/uvd_enc_tests.c  |   2 +-
 tests/amdgpu/vce_tests.c      |  12 +-
 tests/amdgpu/vcn_tests.c      |   4 +-
 tests/amdgpu/vm_tests.c       |   2 +-
 17 files changed, 500 insertions(+), 349 deletions(-)

Comments

Christian König June 24, 2019, 5:31 p.m. UTC | #1
Patches #1 - #3 look good to me, but I'm not sure if the rest is such a 
good idea.

Basically you not only want to use the same FD for CS, but also for 
basically all buffer functions and as far as I can see we break that here.

I would rather add a new function to export the KMS handle for a certain 
BO/FD pair. Exporting the handle based on a type was also like trying to 
squeeze a round pig through a square hole in the first place.

KMS, flink and DMA-buf fd are fundamentally different and shouldn't be 
handled by the same function in the first place.

The only tricky part in this scenario is to know when to close the KMS 
handle again.

Christian.

Am 24.06.19 um 18:53 schrieb Michel Dänzer:
> From: Michel Dänzer <michel.daenzer@amd.com>
>
> The motivation for these patches is https://bugs.freedesktop.org/110903 .
>
> Patches 1-3 are preparatory.
>
> Patches 4 & 5 are the core patches allowing the issues discussed in the
> bug report to be fixed.
>
> Patches 6-8 are further optimizations / cleanups.
>
> Patch 9 is the Mesa patch making use of the new
> amdgpu_bo_handle_type_kms_user API to fix the user visible problem.
>
> Note that the libdrm patches need to land first, a libdrm release needs
> to be made, and the Mesa patch needs to bump the libdrm_amdgpu version
> requirement to that release's version before it can land.
>
> Michel Dänzer (9):
>    amdgpu: Pass file descriptor directly to amdgpu_close_kms_handle
>    amdgpu: Add BO handle to table in amdgpu_bo_create
>    amdgpu: Rename fd_mutex/list to dev_mutex/list
>    amdgpu: Add struct amdgpu_core_device and amdgpu_core_bo
>    amdgpu: Add amdgpu_bo_handle_type_kms_user
>    amdgpu: Re-use an existing amdgpu_device when possible
>    amdgpu: Use normal integers for device / core BO reference counting
>    amdgpu: Drop refcount member from struct amdgpu_core_bo/device
>    winsys/amdgpu: Use amdgpu_bo_handle_type_kms_user for API KMS handles
>
>   amdgpu/amdgpu.h               |  14 +-
>   amdgpu/amdgpu_asic_id.c       |   4 +-
>   amdgpu/amdgpu_bo.c            | 367 ++++++++++++++++++++++------------
>   amdgpu/amdgpu_cs.c            |  64 +++---
>   amdgpu/amdgpu_device.c        | 281 +++++++++++++-------------
>   amdgpu/amdgpu_gpu_info.c      |  35 ++--
>   amdgpu/amdgpu_internal.h      |  31 ++-
>   amdgpu/amdgpu_vamgr.c         |   9 +-
>   amdgpu/amdgpu_vm.c            |   4 +-
>   tests/amdgpu/amdgpu_test.c    |   2 +-
>   tests/amdgpu/bo_tests.c       |   2 +-
>   tests/amdgpu/cs_tests.c       |   8 +-
>   tests/amdgpu/deadlock_tests.c |   8 +-
>   tests/amdgpu/uvd_enc_tests.c  |   2 +-
>   tests/amdgpu/vce_tests.c      |  12 +-
>   tests/amdgpu/vcn_tests.c      |   4 +-
>   tests/amdgpu/vm_tests.c       |   2 +-
>   17 files changed, 500 insertions(+), 349 deletions(-)
>
Michel Dänzer June 25, 2019, 8:02 a.m. UTC | #2
On 2019-06-24 7:31 p.m., Christian König wrote:
> Patches #1 - #3 look good to me, but I'm not sure if the rest is such a
> good idea.
> 
> Basically you not only want to use the same FD for CS, but also for
> basically all buffer functions and as far as I can see we break that here.

How so? The core FD is used for everything except flink and
amdgpu_bo_handle_type_kms_user.


> I would rather add a new function to export the KMS handle for a certain
> BO/FD pair. Exporting the handle based on a type was also like trying to
> squeeze a round pig through a square hole in the first place.
> 
> KMS, flink and DMA-buf fd are fundamentally different and shouldn't be
> handled by the same function in the first place.

I don't really see the problem. radeonsi & RADV are using
amdgpu_bo_handle_type_kms to get handles for command stream submission,
so those handles have to be valid for the core FD. Now there's
amdgpu_bo_handle_type_kms_user to get a handle valid for the FD passed
to amdgpu_device_initialize. Can you describe a scenario where a handle
valid for yet another file description is needed?
Christian König June 25, 2019, 9:44 a.m. UTC | #3
Am 25.06.19 um 10:02 schrieb Michel Dänzer:
> On 2019-06-24 7:31 p.m., Christian König wrote:
>> Patches #1 - #3 look good to me, but I'm not sure if the rest is such a
>> good idea.
>>
>> Basically you not only want to use the same FD for CS, but also for
>> basically all buffer functions and as far as I can see we break that here.
> How so? The core FD is used for everything except flink and
> amdgpu_bo_handle_type_kms_user.

IIRC in the Mesa winsys we compare the amdgpu_device and amdgpu_bo 
pointers to figure out if an opened device or imported BO is the same as 
one we already know.

With this patch that won't work any more and for example OpenGL and 
VA-API could potentially use separate amdgpu_bo pointers for the same 
underlying buffer. That in turn would break synchronization.

Additional to that we potentially could keep a bunch of file descriptors 
around which we don't necessary want. In other words we previously 
closed the extra file descriptors, but now we keep them open as well. 
But that should probably be only a minor problem.

>> I would rather add a new function to export the KMS handle for a certain
>> BO/FD pair. Exporting the handle based on a type was also like trying to
>> squeeze a round pig through a square hole in the first place.
>>
>> KMS, flink and DMA-buf fd are fundamentally different and shouldn't be
>> handled by the same function in the first place.
> I don't really see the problem. radeonsi & RADV are using
> amdgpu_bo_handle_type_kms to get handles for command stream submission,
> so those handles have to be valid for the core FD. Now there's
> amdgpu_bo_handle_type_kms_user to get a handle valid for the FD passed
> to amdgpu_device_initialize. Can you describe a scenario where a handle
> valid for yet another file description is needed?

Not yet another file descriptor, but the underlying problem here seems 
to be that we don't tell libdrm in which domain/file descriptor the KMS 
should actually be valid in.

Regards,
Christian.
Michel Dänzer June 25, 2019, 10:07 a.m. UTC | #4
On 2019-06-25 11:44 a.m., Koenig, Christian wrote:
> Am 25.06.19 um 10:02 schrieb Michel Dänzer:
>> On 2019-06-24 7:31 p.m., Christian König wrote:
>>> Patches #1 - #3 look good to me, but I'm not sure if the rest is such a
>>> good idea.
>>>
>>> Basically you not only want to use the same FD for CS, but also for
>>> basically all buffer functions and as far as I can see we break that here.
>> How so? The core FD is used for everything except flink and
>> amdgpu_bo_handle_type_kms_user.
> 
> IIRC in the Mesa winsys we compare the amdgpu_device and amdgpu_bo 
> pointers to figure out if an opened device or imported BO is the same as 
> one we already know.
> 
> With this patch that won't work any more and for example OpenGL and 
> VA-API could potentially use separate amdgpu_bo pointers for the same 
> underlying buffer. That in turn would break synchronization.

Hmm, I was hoping patch 6 would cover this, but it looks like OpenGL and
VA-API pass file descriptors referencing different file descriptions to
amdgpu_device_initialize, so you're right. :(


>>> I would rather add a new function to export the KMS handle for a certain
>>> BO/FD pair.

I'll try this approach then.