diff mbox series

[RFC] mesa: Export BOs in RW mode

Message ID 20190703133357.10217-1-boris.brezillon@collabora.com (mailing list archive)
State New, archived
Headers show
Series [RFC] mesa: Export BOs in RW mode | expand

Commit Message

Boris Brezillon July 3, 2019, 1:33 p.m. UTC
Exported BOs might be imported back, then mmap()-ed to be written
too. Most drivers handle that by mmap()-ing the GEM handle after it's
been imported, but, according to [1], this is illegal. The panfrost
driver has recently switched to this generic helper (which was renamed
into drm_gem_map_offset() in the meantime) [2], and mmap()-ing
of imported BOs now fails.

Now I'm wondering how this should be solved. I guess the first question
is, is mmap()-ing of imported BOs really forbidden for all BOs? I guess
calling mmap() on a buffer that's been exported by the DRM driver itself
then re-imported shouldn't hurt, so maybe we can check that before
failing.

Now, if we really want to forbid mmap() on imported BOs, that means we
need a solution to mmap() the dmabuf object directly, and sometimes this
mapping will request RW permissions. The problem is, all function
exporting BOs in mesa are exporting them in RO-mode (resulting FD is
O_READ), thus preventing mmap()s in RW mode.

This patch modifies all
drmPrimeHandleToFD()/ioctl(DRM_IOCTL_PRIME_HANDLE_TO_FD) call sites
to pass the DRM_RDWR flag so that what's described above becomes
possible.

I'm not saying this is what we should do, it's more a way to start the
discussion. Feel free to propose alternives to this solution.

[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem.c?h=v5.2-rc7#n318
[2]https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/panfrost?id=583bbf46133c726bae277e8f4e32bfba2a528c7f

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: <dri-devel@lists.freedesktop.org>
Cc: <mesa-dev@lists.freedesktop.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh@kernel.org>
---
Cc-ing dri-devel is not a mistake, I really to have feedback from the DRM
maintainers, since this started with a kernel-side change.
---
 src/etnaviv/drm/etnaviv_bo.c                      | 4 ++--
 src/freedreno/drm/freedreno_bo.c                  | 4 ++--
 src/freedreno/vulkan/tu_drm.c                     | 2 +-
 src/gallium/auxiliary/renderonly/renderonly.c     | 2 +-
 src/gallium/drivers/iris/iris_bufmgr.c            | 2 +-
 src/gallium/drivers/lima/lima_bo.c                | 2 +-
 src/gallium/drivers/panfrost/pan_drm.c            | 2 +-
 src/gallium/drivers/v3d/v3d_bufmgr.c              | 2 +-
 src/gallium/drivers/vc4/vc4_bufmgr.c              | 2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c     | 3 ++-
 src/gallium/winsys/svga/drm/vmw_screen_dri.c      | 5 +++--
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
 src/gallium/winsys/virgl/drm/virgl_drm_winsys.c   | 3 ++-
 src/intel/vulkan/anv_gem.c                        | 2 +-
 src/mesa/drivers/dri/i965/brw_bufmgr.c            | 2 +-
 15 files changed, 21 insertions(+), 18 deletions(-)

Comments

Rob Herring July 3, 2019, 1:45 p.m. UTC | #1
On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> Exported BOs might be imported back, then mmap()-ed to be written
> too. Most drivers handle that by mmap()-ing the GEM handle after it's
> been imported, but, according to [1], this is illegal.

It's not illegal, but is supposed to go thru the dmabuf mmap
functions. However, none of the driver I've looked at (etnaviv, msm,
v3d, vgem) do that. It probably works because it's the same driver
doing the import and export or both drivers have essentially the same
implementations.

> The panfrost
> driver has recently switched to this generic helper (which was renamed
> into drm_gem_map_offset() in the meantime) [2], and mmap()-ing
> of imported BOs now fails.

I think I'm going to revert this.

> Now I'm wondering how this should be solved. I guess the first question
> is, is mmap()-ing of imported BOs really forbidden for all BOs? I guess
> calling mmap() on a buffer that's been exported by the DRM driver itself
> then re-imported shouldn't hurt, so maybe we can check that before
> failing.
>
> Now, if we really want to forbid mmap() on imported BOs, that means we
> need a solution to mmap() the dmabuf object directly, and sometimes this
> mapping will request RW permissions. The problem is, all function
> exporting BOs in mesa are exporting them in RO-mode (resulting FD is
> O_READ), thus preventing mmap()s in RW mode.
>
> This patch modifies all
> drmPrimeHandleToFD()/ioctl(DRM_IOCTL_PRIME_HANDLE_TO_FD) call sites
> to pass the DRM_RDWR flag so that what's described above becomes
> possible.
>
> I'm not saying this is what we should do, it's more a way to start the
> discussion. Feel free to propose alternives to this solution.
>
> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem.c?h=v5.2-rc7#n318
> [2]https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/panfrost?id=583bbf46133c726bae277e8f4e32bfba2a528c7f
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: <dri-devel@lists.freedesktop.org>
> Cc: <mesa-dev@lists.freedesktop.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> ---
> Cc-ing dri-devel is not a mistake, I really to have feedback from the DRM
> maintainers, since this started with a kernel-side change.
> ---
>  src/etnaviv/drm/etnaviv_bo.c                      | 4 ++--
>  src/freedreno/drm/freedreno_bo.c                  | 4 ++--
>  src/freedreno/vulkan/tu_drm.c                     | 2 +-
>  src/gallium/auxiliary/renderonly/renderonly.c     | 2 +-
>  src/gallium/drivers/iris/iris_bufmgr.c            | 2 +-
>  src/gallium/drivers/lima/lima_bo.c                | 2 +-
>  src/gallium/drivers/panfrost/pan_drm.c            | 2 +-
>  src/gallium/drivers/v3d/v3d_bufmgr.c              | 2 +-
>  src/gallium/drivers/vc4/vc4_bufmgr.c              | 2 +-
>  src/gallium/winsys/radeon/drm/radeon_drm_bo.c     | 3 ++-
>  src/gallium/winsys/svga/drm/vmw_screen_dri.c      | 5 +++--
>  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
>  src/gallium/winsys/virgl/drm/virgl_drm_winsys.c   | 3 ++-
>  src/intel/vulkan/anv_gem.c                        | 2 +-
>  src/mesa/drivers/dri/i965/brw_bufmgr.c            | 2 +-
>  15 files changed, 21 insertions(+), 18 deletions(-)
>
> diff --git a/src/etnaviv/drm/etnaviv_bo.c b/src/etnaviv/drm/etnaviv_bo.c
> index 6e952fa47858..92634141b580 100644
> --- a/src/etnaviv/drm/etnaviv_bo.c
> +++ b/src/etnaviv/drm/etnaviv_bo.c
> @@ -294,8 +294,8 @@ int etna_bo_dmabuf(struct etna_bo *bo)
>  {
>         int ret, prime_fd;
>
> -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> -                               &prime_fd);
> +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
>         if (ret) {
>                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
>                 return ret;
> diff --git a/src/freedreno/drm/freedreno_bo.c b/src/freedreno/drm/freedreno_bo.c
> index 7449160f1371..ba19b08d7c54 100644
> --- a/src/freedreno/drm/freedreno_bo.c
> +++ b/src/freedreno/drm/freedreno_bo.c
> @@ -318,8 +318,8 @@ int fd_bo_dmabuf(struct fd_bo *bo)
>  {
>         int ret, prime_fd;
>
> -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> -                       &prime_fd);
> +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
>         if (ret) {
>                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
>                 return ret;
> diff --git a/src/freedreno/vulkan/tu_drm.c b/src/freedreno/vulkan/tu_drm.c
> index 9b2e6f78879e..6bef3012ddb5 100644
> --- a/src/freedreno/vulkan/tu_drm.c
> +++ b/src/freedreno/vulkan/tu_drm.c
> @@ -147,7 +147,7 @@ tu_gem_export_dmabuf(const struct tu_device *dev, uint32_t gem_handle)
>  {
>     int prime_fd;
>     int ret = drmPrimeHandleToFD(dev->physical_device->local_fd, gem_handle,
> -                                DRM_CLOEXEC, &prime_fd);
> +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
>
>     return ret == 0 ? prime_fd : -1;
>  }
> diff --git a/src/gallium/auxiliary/renderonly/renderonly.c b/src/gallium/auxiliary/renderonly/renderonly.c
> index d6a344009378..c1cc31115105 100644
> --- a/src/gallium/auxiliary/renderonly/renderonly.c
> +++ b/src/gallium/auxiliary/renderonly/renderonly.c
> @@ -101,7 +101,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
>     out_handle->type = WINSYS_HANDLE_TYPE_FD;
>     out_handle->stride = create_dumb.pitch;
>
> -   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC,
> +   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC | O_RDWR,
>           (int *)&out_handle->handle);
>     if (err < 0) {
>        fprintf(stderr, "failed to export dumb buffer: %s\n", strerror(errno));
> diff --git a/src/gallium/drivers/iris/iris_bufmgr.c b/src/gallium/drivers/iris/iris_bufmgr.c
> index d80945824098..68e5d8a8bb13 100644
> --- a/src/gallium/drivers/iris/iris_bufmgr.c
> +++ b/src/gallium/drivers/iris/iris_bufmgr.c
> @@ -1369,7 +1369,7 @@ iris_bo_export_dmabuf(struct iris_bo *bo, int *prime_fd)
>     iris_bo_make_external(bo);
>
>     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> -                          DRM_CLOEXEC, prime_fd) != 0)
> +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
>        return -errno;
>
>     bo->reusable = false;
> diff --git a/src/gallium/drivers/lima/lima_bo.c b/src/gallium/drivers/lima/lima_bo.c
> index 1d6dd720602a..df0114fc67ed 100644
> --- a/src/gallium/drivers/lima/lima_bo.c
> +++ b/src/gallium/drivers/lima/lima_bo.c
> @@ -205,7 +205,7 @@ bool lima_bo_export(struct lima_bo *bo, struct winsys_handle *handle)
>        return true;
>
>     case WINSYS_HANDLE_TYPE_FD:
> -      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC,
> +      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
>                               (int*)&handle->handle))
>           return false;
>
> diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c
> index 8de4f483435c..a086575c453a 100644
> --- a/src/gallium/drivers/panfrost/pan_drm.c
> +++ b/src/gallium/drivers/panfrost/pan_drm.c
> @@ -181,7 +181,7 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo
>  {
>          struct drm_prime_handle args = {
>                  .handle = bo->gem_handle,
> -                .flags = DRM_CLOEXEC,
> +                .flags = DRM_CLOEXEC | DRM_RDWR,
>          };
>
>          int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> diff --git a/src/gallium/drivers/v3d/v3d_bufmgr.c b/src/gallium/drivers/v3d/v3d_bufmgr.c
> index f0df1b983737..0c78994d899f 100644
> --- a/src/gallium/drivers/v3d/v3d_bufmgr.c
> +++ b/src/gallium/drivers/v3d/v3d_bufmgr.c
> @@ -424,7 +424,7 @@ v3d_bo_get_dmabuf(struct v3d_bo *bo)
>  {
>          int fd;
>          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> -                                     O_CLOEXEC, &fd);
> +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
>          if (ret != 0) {
>                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
>                          bo->handle);
> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c
> index 716ca50ea069..c7e89b6a517e 100644
> --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
> @@ -462,7 +462,7 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
>  {
>          int fd;
>          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> -                                     O_CLOEXEC, &fd);
> +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
>          if (ret != 0) {
>                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
>                          bo->handle);
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> index d1e2a8685ba9..ae7958a8b892 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> @@ -1320,7 +1320,8 @@ static bool radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
>      } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
>          whandle->handle = bo->handle;
>      } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> -        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC, (int*)&whandle->handle))
> +        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
> +                               (int*)&whandle->handle))
>              return false;
>      }
>
> diff --git a/src/gallium/winsys/svga/drm/vmw_screen_dri.c b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> index a85ee18e45b9..24f6ddac0890 100644
> --- a/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> +++ b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> @@ -345,8 +345,9 @@ vmw_drm_surface_get_handle(struct svga_winsys_screen *sws,
>         whandle->handle = vsrf->sid;
>         break;
>      case WINSYS_HANDLE_TYPE_FD:
> -       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid, DRM_CLOEXEC,
> -                               (int *)&whandle->handle);
> +       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid,
> +                                DRM_CLOEXEC | DRM_RDWR,
> +                                (int *)&whandle->handle);
>         if (ret) {
>           vmw_error("Failed to get file descriptor from prime.\n");
>           return FALSE;
> diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> index d9b417dc4dad..ce47e66ab1e5 100644
> --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> @@ -446,7 +446,7 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
>        return TRUE;
>     case WINSYS_HANDLE_TYPE_FD:
>        if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
> -                             DRM_CLOEXEC, (int*)&whandle->handle)) {
> +                             DRM_CLOEXEC | DRM_RDWR, (int*)&whandle->handle)) {
>           whandle->stride = plane->stride;
>           whandle->offset = plane->offset;
>           return TRUE;
> diff --git a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> index 9eec92f57363..07fab2735e25 100644
> --- a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> +++ b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> @@ -418,7 +418,8 @@ static boolean virgl_drm_winsys_resource_get_handle(struct virgl_winsys *qws,
>     } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
>        whandle->handle = res->bo_handle;
>     } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> -      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC, (int*)&whandle->handle))
> +      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC | DRM_RDWR,
> +                             (int*)&whandle->handle))
>              return FALSE;
>        mtx_lock(&qdws->bo_handles_mutex);
>        util_hash_table_set(qdws->bo_handles, (void *)(uintptr_t)res->bo_handle, res);
> diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> index 1bdf040c1a37..8f07928638ce 100644
> --- a/src/intel/vulkan/anv_gem.c
> +++ b/src/intel/vulkan/anv_gem.c
> @@ -399,7 +399,7 @@ anv_gem_handle_to_fd(struct anv_device *device, uint32_t gem_handle)
>  {
>     struct drm_prime_handle args = {
>        .handle = gem_handle,
> -      .flags = DRM_CLOEXEC,
> +      .flags = DRM_CLOEXEC | DRM_RDWR,
>     };
>
>     int ret = anv_ioctl(device->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index a7c684063158..ee639d2cf23a 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -1489,7 +1489,7 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd)
>     brw_bo_make_external(bo);
>
>     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> -                          DRM_CLOEXEC, prime_fd) != 0)
> +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
>        return -errno;
>
>     bo->reusable = false;
> --
> 2.21.0
>
Boris Brezillon July 3, 2019, 1:56 p.m. UTC | #2
On Wed, 3 Jul 2019 07:45:32 -0600
Rob Herring <robh+dt@kernel.org> wrote:

> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> >
> > Exported BOs might be imported back, then mmap()-ed to be written
> > too. Most drivers handle that by mmap()-ing the GEM handle after it's
> > been imported, but, according to [1], this is illegal.  
> 
> It's not illegal, but is supposed to go thru the dmabuf mmap
> functions.

That's basically what I'm proposing here, just didn't post the patch
skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
instead of the DRM-node one, but I have it working for panfrost.

> However, none of the driver I've looked at (etnaviv, msm,
> v3d, vgem) do that. It probably works because it's the same driver
> doing the import and export or both drivers have essentially the same
> implementations.

Yes, but maybe that's something we should start fixing if mmap()-ing
the dmabuf is the recommended solution.
Steven Price July 3, 2019, 2:13 p.m. UTC | #3
On 03/07/2019 14:56, Boris Brezillon wrote:
> On Wed, 3 Jul 2019 07:45:32 -0600
> Rob Herring <robh+dt@kernel.org> wrote:
> 
>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
>> <boris.brezillon@collabora.com> wrote:
>>>
>>> Exported BOs might be imported back, then mmap()-ed to be written
>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
>>> been imported, but, according to [1], this is illegal.  
>>
>> It's not illegal, but is supposed to go thru the dmabuf mmap
>> functions.
> 
> That's basically what I'm proposing here, just didn't post the patch
> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
> instead of the DRM-node one, but I have it working for panfrost.

If we want to we could make the Panfrost kernel driver internally call
dma_buf_mmap() so that mapping using the DRM-node "just works". This is
indeed what the kbase driver does.

>> However, none of the driver I've looked at (etnaviv, msm,
>> v3d, vgem) do that. It probably works because it's the same driver
>> doing the import and export or both drivers have essentially the same
>> implementations.
> 
> Yes, but maybe that's something we should start fixing if mmap()-ing
> the dmabuf is the recommended solution.

I'm open to options here. User space calling mmap() on the dma_buf file
descriptor should always be safe (the exporter can do whatever is
necessary to make it work). If that happens then the patches I posted
close off the DRM node version which could be broken if the exporter
needs to do anything to prepare the buffer for CPU access (i.e. cache
maintenance).

Alternatively if user space wants/needs to use the DMA node then we can
take a look at what needs to change in the kernel. From a quick look at
the code it seems we'd need to split drm_gem_mmap() into a helper so
that it can return whether the exporter is handling everything or if the
caller needs to do some more work (e.g. drm_gem_shmem_mmap() needs to
allocate backing pages). But because drm_gem_mmap() is used as the
direct callback for some drivers we'd need to preserve the interface.

The below (completely untested) patch demonstrates the idea.

Steve

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index a8c4468f03d9..df661e24cadf 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1140,7 +1140,7 @@ EXPORT_SYMBOL(drm_gem_mmap_obj);
  * If the caller is not granted access to the buffer object, the mmap
will fail
  * with EACCES. Please see the vma manager for more information.
  */
-int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+int drm_gem_mmap_helper(struct file *filp, struct vm_area_struct *vma)
 {
 	struct drm_file *priv = filp->private_data;
 	struct drm_device *dev = priv->minor->dev;
@@ -1189,6 +1189,11 @@ int drm_gem_mmap(struct file *filp, struct
vm_area_struct *vma)
 		vma->vm_flags &= ~VM_MAYWRITE;
 	}

+	if (obj->import_attach) {
+		ret = dma_buf_mmap(obj->dma_buf, vma, 0);
+		return ret?:1;
+	}
+
 	ret = drm_gem_mmap_obj(obj, drm_vma_node_size(node) << PAGE_SHIFT,
 			       vma);

@@ -1196,6 +1201,16 @@ int drm_gem_mmap(struct file *filp, struct
vm_area_struct *vma)

 	return ret;
 }
+
+int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	int ret;
+
+	ret = drm_gem_mmap_helper(filp, vma);
+	if (ret == 1)
+		return 0;
+	return ret;
+}
 EXPORT_SYMBOL(drm_gem_mmap);

 void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 472ea5d81f82..b85d84e4d4a8 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -466,8 +466,10 @@ int drm_gem_shmem_mmap(struct file *filp, struct
vm_area_struct *vma)
 	struct drm_gem_shmem_object *shmem;
 	int ret;

-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
+	ret = drm_gem_mmap_helper(filp, vma);
+	if (ret == 1)
+		return 0; /* Exporter handles the mapping */
+	else if (ret)
 		return ret;

 	shmem = to_drm_gem_shmem_obj(vma->vm_private_data);
Boris Brezillon July 3, 2019, 2:33 p.m. UTC | #4
On Wed, 3 Jul 2019 15:13:25 +0100
Steven Price <steven.price@arm.com> wrote:

> On 03/07/2019 14:56, Boris Brezillon wrote:
> > On Wed, 3 Jul 2019 07:45:32 -0600
> > Rob Herring <robh+dt@kernel.org> wrote:
> >   
> >> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> >> <boris.brezillon@collabora.com> wrote:  
> >>>
> >>> Exported BOs might be imported back, then mmap()-ed to be written
> >>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
> >>> been imported, but, according to [1], this is illegal.    
> >>
> >> It's not illegal, but is supposed to go thru the dmabuf mmap
> >> functions.  
> > 
> > That's basically what I'm proposing here, just didn't post the patch
> > skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
> > instead of the DRM-node one, but I have it working for panfrost.  
> 
> If we want to we could make the Panfrost kernel driver internally call
> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
> indeed what the kbase driver does.

Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or
ignore its return code), so calling mmap() on the dmabuf FD instead of
the DRM-node FD shouldn't be that hard.

> 
> >> However, none of the driver I've looked at (etnaviv, msm,
> >> v3d, vgem) do that. It probably works because it's the same driver
> >> doing the import and export or both drivers have essentially the same
> >> implementations.  
> > 
> > Yes, but maybe that's something we should start fixing if mmap()-ing
> > the dmabuf is the recommended solution.  
> 
> I'm open to options here. User space calling mmap() on the dma_buf file
> descriptor should always be safe (the exporter can do whatever is
> necessary to make it work). If that happens then the patches I posted
> close off the DRM node version which could be broken if the exporter
> needs to do anything to prepare the buffer for CPU access (i.e. cache
> maintenance).

Talking about CPU <-> GPU syncs, I was wondering if the
mmap(gem_handle) step was providing any guarantee that would
allow us to ignore all the cache maintenance operations that are
required when mmap()-ing a dmabuf directly. Note that in both cases the
dmabuf is imported.

> 
> Alternatively if user space wants/needs to use the DMA node then we can
> take a look at what needs to change in the kernel. From a quick look at
> the code it seems we'd need to split drm_gem_mmap() into a helper so
> that it can return whether the exporter is handling everything or if the
> caller needs to do some more work (e.g. drm_gem_shmem_mmap() needs to
> allocate backing pages). But because drm_gem_mmap() is used as the
> direct callback for some drivers we'd need to preserve the interface.
> 
> The below (completely untested) patch demonstrates the idea.
> 
> Steve
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index a8c4468f03d9..df661e24cadf 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1140,7 +1140,7 @@ EXPORT_SYMBOL(drm_gem_mmap_obj);
>   * If the caller is not granted access to the buffer object, the mmap
> will fail
>   * with EACCES. Please see the vma manager for more information.
>   */
> -int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
> +int drm_gem_mmap_helper(struct file *filp, struct vm_area_struct *vma)
>  {
>  	struct drm_file *priv = filp->private_data;
>  	struct drm_device *dev = priv->minor->dev;
> @@ -1189,6 +1189,11 @@ int drm_gem_mmap(struct file *filp, struct
> vm_area_struct *vma)
>  		vma->vm_flags &= ~VM_MAYWRITE;
>  	}
> 
> +	if (obj->import_attach) {
> +		ret = dma_buf_mmap(obj->dma_buf, vma, 0);
> +		return ret?:1;
> +	}
> +
>  	ret = drm_gem_mmap_obj(obj, drm_vma_node_size(node) << PAGE_SHIFT,
>  			       vma);
> 
> @@ -1196,6 +1201,16 @@ int drm_gem_mmap(struct file *filp, struct
> vm_area_struct *vma)
> 
>  	return ret;
>  }
> +
> +int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> +	int ret;
> +
> +	ret = drm_gem_mmap_helper(filp, vma);
> +	if (ret == 1)
> +		return 0;
> +	return ret;
> +}
>  EXPORT_SYMBOL(drm_gem_mmap);
> 
>  void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 472ea5d81f82..b85d84e4d4a8 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -466,8 +466,10 @@ int drm_gem_shmem_mmap(struct file *filp, struct
> vm_area_struct *vma)
>  	struct drm_gem_shmem_object *shmem;
>  	int ret;
> 
> -	ret = drm_gem_mmap(filp, vma);
> -	if (ret)
> +	ret = drm_gem_mmap_helper(filp, vma);
> +	if (ret == 1)
> +		return 0; /* Exporter handles the mapping */
> +	else if (ret)
>  		return ret;
> 
>  	shmem = to_drm_gem_shmem_obj(vma->vm_private_data);
Steven Price July 3, 2019, 2:50 p.m. UTC | #5
On 03/07/2019 15:33, Boris Brezillon wrote:
> On Wed, 3 Jul 2019 15:13:25 +0100
> Steven Price <steven.price@arm.com> wrote:
> 
>> On 03/07/2019 14:56, Boris Brezillon wrote:
>>> On Wed, 3 Jul 2019 07:45:32 -0600
>>> Rob Herring <robh+dt@kernel.org> wrote:
>>>   
>>>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
>>>> <boris.brezillon@collabora.com> wrote:  
>>>>>
>>>>> Exported BOs might be imported back, then mmap()-ed to be written
>>>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
>>>>> been imported, but, according to [1], this is illegal.    
>>>>
>>>> It's not illegal, but is supposed to go thru the dmabuf mmap
>>>> functions.  
>>>
>>> That's basically what I'm proposing here, just didn't post the patch
>>> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
>>> instead of the DRM-node one, but I have it working for panfrost.  
>>
>> If we want to we could make the Panfrost kernel driver internally call
>> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
>> indeed what the kbase driver does.
> 
> Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or
> ignore its return code), so calling mmap() on the dmabuf FD instead of
> the DRM-node FD shouldn't be that hard.

What I was suggesting is that user space would still call
DRM_IOCTL_PANFROST_MMAP_BO to get an offset which uses in a call to
mmap(..., drm_node_fd, offset). The kernel could detect that the buffer
is imported and call the exporter for the actual mmap() functionality.

The alternative is that user space 'simply' remembers that a buffer is
imported and keeps the file descriptor around so that it can instead
directly mmap() the dma_buf fd. Which is certainly easiest from the
kernel's perspective (and was what I assumed panfrost was doing - I
should have checked more closely!).

>>>> However, none of the driver I've looked at (etnaviv, msm,
>>>> v3d, vgem) do that. It probably works because it's the same driver
>>>> doing the import and export or both drivers have essentially the same
>>>> implementations.  
>>>
>>> Yes, but maybe that's something we should start fixing if mmap()-ing
>>> the dmabuf is the recommended solution.  
>>
>> I'm open to options here. User space calling mmap() on the dma_buf file
>> descriptor should always be safe (the exporter can do whatever is
>> necessary to make it work). If that happens then the patches I posted
>> close off the DRM node version which could be broken if the exporter
>> needs to do anything to prepare the buffer for CPU access (i.e. cache
>> maintenance).
> 
> Talking about CPU <-> GPU syncs, I was wondering if the
> mmap(gem_handle) step was providing any guarantee that would
> allow us to ignore all the cache maintenance operations that are
> required when mmap()-ing a dmabuf directly. Note that in both cases the
> dmabuf is imported.

In theory the exporter should do whatever is required to ensure that the
CPU is synchronised when a user space mapping exists. There are some
issues here though:

* In theory the kernel driver should map the dma_buf purely for the
duration that a job is using the buffer (and unmap immediately after).
This gives the exporter the knowledge of when the GPU is using the
memory and allows the exporter to page out of the memory if necessary.
In practise this map/unmap operation is expensive (updating the GPU's
page tables) so most drivers don't actually bother and keep the memory
mapped. This means the exporter cannot tell when the buffer is used or
move the pages.

* The CPU mappings can be faulted on demand (performing the necessary
CPU cache invalidate if needed) and shot-down to allow moving the
memory. In theory when the GPU needs the memory it should map the buffer
and the exporter can then shoot down the mappings, perform the CPU cache
clean and then allow the GPU to use the memory. A subsequent CPU access
would then refault the page, ensuring a CPU cache invalidate so the
latest data is visible.

* The majority of exporters are simple and deal with uncached memory
(e.g. frame buffers) or are actually exporting back to the same driver
(e.g. window surfaces). In these situations either the driver already
has the necessary "magic" to deal with caches (e.g. kbase provides
explicit cache maintenance operations), or it's "uncached" anyway so it
doesn't matter. This means that hardly anyone tests with the complex
cases...

From a user space ABI, my understanding is that a dma_buf mmap() mapping
should be coherent, and user space isn't expected to do anything to make
it work. Obviously any importing device might have it's own coherency
details which will be up to the ABI of that device (e.g. Mali has caches
which may need to be flushed - this is usually done at the start/end of
a job chain, so coherency is not guaranteed while the job chain is running).

Steve
Boris Brezillon July 3, 2019, 3:07 p.m. UTC | #6
On Wed, 3 Jul 2019 15:50:08 +0100
Steven Price <steven.price@arm.com> wrote:

> On 03/07/2019 15:33, Boris Brezillon wrote:
> > On Wed, 3 Jul 2019 15:13:25 +0100
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> On 03/07/2019 14:56, Boris Brezillon wrote:  
> >>> On Wed, 3 Jul 2019 07:45:32 -0600
> >>> Rob Herring <robh+dt@kernel.org> wrote:
> >>>     
> >>>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> >>>> <boris.brezillon@collabora.com> wrote:    
> >>>>>
> >>>>> Exported BOs might be imported back, then mmap()-ed to be written
> >>>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
> >>>>> been imported, but, according to [1], this is illegal.      
> >>>>
> >>>> It's not illegal, but is supposed to go thru the dmabuf mmap
> >>>> functions.    
> >>>
> >>> That's basically what I'm proposing here, just didn't post the patch
> >>> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
> >>> instead of the DRM-node one, but I have it working for panfrost.    
> >>
> >> If we want to we could make the Panfrost kernel driver internally call
> >> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
> >> indeed what the kbase driver does.  
> > 
> > Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or
> > ignore its return code), so calling mmap() on the dmabuf FD instead of
> > the DRM-node FD shouldn't be that hard.  
> 
> What I was suggesting is that user space would still call
> DRM_IOCTL_PANFROST_MMAP_BO to get an offset which uses in a call to
> mmap(..., drm_node_fd, offset). The kernel could detect that the buffer
> is imported and call the exporter for the actual mmap() functionality.

Oops, sorry, brain fart. I thought it was DRM_IOCTL_PANFROST_MMAP_BO
that was failing, but it's actually the mmap() call, so providing this
wrapper kernel-side should work.

> 
> The alternative is that user space 'simply' remembers that a buffer is
> imported and keeps the file descriptor around so that it can instead
> directly mmap() the dma_buf fd. Which is certainly easiest from the
> kernel's perspective (and was what I assumed panfrost was doing - I
> should have checked more closely!).
> 
> >>>> However, none of the driver I've looked at (etnaviv, msm,
> >>>> v3d, vgem) do that. It probably works because it's the same driver
> >>>> doing the import and export or both drivers have essentially the same
> >>>> implementations.    
> >>>
> >>> Yes, but maybe that's something we should start fixing if mmap()-ing
> >>> the dmabuf is the recommended solution.    
> >>
> >> I'm open to options here. User space calling mmap() on the dma_buf file
> >> descriptor should always be safe (the exporter can do whatever is
> >> necessary to make it work). If that happens then the patches I posted
> >> close off the DRM node version which could be broken if the exporter
> >> needs to do anything to prepare the buffer for CPU access (i.e. cache
> >> maintenance).  
> > 
> > Talking about CPU <-> GPU syncs, I was wondering if the
> > mmap(gem_handle) step was providing any guarantee that would
> > allow us to ignore all the cache maintenance operations that are
> > required when mmap()-ing a dmabuf directly. Note that in both cases the
> > dmabuf is imported.  
> 
> In theory the exporter should do whatever is required to ensure that the
> CPU is synchronised when a user space mapping exists. There are some
> issues here though:
> 
> * In theory the kernel driver should map the dma_buf purely for the
> duration that a job is using the buffer (and unmap immediately after).
> This gives the exporter the knowledge of when the GPU is using the
> memory and allows the exporter to page out of the memory if necessary.
> In practise this map/unmap operation is expensive (updating the GPU's
> page tables) so most drivers don't actually bother and keep the memory
> mapped. This means the exporter cannot tell when the buffer is used or
> move the pages.
> 
> * The CPU mappings can be faulted on demand (performing the necessary
> CPU cache invalidate if needed) and shot-down to allow moving the
> memory. In theory when the GPU needs the memory it should map the buffer
> and the exporter can then shoot down the mappings, perform the CPU cache
> clean and then allow the GPU to use the memory. A subsequent CPU access
> would then refault the page, ensuring a CPU cache invalidate so the
> latest data is visible.
> 
> * The majority of exporters are simple and deal with uncached memory
> (e.g. frame buffers) or are actually exporting back to the same driver
> (e.g. window surfaces). In these situations either the driver already
> has the necessary "magic" to deal with caches (e.g. kbase provides
> explicit cache maintenance operations), or it's "uncached" anyway so it
> doesn't matter. This means that hardly anyone tests with the complex
> cases...
> 
> From a user space ABI, my understanding is that a dma_buf mmap() mapping
> should be coherent, and user space isn't expected to do anything to make
> it work. Obviously any importing device might have it's own coherency
> details which will be up to the ABI of that device (e.g. Mali has caches
> which may need to be flushed - this is usually done at the start/end of
> a job chain, so coherency is not guaranteed while the job chain is running).

Thanks for the detailed explanation.
Daniel Vetter July 3, 2019, 3:47 p.m. UTC | #7
On Wed, Jul 03, 2019 at 07:45:32AM -0600, Rob Herring wrote:
> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> >
> > Exported BOs might be imported back, then mmap()-ed to be written
> > too. Most drivers handle that by mmap()-ing the GEM handle after it's
> > been imported, but, according to [1], this is illegal.
> 
> It's not illegal, but is supposed to go thru the dmabuf mmap
> functions. However, none of the driver I've looked at (etnaviv, msm,
> v3d, vgem) do that. It probably works because it's the same driver
> doing the import and export or both drivers have essentially the same
> implementations.

For self-import we short-circuit out the underlying dma-buf and directly
go to the gem_bo again.

> > The panfrost
> > driver has recently switched to this generic helper (which was renamed
> > into drm_gem_map_offset() in the meantime) [2], and mmap()-ing
> > of imported BOs now fails.
> 
> I think I'm going to revert this.

Can't we change the helper to force the rw mode to make this work?

Aside: The reason I'm not a huge fan of forwarding gem mmap to dma-buf
mmap is that generally the begin/end_cpu_access stuff gets lots on the way
there. Which mostly doesn't matter since everyone prefers to share
coherent buffers, except when it totally does matter. So by forwarding
mmap and pretending it all keeps working as-is we're digging ourselves a
bit into a hole I think :-/

Since the same discussion is happening with etnaviv: Why exactly does mesa
need to mmap imported buffers?
-Daniel

> 
> > Now I'm wondering how this should be solved. I guess the first question
> > is, is mmap()-ing of imported BOs really forbidden for all BOs? I guess
> > calling mmap() on a buffer that's been exported by the DRM driver itself
> > then re-imported shouldn't hurt, so maybe we can check that before
> > failing.
> >
> > Now, if we really want to forbid mmap() on imported BOs, that means we
> > need a solution to mmap() the dmabuf object directly, and sometimes this
> > mapping will request RW permissions. The problem is, all function
> > exporting BOs in mesa are exporting them in RO-mode (resulting FD is
> > O_READ), thus preventing mmap()s in RW mode.
> >
> > This patch modifies all
> > drmPrimeHandleToFD()/ioctl(DRM_IOCTL_PRIME_HANDLE_TO_FD) call sites
> > to pass the DRM_RDWR flag so that what's described above becomes
> > possible.
> >
> > I'm not saying this is what we should do, it's more a way to start the
> > discussion. Feel free to propose alternives to this solution.
> >
> > [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem.c?h=v5.2-rc7#n318
> > [2]https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/panfrost?id=583bbf46133c726bae277e8f4e32bfba2a528c7f
> >
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: <dri-devel@lists.freedesktop.org>
> > Cc: <mesa-dev@lists.freedesktop.org>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Rob Herring <robh@kernel.org>
> > ---
> > Cc-ing dri-devel is not a mistake, I really to have feedback from the DRM
> > maintainers, since this started with a kernel-side change.
> > ---
> >  src/etnaviv/drm/etnaviv_bo.c                      | 4 ++--
> >  src/freedreno/drm/freedreno_bo.c                  | 4 ++--
> >  src/freedreno/vulkan/tu_drm.c                     | 2 +-
> >  src/gallium/auxiliary/renderonly/renderonly.c     | 2 +-
> >  src/gallium/drivers/iris/iris_bufmgr.c            | 2 +-
> >  src/gallium/drivers/lima/lima_bo.c                | 2 +-
> >  src/gallium/drivers/panfrost/pan_drm.c            | 2 +-
> >  src/gallium/drivers/v3d/v3d_bufmgr.c              | 2 +-
> >  src/gallium/drivers/vc4/vc4_bufmgr.c              | 2 +-
> >  src/gallium/winsys/radeon/drm/radeon_drm_bo.c     | 3 ++-
> >  src/gallium/winsys/svga/drm/vmw_screen_dri.c      | 5 +++--
> >  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
> >  src/gallium/winsys/virgl/drm/virgl_drm_winsys.c   | 3 ++-
> >  src/intel/vulkan/anv_gem.c                        | 2 +-
> >  src/mesa/drivers/dri/i965/brw_bufmgr.c            | 2 +-
> >  15 files changed, 21 insertions(+), 18 deletions(-)
> >
> > diff --git a/src/etnaviv/drm/etnaviv_bo.c b/src/etnaviv/drm/etnaviv_bo.c
> > index 6e952fa47858..92634141b580 100644
> > --- a/src/etnaviv/drm/etnaviv_bo.c
> > +++ b/src/etnaviv/drm/etnaviv_bo.c
> > @@ -294,8 +294,8 @@ int etna_bo_dmabuf(struct etna_bo *bo)
> >  {
> >         int ret, prime_fd;
> >
> > -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> > -                               &prime_fd);
> > +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> >         if (ret) {
> >                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
> >                 return ret;
> > diff --git a/src/freedreno/drm/freedreno_bo.c b/src/freedreno/drm/freedreno_bo.c
> > index 7449160f1371..ba19b08d7c54 100644
> > --- a/src/freedreno/drm/freedreno_bo.c
> > +++ b/src/freedreno/drm/freedreno_bo.c
> > @@ -318,8 +318,8 @@ int fd_bo_dmabuf(struct fd_bo *bo)
> >  {
> >         int ret, prime_fd;
> >
> > -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> > -                       &prime_fd);
> > +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> >         if (ret) {
> >                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
> >                 return ret;
> > diff --git a/src/freedreno/vulkan/tu_drm.c b/src/freedreno/vulkan/tu_drm.c
> > index 9b2e6f78879e..6bef3012ddb5 100644
> > --- a/src/freedreno/vulkan/tu_drm.c
> > +++ b/src/freedreno/vulkan/tu_drm.c
> > @@ -147,7 +147,7 @@ tu_gem_export_dmabuf(const struct tu_device *dev, uint32_t gem_handle)
> >  {
> >     int prime_fd;
> >     int ret = drmPrimeHandleToFD(dev->physical_device->local_fd, gem_handle,
> > -                                DRM_CLOEXEC, &prime_fd);
> > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> >
> >     return ret == 0 ? prime_fd : -1;
> >  }
> > diff --git a/src/gallium/auxiliary/renderonly/renderonly.c b/src/gallium/auxiliary/renderonly/renderonly.c
> > index d6a344009378..c1cc31115105 100644
> > --- a/src/gallium/auxiliary/renderonly/renderonly.c
> > +++ b/src/gallium/auxiliary/renderonly/renderonly.c
> > @@ -101,7 +101,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
> >     out_handle->type = WINSYS_HANDLE_TYPE_FD;
> >     out_handle->stride = create_dumb.pitch;
> >
> > -   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC,
> > +   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC | O_RDWR,
> >           (int *)&out_handle->handle);
> >     if (err < 0) {
> >        fprintf(stderr, "failed to export dumb buffer: %s\n", strerror(errno));
> > diff --git a/src/gallium/drivers/iris/iris_bufmgr.c b/src/gallium/drivers/iris/iris_bufmgr.c
> > index d80945824098..68e5d8a8bb13 100644
> > --- a/src/gallium/drivers/iris/iris_bufmgr.c
> > +++ b/src/gallium/drivers/iris/iris_bufmgr.c
> > @@ -1369,7 +1369,7 @@ iris_bo_export_dmabuf(struct iris_bo *bo, int *prime_fd)
> >     iris_bo_make_external(bo);
> >
> >     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> > -                          DRM_CLOEXEC, prime_fd) != 0)
> > +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
> >        return -errno;
> >
> >     bo->reusable = false;
> > diff --git a/src/gallium/drivers/lima/lima_bo.c b/src/gallium/drivers/lima/lima_bo.c
> > index 1d6dd720602a..df0114fc67ed 100644
> > --- a/src/gallium/drivers/lima/lima_bo.c
> > +++ b/src/gallium/drivers/lima/lima_bo.c
> > @@ -205,7 +205,7 @@ bool lima_bo_export(struct lima_bo *bo, struct winsys_handle *handle)
> >        return true;
> >
> >     case WINSYS_HANDLE_TYPE_FD:
> > -      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC,
> > +      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
> >                               (int*)&handle->handle))
> >           return false;
> >
> > diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c
> > index 8de4f483435c..a086575c453a 100644
> > --- a/src/gallium/drivers/panfrost/pan_drm.c
> > +++ b/src/gallium/drivers/panfrost/pan_drm.c
> > @@ -181,7 +181,7 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo
> >  {
> >          struct drm_prime_handle args = {
> >                  .handle = bo->gem_handle,
> > -                .flags = DRM_CLOEXEC,
> > +                .flags = DRM_CLOEXEC | DRM_RDWR,
> >          };
> >
> >          int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> > diff --git a/src/gallium/drivers/v3d/v3d_bufmgr.c b/src/gallium/drivers/v3d/v3d_bufmgr.c
> > index f0df1b983737..0c78994d899f 100644
> > --- a/src/gallium/drivers/v3d/v3d_bufmgr.c
> > +++ b/src/gallium/drivers/v3d/v3d_bufmgr.c
> > @@ -424,7 +424,7 @@ v3d_bo_get_dmabuf(struct v3d_bo *bo)
> >  {
> >          int fd;
> >          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> > -                                     O_CLOEXEC, &fd);
> > +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
> >          if (ret != 0) {
> >                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
> >                          bo->handle);
> > diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c
> > index 716ca50ea069..c7e89b6a517e 100644
> > --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
> > +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
> > @@ -462,7 +462,7 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
> >  {
> >          int fd;
> >          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> > -                                     O_CLOEXEC, &fd);
> > +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
> >          if (ret != 0) {
> >                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
> >                          bo->handle);
> > diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > index d1e2a8685ba9..ae7958a8b892 100644
> > --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > @@ -1320,7 +1320,8 @@ static bool radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
> >      } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
> >          whandle->handle = bo->handle;
> >      } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> > -        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC, (int*)&whandle->handle))
> > +        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
> > +                               (int*)&whandle->handle))
> >              return false;
> >      }
> >
> > diff --git a/src/gallium/winsys/svga/drm/vmw_screen_dri.c b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > index a85ee18e45b9..24f6ddac0890 100644
> > --- a/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > +++ b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > @@ -345,8 +345,9 @@ vmw_drm_surface_get_handle(struct svga_winsys_screen *sws,
> >         whandle->handle = vsrf->sid;
> >         break;
> >      case WINSYS_HANDLE_TYPE_FD:
> > -       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid, DRM_CLOEXEC,
> > -                               (int *)&whandle->handle);
> > +       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid,
> > +                                DRM_CLOEXEC | DRM_RDWR,
> > +                                (int *)&whandle->handle);
> >         if (ret) {
> >           vmw_error("Failed to get file descriptor from prime.\n");
> >           return FALSE;
> > diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > index d9b417dc4dad..ce47e66ab1e5 100644
> > --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > @@ -446,7 +446,7 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
> >        return TRUE;
> >     case WINSYS_HANDLE_TYPE_FD:
> >        if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
> > -                             DRM_CLOEXEC, (int*)&whandle->handle)) {
> > +                             DRM_CLOEXEC | DRM_RDWR, (int*)&whandle->handle)) {
> >           whandle->stride = plane->stride;
> >           whandle->offset = plane->offset;
> >           return TRUE;
> > diff --git a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > index 9eec92f57363..07fab2735e25 100644
> > --- a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > +++ b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > @@ -418,7 +418,8 @@ static boolean virgl_drm_winsys_resource_get_handle(struct virgl_winsys *qws,
> >     } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
> >        whandle->handle = res->bo_handle;
> >     } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> > -      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC, (int*)&whandle->handle))
> > +      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC | DRM_RDWR,
> > +                             (int*)&whandle->handle))
> >              return FALSE;
> >        mtx_lock(&qdws->bo_handles_mutex);
> >        util_hash_table_set(qdws->bo_handles, (void *)(uintptr_t)res->bo_handle, res);
> > diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> > index 1bdf040c1a37..8f07928638ce 100644
> > --- a/src/intel/vulkan/anv_gem.c
> > +++ b/src/intel/vulkan/anv_gem.c
> > @@ -399,7 +399,7 @@ anv_gem_handle_to_fd(struct anv_device *device, uint32_t gem_handle)
> >  {
> >     struct drm_prime_handle args = {
> >        .handle = gem_handle,
> > -      .flags = DRM_CLOEXEC,
> > +      .flags = DRM_CLOEXEC | DRM_RDWR,
> >     };
> >
> >     int ret = anv_ioctl(device->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > index a7c684063158..ee639d2cf23a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > @@ -1489,7 +1489,7 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd)
> >     brw_bo_make_external(bo);
> >
> >     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> > -                          DRM_CLOEXEC, prime_fd) != 0)
> > +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
> >        return -errno;
> >
> >     bo->reusable = false;
> > --
> > 2.21.0
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Daniel Vetter July 3, 2019, 3:49 p.m. UTC | #8
On Wed, Jul 03, 2019 at 05:47:24PM +0200, Daniel Vetter wrote:
> On Wed, Jul 03, 2019 at 07:45:32AM -0600, Rob Herring wrote:
> > On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> > <boris.brezillon@collabora.com> wrote:
> > >
> > > Exported BOs might be imported back, then mmap()-ed to be written
> > > too. Most drivers handle that by mmap()-ing the GEM handle after it's
> > > been imported, but, according to [1], this is illegal.
> > 
> > It's not illegal, but is supposed to go thru the dmabuf mmap
> > functions. However, none of the driver I've looked at (etnaviv, msm,
> > v3d, vgem) do that. It probably works because it's the same driver
> > doing the import and export or both drivers have essentially the same
> > implementations.
> 
> For self-import we short-circuit out the underlying dma-buf and directly
> go to the gem_bo again.
> 
> > > The panfrost
> > > driver has recently switched to this generic helper (which was renamed
> > > into drm_gem_map_offset() in the meantime) [2], and mmap()-ing
> > > of imported BOs now fails.
> > 
> > I think I'm going to revert this.
> 
> Can't we change the helper to force the rw mode to make this work?
> 
> Aside: The reason I'm not a huge fan of forwarding gem mmap to dma-buf
> mmap is that generally the begin/end_cpu_access stuff gets lots on the way
> there. Which mostly doesn't matter since everyone prefers to share
> coherent buffers, except when it totally does matter. So by forwarding
> mmap and pretending it all keeps working as-is we're digging ourselves a
> bit into a hole I think :-/
> 
> Since the same discussion is happening with etnaviv: Why exactly does mesa
> need to mmap imported buffers?

Also, for dumb mmap I very much want to keep this requirement. If you
import a dma-buf and then mmap it through the dumb buffer interfaces,
you're just doing all the things wrong :-)
-Daniel

> -Daniel
> 
> > 
> > > Now I'm wondering how this should be solved. I guess the first question
> > > is, is mmap()-ing of imported BOs really forbidden for all BOs? I guess
> > > calling mmap() on a buffer that's been exported by the DRM driver itself
> > > then re-imported shouldn't hurt, so maybe we can check that before
> > > failing.
> > >
> > > Now, if we really want to forbid mmap() on imported BOs, that means we
> > > need a solution to mmap() the dmabuf object directly, and sometimes this
> > > mapping will request RW permissions. The problem is, all function
> > > exporting BOs in mesa are exporting them in RO-mode (resulting FD is
> > > O_READ), thus preventing mmap()s in RW mode.
> > >
> > > This patch modifies all
> > > drmPrimeHandleToFD()/ioctl(DRM_IOCTL_PRIME_HANDLE_TO_FD) call sites
> > > to pass the DRM_RDWR flag so that what's described above becomes
> > > possible.
> > >
> > > I'm not saying this is what we should do, it's more a way to start the
> > > discussion. Feel free to propose alternives to this solution.
> > >
> > > [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem.c?h=v5.2-rc7#n318
> > > [2]https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/panfrost?id=583bbf46133c726bae277e8f4e32bfba2a528c7f
> > >
> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > Cc: <dri-devel@lists.freedesktop.org>
> > > Cc: <mesa-dev@lists.freedesktop.org>
> > > Cc: Steven Price <steven.price@arm.com>
> > > Cc: Rob Herring <robh@kernel.org>
> > > ---
> > > Cc-ing dri-devel is not a mistake, I really to have feedback from the DRM
> > > maintainers, since this started with a kernel-side change.
> > > ---
> > >  src/etnaviv/drm/etnaviv_bo.c                      | 4 ++--
> > >  src/freedreno/drm/freedreno_bo.c                  | 4 ++--
> > >  src/freedreno/vulkan/tu_drm.c                     | 2 +-
> > >  src/gallium/auxiliary/renderonly/renderonly.c     | 2 +-
> > >  src/gallium/drivers/iris/iris_bufmgr.c            | 2 +-
> > >  src/gallium/drivers/lima/lima_bo.c                | 2 +-
> > >  src/gallium/drivers/panfrost/pan_drm.c            | 2 +-
> > >  src/gallium/drivers/v3d/v3d_bufmgr.c              | 2 +-
> > >  src/gallium/drivers/vc4/vc4_bufmgr.c              | 2 +-
> > >  src/gallium/winsys/radeon/drm/radeon_drm_bo.c     | 3 ++-
> > >  src/gallium/winsys/svga/drm/vmw_screen_dri.c      | 5 +++--
> > >  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
> > >  src/gallium/winsys/virgl/drm/virgl_drm_winsys.c   | 3 ++-
> > >  src/intel/vulkan/anv_gem.c                        | 2 +-
> > >  src/mesa/drivers/dri/i965/brw_bufmgr.c            | 2 +-
> > >  15 files changed, 21 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/src/etnaviv/drm/etnaviv_bo.c b/src/etnaviv/drm/etnaviv_bo.c
> > > index 6e952fa47858..92634141b580 100644
> > > --- a/src/etnaviv/drm/etnaviv_bo.c
> > > +++ b/src/etnaviv/drm/etnaviv_bo.c
> > > @@ -294,8 +294,8 @@ int etna_bo_dmabuf(struct etna_bo *bo)
> > >  {
> > >         int ret, prime_fd;
> > >
> > > -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> > > -                               &prime_fd);
> > > +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> > > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> > >         if (ret) {
> > >                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
> > >                 return ret;
> > > diff --git a/src/freedreno/drm/freedreno_bo.c b/src/freedreno/drm/freedreno_bo.c
> > > index 7449160f1371..ba19b08d7c54 100644
> > > --- a/src/freedreno/drm/freedreno_bo.c
> > > +++ b/src/freedreno/drm/freedreno_bo.c
> > > @@ -318,8 +318,8 @@ int fd_bo_dmabuf(struct fd_bo *bo)
> > >  {
> > >         int ret, prime_fd;
> > >
> > > -       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
> > > -                       &prime_fd);
> > > +       ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
> > > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> > >         if (ret) {
> > >                 ERROR_MSG("failed to get dmabuf fd: %d", ret);
> > >                 return ret;
> > > diff --git a/src/freedreno/vulkan/tu_drm.c b/src/freedreno/vulkan/tu_drm.c
> > > index 9b2e6f78879e..6bef3012ddb5 100644
> > > --- a/src/freedreno/vulkan/tu_drm.c
> > > +++ b/src/freedreno/vulkan/tu_drm.c
> > > @@ -147,7 +147,7 @@ tu_gem_export_dmabuf(const struct tu_device *dev, uint32_t gem_handle)
> > >  {
> > >     int prime_fd;
> > >     int ret = drmPrimeHandleToFD(dev->physical_device->local_fd, gem_handle,
> > > -                                DRM_CLOEXEC, &prime_fd);
> > > +                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
> > >
> > >     return ret == 0 ? prime_fd : -1;
> > >  }
> > > diff --git a/src/gallium/auxiliary/renderonly/renderonly.c b/src/gallium/auxiliary/renderonly/renderonly.c
> > > index d6a344009378..c1cc31115105 100644
> > > --- a/src/gallium/auxiliary/renderonly/renderonly.c
> > > +++ b/src/gallium/auxiliary/renderonly/renderonly.c
> > > @@ -101,7 +101,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
> > >     out_handle->type = WINSYS_HANDLE_TYPE_FD;
> > >     out_handle->stride = create_dumb.pitch;
> > >
> > > -   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC,
> > > +   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC | O_RDWR,
> > >           (int *)&out_handle->handle);
> > >     if (err < 0) {
> > >        fprintf(stderr, "failed to export dumb buffer: %s\n", strerror(errno));
> > > diff --git a/src/gallium/drivers/iris/iris_bufmgr.c b/src/gallium/drivers/iris/iris_bufmgr.c
> > > index d80945824098..68e5d8a8bb13 100644
> > > --- a/src/gallium/drivers/iris/iris_bufmgr.c
> > > +++ b/src/gallium/drivers/iris/iris_bufmgr.c
> > > @@ -1369,7 +1369,7 @@ iris_bo_export_dmabuf(struct iris_bo *bo, int *prime_fd)
> > >     iris_bo_make_external(bo);
> > >
> > >     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> > > -                          DRM_CLOEXEC, prime_fd) != 0)
> > > +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
> > >        return -errno;
> > >
> > >     bo->reusable = false;
> > > diff --git a/src/gallium/drivers/lima/lima_bo.c b/src/gallium/drivers/lima/lima_bo.c
> > > index 1d6dd720602a..df0114fc67ed 100644
> > > --- a/src/gallium/drivers/lima/lima_bo.c
> > > +++ b/src/gallium/drivers/lima/lima_bo.c
> > > @@ -205,7 +205,7 @@ bool lima_bo_export(struct lima_bo *bo, struct winsys_handle *handle)
> > >        return true;
> > >
> > >     case WINSYS_HANDLE_TYPE_FD:
> > > -      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC,
> > > +      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
> > >                               (int*)&handle->handle))
> > >           return false;
> > >
> > > diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c
> > > index 8de4f483435c..a086575c453a 100644
> > > --- a/src/gallium/drivers/panfrost/pan_drm.c
> > > +++ b/src/gallium/drivers/panfrost/pan_drm.c
> > > @@ -181,7 +181,7 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo
> > >  {
> > >          struct drm_prime_handle args = {
> > >                  .handle = bo->gem_handle,
> > > -                .flags = DRM_CLOEXEC,
> > > +                .flags = DRM_CLOEXEC | DRM_RDWR,
> > >          };
> > >
> > >          int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> > > diff --git a/src/gallium/drivers/v3d/v3d_bufmgr.c b/src/gallium/drivers/v3d/v3d_bufmgr.c
> > > index f0df1b983737..0c78994d899f 100644
> > > --- a/src/gallium/drivers/v3d/v3d_bufmgr.c
> > > +++ b/src/gallium/drivers/v3d/v3d_bufmgr.c
> > > @@ -424,7 +424,7 @@ v3d_bo_get_dmabuf(struct v3d_bo *bo)
> > >  {
> > >          int fd;
> > >          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> > > -                                     O_CLOEXEC, &fd);
> > > +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
> > >          if (ret != 0) {
> > >                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
> > >                          bo->handle);
> > > diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c
> > > index 716ca50ea069..c7e89b6a517e 100644
> > > --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
> > > +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
> > > @@ -462,7 +462,7 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
> > >  {
> > >          int fd;
> > >          int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
> > > -                                     O_CLOEXEC, &fd);
> > > +                                     DRM_CLOEXEC | DRM_RDWR, &fd);
> > >          if (ret != 0) {
> > >                  fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
> > >                          bo->handle);
> > > diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > > index d1e2a8685ba9..ae7958a8b892 100644
> > > --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > > +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> > > @@ -1320,7 +1320,8 @@ static bool radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
> > >      } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
> > >          whandle->handle = bo->handle;
> > >      } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> > > -        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC, (int*)&whandle->handle))
> > > +        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
> > > +                               (int*)&whandle->handle))
> > >              return false;
> > >      }
> > >
> > > diff --git a/src/gallium/winsys/svga/drm/vmw_screen_dri.c b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > > index a85ee18e45b9..24f6ddac0890 100644
> > > --- a/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > > +++ b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> > > @@ -345,8 +345,9 @@ vmw_drm_surface_get_handle(struct svga_winsys_screen *sws,
> > >         whandle->handle = vsrf->sid;
> > >         break;
> > >      case WINSYS_HANDLE_TYPE_FD:
> > > -       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid, DRM_CLOEXEC,
> > > -                               (int *)&whandle->handle);
> > > +       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid,
> > > +                                DRM_CLOEXEC | DRM_RDWR,
> > > +                                (int *)&whandle->handle);
> > >         if (ret) {
> > >           vmw_error("Failed to get file descriptor from prime.\n");
> > >           return FALSE;
> > > diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > > index d9b417dc4dad..ce47e66ab1e5 100644
> > > --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > > +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> > > @@ -446,7 +446,7 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
> > >        return TRUE;
> > >     case WINSYS_HANDLE_TYPE_FD:
> > >        if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
> > > -                             DRM_CLOEXEC, (int*)&whandle->handle)) {
> > > +                             DRM_CLOEXEC | DRM_RDWR, (int*)&whandle->handle)) {
> > >           whandle->stride = plane->stride;
> > >           whandle->offset = plane->offset;
> > >           return TRUE;
> > > diff --git a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > > index 9eec92f57363..07fab2735e25 100644
> > > --- a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > > +++ b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> > > @@ -418,7 +418,8 @@ static boolean virgl_drm_winsys_resource_get_handle(struct virgl_winsys *qws,
> > >     } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
> > >        whandle->handle = res->bo_handle;
> > >     } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
> > > -      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC, (int*)&whandle->handle))
> > > +      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC | DRM_RDWR,
> > > +                             (int*)&whandle->handle))
> > >              return FALSE;
> > >        mtx_lock(&qdws->bo_handles_mutex);
> > >        util_hash_table_set(qdws->bo_handles, (void *)(uintptr_t)res->bo_handle, res);
> > > diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> > > index 1bdf040c1a37..8f07928638ce 100644
> > > --- a/src/intel/vulkan/anv_gem.c
> > > +++ b/src/intel/vulkan/anv_gem.c
> > > @@ -399,7 +399,7 @@ anv_gem_handle_to_fd(struct anv_device *device, uint32_t gem_handle)
> > >  {
> > >     struct drm_prime_handle args = {
> > >        .handle = gem_handle,
> > > -      .flags = DRM_CLOEXEC,
> > > +      .flags = DRM_CLOEXEC | DRM_RDWR,
> > >     };
> > >
> > >     int ret = anv_ioctl(device->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
> > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > > index a7c684063158..ee639d2cf23a 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > > @@ -1489,7 +1489,7 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd)
> > >     brw_bo_make_external(bo);
> > >
> > >     if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
> > > -                          DRM_CLOEXEC, prime_fd) != 0)
> > > +                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
> > >        return -errno;
> > >
> > >     bo->reusable = false;
> > > --
> > > 2.21.0
> > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Rob Herring July 3, 2019, 3:59 p.m. UTC | #9
On Wed, Jul 3, 2019 at 8:13 AM Steven Price <steven.price@arm.com> wrote:
>
> On 03/07/2019 14:56, Boris Brezillon wrote:
> > On Wed, 3 Jul 2019 07:45:32 -0600
> > Rob Herring <robh+dt@kernel.org> wrote:
> >
> >> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> >> <boris.brezillon@collabora.com> wrote:
> >>>
> >>> Exported BOs might be imported back, then mmap()-ed to be written
> >>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
> >>> been imported, but, according to [1], this is illegal.
> >>
> >> It's not illegal, but is supposed to go thru the dmabuf mmap
> >> functions.
> >
> > That's basically what I'm proposing here, just didn't post the patch
> > skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
> > instead of the DRM-node one, but I have it working for panfrost.
>
> If we want to we could make the Panfrost kernel driver internally call
> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
> indeed what the kbase driver does.
>
> >> However, none of the driver I've looked at (etnaviv, msm,
> >> v3d, vgem) do that. It probably works because it's the same driver
> >> doing the import and export or both drivers have essentially the same
> >> implementations.
> >
> > Yes, but maybe that's something we should start fixing if mmap()-ing
> > the dmabuf is the recommended solution.
>
> I'm open to options here. User space calling mmap() on the dma_buf file
> descriptor should always be safe (the exporter can do whatever is
> necessary to make it work). If that happens then the patches I posted
> close off the DRM node version which could be broken if the exporter
> needs to do anything to prepare the buffer for CPU access (i.e. cache
> maintenance).
>
> Alternatively if user space wants/needs to use the DMA node then we can
> take a look at what needs to change in the kernel. From a quick look at
> the code it seems we'd need to split drm_gem_mmap() into a helper so
> that it can return whether the exporter is handling everything or if the
> caller needs to do some more work (e.g. drm_gem_shmem_mmap() needs to
> allocate backing pages). But because drm_gem_mmap() is used as the
> direct callback for some drivers we'd need to preserve the interface.
>
> The below (completely untested) patch demonstrates the idea.
>
> Steve
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index a8c4468f03d9..df661e24cadf 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1140,7 +1140,7 @@ EXPORT_SYMBOL(drm_gem_mmap_obj);
>   * If the caller is not granted access to the buffer object, the mmap
> will fail
>   * with EACCES. Please see the vma manager for more information.
>   */
> -int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
> +int drm_gem_mmap_helper(struct file *filp, struct vm_area_struct *vma)
>  {
>         struct drm_file *priv = filp->private_data;
>         struct drm_device *dev = priv->minor->dev;
> @@ -1189,6 +1189,11 @@ int drm_gem_mmap(struct file *filp, struct
> vm_area_struct *vma)
>                 vma->vm_flags &= ~VM_MAYWRITE;
>         }
>
> +       if (obj->import_attach) {
> +               ret = dma_buf_mmap(obj->dma_buf, vma, 0);

I don't see how calling dma_buf_mmap() would work. Looking at etnaviv
which is the only GPU driver calling it, it looks to me like there a
recursive loop:

driver->fops.mmap -> obj->mmap -> dma_buf_mmap -> .gem_prime_mmap ->
etnaviv_gem_prime_mmap -> obj->mmap -> dma_buf_mmap -> ...

Rob
Alyssa Rosenzweig July 3, 2019, 4:07 p.m. UTC | #10
> Since the same discussion is happening with etnaviv: Why exactly does mesa
> need to mmap imported buffers?

The golden question!

Seemingly, (one of the) reasons is that the state tracker does
colourspace conversion in software if the winsys wants a format that the
driver doesn't advertise. Unrelated to this, that's almost certainly a
performance issue and might explain quite a bit...

In this case, the winsys seems to want RGB10_A2 but we settle for RGBA8.
I thought that was just changing EGL negotiation stuff.. I had no idea
it was covering up in software... not good...
Steven Price July 3, 2019, 4:15 p.m. UTC | #11
On 03/07/2019 16:59, Rob Herring wrote:
> On Wed, Jul 3, 2019 at 8:13 AM Steven Price <steven.price@arm.com> wrote:
>>
>> On 03/07/2019 14:56, Boris Brezillon wrote:
>>> On Wed, 3 Jul 2019 07:45:32 -0600
>>> Rob Herring <robh+dt@kernel.org> wrote:
>>>
>>>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
>>>> <boris.brezillon@collabora.com> wrote:
>>>>>
>>>>> Exported BOs might be imported back, then mmap()-ed to be written
>>>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
>>>>> been imported, but, according to [1], this is illegal.
>>>>
>>>> It's not illegal, but is supposed to go thru the dmabuf mmap
>>>> functions.
>>>
>>> That's basically what I'm proposing here, just didn't post the patch
>>> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
>>> instead of the DRM-node one, but I have it working for panfrost.
>>
>> If we want to we could make the Panfrost kernel driver internally call
>> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
>> indeed what the kbase driver does.
>>
>>>> However, none of the driver I've looked at (etnaviv, msm,
>>>> v3d, vgem) do that. It probably works because it's the same driver
>>>> doing the import and export or both drivers have essentially the same
>>>> implementations.
>>>
>>> Yes, but maybe that's something we should start fixing if mmap()-ing
>>> the dmabuf is the recommended solution.
>>
>> I'm open to options here. User space calling mmap() on the dma_buf file
>> descriptor should always be safe (the exporter can do whatever is
>> necessary to make it work). If that happens then the patches I posted
>> close off the DRM node version which could be broken if the exporter
>> needs to do anything to prepare the buffer for CPU access (i.e. cache
>> maintenance).
>>
>> Alternatively if user space wants/needs to use the DMA node then we can
>> take a look at what needs to change in the kernel. From a quick look at
>> the code it seems we'd need to split drm_gem_mmap() into a helper so
>> that it can return whether the exporter is handling everything or if the
>> caller needs to do some more work (e.g. drm_gem_shmem_mmap() needs to
>> allocate backing pages). But because drm_gem_mmap() is used as the
>> direct callback for some drivers we'd need to preserve the interface.
>>
>> The below (completely untested) patch demonstrates the idea.
>>
>> Steve
>>
>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>> index a8c4468f03d9..df661e24cadf 100644
>> --- a/drivers/gpu/drm/drm_gem.c
>> +++ b/drivers/gpu/drm/drm_gem.c
>> @@ -1140,7 +1140,7 @@ EXPORT_SYMBOL(drm_gem_mmap_obj);
>>   * If the caller is not granted access to the buffer object, the mmap
>> will fail
>>   * with EACCES. Please see the vma manager for more information.
>>   */
>> -int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
>> +int drm_gem_mmap_helper(struct file *filp, struct vm_area_struct *vma)
>>  {
>>         struct drm_file *priv = filp->private_data;
>>         struct drm_device *dev = priv->minor->dev;
>> @@ -1189,6 +1189,11 @@ int drm_gem_mmap(struct file *filp, struct
>> vm_area_struct *vma)
>>                 vma->vm_flags &= ~VM_MAYWRITE;
>>         }
>>
>> +       if (obj->import_attach) {
>> +               ret = dma_buf_mmap(obj->dma_buf, vma, 0);
> 
> I don't see how calling dma_buf_mmap() would work. Looking at etnaviv
> which is the only GPU driver calling it, it looks to me like there a
> recursive loop:
> 
> driver->fops.mmap -> obj->mmap -> dma_buf_mmap -> .gem_prime_mmap ->
> etnaviv_gem_prime_mmap -> obj->mmap -> dma_buf_mmap -> ...

I did say it was untested :)

The call to dma_buf_mmap should only be in the drivers->fops.mmap
callback, not in obj->mmap. I have to admit I didn't actually bother to
look where drm_gem_mmap() was being used. So drm_gem_shmem_mmap() needs
to call a function which does do the dma_buf_mmap(), but the callback in
obj->mmap shouldn't.

There's a reason I initially went for simply preventing user space
mmap()ing imported objects (through the DRM node) rather than trying to
make it work ;)

Steve
Daniel Vetter July 3, 2019, 4:18 p.m. UTC | #12
On Wed, Jul 3, 2019 at 6:11 PM Steven Price <steven.price@arm.com> wrote:
>
> On 03/07/2019 15:33, Boris Brezillon wrote:
> > On Wed, 3 Jul 2019 15:13:25 +0100
> > Steven Price <steven.price@arm.com> wrote:
> >
> >> On 03/07/2019 14:56, Boris Brezillon wrote:
> >>> On Wed, 3 Jul 2019 07:45:32 -0600
> >>> Rob Herring <robh+dt@kernel.org> wrote:
> >>>
> >>>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon
> >>>> <boris.brezillon@collabora.com> wrote:
> >>>>>
> >>>>> Exported BOs might be imported back, then mmap()-ed to be written
> >>>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's
> >>>>> been imported, but, according to [1], this is illegal.
> >>>>
> >>>> It's not illegal, but is supposed to go thru the dmabuf mmap
> >>>> functions.
> >>>
> >>> That's basically what I'm proposing here, just didn't post the patch
> >>> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD
> >>> instead of the DRM-node one, but I have it working for panfrost.
> >>
> >> If we want to we could make the Panfrost kernel driver internally call
> >> dma_buf_mmap() so that mapping using the DRM-node "just works". This is
> >> indeed what the kbase driver does.
> >
> > Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or
> > ignore its return code), so calling mmap() on the dmabuf FD instead of
> > the DRM-node FD shouldn't be that hard.
>
> What I was suggesting is that user space would still call
> DRM_IOCTL_PANFROST_MMAP_BO to get an offset which uses in a call to
> mmap(..., drm_node_fd, offset). The kernel could detect that the buffer
> is imported and call the exporter for the actual mmap() functionality.
>
> The alternative is that user space 'simply' remembers that a buffer is
> imported and keeps the file descriptor around so that it can instead
> directly mmap() the dma_buf fd. Which is certainly easiest from the
> kernel's perspective (and was what I assumed panfrost was doing - I
> should have checked more closely!).
>
> >>>> However, none of the driver I've looked at (etnaviv, msm,
> >>>> v3d, vgem) do that. It probably works because it's the same driver
> >>>> doing the import and export or both drivers have essentially the same
> >>>> implementations.
> >>>
> >>> Yes, but maybe that's something we should start fixing if mmap()-ing
> >>> the dmabuf is the recommended solution.
> >>
> >> I'm open to options here. User space calling mmap() on the dma_buf file
> >> descriptor should always be safe (the exporter can do whatever is
> >> necessary to make it work). If that happens then the patches I posted
> >> close off the DRM node version which could be broken if the exporter
> >> needs to do anything to prepare the buffer for CPU access (i.e. cache
> >> maintenance).
> >
> > Talking about CPU <-> GPU syncs, I was wondering if the
> > mmap(gem_handle) step was providing any guarantee that would
> > allow us to ignore all the cache maintenance operations that are
> > required when mmap()-ing a dmabuf directly. Note that in both cases the
> > dmabuf is imported.
>
> In theory the exporter should do whatever is required to ensure that the
> CPU is synchronised when a user space mapping exists. There are some
> issues here though:
>
> * In theory the kernel driver should map the dma_buf purely for the
> duration that a job is using the buffer (and unmap immediately after).
> This gives the exporter the knowledge of when the GPU is using the
> memory and allows the exporter to page out of the memory if necessary.
> In practise this map/unmap operation is expensive (updating the GPU's
> page tables) so most drivers don't actually bother and keep the memory
> mapped. This means the exporter cannot tell when the buffer is used or
> move the pages.

Yeah refaulting is too slow if you care the least about performance.

> * The CPU mappings can be faulted on demand (performing the necessary
> CPU cache invalidate if needed) and shot-down to allow moving the
> memory. In theory when the GPU needs the memory it should map the buffer
> and the exporter can then shoot down the mappings, perform the CPU cache
> clean and then allow the GPU to use the memory. A subsequent CPU access
> would then refault the page, ensuring a CPU cache invalidate so the
> latest data is visible.

We thought that was the answer, until it was clear its not. dma-buf
mmap isn't coherent, you need to call the begin/end ioctls.

> * The majority of exporters are simple and deal with uncached memory
> (e.g. frame buffers) or are actually exporting back to the same driver
> (e.g. window surfaces). In these situations either the driver already
> has the necessary "magic" to deal with caches (e.g. kbase provides
> explicit cache maintenance operations), or it's "uncached" anyway so it
> doesn't matter. This means that hardly anyone tests with the complex
> cases...
>
> From a user space ABI, my understanding is that a dma_buf mmap() mapping
> should be coherent, and user space isn't expected to do anything to make
> it work. Obviously any importing device might have it's own coherency
> details which will be up to the ABI of that device (e.g. Mali has caches
> which may need to be flushed - this is usually done at the start/end of
> a job chain, so coherency is not guaranteed while the job chain is running).

See my other reply, but this isn't correct. dma-buf has explicit cache
maintenance ops. It's just that generally everyone (ok, display only
drivers using the cma helpers) ends up exporting coherent memory and
that's why this works. Doesn't make it a bright idea imo ...
-Daniel
Steven Price July 4, 2019, 9:26 a.m. UTC | #13
On 03/07/2019 17:18, Daniel Vetter wrote:
> On Wed, Jul 3, 2019 at 6:11 PM Steven Price <steven.price@arm.com> wrote:
[...]
>> In theory the exporter should do whatever is required to ensure that the
>> CPU is synchronised when a user space mapping exists. There are some
>> issues here though:
>>
>> * In theory the kernel driver should map the dma_buf purely for the
>> duration that a job is using the buffer (and unmap immediately after).
>> This gives the exporter the knowledge of when the GPU is using the
>> memory and allows the exporter to page out of the memory if necessary.
>> In practise this map/unmap operation is expensive (updating the GPU's
>> page tables) so most drivers don't actually bother and keep the memory
>> mapped. This means the exporter cannot tell when the buffer is used or
>> move the pages.
> 
> Yeah refaulting is too slow if you care the least about performance.
> 
>> * The CPU mappings can be faulted on demand (performing the necessary
>> CPU cache invalidate if needed) and shot-down to allow moving the
>> memory. In theory when the GPU needs the memory it should map the buffer
>> and the exporter can then shoot down the mappings, perform the CPU cache
>> clean and then allow the GPU to use the memory. A subsequent CPU access
>> would then refault the page, ensuring a CPU cache invalidate so the
>> latest data is visible.
> 
> We thought that was the answer, until it was clear its not. dma-buf
> mmap isn't coherent, you need to call the begin/end ioctls.
> 
>> * The majority of exporters are simple and deal with uncached memory
>> (e.g. frame buffers) or are actually exporting back to the same driver
>> (e.g. window surfaces). In these situations either the driver already
>> has the necessary "magic" to deal with caches (e.g. kbase provides
>> explicit cache maintenance operations), or it's "uncached" anyway so it
>> doesn't matter. This means that hardly anyone tests with the complex
>> cases...
>>
>> From a user space ABI, my understanding is that a dma_buf mmap() mapping
>> should be coherent, and user space isn't expected to do anything to make
>> it work. Obviously any importing device might have it's own coherency
>> details which will be up to the ABI of that device (e.g. Mali has caches
>> which may need to be flushed - this is usually done at the start/end of
>> a job chain, so coherency is not guaranteed while the job chain is running).
> 
> See my other reply, but this isn't correct. dma-buf has explicit cache
> maintenance ops. It's just that generally everyone (ok, display only
> drivers using the cma helpers) ends up exporting coherent memory and
> that's why this works. Doesn't make it a bright idea imo ...

Sorry, I'd completely forgotten about the DMA_BUF_IOCTL_SYNC ioctl when
I wrote this. But to a large extent this ship has already sailed, and
indeed the current users of dma_buf_mmap() implicitly assume that no
sync is necessary (since there's no mechanism to forward the syncs onto
the exporter). Indeed the comment in dma-buf.c says:

 *   The assumption in the current dma-buf interfaces is that redirecting the
 *   initial mmap is all that's needed. A survey of some of the existing
 *   subsystems shows that no driver seems to do any nefarious thing like
 *   syncing up with outstanding asynchronous processing on the device or
 *   allocating special resources at fault time. So hopefully this is good
 *   enough, since adding interfaces to intercept pagefaults and allow pte
 *   shootdowns would increase the complexity quite a bit.

I'd be happy to kill off dma_buf_mmap() and require user space to mmap
via the dma_buf handle (and also therefore to handle the manual sync
ioctls), but I'm not sure how we achieve that while maintaining
backwards compatibility.

Thanks,

Steve
Daniel Vetter July 4, 2019, 9:46 a.m. UTC | #14
On Thu, Jul 4, 2019 at 11:26 AM Steven Price <steven.price@arm.com> wrote:
>
> On 03/07/2019 17:18, Daniel Vetter wrote:
> > On Wed, Jul 3, 2019 at 6:11 PM Steven Price <steven.price@arm.com> wrote:
> [...]
> >> In theory the exporter should do whatever is required to ensure that the
> >> CPU is synchronised when a user space mapping exists. There are some
> >> issues here though:
> >>
> >> * In theory the kernel driver should map the dma_buf purely for the
> >> duration that a job is using the buffer (and unmap immediately after).
> >> This gives the exporter the knowledge of when the GPU is using the
> >> memory and allows the exporter to page out of the memory if necessary.
> >> In practise this map/unmap operation is expensive (updating the GPU's
> >> page tables) so most drivers don't actually bother and keep the memory
> >> mapped. This means the exporter cannot tell when the buffer is used or
> >> move the pages.
> >
> > Yeah refaulting is too slow if you care the least about performance.
> >
> >> * The CPU mappings can be faulted on demand (performing the necessary
> >> CPU cache invalidate if needed) and shot-down to allow moving the
> >> memory. In theory when the GPU needs the memory it should map the buffer
> >> and the exporter can then shoot down the mappings, perform the CPU cache
> >> clean and then allow the GPU to use the memory. A subsequent CPU access
> >> would then refault the page, ensuring a CPU cache invalidate so the
> >> latest data is visible.
> >
> > We thought that was the answer, until it was clear its not. dma-buf
> > mmap isn't coherent, you need to call the begin/end ioctls.
> >
> >> * The majority of exporters are simple and deal with uncached memory
> >> (e.g. frame buffers) or are actually exporting back to the same driver
> >> (e.g. window surfaces). In these situations either the driver already
> >> has the necessary "magic" to deal with caches (e.g. kbase provides
> >> explicit cache maintenance operations), or it's "uncached" anyway so it
> >> doesn't matter. This means that hardly anyone tests with the complex
> >> cases...
> >>
> >> From a user space ABI, my understanding is that a dma_buf mmap() mapping
> >> should be coherent, and user space isn't expected to do anything to make
> >> it work. Obviously any importing device might have it's own coherency
> >> details which will be up to the ABI of that device (e.g. Mali has caches
> >> which may need to be flushed - this is usually done at the start/end of
> >> a job chain, so coherency is not guaranteed while the job chain is running).
> >
> > See my other reply, but this isn't correct. dma-buf has explicit cache
> > maintenance ops. It's just that generally everyone (ok, display only
> > drivers using the cma helpers) ends up exporting coherent memory and
> > that's why this works. Doesn't make it a bright idea imo ...
>
> Sorry, I'd completely forgotten about the DMA_BUF_IOCTL_SYNC ioctl when
> I wrote this. But to a large extent this ship has already sailed, and
> indeed the current users of dma_buf_mmap() implicitly assume that no
> sync is necessary (since there's no mechanism to forward the syncs onto
> the exporter). Indeed the comment in dma-buf.c says:
>
>  *   The assumption in the current dma-buf interfaces is that redirecting the
>  *   initial mmap is all that's needed. A survey of some of the existing
>  *   subsystems shows that no driver seems to do any nefarious thing like
>  *   syncing up with outstanding asynchronous processing on the device or
>  *   allocating special resources at fault time. So hopefully this is good
>  *   enough, since adding interfaces to intercept pagefaults and allow pte
>  *   shootdowns would increase the complexity quite a bit.
>
> I'd be happy to kill off dma_buf_mmap() and require user space to mmap
> via the dma_buf handle (and also therefore to handle the manual sync
> ioctls), but I'm not sure how we achieve that while maintaining
> backwards compatibility.

Uh I forgot to fully update this in my doc update in 2016. That text
originally predates the sync ioctl. Care to type a patch?
-Daniel
diff mbox series

Patch

diff --git a/src/etnaviv/drm/etnaviv_bo.c b/src/etnaviv/drm/etnaviv_bo.c
index 6e952fa47858..92634141b580 100644
--- a/src/etnaviv/drm/etnaviv_bo.c
+++ b/src/etnaviv/drm/etnaviv_bo.c
@@ -294,8 +294,8 @@  int etna_bo_dmabuf(struct etna_bo *bo)
 {
 	int ret, prime_fd;
 
-	ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
-				&prime_fd);
+	ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
+				 DRM_CLOEXEC | DRM_RDWR, &prime_fd);
 	if (ret) {
 		ERROR_MSG("failed to get dmabuf fd: %d", ret);
 		return ret;
diff --git a/src/freedreno/drm/freedreno_bo.c b/src/freedreno/drm/freedreno_bo.c
index 7449160f1371..ba19b08d7c54 100644
--- a/src/freedreno/drm/freedreno_bo.c
+++ b/src/freedreno/drm/freedreno_bo.c
@@ -318,8 +318,8 @@  int fd_bo_dmabuf(struct fd_bo *bo)
 {
 	int ret, prime_fd;
 
-	ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC,
-			&prime_fd);
+	ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle,
+				 DRM_CLOEXEC | DRM_RDWR, &prime_fd);
 	if (ret) {
 		ERROR_MSG("failed to get dmabuf fd: %d", ret);
 		return ret;
diff --git a/src/freedreno/vulkan/tu_drm.c b/src/freedreno/vulkan/tu_drm.c
index 9b2e6f78879e..6bef3012ddb5 100644
--- a/src/freedreno/vulkan/tu_drm.c
+++ b/src/freedreno/vulkan/tu_drm.c
@@ -147,7 +147,7 @@  tu_gem_export_dmabuf(const struct tu_device *dev, uint32_t gem_handle)
 {
    int prime_fd;
    int ret = drmPrimeHandleToFD(dev->physical_device->local_fd, gem_handle,
-                                DRM_CLOEXEC, &prime_fd);
+                                DRM_CLOEXEC | DRM_RDWR, &prime_fd);
 
    return ret == 0 ? prime_fd : -1;
 }
diff --git a/src/gallium/auxiliary/renderonly/renderonly.c b/src/gallium/auxiliary/renderonly/renderonly.c
index d6a344009378..c1cc31115105 100644
--- a/src/gallium/auxiliary/renderonly/renderonly.c
+++ b/src/gallium/auxiliary/renderonly/renderonly.c
@@ -101,7 +101,7 @@  renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
    out_handle->type = WINSYS_HANDLE_TYPE_FD;
    out_handle->stride = create_dumb.pitch;
 
-   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC,
+   err = drmPrimeHandleToFD(ro->kms_fd, create_dumb.handle, O_CLOEXEC | O_RDWR,
          (int *)&out_handle->handle);
    if (err < 0) {
       fprintf(stderr, "failed to export dumb buffer: %s\n", strerror(errno));
diff --git a/src/gallium/drivers/iris/iris_bufmgr.c b/src/gallium/drivers/iris/iris_bufmgr.c
index d80945824098..68e5d8a8bb13 100644
--- a/src/gallium/drivers/iris/iris_bufmgr.c
+++ b/src/gallium/drivers/iris/iris_bufmgr.c
@@ -1369,7 +1369,7 @@  iris_bo_export_dmabuf(struct iris_bo *bo, int *prime_fd)
    iris_bo_make_external(bo);
 
    if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
-                          DRM_CLOEXEC, prime_fd) != 0)
+                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
       return -errno;
 
    bo->reusable = false;
diff --git a/src/gallium/drivers/lima/lima_bo.c b/src/gallium/drivers/lima/lima_bo.c
index 1d6dd720602a..df0114fc67ed 100644
--- a/src/gallium/drivers/lima/lima_bo.c
+++ b/src/gallium/drivers/lima/lima_bo.c
@@ -205,7 +205,7 @@  bool lima_bo_export(struct lima_bo *bo, struct winsys_handle *handle)
       return true;
 
    case WINSYS_HANDLE_TYPE_FD:
-      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC,
+      if (drmPrimeHandleToFD(screen->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
                              (int*)&handle->handle))
          return false;
 
diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c
index 8de4f483435c..a086575c453a 100644
--- a/src/gallium/drivers/panfrost/pan_drm.c
+++ b/src/gallium/drivers/panfrost/pan_drm.c
@@ -181,7 +181,7 @@  panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo
 {
         struct drm_prime_handle args = {
                 .handle = bo->gem_handle,
-                .flags = DRM_CLOEXEC,
+                .flags = DRM_CLOEXEC | DRM_RDWR,
         };
 
         int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
diff --git a/src/gallium/drivers/v3d/v3d_bufmgr.c b/src/gallium/drivers/v3d/v3d_bufmgr.c
index f0df1b983737..0c78994d899f 100644
--- a/src/gallium/drivers/v3d/v3d_bufmgr.c
+++ b/src/gallium/drivers/v3d/v3d_bufmgr.c
@@ -424,7 +424,7 @@  v3d_bo_get_dmabuf(struct v3d_bo *bo)
 {
         int fd;
         int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
-                                     O_CLOEXEC, &fd);
+                                     DRM_CLOEXEC | DRM_RDWR, &fd);
         if (ret != 0) {
                 fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
                         bo->handle);
diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c
index 716ca50ea069..c7e89b6a517e 100644
--- a/src/gallium/drivers/vc4/vc4_bufmgr.c
+++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
@@ -462,7 +462,7 @@  vc4_bo_get_dmabuf(struct vc4_bo *bo)
 {
         int fd;
         int ret = drmPrimeHandleToFD(bo->screen->fd, bo->handle,
-                                     O_CLOEXEC, &fd);
+                                     DRM_CLOEXEC | DRM_RDWR, &fd);
         if (ret != 0) {
                 fprintf(stderr, "Failed to export gem bo %d to dmabuf\n",
                         bo->handle);
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index d1e2a8685ba9..ae7958a8b892 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -1320,7 +1320,8 @@  static bool radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
     } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
         whandle->handle = bo->handle;
     } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
-        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC, (int*)&whandle->handle))
+        if (drmPrimeHandleToFD(ws->fd, bo->handle, DRM_CLOEXEC | DRM_RDWR,
+                               (int*)&whandle->handle))
             return false;
     }
 
diff --git a/src/gallium/winsys/svga/drm/vmw_screen_dri.c b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
index a85ee18e45b9..24f6ddac0890 100644
--- a/src/gallium/winsys/svga/drm/vmw_screen_dri.c
+++ b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
@@ -345,8 +345,9 @@  vmw_drm_surface_get_handle(struct svga_winsys_screen *sws,
        whandle->handle = vsrf->sid;
        break;
     case WINSYS_HANDLE_TYPE_FD:
-       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid, DRM_CLOEXEC,
-				(int *)&whandle->handle);
+       ret = drmPrimeHandleToFD(vws->ioctl.drm_fd, vsrf->sid,
+                                DRM_CLOEXEC | DRM_RDWR,
+                                (int *)&whandle->handle);
        if (ret) {
 	  vmw_error("Failed to get file descriptor from prime.\n");
 	  return FALSE;
diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index d9b417dc4dad..ce47e66ab1e5 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -446,7 +446,7 @@  kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
       return TRUE;
    case WINSYS_HANDLE_TYPE_FD:
       if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
-                             DRM_CLOEXEC, (int*)&whandle->handle)) {
+                             DRM_CLOEXEC | DRM_RDWR, (int*)&whandle->handle)) {
          whandle->stride = plane->stride;
          whandle->offset = plane->offset;
          return TRUE;
diff --git a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
index 9eec92f57363..07fab2735e25 100644
--- a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
+++ b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
@@ -418,7 +418,8 @@  static boolean virgl_drm_winsys_resource_get_handle(struct virgl_winsys *qws,
    } else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
       whandle->handle = res->bo_handle;
    } else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
-      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC, (int*)&whandle->handle))
+      if (drmPrimeHandleToFD(qdws->fd, res->bo_handle, DRM_CLOEXEC | DRM_RDWR,
+                             (int*)&whandle->handle))
             return FALSE;
       mtx_lock(&qdws->bo_handles_mutex);
       util_hash_table_set(qdws->bo_handles, (void *)(uintptr_t)res->bo_handle, res);
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 1bdf040c1a37..8f07928638ce 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -399,7 +399,7 @@  anv_gem_handle_to_fd(struct anv_device *device, uint32_t gem_handle)
 {
    struct drm_prime_handle args = {
       .handle = gem_handle,
-      .flags = DRM_CLOEXEC,
+      .flags = DRM_CLOEXEC | DRM_RDWR,
    };
 
    int ret = anv_ioctl(device->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index a7c684063158..ee639d2cf23a 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -1489,7 +1489,7 @@  brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd)
    brw_bo_make_external(bo);
 
    if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle,
-                          DRM_CLOEXEC, prime_fd) != 0)
+                          DRM_CLOEXEC | DRM_RDWR, prime_fd) != 0)
       return -errno;
 
    bo->reusable = false;