[v3,3/4] drm/ttm: convert to unified vma offset manager

Message ID	1374084860-29768-4-git-send-email-dh.herrmann@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org> From: David Herrmann <dh.herrmann@gmail.com> To: dri-devel@lists.freedesktop.org Subject: [PATCH v3 3/4] drm/ttm: convert to unified vma offset manager Date: Wed, 17 Jul 2013 20:14:19 +0200 Message-Id: <1374084860-29768-4-git-send-email-dh.herrmann@gmail.com> In-Reply-To: <1374084860-29768-1-git-send-email-dh.herrmann@gmail.com> References: <1374084860-29768-1-git-send-email-dh.herrmann@gmail.com> Cc: Thomas Hellstrom <thellstrom@vmware.com>, Martin Peres <martin.peres@labri.fr>, Alex Deucher <alexander.deucher@amd.com>, Ben Skeggs <bskeggs@redhat.com>, Dave Airlie <airlied@redhat.com> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org

David Herrmann July 17, 2013, 6:14 p.m. UTC

Use the new vma-manager infrastructure. This doesn't change any
implementation details as the vma-offset-manager is nearly copied 1-to-1
from TTM.

Even though the vma-manager uses its own locks, we still need bo->vm_lock
to prevent bos from being destroyed before we can get a reference during
lookup. However, this lock is not needed during vm-setup as we still
hold a reference there.

This also drops the addr_space_offset member as it is a copy of vm_start
in vma_node objects. Use the accessor functions instead.

Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Martin Peres <martin.peres@labri.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
 drivers/gpu/drm/ast/ast_main.c            |  2 +-
 drivers/gpu/drm/cirrus/cirrus_main.c      |  2 +-
 drivers/gpu/drm/mgag200/mgag200_main.c    |  2 +-
 drivers/gpu/drm/nouveau/nouveau_display.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c     |  2 +-
 drivers/gpu/drm/qxl/qxl_object.h          |  2 +-
 drivers/gpu/drm/qxl/qxl_release.c         |  2 +-
 drivers/gpu/drm/radeon/radeon_object.h    |  5 +-
 drivers/gpu/drm/ttm/ttm_bo.c              | 84 ++++++-------------------------
 drivers/gpu/drm/ttm/ttm_bo_util.c         |  3 +-
 drivers/gpu/drm/ttm/ttm_bo_vm.c           | 81 ++++++++++++-----------------
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  4 +-
 include/drm/ttm/ttm_bo_api.h              | 15 ++----
 include/drm/ttm/ttm_bo_driver.h           |  7 +--
 14 files changed, 65 insertions(+), 148 deletions(-)

Thomas Hellstrom July 18, 2013, 8:53 a.m. UTC | #1

A quick look, but not a full review:

Looks mostly good, but it looks like the TTM vm lock isn't needed at all 
anymore (provided the vma offset manager is properly protected), since
kref_get_unless_zero() is used when a reference after lookup is taken. 
(please see the kref_get_unless_zero documentation examples). When 
drm_vma_offset_remove() is called, the kref value is unconditionally 
zero, and that should block any successful lookup.

Otherwise, if the vma offset manager always needs external locking to 
make lookup + referencing atomic, then perhaps locking should be completely
left to the caller?

Thanks,
Thomas



On 07/17/2013 08:14 PM, David Herrmann wrote:
> Use the new vma-manager infrastructure. This doesn't change any
> implementation details as the vma-offset-manager is nearly copied 1-to-1
> from TTM.
>
> Even though the vma-manager uses its own locks, we still need bo->vm_lock
> to prevent bos from being destroyed before we can get a reference during
> lookup. However, this lock is not needed during vm-setup as we still
> hold a reference there.
>
> This also drops the addr_space_offset member as it is a copy of vm_start
> in vma_node objects. Use the accessor functions instead.
>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Ben Skeggs <bskeggs@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: Martin Peres <martin.peres@labri.fr>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
>   drivers/gpu/drm/ast/ast_main.c            |  2 +-
>   drivers/gpu/drm/cirrus/cirrus_main.c      |  2 +-
>   drivers/gpu/drm/mgag200/mgag200_main.c    |  2 +-
>   drivers/gpu/drm/nouveau/nouveau_display.c |  2 +-
>   drivers/gpu/drm/nouveau/nouveau_gem.c     |  2 +-
>   drivers/gpu/drm/qxl/qxl_object.h          |  2 +-
>   drivers/gpu/drm/qxl/qxl_release.c         |  2 +-
>   drivers/gpu/drm/radeon/radeon_object.h    |  5 +-
>   drivers/gpu/drm/ttm/ttm_bo.c              | 84 ++++++-------------------------
>   drivers/gpu/drm/ttm/ttm_bo_util.c         |  3 +-
>   drivers/gpu/drm/ttm/ttm_bo_vm.c           | 81 ++++++++++++-----------------
>   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  4 +-
>   include/drm/ttm/ttm_bo_api.h              | 15 ++----
>   include/drm/ttm/ttm_bo_driver.h           |  7 +--
>   14 files changed, 65 insertions(+), 148 deletions(-)
>
> diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
> index f60fd7b..c195dc2 100644
> --- a/drivers/gpu/drm/ast/ast_main.c
> +++ b/drivers/gpu/drm/ast/ast_main.c
> @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj)
>   
>   static inline u64 ast_bo_mmap_offset(struct ast_bo *bo)
>   {
> -	return bo->bo.addr_space_offset;
> +	return drm_vma_node_offset_addr(&bo->bo.vma_node);
>   }
>   int
>   ast_dumb_mmap_offset(struct drm_file *file,
> diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c b/drivers/gpu/drm/cirrus/cirrus_main.c
> index 35cbae8..3a7a0ef 100644
> --- a/drivers/gpu/drm/cirrus/cirrus_main.c
> +++ b/drivers/gpu/drm/cirrus/cirrus_main.c
> @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object *obj)
>   
>   static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo)
>   {
> -	return bo->bo.addr_space_offset;
> +	return drm_vma_node_offset_addr(&bo->bo.vma_node);
>   }
>   
>   int
> diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c b/drivers/gpu/drm/mgag200/mgag200_main.c
> index 9fa5685..1a75ea3 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_main.c
> +++ b/drivers/gpu/drm/mgag200/mgag200_main.c
> @@ -349,7 +349,7 @@ void mgag200_gem_free_object(struct drm_gem_object *obj)
>   
>   static inline u64 mgag200_bo_mmap_offset(struct mgag200_bo *bo)
>   {
> -	return bo->bo.addr_space_offset;
> +	return drm_vma_node_offset_addr(&bo->bo.vma_node);
>   }
>   
>   int
> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
> index 708b2d1..7a8caa1 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_display.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c
> @@ -697,7 +697,7 @@ nouveau_display_dumb_map_offset(struct drm_file *file_priv,
>   	gem = drm_gem_object_lookup(dev, file_priv, handle);
>   	if (gem) {
>   		struct nouveau_bo *bo = gem->driver_private;
> -		*poffset = bo->bo.addr_space_offset;
> +		*poffset = drm_vma_node_offset_addr(&bo->bo.vma_node);
>   		drm_gem_object_unreference_unlocked(gem);
>   		return 0;
>   	}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
> index e72d09c..86597eb 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> @@ -226,7 +226,7 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem,
>   	}
>   
>   	rep->size = nvbo->bo.mem.num_pages << PAGE_SHIFT;
> -	rep->map_handle = nvbo->bo.addr_space_offset;
> +	rep->map_handle = drm_vma_node_offset_addr(&nvbo->bo.vma_node);
>   	rep->tile_mode = nvbo->tile_mode;
>   	rep->tile_flags = nvbo->tile_flags;
>   	return 0;
> diff --git a/drivers/gpu/drm/qxl/qxl_object.h b/drivers/gpu/drm/qxl/qxl_object.h
> index ee7ad79..af10165 100644
> --- a/drivers/gpu/drm/qxl/qxl_object.h
> +++ b/drivers/gpu/drm/qxl/qxl_object.h
> @@ -59,7 +59,7 @@ static inline unsigned long qxl_bo_size(struct qxl_bo *bo)
>   
>   static inline u64 qxl_bo_mmap_offset(struct qxl_bo *bo)
>   {
> -	return bo->tbo.addr_space_offset;
> +	return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>   }
>   
>   static inline int qxl_bo_wait(struct qxl_bo *bo, u32 *mem_type,
> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> index b443d67..1a648e1 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -87,7 +87,7 @@ qxl_release_free(struct qxl_device *qdev,
>   
>   	for (i = 0 ; i < release->bo_count; ++i) {
>   		QXL_INFO(qdev, "release %llx\n",
> -			release->bos[i]->tbo.addr_space_offset
> +			drm_vma_node_offset_addr(&release->bos[i]->tbo.vma_node)
>   						- DRM_FILE_OFFSET);
>   		qxl_fence_remove_release(&release->bos[i]->fence, release->id);
>   		qxl_bo_unref(&release->bos[i]);
> diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
> index 91519a5..188c682 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.h
> +++ b/drivers/gpu/drm/radeon/radeon_object.h
> @@ -113,13 +113,10 @@ static inline unsigned radeon_bo_gpu_page_alignment(struct radeon_bo *bo)
>    * @bo:	radeon object for which we query the offset
>    *
>    * Returns mmap offset of the object.
> - *
> - * Note: addr_space_offset is constant after ttm bo init thus isn't protected
> - * by any lock.
>    */
>   static inline u64 radeon_bo_mmap_offset(struct radeon_bo *bo)
>   {
> -	return bo->tbo.addr_space_offset;
> +	return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>   }
>   
>   extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type,
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index cb9dd67..245850b 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -616,11 +616,7 @@ static void ttm_bo_release(struct kref *kref)
>   	struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
>   
>   	write_lock(&bdev->vm_lock);
> -	if (likely(bo->vm_node != NULL)) {
> -		rb_erase(&bo->vm_rb, &bdev->addr_space_rb);
> -		drm_mm_put_block(bo->vm_node);
> -		bo->vm_node = NULL;
> -	}
> +	drm_vma_offset_remove(&bdev->vma_manager, &bo->vma_node);
>   	write_unlock(&bdev->vm_lock);
>   	ttm_mem_io_lock(man, false);
>   	ttm_mem_io_free_vm(bo);
> @@ -1129,6 +1125,7 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>   	bo->resv = &bo->ttm_resv;
>   	reservation_object_init(bo->resv);
>   	atomic_inc(&bo->glob->bo_count);
> +	drm_vma_node_reset(&bo->vma_node);
>   
>   	ret = ttm_bo_check_placement(bo, placement);
>   
> @@ -1424,9 +1421,8 @@ int ttm_bo_device_release(struct ttm_bo_device *bdev)
>   		TTM_DEBUG("Swap list was clean\n");
>   	spin_unlock(&glob->lru_lock);
>   
> -	BUG_ON(!drm_mm_clean(&bdev->addr_space_mm));
>   	write_lock(&bdev->vm_lock);
> -	drm_mm_takedown(&bdev->addr_space_mm);
> +	drm_vma_offset_manager_destroy(&bdev->vma_manager);
>   	write_unlock(&bdev->vm_lock);
>   
>   	return ret;
> @@ -1454,9 +1450,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>   	if (unlikely(ret != 0))
>   		goto out_no_sys;
>   
> -	bdev->addr_space_rb = RB_ROOT;
> -	drm_mm_init(&bdev->addr_space_mm, file_page_offset, 0x10000000);
> -
> +	drm_vma_offset_manager_init(&bdev->vma_manager, file_page_offset,
> +				    0x10000000);
>   	INIT_DELAYED_WORK(&bdev->wq, ttm_bo_delayed_workqueue);
>   	INIT_LIST_HEAD(&bdev->ddestroy);
>   	bdev->dev_mapping = NULL;
> @@ -1498,12 +1493,17 @@ bool ttm_mem_reg_is_pci(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem)
>   void ttm_bo_unmap_virtual_locked(struct ttm_buffer_object *bo)
>   {
>   	struct ttm_bo_device *bdev = bo->bdev;
> -	loff_t offset = (loff_t) bo->addr_space_offset;
> -	loff_t holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
> +	loff_t offset, holelen;
>   
>   	if (!bdev->dev_mapping)
>   		return;
> -	unmap_mapping_range(bdev->dev_mapping, offset, holelen, 1);
> +
> +	if (drm_vma_node_has_offset(&bo->vma_node)) {
> +		offset = (loff_t) drm_vma_node_offset_addr(&bo->vma_node);
> +		holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
> +
> +		unmap_mapping_range(bdev->dev_mapping, offset, holelen, 1);
> +	}
>   	ttm_mem_io_free_vm(bo);
>   }
>   
> @@ -1520,31 +1520,6 @@ void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo)
>   
>   EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>   
> -static void ttm_bo_vm_insert_rb(struct ttm_buffer_object *bo)
> -{
> -	struct ttm_bo_device *bdev = bo->bdev;
> -	struct rb_node **cur = &bdev->addr_space_rb.rb_node;
> -	struct rb_node *parent = NULL;
> -	struct ttm_buffer_object *cur_bo;
> -	unsigned long offset = bo->vm_node->start;
> -	unsigned long cur_offset;
> -
> -	while (*cur) {
> -		parent = *cur;
> -		cur_bo = rb_entry(parent, struct ttm_buffer_object, vm_rb);
> -		cur_offset = cur_bo->vm_node->start;
> -		if (offset < cur_offset)
> -			cur = &parent->rb_left;
> -		else if (offset > cur_offset)
> -			cur = &parent->rb_right;
> -		else
> -			BUG();
> -	}
> -
> -	rb_link_node(&bo->vm_rb, parent, cur);
> -	rb_insert_color(&bo->vm_rb, &bdev->addr_space_rb);
> -}
> -
>   /**
>    * ttm_bo_setup_vm:
>    *
> @@ -1559,38 +1534,9 @@ static void ttm_bo_vm_insert_rb(struct ttm_buffer_object *bo)
>   static int ttm_bo_setup_vm(struct ttm_buffer_object *bo)
>   {
>   	struct ttm_bo_device *bdev = bo->bdev;
> -	int ret;
> -
> -retry_pre_get:
> -	ret = drm_mm_pre_get(&bdev->addr_space_mm);
> -	if (unlikely(ret != 0))
> -		return ret;
>   
> -	write_lock(&bdev->vm_lock);
> -	bo->vm_node = drm_mm_search_free(&bdev->addr_space_mm,
> -					 bo->mem.num_pages, 0, 0);
> -
> -	if (unlikely(bo->vm_node == NULL)) {
> -		ret = -ENOMEM;
> -		goto out_unlock;
> -	}
> -
> -	bo->vm_node = drm_mm_get_block_atomic(bo->vm_node,
> -					      bo->mem.num_pages, 0);
> -
> -	if (unlikely(bo->vm_node == NULL)) {
> -		write_unlock(&bdev->vm_lock);
> -		goto retry_pre_get;
> -	}
> -
> -	ttm_bo_vm_insert_rb(bo);
> -	write_unlock(&bdev->vm_lock);
> -	bo->addr_space_offset = ((uint64_t) bo->vm_node->start) << PAGE_SHIFT;
> -
> -	return 0;
> -out_unlock:
> -	write_unlock(&bdev->vm_lock);
> -	return ret;
> +	return drm_vma_offset_add(&bdev->vma_manager, &bo->vma_node,
> +				  bo->mem.num_pages);
>   }
>   
>   int ttm_bo_wait(struct ttm_buffer_object *bo,
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 319cf41..7cc904d 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -30,6 +30,7 @@
>   
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
> +#include <drm/drm_vma_manager.h>
>   #include <linux/io.h>
>   #include <linux/highmem.h>
>   #include <linux/wait.h>
> @@ -450,7 +451,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>   	INIT_LIST_HEAD(&fbo->lru);
>   	INIT_LIST_HEAD(&fbo->swap);
>   	INIT_LIST_HEAD(&fbo->io_reserve_lru);
> -	fbo->vm_node = NULL;
> +	drm_vma_node_reset(&fbo->vma_node);
>   	atomic_set(&fbo->cpu_writers, 0);
>   
>   	spin_lock(&bdev->fence_lock);
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 3df9f16..54a67f1 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -33,6 +33,7 @@
>   #include <ttm/ttm_module.h>
>   #include <ttm/ttm_bo_driver.h>
>   #include <ttm/ttm_placement.h>
> +#include <drm/drm_vma_manager.h>
>   #include <linux/mm.h>
>   #include <linux/rbtree.h>
>   #include <linux/module.h>
> @@ -40,37 +41,6 @@
>   
>   #define TTM_BO_VM_NUM_PREFAULT 16
>   
> -static struct ttm_buffer_object *ttm_bo_vm_lookup_rb(struct ttm_bo_device *bdev,
> -						     unsigned long page_start,
> -						     unsigned long num_pages)
> -{
> -	struct rb_node *cur = bdev->addr_space_rb.rb_node;
> -	unsigned long cur_offset;
> -	struct ttm_buffer_object *bo;
> -	struct ttm_buffer_object *best_bo = NULL;
> -
> -	while (likely(cur != NULL)) {
> -		bo = rb_entry(cur, struct ttm_buffer_object, vm_rb);
> -		cur_offset = bo->vm_node->start;
> -		if (page_start >= cur_offset) {
> -			cur = cur->rb_right;
> -			best_bo = bo;
> -			if (page_start == cur_offset)
> -				break;
> -		} else
> -			cur = cur->rb_left;
> -	}
> -
> -	if (unlikely(best_bo == NULL))
> -		return NULL;
> -
> -	if (unlikely((best_bo->vm_node->start + best_bo->num_pages) <
> -		     (page_start + num_pages)))
> -		return NULL;
> -
> -	return best_bo;
> -}
> -
>   static int ttm_bo_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   {
>   	struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
> @@ -146,9 +116,9 @@ static int ttm_bo_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   	}
>   
>   	page_offset = ((address - vma->vm_start) >> PAGE_SHIFT) +
> -	    bo->vm_node->start - vma->vm_pgoff;
> +	    drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>   	page_last = vma_pages(vma) +
> -	    bo->vm_node->start - vma->vm_pgoff;
> +	    drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>   
>   	if (unlikely(page_offset >= bo->num_pages)) {
>   		retval = VM_FAULT_SIGBUS;
> @@ -249,6 +219,30 @@ static const struct vm_operations_struct ttm_bo_vm_ops = {
>   	.close = ttm_bo_vm_close
>   };
>   
> +static struct ttm_buffer_object *ttm_bo_vm_lookup(struct ttm_bo_device *bdev,
> +						  unsigned long offset,
> +						  unsigned long pages)
> +{
> +	struct drm_vma_offset_node *node;
> +	struct ttm_buffer_object *bo = NULL;
> +
> +	read_lock(&bdev->vm_lock);
> +
> +	node = drm_vma_offset_lookup(&bdev->vma_manager, offset, pages);
> +	if (likely(node)) {
> +		bo = container_of(node, struct ttm_buffer_object, vma_node);
> +		if (!kref_get_unless_zero(&bo->kref))
> +			bo = NULL;
> +	}
> +
> +	read_unlock(&bdev->vm_lock);
> +
> +	if (!bo)
> +		pr_err("Could not find buffer object to map\n");
> +
> +	return bo;
> +}
> +
>   int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
>   		struct ttm_bo_device *bdev)
>   {
> @@ -256,17 +250,9 @@ int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
>   	struct ttm_buffer_object *bo;
>   	int ret;
>   
> -	read_lock(&bdev->vm_lock);
> -	bo = ttm_bo_vm_lookup_rb(bdev, vma->vm_pgoff,
> -				 vma_pages(vma));
> -	if (likely(bo != NULL) && !kref_get_unless_zero(&bo->kref))
> -		bo = NULL;
> -	read_unlock(&bdev->vm_lock);
> -
> -	if (unlikely(bo == NULL)) {
> -		pr_err("Could not find buffer object to map\n");
> +	bo = ttm_bo_vm_lookup(bdev, vma->vm_pgoff, vma_pages(vma));
> +	if (unlikely(!bo))
>   		return -EINVAL;
> -	}
>   
>   	driver = bo->bdev->driver;
>   	if (unlikely(!driver->verify_access)) {
> @@ -324,12 +310,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct file *filp,
>   	bool no_wait = false;
>   	bool dummy;
>   
> -	read_lock(&bdev->vm_lock);
> -	bo = ttm_bo_vm_lookup_rb(bdev, dev_offset, 1);
> -	if (likely(bo != NULL))
> -		ttm_bo_reference(bo);
> -	read_unlock(&bdev->vm_lock);
> -
> +	bo = ttm_bo_vm_lookup(bdev, dev_offset, 1);
>   	if (unlikely(bo == NULL))
>   		return -EFAULT;
>   
> @@ -343,7 +324,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct file *filp,
>   	if (unlikely(ret != 0))
>   		goto out_unref;
>   
> -	kmap_offset = dev_offset - bo->vm_node->start;
> +	kmap_offset = dev_offset - drm_vma_node_start(&bo->vma_node);
>   	if (unlikely(kmap_offset >= bo->num_pages)) {
>   		ret = -EFBIG;
>   		goto out_unref;
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> index 7953d1f..0e67cf4 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> @@ -500,7 +500,7 @@ int vmw_dmabuf_alloc_ioctl(struct drm_device *dev, void *data,
>   		goto out_no_dmabuf;
>   
>   	rep->handle = handle;
> -	rep->map_handle = dma_buf->base.addr_space_offset;
> +	rep->map_handle = drm_vma_node_offset_addr(&dma_buf->base.vma_node);
>   	rep->cur_gmr_id = handle;
>   	rep->cur_gmr_offset = 0;
>   
> @@ -834,7 +834,7 @@ int vmw_dumb_map_offset(struct drm_file *file_priv,
>   	if (ret != 0)
>   		return -EINVAL;
>   
> -	*offset = out_buf->base.addr_space_offset;
> +	*offset = drm_vma_node_offset_addr(&out_buf->base.vma_node);
>   	vmw_dmabuf_unreference(&out_buf);
>   	return 0;
>   }
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 8a6aa56..751eaff 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -32,12 +32,12 @@
>   #define _TTM_BO_API_H_
>   
>   #include <drm/drm_hashtab.h>
> +#include <drm/drm_vma_manager.h>
>   #include <linux/kref.h>
>   #include <linux/list.h>
>   #include <linux/wait.h>
>   #include <linux/mutex.h>
>   #include <linux/mm.h>
> -#include <linux/rbtree.h>
>   #include <linux/bitmap.h>
>   #include <linux/reservation.h>
>   
> @@ -145,7 +145,6 @@ struct ttm_tt;
>    * @type: The bo type.
>    * @destroy: Destruction function. If NULL, kfree is used.
>    * @num_pages: Actual number of pages.
> - * @addr_space_offset: Address space offset.
>    * @acc_size: Accounted size for this object.
>    * @kref: Reference count of this buffer object. When this refcount reaches
>    * zero, the object is put on the delayed delete list.
> @@ -166,8 +165,7 @@ struct ttm_tt;
>    * @swap: List head for swap LRU list.
>    * @sync_obj: Pointer to a synchronization object.
>    * @priv_flags: Flags describing buffer object internal state.
> - * @vm_rb: Rb node for the vm rb tree.
> - * @vm_node: Address space manager node.
> + * @vma_node: Address space manager node.
>    * @offset: The current GPU offset, which can have different meanings
>    * depending on the memory type. For SYSTEM type memory, it should be 0.
>    * @cur_placement: Hint of current placement.
> @@ -194,7 +192,6 @@ struct ttm_buffer_object {
>   	enum ttm_bo_type type;
>   	void (*destroy) (struct ttm_buffer_object *);
>   	unsigned long num_pages;
> -	uint64_t addr_space_offset;
>   	size_t acc_size;
>   
>   	/**
> @@ -238,13 +235,7 @@ struct ttm_buffer_object {
>   	void *sync_obj;
>   	unsigned long priv_flags;
>   
> -	/**
> -	 * Members protected by the bdev::vm_lock
> -	 */
> -
> -	struct rb_node vm_rb;
> -	struct drm_mm_node *vm_node;
> -
> +	struct drm_vma_offset_node vma_node;
>   
>   	/**
>   	 * Special members that are protected by the reserve lock
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index 984fc2d..d3b8479 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -36,6 +36,7 @@
>   #include <ttm/ttm_placement.h>
>   #include <drm/drm_mm.h>
>   #include <drm/drm_global.h>
> +#include <drm/drm_vma_manager.h>
>   #include <linux/workqueue.h>
>   #include <linux/fs.h>
>   #include <linux/spinlock.h>
> @@ -519,7 +520,7 @@ struct ttm_bo_global {
>    * @man: An array of mem_type_managers.
>    * @fence_lock: Protects the synchronizing members on *all* bos belonging
>    * to this device.
> - * @addr_space_mm: Range manager for the device address space.
> + * @vma_manager: Address space manager
>    * lru_lock: Spinlock that protects the buffer+device lru lists and
>    * ddestroy lists.
>    * @val_seq: Current validation sequence.
> @@ -540,11 +541,11 @@ struct ttm_bo_device {
>   	rwlock_t vm_lock;
>   	struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
>   	spinlock_t fence_lock;
> +
>   	/*
>   	 * Protected by the vm lock.
>   	 */
> -	struct rb_root addr_space_rb;
> -	struct drm_mm addr_space_mm;
> +	struct drm_vma_offset_manager vma_manager;
>   
>   	/*
>   	 * Protected by the global:lru lock

Daniel Vetter July 18, 2013, 11:02 a.m. UTC | #2

On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
<thellstrom@vmware.com> wrote:
> A quick look, but not a full review:
>
> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
> anymore (provided the vma offset manager is properly protected), since
> kref_get_unless_zero() is used when a reference after lookup is taken.
> (please see the kref_get_unless_zero documentation examples). When
> drm_vma_offset_remove() is called, the kref value is unconditionally zero,
> and that should block any successful lookup.
>
> Otherwise, if the vma offset manager always needs external locking to make
> lookup + referencing atomic, then perhaps locking should be completely
> left to the caller?

Somehow I've thought plain gem drivers had this fixed, but looks like
I've never gotten around to it. So we need to rework things anyway.

What about a drm_vma_offset_lookup_unlocked which just checks that the
caller is holding the offset manager's lock? That way everyone can
follow up with the get_unless_zero dance correctly. And ttm could drop
it's own vm lock.

I've considered whether we should just move the vma node into struct
drm_gem_object and let the offset manager do the dance, but that's
much more invasive. And I'm not sure it's the right thing to do yet.
So I think we should consider this only as a follow-up series.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

David Herrmann July 18, 2013, 11:07 a.m. UTC | #3

Hi

On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
<thellstrom@vmware.com> wrote:
> A quick look, but not a full review:
>
> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
> anymore (provided the vma offset manager is properly protected), since
> kref_get_unless_zero() is used when a reference after lookup is taken.
> (please see the kref_get_unless_zero documentation examples). When
> drm_vma_offset_remove() is called, the kref value is unconditionally zero,
> and that should block any successful lookup.

If we drop vm_lock without modifying TTM, this will not work. Even
kref_get_unless_zero() needs some guarantee that the object is still
valid. Assume this scenario:

drm_vma_offset_lookup() returns a valid node, however, before we call
kref_get_*(), another thread destroys the last reference to the bo so
it gets kfree()d. kref_get_unless_zero() will thus reference memory
which can be _anything_ and is not guarantee to stay 0.
(Documentation/kref.txt mentions this, too, and I guess it was you who
wrote that).

I cannot see any locking around the mmap call that could guarantee
this except vm_lock.

> Otherwise, if the vma offset manager always needs external locking to make
> lookup + referencing atomic, then perhaps locking should be completely
> left to the caller?

I would actually prefer to keep it in the VMA manager. It allows some
fast-paths which otherwise would need special semantics for the caller
(for instance see the access-management patches which are based on
this patchset or the reduced locking during setup/release). We'd also
require the caller to use rwsems for good performance, which is not
the case for gem.

So how about Daniel's proposal to add an _unlocked() version and
provide _lock_lookup() and _unlock_lookup() helpers which just wrap
the read_lock() and read_unlock?

Thanks!
David

> Thanks,
> Thomas
>
>
>
>
> On 07/17/2013 08:14 PM, David Herrmann wrote:
>>
>> Use the new vma-manager infrastructure. This doesn't change any
>> implementation details as the vma-offset-manager is nearly copied 1-to-1
>> from TTM.
>>
>> Even though the vma-manager uses its own locks, we still need bo->vm_lock
>> to prevent bos from being destroyed before we can get a reference during
>> lookup. However, this lock is not needed during vm-setup as we still
>> hold a reference there.
>>
>> This also drops the addr_space_offset member as it is a copy of vm_start
>> in vma_node objects. Use the accessor functions instead.
>>
>> Cc: Dave Airlie <airlied@redhat.com>
>> Cc: Ben Skeggs <bskeggs@redhat.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> Cc: Martin Peres <martin.peres@labri.fr>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Thomas Hellstrom <thellstrom@vmware.com>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>>   drivers/gpu/drm/ast/ast_main.c            |  2 +-
>>   drivers/gpu/drm/cirrus/cirrus_main.c      |  2 +-
>>   drivers/gpu/drm/mgag200/mgag200_main.c    |  2 +-
>>   drivers/gpu/drm/nouveau/nouveau_display.c |  2 +-
>>   drivers/gpu/drm/nouveau/nouveau_gem.c     |  2 +-
>>   drivers/gpu/drm/qxl/qxl_object.h          |  2 +-
>>   drivers/gpu/drm/qxl/qxl_release.c         |  2 +-
>>   drivers/gpu/drm/radeon/radeon_object.h    |  5 +-
>>   drivers/gpu/drm/ttm/ttm_bo.c              | 84
>> ++++++-------------------------
>>   drivers/gpu/drm/ttm/ttm_bo_util.c         |  3 +-
>>   drivers/gpu/drm/ttm/ttm_bo_vm.c           | 81
>> ++++++++++++-----------------
>>   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  4 +-
>>   include/drm/ttm/ttm_bo_api.h              | 15 ++----
>>   include/drm/ttm/ttm_bo_driver.h           |  7 +--
>>   14 files changed, 65 insertions(+), 148 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ast/ast_main.c
>> b/drivers/gpu/drm/ast/ast_main.c
>> index f60fd7b..c195dc2 100644
>> --- a/drivers/gpu/drm/ast/ast_main.c
>> +++ b/drivers/gpu/drm/ast/ast_main.c
>> @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj)
>>     static inline u64 ast_bo_mmap_offset(struct ast_bo *bo)
>>   {
>> -       return bo->bo.addr_space_offset;
>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>   }
>>   int
>>   ast_dumb_mmap_offset(struct drm_file *file,
>> diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c
>> b/drivers/gpu/drm/cirrus/cirrus_main.c
>> index 35cbae8..3a7a0ef 100644
>> --- a/drivers/gpu/drm/cirrus/cirrus_main.c
>> +++ b/drivers/gpu/drm/cirrus/cirrus_main.c
>> @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object
>> *obj)
>>     static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo)
>>   {
>> -       return bo->bo.addr_space_offset;
>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>   }
>>     int
>> diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c
>> b/drivers/gpu/drm/mgag200/mgag200_main.c
>> index 9fa5685..1a75ea3 100644
>> --- a/drivers/gpu/drm/mgag200/mgag200_main.c
>> +++ b/drivers/gpu/drm/mgag200/mgag200_main.c
>> @@ -349,7 +349,7 @@ void mgag200_gem_free_object(struct drm_gem_object
>> *obj)
>>     static inline u64 mgag200_bo_mmap_offset(struct mgag200_bo *bo)
>>   {
>> -       return bo->bo.addr_space_offset;
>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>   }
>>     int
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c
>> b/drivers/gpu/drm/nouveau/nouveau_display.c
>> index 708b2d1..7a8caa1 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_display.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c
>> @@ -697,7 +697,7 @@ nouveau_display_dumb_map_offset(struct drm_file
>> *file_priv,
>>         gem = drm_gem_object_lookup(dev, file_priv, handle);
>>         if (gem) {
>>                 struct nouveau_bo *bo = gem->driver_private;
>> -               *poffset = bo->bo.addr_space_offset;
>> +               *poffset = drm_vma_node_offset_addr(&bo->bo.vma_node);
>>                 drm_gem_object_unreference_unlocked(gem);
>>                 return 0;
>>         }
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> index e72d09c..86597eb 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> @@ -226,7 +226,7 @@ nouveau_gem_info(struct drm_file *file_priv, struct
>> drm_gem_object *gem,
>>         }
>>         rep->size = nvbo->bo.mem.num_pages << PAGE_SHIFT;
>> -       rep->map_handle = nvbo->bo.addr_space_offset;
>> +       rep->map_handle = drm_vma_node_offset_addr(&nvbo->bo.vma_node);
>>         rep->tile_mode = nvbo->tile_mode;
>>         rep->tile_flags = nvbo->tile_flags;
>>         return 0;
>> diff --git a/drivers/gpu/drm/qxl/qxl_object.h
>> b/drivers/gpu/drm/qxl/qxl_object.h
>> index ee7ad79..af10165 100644
>> --- a/drivers/gpu/drm/qxl/qxl_object.h
>> +++ b/drivers/gpu/drm/qxl/qxl_object.h
>> @@ -59,7 +59,7 @@ static inline unsigned long qxl_bo_size(struct qxl_bo
>> *bo)
>>     static inline u64 qxl_bo_mmap_offset(struct qxl_bo *bo)
>>   {
>> -       return bo->tbo.addr_space_offset;
>> +       return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>>   }
>>     static inline int qxl_bo_wait(struct qxl_bo *bo, u32 *mem_type,
>> diff --git a/drivers/gpu/drm/qxl/qxl_release.c
>> b/drivers/gpu/drm/qxl/qxl_release.c
>> index b443d67..1a648e1 100644
>> --- a/drivers/gpu/drm/qxl/qxl_release.c
>> +++ b/drivers/gpu/drm/qxl/qxl_release.c
>> @@ -87,7 +87,7 @@ qxl_release_free(struct qxl_device *qdev,
>>         for (i = 0 ; i < release->bo_count; ++i) {
>>                 QXL_INFO(qdev, "release %llx\n",
>> -                       release->bos[i]->tbo.addr_space_offset
>> +
>> drm_vma_node_offset_addr(&release->bos[i]->tbo.vma_node)
>>                                                 - DRM_FILE_OFFSET);
>>                 qxl_fence_remove_release(&release->bos[i]->fence,
>> release->id);
>>                 qxl_bo_unref(&release->bos[i]);
>> diff --git a/drivers/gpu/drm/radeon/radeon_object.h
>> b/drivers/gpu/drm/radeon/radeon_object.h
>> index 91519a5..188c682 100644
>> --- a/drivers/gpu/drm/radeon/radeon_object.h
>> +++ b/drivers/gpu/drm/radeon/radeon_object.h
>> @@ -113,13 +113,10 @@ static inline unsigned
>> radeon_bo_gpu_page_alignment(struct radeon_bo *bo)
>>    * @bo:       radeon object for which we query the offset
>>    *
>>    * Returns mmap offset of the object.
>> - *
>> - * Note: addr_space_offset is constant after ttm bo init thus isn't
>> protected
>> - * by any lock.
>>    */
>>   static inline u64 radeon_bo_mmap_offset(struct radeon_bo *bo)
>>   {
>> -       return bo->tbo.addr_space_offset;
>> +       return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>>   }
>>     extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type,
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index cb9dd67..245850b 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -616,11 +616,7 @@ static void ttm_bo_release(struct kref *kref)
>>         struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
>>         write_lock(&bdev->vm_lock);
>> -       if (likely(bo->vm_node != NULL)) {
>> -               rb_erase(&bo->vm_rb, &bdev->addr_space_rb);
>> -               drm_mm_put_block(bo->vm_node);
>> -               bo->vm_node = NULL;
>> -       }
>> +       drm_vma_offset_remove(&bdev->vma_manager, &bo->vma_node);
>>         write_unlock(&bdev->vm_lock);
>>         ttm_mem_io_lock(man, false);
>>         ttm_mem_io_free_vm(bo);
>> @@ -1129,6 +1125,7 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>         bo->resv = &bo->ttm_resv;
>>         reservation_object_init(bo->resv);
>>         atomic_inc(&bo->glob->bo_count);
>> +       drm_vma_node_reset(&bo->vma_node);
>>         ret = ttm_bo_check_placement(bo, placement);
>>   @@ -1424,9 +1421,8 @@ int ttm_bo_device_release(struct ttm_bo_device
>> *bdev)
>>                 TTM_DEBUG("Swap list was clean\n");
>>         spin_unlock(&glob->lru_lock);
>>   -     BUG_ON(!drm_mm_clean(&bdev->addr_space_mm));
>>         write_lock(&bdev->vm_lock);
>> -       drm_mm_takedown(&bdev->addr_space_mm);
>> +       drm_vma_offset_manager_destroy(&bdev->vma_manager);
>>         write_unlock(&bdev->vm_lock);
>>         return ret;
>> @@ -1454,9 +1450,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>         if (unlikely(ret != 0))
>>                 goto out_no_sys;
>>   -     bdev->addr_space_rb = RB_ROOT;
>> -       drm_mm_init(&bdev->addr_space_mm, file_page_offset, 0x10000000);
>> -
>> +       drm_vma_offset_manager_init(&bdev->vma_manager, file_page_offset,
>> +                                   0x10000000);
>>         INIT_DELAYED_WORK(&bdev->wq, ttm_bo_delayed_workqueue);
>>         INIT_LIST_HEAD(&bdev->ddestroy);
>>         bdev->dev_mapping = NULL;
>> @@ -1498,12 +1493,17 @@ bool ttm_mem_reg_is_pci(struct ttm_bo_device
>> *bdev, struct ttm_mem_reg *mem)
>>   void ttm_bo_unmap_virtual_locked(struct ttm_buffer_object *bo)
>>   {
>>         struct ttm_bo_device *bdev = bo->bdev;
>> -       loff_t offset = (loff_t) bo->addr_space_offset;
>> -       loff_t holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
>> +       loff_t offset, holelen;
>>         if (!bdev->dev_mapping)
>>                 return;
>> -       unmap_mapping_range(bdev->dev_mapping, offset, holelen, 1);
>> +
>> +       if (drm_vma_node_has_offset(&bo->vma_node)) {
>> +               offset = (loff_t) drm_vma_node_offset_addr(&bo->vma_node);
>> +               holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
>> +
>> +               unmap_mapping_range(bdev->dev_mapping, offset, holelen,
>> 1);
>> +       }
>>         ttm_mem_io_free_vm(bo);
>>   }
>>   @@ -1520,31 +1520,6 @@ void ttm_bo_unmap_virtual(struct
>> ttm_buffer_object *bo)
>>     EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>   -static void ttm_bo_vm_insert_rb(struct ttm_buffer_object *bo)
>> -{
>> -       struct ttm_bo_device *bdev = bo->bdev;
>> -       struct rb_node **cur = &bdev->addr_space_rb.rb_node;
>> -       struct rb_node *parent = NULL;
>> -       struct ttm_buffer_object *cur_bo;
>> -       unsigned long offset = bo->vm_node->start;
>> -       unsigned long cur_offset;
>> -
>> -       while (*cur) {
>> -               parent = *cur;
>> -               cur_bo = rb_entry(parent, struct ttm_buffer_object,
>> vm_rb);
>> -               cur_offset = cur_bo->vm_node->start;
>> -               if (offset < cur_offset)
>> -                       cur = &parent->rb_left;
>> -               else if (offset > cur_offset)
>> -                       cur = &parent->rb_right;
>> -               else
>> -                       BUG();
>> -       }
>> -
>> -       rb_link_node(&bo->vm_rb, parent, cur);
>> -       rb_insert_color(&bo->vm_rb, &bdev->addr_space_rb);
>> -}
>> -
>>   /**
>>    * ttm_bo_setup_vm:
>>    *
>> @@ -1559,38 +1534,9 @@ static void ttm_bo_vm_insert_rb(struct
>> ttm_buffer_object *bo)
>>   static int ttm_bo_setup_vm(struct ttm_buffer_object *bo)
>>   {
>>         struct ttm_bo_device *bdev = bo->bdev;
>> -       int ret;
>> -
>> -retry_pre_get:
>> -       ret = drm_mm_pre_get(&bdev->addr_space_mm);
>> -       if (unlikely(ret != 0))
>> -               return ret;
>>   -     write_lock(&bdev->vm_lock);
>> -       bo->vm_node = drm_mm_search_free(&bdev->addr_space_mm,
>> -                                        bo->mem.num_pages, 0, 0);
>> -
>> -       if (unlikely(bo->vm_node == NULL)) {
>> -               ret = -ENOMEM;
>> -               goto out_unlock;
>> -       }
>> -
>> -       bo->vm_node = drm_mm_get_block_atomic(bo->vm_node,
>> -                                             bo->mem.num_pages, 0);
>> -
>> -       if (unlikely(bo->vm_node == NULL)) {
>> -               write_unlock(&bdev->vm_lock);
>> -               goto retry_pre_get;
>> -       }
>> -
>> -       ttm_bo_vm_insert_rb(bo);
>> -       write_unlock(&bdev->vm_lock);
>> -       bo->addr_space_offset = ((uint64_t) bo->vm_node->start) <<
>> PAGE_SHIFT;
>> -
>> -       return 0;
>> -out_unlock:
>> -       write_unlock(&bdev->vm_lock);
>> -       return ret;
>> +       return drm_vma_offset_add(&bdev->vma_manager, &bo->vma_node,
>> +                                 bo->mem.num_pages);
>>   }
>>     int ttm_bo_wait(struct ttm_buffer_object *bo,
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index 319cf41..7cc904d 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -30,6 +30,7 @@
>>     #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/ttm/ttm_placement.h>
>> +#include <drm/drm_vma_manager.h>
>>   #include <linux/io.h>
>>   #include <linux/highmem.h>
>>   #include <linux/wait.h>
>> @@ -450,7 +451,7 @@ static int ttm_buffer_object_transfer(struct
>> ttm_buffer_object *bo,
>>         INIT_LIST_HEAD(&fbo->lru);
>>         INIT_LIST_HEAD(&fbo->swap);
>>         INIT_LIST_HEAD(&fbo->io_reserve_lru);
>> -       fbo->vm_node = NULL;
>> +       drm_vma_node_reset(&fbo->vma_node);
>>         atomic_set(&fbo->cpu_writers, 0);
>>         spin_lock(&bdev->fence_lock);
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 3df9f16..54a67f1 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -33,6 +33,7 @@
>>   #include <ttm/ttm_module.h>
>>   #include <ttm/ttm_bo_driver.h>
>>   #include <ttm/ttm_placement.h>
>> +#include <drm/drm_vma_manager.h>
>>   #include <linux/mm.h>
>>   #include <linux/rbtree.h>
>>   #include <linux/module.h>
>> @@ -40,37 +41,6 @@
>>     #define TTM_BO_VM_NUM_PREFAULT 16
>>   -static struct ttm_buffer_object *ttm_bo_vm_lookup_rb(struct
>> ttm_bo_device *bdev,
>> -                                                    unsigned long
>> page_start,
>> -                                                    unsigned long
>> num_pages)
>> -{
>> -       struct rb_node *cur = bdev->addr_space_rb.rb_node;
>> -       unsigned long cur_offset;
>> -       struct ttm_buffer_object *bo;
>> -       struct ttm_buffer_object *best_bo = NULL;
>> -
>> -       while (likely(cur != NULL)) {
>> -               bo = rb_entry(cur, struct ttm_buffer_object, vm_rb);
>> -               cur_offset = bo->vm_node->start;
>> -               if (page_start >= cur_offset) {
>> -                       cur = cur->rb_right;
>> -                       best_bo = bo;
>> -                       if (page_start == cur_offset)
>> -                               break;
>> -               } else
>> -                       cur = cur->rb_left;
>> -       }
>> -
>> -       if (unlikely(best_bo == NULL))
>> -               return NULL;
>> -
>> -       if (unlikely((best_bo->vm_node->start + best_bo->num_pages) <
>> -                    (page_start + num_pages)))
>> -               return NULL;
>> -
>> -       return best_bo;
>> -}
>> -
>>   static int ttm_bo_vm_fault(struct vm_area_struct *vma, struct vm_fault
>> *vmf)
>>   {
>>         struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
>> @@ -146,9 +116,9 @@ static int ttm_bo_vm_fault(struct vm_area_struct *vma,
>> struct vm_fault *vmf)
>>         }
>>         page_offset = ((address - vma->vm_start) >> PAGE_SHIFT) +
>> -           bo->vm_node->start - vma->vm_pgoff;
>> +           drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>>         page_last = vma_pages(vma) +
>> -           bo->vm_node->start - vma->vm_pgoff;
>> +           drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>>         if (unlikely(page_offset >= bo->num_pages)) {
>>                 retval = VM_FAULT_SIGBUS;
>> @@ -249,6 +219,30 @@ static const struct vm_operations_struct
>> ttm_bo_vm_ops = {
>>         .close = ttm_bo_vm_close
>>   };
>>   +static struct ttm_buffer_object *ttm_bo_vm_lookup(struct ttm_bo_device
>> *bdev,
>> +                                                 unsigned long offset,
>> +                                                 unsigned long pages)
>> +{
>> +       struct drm_vma_offset_node *node;
>> +       struct ttm_buffer_object *bo = NULL;
>> +
>> +       read_lock(&bdev->vm_lock);
>> +
>> +       node = drm_vma_offset_lookup(&bdev->vma_manager, offset, pages);
>> +       if (likely(node)) {
>> +               bo = container_of(node, struct ttm_buffer_object,
>> vma_node);
>> +               if (!kref_get_unless_zero(&bo->kref))
>> +                       bo = NULL;
>> +       }
>> +
>> +       read_unlock(&bdev->vm_lock);
>> +
>> +       if (!bo)
>> +               pr_err("Could not find buffer object to map\n");
>> +
>> +       return bo;
>> +}
>> +
>>   int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
>>                 struct ttm_bo_device *bdev)
>>   {
>> @@ -256,17 +250,9 @@ int ttm_bo_mmap(struct file *filp, struct
>> vm_area_struct *vma,
>>         struct ttm_buffer_object *bo;
>>         int ret;
>>   -     read_lock(&bdev->vm_lock);
>> -       bo = ttm_bo_vm_lookup_rb(bdev, vma->vm_pgoff,
>> -                                vma_pages(vma));
>> -       if (likely(bo != NULL) && !kref_get_unless_zero(&bo->kref))
>> -               bo = NULL;
>> -       read_unlock(&bdev->vm_lock);
>> -
>> -       if (unlikely(bo == NULL)) {
>> -               pr_err("Could not find buffer object to map\n");
>> +       bo = ttm_bo_vm_lookup(bdev, vma->vm_pgoff, vma_pages(vma));
>> +       if (unlikely(!bo))
>>                 return -EINVAL;
>> -       }
>>         driver = bo->bdev->driver;
>>         if (unlikely(!driver->verify_access)) {
>> @@ -324,12 +310,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct
>> file *filp,
>>         bool no_wait = false;
>>         bool dummy;
>>   -     read_lock(&bdev->vm_lock);
>> -       bo = ttm_bo_vm_lookup_rb(bdev, dev_offset, 1);
>> -       if (likely(bo != NULL))
>> -               ttm_bo_reference(bo);
>> -       read_unlock(&bdev->vm_lock);
>> -
>> +       bo = ttm_bo_vm_lookup(bdev, dev_offset, 1);
>>         if (unlikely(bo == NULL))
>>                 return -EFAULT;
>>   @@ -343,7 +324,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct
>> file *filp,
>>         if (unlikely(ret != 0))
>>                 goto out_unref;
>>   -     kmap_offset = dev_offset - bo->vm_node->start;
>> +       kmap_offset = dev_offset - drm_vma_node_start(&bo->vma_node);
>>         if (unlikely(kmap_offset >= bo->num_pages)) {
>>                 ret = -EFBIG;
>>                 goto out_unref;
>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>> b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>> index 7953d1f..0e67cf4 100644
>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>> @@ -500,7 +500,7 @@ int vmw_dmabuf_alloc_ioctl(struct drm_device *dev,
>> void *data,
>>                 goto out_no_dmabuf;
>>         rep->handle = handle;
>> -       rep->map_handle = dma_buf->base.addr_space_offset;
>> +       rep->map_handle =
>> drm_vma_node_offset_addr(&dma_buf->base.vma_node);
>>         rep->cur_gmr_id = handle;
>>         rep->cur_gmr_offset = 0;
>>   @@ -834,7 +834,7 @@ int vmw_dumb_map_offset(struct drm_file *file_priv,
>>         if (ret != 0)
>>                 return -EINVAL;
>>   -     *offset = out_buf->base.addr_space_offset;
>> +       *offset = drm_vma_node_offset_addr(&out_buf->base.vma_node);
>>         vmw_dmabuf_unreference(&out_buf);
>>         return 0;
>>   }
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index 8a6aa56..751eaff 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -32,12 +32,12 @@
>>   #define _TTM_BO_API_H_
>>     #include <drm/drm_hashtab.h>
>> +#include <drm/drm_vma_manager.h>
>>   #include <linux/kref.h>
>>   #include <linux/list.h>
>>   #include <linux/wait.h>
>>   #include <linux/mutex.h>
>>   #include <linux/mm.h>
>> -#include <linux/rbtree.h>
>>   #include <linux/bitmap.h>
>>   #include <linux/reservation.h>
>>   @@ -145,7 +145,6 @@ struct ttm_tt;
>>    * @type: The bo type.
>>    * @destroy: Destruction function. If NULL, kfree is used.
>>    * @num_pages: Actual number of pages.
>> - * @addr_space_offset: Address space offset.
>>    * @acc_size: Accounted size for this object.
>>    * @kref: Reference count of this buffer object. When this refcount
>> reaches
>>    * zero, the object is put on the delayed delete list.
>> @@ -166,8 +165,7 @@ struct ttm_tt;
>>    * @swap: List head for swap LRU list.
>>    * @sync_obj: Pointer to a synchronization object.
>>    * @priv_flags: Flags describing buffer object internal state.
>> - * @vm_rb: Rb node for the vm rb tree.
>> - * @vm_node: Address space manager node.
>> + * @vma_node: Address space manager node.
>>    * @offset: The current GPU offset, which can have different meanings
>>    * depending on the memory type. For SYSTEM type memory, it should be 0.
>>    * @cur_placement: Hint of current placement.
>> @@ -194,7 +192,6 @@ struct ttm_buffer_object {
>>         enum ttm_bo_type type;
>>         void (*destroy) (struct ttm_buffer_object *);
>>         unsigned long num_pages;
>> -       uint64_t addr_space_offset;
>>         size_t acc_size;
>>         /**
>> @@ -238,13 +235,7 @@ struct ttm_buffer_object {
>>         void *sync_obj;
>>         unsigned long priv_flags;
>>   -     /**
>> -        * Members protected by the bdev::vm_lock
>> -        */
>> -
>> -       struct rb_node vm_rb;
>> -       struct drm_mm_node *vm_node;
>> -
>> +       struct drm_vma_offset_node vma_node;
>>         /**
>>          * Special members that are protected by the reserve lock
>> diff --git a/include/drm/ttm/ttm_bo_driver.h
>> b/include/drm/ttm/ttm_bo_driver.h
>> index 984fc2d..d3b8479 100644
>> --- a/include/drm/ttm/ttm_bo_driver.h
>> +++ b/include/drm/ttm/ttm_bo_driver.h
>> @@ -36,6 +36,7 @@
>>   #include <ttm/ttm_placement.h>
>>   #include <drm/drm_mm.h>
>>   #include <drm/drm_global.h>
>> +#include <drm/drm_vma_manager.h>
>>   #include <linux/workqueue.h>
>>   #include <linux/fs.h>
>>   #include <linux/spinlock.h>
>> @@ -519,7 +520,7 @@ struct ttm_bo_global {
>>    * @man: An array of mem_type_managers.
>>    * @fence_lock: Protects the synchronizing members on *all* bos
>> belonging
>>    * to this device.
>> - * @addr_space_mm: Range manager for the device address space.
>> + * @vma_manager: Address space manager
>>    * lru_lock: Spinlock that protects the buffer+device lru lists and
>>    * ddestroy lists.
>>    * @val_seq: Current validation sequence.
>> @@ -540,11 +541,11 @@ struct ttm_bo_device {
>>         rwlock_t vm_lock;
>>         struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
>>         spinlock_t fence_lock;
>> +
>>         /*
>>          * Protected by the vm lock.
>>          */
>> -       struct rb_root addr_space_rb;
>> -       struct drm_mm addr_space_mm;
>> +       struct drm_vma_offset_manager vma_manager;
>>         /*
>>          * Protected by the global:lru lock

Thomas Hellstrom July 18, 2013, 11:24 a.m. UTC | #4

On 07/18/2013 01:07 PM, David Herrmann wrote:
> Hi
>
> On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
> <thellstrom@vmware.com> wrote:
>> A quick look, but not a full review:
>>
>> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
>> anymore (provided the vma offset manager is properly protected), since
>> kref_get_unless_zero() is used when a reference after lookup is taken.
>> (please see the kref_get_unless_zero documentation examples). When
>> drm_vma_offset_remove() is called, the kref value is unconditionally zero,
>> and that should block any successful lookup.
> If we drop vm_lock without modifying TTM, this will not work. Even
> kref_get_unless_zero() needs some guarantee that the object is still
> valid. Assume this scenario:
>
> drm_vma_offset_lookup() returns a valid node, however, before we call
> kref_get_*(), another thread destroys the last reference to the bo so
> it gets kfree()d. kref_get_unless_zero() will thus reference memory
> which can be _anything_ and is not guarantee to stay 0.
> (Documentation/kref.txt mentions this, too, and I guess it was you who
> wrote that).
>
> I cannot see any locking around the mmap call that could guarantee
> this except vm_lock.

You're right. My apologies. Parental leave has taken its toll.

>
>> Otherwise, if the vma offset manager always needs external locking to make
>> lookup + referencing atomic, then perhaps locking should be completely
>> left to the caller?
> I would actually prefer to keep it in the VMA manager. It allows some
> fast-paths which otherwise would need special semantics for the caller
> (for instance see the access-management patches which are based on
> this patchset or the reduced locking during setup/release). We'd also
> require the caller to use rwsems for good performance, which is not
> the case for gem.
>
> So how about Daniel's proposal to add an _unlocked() version and
> provide _lock_lookup() and _unlock_lookup() helpers which just wrap
> the read_lock() and read_unlock?
>
> Thanks!
> David

I think that if there are good reasons to keep locking internal, I'm 
fine with that, (And also, of course, with
Daniel's proposal). Currently the add / remove / lookup paths are mostly 
used by TTM during object creation and
destruction.

However, if the lookup path is ever used by pread / pwrite, that 
situation might change and we would then like to
minimize the locking.

Thanks,
Thomas

>
>> Thanks,
>> Thomas
>>
>>
>>
>>
>> On 07/17/2013 08:14 PM, David Herrmann wrote:
>>> Use the new vma-manager infrastructure. This doesn't change any
>>> implementation details as the vma-offset-manager is nearly copied 1-to-1
>>> from TTM.
>>>
>>> Even though the vma-manager uses its own locks, we still need bo->vm_lock
>>> to prevent bos from being destroyed before we can get a reference during
>>> lookup. However, this lock is not needed during vm-setup as we still
>>> hold a reference there.
>>>
>>> This also drops the addr_space_offset member as it is a copy of vm_start
>>> in vma_node objects. Use the accessor functions instead.
>>>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Ben Skeggs <bskeggs@redhat.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> Cc: Martin Peres <martin.peres@labri.fr>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Thomas Hellstrom <thellstrom@vmware.com>
>>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>>> ---
>>>    drivers/gpu/drm/ast/ast_main.c            |  2 +-
>>>    drivers/gpu/drm/cirrus/cirrus_main.c      |  2 +-
>>>    drivers/gpu/drm/mgag200/mgag200_main.c    |  2 +-
>>>    drivers/gpu/drm/nouveau/nouveau_display.c |  2 +-
>>>    drivers/gpu/drm/nouveau/nouveau_gem.c     |  2 +-
>>>    drivers/gpu/drm/qxl/qxl_object.h          |  2 +-
>>>    drivers/gpu/drm/qxl/qxl_release.c         |  2 +-
>>>    drivers/gpu/drm/radeon/radeon_object.h    |  5 +-
>>>    drivers/gpu/drm/ttm/ttm_bo.c              | 84
>>> ++++++-------------------------
>>>    drivers/gpu/drm/ttm/ttm_bo_util.c         |  3 +-
>>>    drivers/gpu/drm/ttm/ttm_bo_vm.c           | 81
>>> ++++++++++++-----------------
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  4 +-
>>>    include/drm/ttm/ttm_bo_api.h              | 15 ++----
>>>    include/drm/ttm/ttm_bo_driver.h           |  7 +--
>>>    14 files changed, 65 insertions(+), 148 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ast/ast_main.c
>>> b/drivers/gpu/drm/ast/ast_main.c
>>> index f60fd7b..c195dc2 100644
>>> --- a/drivers/gpu/drm/ast/ast_main.c
>>> +++ b/drivers/gpu/drm/ast/ast_main.c
>>> @@ -487,7 +487,7 @@ void ast_gem_free_object(struct drm_gem_object *obj)
>>>      static inline u64 ast_bo_mmap_offset(struct ast_bo *bo)
>>>    {
>>> -       return bo->bo.addr_space_offset;
>>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>>    }
>>>    int
>>>    ast_dumb_mmap_offset(struct drm_file *file,
>>> diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c
>>> b/drivers/gpu/drm/cirrus/cirrus_main.c
>>> index 35cbae8..3a7a0ef 100644
>>> --- a/drivers/gpu/drm/cirrus/cirrus_main.c
>>> +++ b/drivers/gpu/drm/cirrus/cirrus_main.c
>>> @@ -294,7 +294,7 @@ void cirrus_gem_free_object(struct drm_gem_object
>>> *obj)
>>>      static inline u64 cirrus_bo_mmap_offset(struct cirrus_bo *bo)
>>>    {
>>> -       return bo->bo.addr_space_offset;
>>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>>    }
>>>      int
>>> diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c
>>> b/drivers/gpu/drm/mgag200/mgag200_main.c
>>> index 9fa5685..1a75ea3 100644
>>> --- a/drivers/gpu/drm/mgag200/mgag200_main.c
>>> +++ b/drivers/gpu/drm/mgag200/mgag200_main.c
>>> @@ -349,7 +349,7 @@ void mgag200_gem_free_object(struct drm_gem_object
>>> *obj)
>>>      static inline u64 mgag200_bo_mmap_offset(struct mgag200_bo *bo)
>>>    {
>>> -       return bo->bo.addr_space_offset;
>>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>>    }
>>>      int
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c
>>> b/drivers/gpu/drm/nouveau/nouveau_display.c
>>> index 708b2d1..7a8caa1 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_display.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c
>>> @@ -697,7 +697,7 @@ nouveau_display_dumb_map_offset(struct drm_file
>>> *file_priv,
>>>          gem = drm_gem_object_lookup(dev, file_priv, handle);
>>>          if (gem) {
>>>                  struct nouveau_bo *bo = gem->driver_private;
>>> -               *poffset = bo->bo.addr_space_offset;
>>> +               *poffset = drm_vma_node_offset_addr(&bo->bo.vma_node);
>>>                  drm_gem_object_unreference_unlocked(gem);
>>>                  return 0;
>>>          }
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> index e72d09c..86597eb 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
>>> @@ -226,7 +226,7 @@ nouveau_gem_info(struct drm_file *file_priv, struct
>>> drm_gem_object *gem,
>>>          }
>>>          rep->size = nvbo->bo.mem.num_pages << PAGE_SHIFT;
>>> -       rep->map_handle = nvbo->bo.addr_space_offset;
>>> +       rep->map_handle = drm_vma_node_offset_addr(&nvbo->bo.vma_node);
>>>          rep->tile_mode = nvbo->tile_mode;
>>>          rep->tile_flags = nvbo->tile_flags;
>>>          return 0;
>>> diff --git a/drivers/gpu/drm/qxl/qxl_object.h
>>> b/drivers/gpu/drm/qxl/qxl_object.h
>>> index ee7ad79..af10165 100644
>>> --- a/drivers/gpu/drm/qxl/qxl_object.h
>>> +++ b/drivers/gpu/drm/qxl/qxl_object.h
>>> @@ -59,7 +59,7 @@ static inline unsigned long qxl_bo_size(struct qxl_bo
>>> *bo)
>>>      static inline u64 qxl_bo_mmap_offset(struct qxl_bo *bo)
>>>    {
>>> -       return bo->tbo.addr_space_offset;
>>> +       return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>>>    }
>>>      static inline int qxl_bo_wait(struct qxl_bo *bo, u32 *mem_type,
>>> diff --git a/drivers/gpu/drm/qxl/qxl_release.c
>>> b/drivers/gpu/drm/qxl/qxl_release.c
>>> index b443d67..1a648e1 100644
>>> --- a/drivers/gpu/drm/qxl/qxl_release.c
>>> +++ b/drivers/gpu/drm/qxl/qxl_release.c
>>> @@ -87,7 +87,7 @@ qxl_release_free(struct qxl_device *qdev,
>>>          for (i = 0 ; i < release->bo_count; ++i) {
>>>                  QXL_INFO(qdev, "release %llx\n",
>>> -                       release->bos[i]->tbo.addr_space_offset
>>> +
>>> drm_vma_node_offset_addr(&release->bos[i]->tbo.vma_node)
>>>                                                  - DRM_FILE_OFFSET);
>>>                  qxl_fence_remove_release(&release->bos[i]->fence,
>>> release->id);
>>>                  qxl_bo_unref(&release->bos[i]);
>>> diff --git a/drivers/gpu/drm/radeon/radeon_object.h
>>> b/drivers/gpu/drm/radeon/radeon_object.h
>>> index 91519a5..188c682 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_object.h
>>> +++ b/drivers/gpu/drm/radeon/radeon_object.h
>>> @@ -113,13 +113,10 @@ static inline unsigned
>>> radeon_bo_gpu_page_alignment(struct radeon_bo *bo)
>>>     * @bo:       radeon object for which we query the offset
>>>     *
>>>     * Returns mmap offset of the object.
>>> - *
>>> - * Note: addr_space_offset is constant after ttm bo init thus isn't
>>> protected
>>> - * by any lock.
>>>     */
>>>    static inline u64 radeon_bo_mmap_offset(struct radeon_bo *bo)
>>>    {
>>> -       return bo->tbo.addr_space_offset;
>>> +       return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>>>    }
>>>      extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type,
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index cb9dd67..245850b 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -616,11 +616,7 @@ static void ttm_bo_release(struct kref *kref)
>>>          struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
>>>          write_lock(&bdev->vm_lock);
>>> -       if (likely(bo->vm_node != NULL)) {
>>> -               rb_erase(&bo->vm_rb, &bdev->addr_space_rb);
>>> -               drm_mm_put_block(bo->vm_node);
>>> -               bo->vm_node = NULL;
>>> -       }
>>> +       drm_vma_offset_remove(&bdev->vma_manager, &bo->vma_node);
>>>          write_unlock(&bdev->vm_lock);
>>>          ttm_mem_io_lock(man, false);
>>>          ttm_mem_io_free_vm(bo);
>>> @@ -1129,6 +1125,7 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>          bo->resv = &bo->ttm_resv;
>>>          reservation_object_init(bo->resv);
>>>          atomic_inc(&bo->glob->bo_count);
>>> +       drm_vma_node_reset(&bo->vma_node);
>>>          ret = ttm_bo_check_placement(bo, placement);
>>>    @@ -1424,9 +1421,8 @@ int ttm_bo_device_release(struct ttm_bo_device
>>> *bdev)
>>>                  TTM_DEBUG("Swap list was clean\n");
>>>          spin_unlock(&glob->lru_lock);
>>>    -     BUG_ON(!drm_mm_clean(&bdev->addr_space_mm));
>>>          write_lock(&bdev->vm_lock);
>>> -       drm_mm_takedown(&bdev->addr_space_mm);
>>> +       drm_vma_offset_manager_destroy(&bdev->vma_manager);
>>>          write_unlock(&bdev->vm_lock);
>>>          return ret;
>>> @@ -1454,9 +1450,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>>          if (unlikely(ret != 0))
>>>                  goto out_no_sys;
>>>    -     bdev->addr_space_rb = RB_ROOT;
>>> -       drm_mm_init(&bdev->addr_space_mm, file_page_offset, 0x10000000);
>>> -
>>> +       drm_vma_offset_manager_init(&bdev->vma_manager, file_page_offset,
>>> +                                   0x10000000);
>>>          INIT_DELAYED_WORK(&bdev->wq, ttm_bo_delayed_workqueue);
>>>          INIT_LIST_HEAD(&bdev->ddestroy);
>>>          bdev->dev_mapping = NULL;
>>> @@ -1498,12 +1493,17 @@ bool ttm_mem_reg_is_pci(struct ttm_bo_device
>>> *bdev, struct ttm_mem_reg *mem)
>>>    void ttm_bo_unmap_virtual_locked(struct ttm_buffer_object *bo)
>>>    {
>>>          struct ttm_bo_device *bdev = bo->bdev;
>>> -       loff_t offset = (loff_t) bo->addr_space_offset;
>>> -       loff_t holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
>>> +       loff_t offset, holelen;
>>>          if (!bdev->dev_mapping)
>>>                  return;
>>> -       unmap_mapping_range(bdev->dev_mapping, offset, holelen, 1);
>>> +
>>> +       if (drm_vma_node_has_offset(&bo->vma_node)) {
>>> +               offset = (loff_t) drm_vma_node_offset_addr(&bo->vma_node);
>>> +               holelen = ((loff_t) bo->mem.num_pages) << PAGE_SHIFT;
>>> +
>>> +               unmap_mapping_range(bdev->dev_mapping, offset, holelen,
>>> 1);
>>> +       }
>>>          ttm_mem_io_free_vm(bo);
>>>    }
>>>    @@ -1520,31 +1520,6 @@ void ttm_bo_unmap_virtual(struct
>>> ttm_buffer_object *bo)
>>>      EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>    -static void ttm_bo_vm_insert_rb(struct ttm_buffer_object *bo)
>>> -{
>>> -       struct ttm_bo_device *bdev = bo->bdev;
>>> -       struct rb_node **cur = &bdev->addr_space_rb.rb_node;
>>> -       struct rb_node *parent = NULL;
>>> -       struct ttm_buffer_object *cur_bo;
>>> -       unsigned long offset = bo->vm_node->start;
>>> -       unsigned long cur_offset;
>>> -
>>> -       while (*cur) {
>>> -               parent = *cur;
>>> -               cur_bo = rb_entry(parent, struct ttm_buffer_object,
>>> vm_rb);
>>> -               cur_offset = cur_bo->vm_node->start;
>>> -               if (offset < cur_offset)
>>> -                       cur = &parent->rb_left;
>>> -               else if (offset > cur_offset)
>>> -                       cur = &parent->rb_right;
>>> -               else
>>> -                       BUG();
>>> -       }
>>> -
>>> -       rb_link_node(&bo->vm_rb, parent, cur);
>>> -       rb_insert_color(&bo->vm_rb, &bdev->addr_space_rb);
>>> -}
>>> -
>>>    /**
>>>     * ttm_bo_setup_vm:
>>>     *
>>> @@ -1559,38 +1534,9 @@ static void ttm_bo_vm_insert_rb(struct
>>> ttm_buffer_object *bo)
>>>    static int ttm_bo_setup_vm(struct ttm_buffer_object *bo)
>>>    {
>>>          struct ttm_bo_device *bdev = bo->bdev;
>>> -       int ret;
>>> -
>>> -retry_pre_get:
>>> -       ret = drm_mm_pre_get(&bdev->addr_space_mm);
>>> -       if (unlikely(ret != 0))
>>> -               return ret;
>>>    -     write_lock(&bdev->vm_lock);
>>> -       bo->vm_node = drm_mm_search_free(&bdev->addr_space_mm,
>>> -                                        bo->mem.num_pages, 0, 0);
>>> -
>>> -       if (unlikely(bo->vm_node == NULL)) {
>>> -               ret = -ENOMEM;
>>> -               goto out_unlock;
>>> -       }
>>> -
>>> -       bo->vm_node = drm_mm_get_block_atomic(bo->vm_node,
>>> -                                             bo->mem.num_pages, 0);
>>> -
>>> -       if (unlikely(bo->vm_node == NULL)) {
>>> -               write_unlock(&bdev->vm_lock);
>>> -               goto retry_pre_get;
>>> -       }
>>> -
>>> -       ttm_bo_vm_insert_rb(bo);
>>> -       write_unlock(&bdev->vm_lock);
>>> -       bo->addr_space_offset = ((uint64_t) bo->vm_node->start) <<
>>> PAGE_SHIFT;
>>> -
>>> -       return 0;
>>> -out_unlock:
>>> -       write_unlock(&bdev->vm_lock);
>>> -       return ret;
>>> +       return drm_vma_offset_add(&bdev->vma_manager, &bo->vma_node,
>>> +                                 bo->mem.num_pages);
>>>    }
>>>      int ttm_bo_wait(struct ttm_buffer_object *bo,
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>> index 319cf41..7cc904d 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>> @@ -30,6 +30,7 @@
>>>      #include <drm/ttm/ttm_bo_driver.h>
>>>    #include <drm/ttm/ttm_placement.h>
>>> +#include <drm/drm_vma_manager.h>
>>>    #include <linux/io.h>
>>>    #include <linux/highmem.h>
>>>    #include <linux/wait.h>
>>> @@ -450,7 +451,7 @@ static int ttm_buffer_object_transfer(struct
>>> ttm_buffer_object *bo,
>>>          INIT_LIST_HEAD(&fbo->lru);
>>>          INIT_LIST_HEAD(&fbo->swap);
>>>          INIT_LIST_HEAD(&fbo->io_reserve_lru);
>>> -       fbo->vm_node = NULL;
>>> +       drm_vma_node_reset(&fbo->vma_node);
>>>          atomic_set(&fbo->cpu_writers, 0);
>>>          spin_lock(&bdev->fence_lock);
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index 3df9f16..54a67f1 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -33,6 +33,7 @@
>>>    #include <ttm/ttm_module.h>
>>>    #include <ttm/ttm_bo_driver.h>
>>>    #include <ttm/ttm_placement.h>
>>> +#include <drm/drm_vma_manager.h>
>>>    #include <linux/mm.h>
>>>    #include <linux/rbtree.h>
>>>    #include <linux/module.h>
>>> @@ -40,37 +41,6 @@
>>>      #define TTM_BO_VM_NUM_PREFAULT 16
>>>    -static struct ttm_buffer_object *ttm_bo_vm_lookup_rb(struct
>>> ttm_bo_device *bdev,
>>> -                                                    unsigned long
>>> page_start,
>>> -                                                    unsigned long
>>> num_pages)
>>> -{
>>> -       struct rb_node *cur = bdev->addr_space_rb.rb_node;
>>> -       unsigned long cur_offset;
>>> -       struct ttm_buffer_object *bo;
>>> -       struct ttm_buffer_object *best_bo = NULL;
>>> -
>>> -       while (likely(cur != NULL)) {
>>> -               bo = rb_entry(cur, struct ttm_buffer_object, vm_rb);
>>> -               cur_offset = bo->vm_node->start;
>>> -               if (page_start >= cur_offset) {
>>> -                       cur = cur->rb_right;
>>> -                       best_bo = bo;
>>> -                       if (page_start == cur_offset)
>>> -                               break;
>>> -               } else
>>> -                       cur = cur->rb_left;
>>> -       }
>>> -
>>> -       if (unlikely(best_bo == NULL))
>>> -               return NULL;
>>> -
>>> -       if (unlikely((best_bo->vm_node->start + best_bo->num_pages) <
>>> -                    (page_start + num_pages)))
>>> -               return NULL;
>>> -
>>> -       return best_bo;
>>> -}
>>> -
>>>    static int ttm_bo_vm_fault(struct vm_area_struct *vma, struct vm_fault
>>> *vmf)
>>>    {
>>>          struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
>>> @@ -146,9 +116,9 @@ static int ttm_bo_vm_fault(struct vm_area_struct *vma,
>>> struct vm_fault *vmf)
>>>          }
>>>          page_offset = ((address - vma->vm_start) >> PAGE_SHIFT) +
>>> -           bo->vm_node->start - vma->vm_pgoff;
>>> +           drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>>>          page_last = vma_pages(vma) +
>>> -           bo->vm_node->start - vma->vm_pgoff;
>>> +           drm_vma_node_start(&bo->vma_node) - vma->vm_pgoff;
>>>          if (unlikely(page_offset >= bo->num_pages)) {
>>>                  retval = VM_FAULT_SIGBUS;
>>> @@ -249,6 +219,30 @@ static const struct vm_operations_struct
>>> ttm_bo_vm_ops = {
>>>          .close = ttm_bo_vm_close
>>>    };
>>>    +static struct ttm_buffer_object *ttm_bo_vm_lookup(struct ttm_bo_device
>>> *bdev,
>>> +                                                 unsigned long offset,
>>> +                                                 unsigned long pages)
>>> +{
>>> +       struct drm_vma_offset_node *node;
>>> +       struct ttm_buffer_object *bo = NULL;
>>> +
>>> +       read_lock(&bdev->vm_lock);
>>> +
>>> +       node = drm_vma_offset_lookup(&bdev->vma_manager, offset, pages);
>>> +       if (likely(node)) {
>>> +               bo = container_of(node, struct ttm_buffer_object,
>>> vma_node);
>>> +               if (!kref_get_unless_zero(&bo->kref))
>>> +                       bo = NULL;
>>> +       }
>>> +
>>> +       read_unlock(&bdev->vm_lock);
>>> +
>>> +       if (!bo)
>>> +               pr_err("Could not find buffer object to map\n");
>>> +
>>> +       return bo;
>>> +}
>>> +
>>>    int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
>>>                  struct ttm_bo_device *bdev)
>>>    {
>>> @@ -256,17 +250,9 @@ int ttm_bo_mmap(struct file *filp, struct
>>> vm_area_struct *vma,
>>>          struct ttm_buffer_object *bo;
>>>          int ret;
>>>    -     read_lock(&bdev->vm_lock);
>>> -       bo = ttm_bo_vm_lookup_rb(bdev, vma->vm_pgoff,
>>> -                                vma_pages(vma));
>>> -       if (likely(bo != NULL) && !kref_get_unless_zero(&bo->kref))
>>> -               bo = NULL;
>>> -       read_unlock(&bdev->vm_lock);
>>> -
>>> -       if (unlikely(bo == NULL)) {
>>> -               pr_err("Could not find buffer object to map\n");
>>> +       bo = ttm_bo_vm_lookup(bdev, vma->vm_pgoff, vma_pages(vma));
>>> +       if (unlikely(!bo))
>>>                  return -EINVAL;
>>> -       }
>>>          driver = bo->bdev->driver;
>>>          if (unlikely(!driver->verify_access)) {
>>> @@ -324,12 +310,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct
>>> file *filp,
>>>          bool no_wait = false;
>>>          bool dummy;
>>>    -     read_lock(&bdev->vm_lock);
>>> -       bo = ttm_bo_vm_lookup_rb(bdev, dev_offset, 1);
>>> -       if (likely(bo != NULL))
>>> -               ttm_bo_reference(bo);
>>> -       read_unlock(&bdev->vm_lock);
>>> -
>>> +       bo = ttm_bo_vm_lookup(bdev, dev_offset, 1);
>>>          if (unlikely(bo == NULL))
>>>                  return -EFAULT;
>>>    @@ -343,7 +324,7 @@ ssize_t ttm_bo_io(struct ttm_bo_device *bdev, struct
>>> file *filp,
>>>          if (unlikely(ret != 0))
>>>                  goto out_unref;
>>>    -     kmap_offset = dev_offset - bo->vm_node->start;
>>> +       kmap_offset = dev_offset - drm_vma_node_start(&bo->vma_node);
>>>          if (unlikely(kmap_offset >= bo->num_pages)) {
>>>                  ret = -EFBIG;
>>>                  goto out_unref;
>>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> index 7953d1f..0e67cf4 100644
>>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> @@ -500,7 +500,7 @@ int vmw_dmabuf_alloc_ioctl(struct drm_device *dev,
>>> void *data,
>>>                  goto out_no_dmabuf;
>>>          rep->handle = handle;
>>> -       rep->map_handle = dma_buf->base.addr_space_offset;
>>> +       rep->map_handle =
>>> drm_vma_node_offset_addr(&dma_buf->base.vma_node);
>>>          rep->cur_gmr_id = handle;
>>>          rep->cur_gmr_offset = 0;
>>>    @@ -834,7 +834,7 @@ int vmw_dumb_map_offset(struct drm_file *file_priv,
>>>          if (ret != 0)
>>>                  return -EINVAL;
>>>    -     *offset = out_buf->base.addr_space_offset;
>>> +       *offset = drm_vma_node_offset_addr(&out_buf->base.vma_node);
>>>          vmw_dmabuf_unreference(&out_buf);
>>>          return 0;
>>>    }
>>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>>> index 8a6aa56..751eaff 100644
>>> --- a/include/drm/ttm/ttm_bo_api.h
>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>> @@ -32,12 +32,12 @@
>>>    #define _TTM_BO_API_H_
>>>      #include <drm/drm_hashtab.h>
>>> +#include <drm/drm_vma_manager.h>
>>>    #include <linux/kref.h>
>>>    #include <linux/list.h>
>>>    #include <linux/wait.h>
>>>    #include <linux/mutex.h>
>>>    #include <linux/mm.h>
>>> -#include <linux/rbtree.h>
>>>    #include <linux/bitmap.h>
>>>    #include <linux/reservation.h>
>>>    @@ -145,7 +145,6 @@ struct ttm_tt;
>>>     * @type: The bo type.
>>>     * @destroy: Destruction function. If NULL, kfree is used.
>>>     * @num_pages: Actual number of pages.
>>> - * @addr_space_offset: Address space offset.
>>>     * @acc_size: Accounted size for this object.
>>>     * @kref: Reference count of this buffer object. When this refcount
>>> reaches
>>>     * zero, the object is put on the delayed delete list.
>>> @@ -166,8 +165,7 @@ struct ttm_tt;
>>>     * @swap: List head for swap LRU list.
>>>     * @sync_obj: Pointer to a synchronization object.
>>>     * @priv_flags: Flags describing buffer object internal state.
>>> - * @vm_rb: Rb node for the vm rb tree.
>>> - * @vm_node: Address space manager node.
>>> + * @vma_node: Address space manager node.
>>>     * @offset: The current GPU offset, which can have different meanings
>>>     * depending on the memory type. For SYSTEM type memory, it should be 0.
>>>     * @cur_placement: Hint of current placement.
>>> @@ -194,7 +192,6 @@ struct ttm_buffer_object {
>>>          enum ttm_bo_type type;
>>>          void (*destroy) (struct ttm_buffer_object *);
>>>          unsigned long num_pages;
>>> -       uint64_t addr_space_offset;
>>>          size_t acc_size;
>>>          /**
>>> @@ -238,13 +235,7 @@ struct ttm_buffer_object {
>>>          void *sync_obj;
>>>          unsigned long priv_flags;
>>>    -     /**
>>> -        * Members protected by the bdev::vm_lock
>>> -        */
>>> -
>>> -       struct rb_node vm_rb;
>>> -       struct drm_mm_node *vm_node;
>>> -
>>> +       struct drm_vma_offset_node vma_node;
>>>          /**
>>>           * Special members that are protected by the reserve lock
>>> diff --git a/include/drm/ttm/ttm_bo_driver.h
>>> b/include/drm/ttm/ttm_bo_driver.h
>>> index 984fc2d..d3b8479 100644
>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>> @@ -36,6 +36,7 @@
>>>    #include <ttm/ttm_placement.h>
>>>    #include <drm/drm_mm.h>
>>>    #include <drm/drm_global.h>
>>> +#include <drm/drm_vma_manager.h>
>>>    #include <linux/workqueue.h>
>>>    #include <linux/fs.h>
>>>    #include <linux/spinlock.h>
>>> @@ -519,7 +520,7 @@ struct ttm_bo_global {
>>>     * @man: An array of mem_type_managers.
>>>     * @fence_lock: Protects the synchronizing members on *all* bos
>>> belonging
>>>     * to this device.
>>> - * @addr_space_mm: Range manager for the device address space.
>>> + * @vma_manager: Address space manager
>>>     * lru_lock: Spinlock that protects the buffer+device lru lists and
>>>     * ddestroy lists.
>>>     * @val_seq: Current validation sequence.
>>> @@ -540,11 +541,11 @@ struct ttm_bo_device {
>>>          rwlock_t vm_lock;
>>>          struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
>>>          spinlock_t fence_lock;
>>> +
>>>          /*
>>>           * Protected by the vm lock.
>>>           */
>>> -       struct rb_root addr_space_rb;
>>> -       struct drm_mm addr_space_mm;
>>> +       struct drm_vma_offset_manager vma_manager;
>>>          /*
>>>           * Protected by the global:lru lock

David Herrmann July 18, 2013, 8:54 p.m. UTC | #5

Hi

On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 07/18/2013 01:07 PM, David Herrmann wrote:
>>
>> Hi
>>
>> On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
>> <thellstrom@vmware.com> wrote:
>>>
>>> A quick look, but not a full review:
>>>
>>> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
>>> anymore (provided the vma offset manager is properly protected), since
>>> kref_get_unless_zero() is used when a reference after lookup is taken.
>>> (please see the kref_get_unless_zero documentation examples). When
>>> drm_vma_offset_remove() is called, the kref value is unconditionally
>>> zero,
>>> and that should block any successful lookup.
>>
>> If we drop vm_lock without modifying TTM, this will not work. Even
>> kref_get_unless_zero() needs some guarantee that the object is still
>> valid. Assume this scenario:
>>
>> drm_vma_offset_lookup() returns a valid node, however, before we call
>> kref_get_*(), another thread destroys the last reference to the bo so
>> it gets kfree()d. kref_get_unless_zero() will thus reference memory
>> which can be _anything_ and is not guarantee to stay 0.
>> (Documentation/kref.txt mentions this, too, and I guess it was you who
>> wrote that).
>>
>> I cannot see any locking around the mmap call that could guarantee
>> this except vm_lock.
>
>
> You're right. My apologies. Parental leave has taken its toll.
>
>
>>
>>> Otherwise, if the vma offset manager always needs external locking to
>>> make
>>> lookup + referencing atomic, then perhaps locking should be completely
>>> left to the caller?
>>
>> I would actually prefer to keep it in the VMA manager. It allows some
>> fast-paths which otherwise would need special semantics for the caller
>> (for instance see the access-management patches which are based on
>> this patchset or the reduced locking during setup/release). We'd also
>> require the caller to use rwsems for good performance, which is not
>> the case for gem.
>>
>> So how about Daniel's proposal to add an _unlocked() version and
>> provide _lock_lookup() and _unlock_lookup() helpers which just wrap
>> the read_lock() and read_unlock?
>>
>> Thanks!
>> David
>
>
> I think that if there are good reasons to keep locking internal, I'm fine
> with that, (And also, of course, with
> Daniel's proposal). Currently the add / remove / lookup paths are mostly
> used by TTM during object creation and
> destruction.
>
> However, if the lookup path is ever used by pread / pwrite, that situation
> might change and we would then like to
> minimize the locking.

I tried to keep the change as minimal as I could. Follow-up patches
are welcome. I just thought pushing the lock into drm_vma_* would
simplify things. If there are benchmarks that prove me wrong, I'll
gladly spend some time optimizing that.

Apart from this, I'd like to see someone ack the
ttm_buffer_object_transfer() changes. I am not very confident with
that. Everything else should be trivial.

Thanks
David

Jerome Glisse July 19, 2013, 2:03 a.m. UTC | #6

On Thu, Jul 18, 2013 at 4:54 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
> Hi
>
> On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 07/18/2013 01:07 PM, David Herrmann wrote:
>>>
>>> Hi
>>>
>>> On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
>>> <thellstrom@vmware.com> wrote:
>>>>
>>>> A quick look, but not a full review:
>>>>
>>>> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
>>>> anymore (provided the vma offset manager is properly protected), since
>>>> kref_get_unless_zero() is used when a reference after lookup is taken.
>>>> (please see the kref_get_unless_zero documentation examples). When
>>>> drm_vma_offset_remove() is called, the kref value is unconditionally
>>>> zero,
>>>> and that should block any successful lookup.
>>>
>>> If we drop vm_lock without modifying TTM, this will not work. Even
>>> kref_get_unless_zero() needs some guarantee that the object is still
>>> valid. Assume this scenario:
>>>
>>> drm_vma_offset_lookup() returns a valid node, however, before we call
>>> kref_get_*(), another thread destroys the last reference to the bo so
>>> it gets kfree()d. kref_get_unless_zero() will thus reference memory
>>> which can be _anything_ and is not guarantee to stay 0.
>>> (Documentation/kref.txt mentions this, too, and I guess it was you who
>>> wrote that).
>>>
>>> I cannot see any locking around the mmap call that could guarantee
>>> this except vm_lock.
>>
>>
>> You're right. My apologies. Parental leave has taken its toll.
>>
>>
>>>
>>>> Otherwise, if the vma offset manager always needs external locking to
>>>> make
>>>> lookup + referencing atomic, then perhaps locking should be completely
>>>> left to the caller?
>>>
>>> I would actually prefer to keep it in the VMA manager. It allows some
>>> fast-paths which otherwise would need special semantics for the caller
>>> (for instance see the access-management patches which are based on
>>> this patchset or the reduced locking during setup/release). We'd also
>>> require the caller to use rwsems for good performance, which is not
>>> the case for gem.
>>>
>>> So how about Daniel's proposal to add an _unlocked() version and
>>> provide _lock_lookup() and _unlock_lookup() helpers which just wrap
>>> the read_lock() and read_unlock?
>>>
>>> Thanks!
>>> David
>>
>>
>> I think that if there are good reasons to keep locking internal, I'm fine
>> with that, (And also, of course, with
>> Daniel's proposal). Currently the add / remove / lookup paths are mostly
>> used by TTM during object creation and
>> destruction.
>>
>> However, if the lookup path is ever used by pread / pwrite, that situation
>> might change and we would then like to
>> minimize the locking.
>
> I tried to keep the change as minimal as I could. Follow-up patches
> are welcome. I just thought pushing the lock into drm_vma_* would
> simplify things. If there are benchmarks that prove me wrong, I'll
> gladly spend some time optimizing that.
>
> Apart from this, I'd like to see someone ack the
> ttm_buffer_object_transfer() changes. I am not very confident with
> that. Everything else should be trivial.
>
> Thanks
> David

Looks good to me, the transfer object must have an empty
vm_node/vma_node as we only interested in tying the system ram or vram
to the ghost object so that delayed delete the vram or system ram but
the original buffer is still valid and thus its original
vm_node/vma_node must not be free.

Cheers,
Jerome

Maarten Lankhorst July 19, 2013, 7:24 a.m. UTC | #7

Op 18-07-13 22:54, David Herrmann schreef:
> Hi
>
> On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 07/18/2013 01:07 PM, David Herrmann wrote:
>>> Hi
>>>
>>> On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom
>>> <thellstrom@vmware.com> wrote:
>>>> A quick look, but not a full review:
>>>>
>>>> Looks mostly good, but it looks like the TTM vm lock isn't needed at all
>>>> anymore (provided the vma offset manager is properly protected), since
>>>> kref_get_unless_zero() is used when a reference after lookup is taken.
>>>> (please see the kref_get_unless_zero documentation examples). When
>>>> drm_vma_offset_remove() is called, the kref value is unconditionally
>>>> zero,
>>>> and that should block any successful lookup.
>>> If we drop vm_lock without modifying TTM, this will not work. Even
>>> kref_get_unless_zero() needs some guarantee that the object is still
>>> valid. Assume this scenario:
>>>
>>> drm_vma_offset_lookup() returns a valid node, however, before we call
>>> kref_get_*(), another thread destroys the last reference to the bo so
>>> it gets kfree()d. kref_get_unless_zero() will thus reference memory
>>> which can be _anything_ and is not guarantee to stay 0.
>>> (Documentation/kref.txt mentions this, too, and I guess it was you who
>>> wrote that).
>>>
>>> I cannot see any locking around the mmap call that could guarantee
>>> this except vm_lock.
>>
>> You're right. My apologies. Parental leave has taken its toll.
>>
>>
>>>> Otherwise, if the vma offset manager always needs external locking to
>>>> make
>>>> lookup + referencing atomic, then perhaps locking should be completely
>>>> left to the caller?
>>> I would actually prefer to keep it in the VMA manager. It allows some
>>> fast-paths which otherwise would need special semantics for the caller
>>> (for instance see the access-management patches which are based on
>>> this patchset or the reduced locking during setup/release). We'd also
>>> require the caller to use rwsems for good performance, which is not
>>> the case for gem.
>>>
>>> So how about Daniel's proposal to add an _unlocked() version and
>>> provide _lock_lookup() and _unlock_lookup() helpers which just wrap
>>> the read_lock() and read_unlock?
>>>
>>> Thanks!
>>> David
>>
>> I think that if there are good reasons to keep locking internal, I'm fine
>> with that, (And also, of course, with
>> Daniel's proposal). Currently the add / remove / lookup paths are mostly
>> used by TTM during object creation and
>> destruction.
>>
>> However, if the lookup path is ever used by pread / pwrite, that situation
>> might change and we would then like to
>> minimize the locking.
> I tried to keep the change as minimal as I could. Follow-up patches
> are welcome. I just thought pushing the lock into drm_vma_* would
> simplify things. If there are benchmarks that prove me wrong, I'll
> gladly spend some time optimizing that.
>
> Apart from this, I'd like to see someone ack the
> ttm_buffer_object_transfer() changes. I am not very confident with
> that. Everything else should be trivial.
>
The transfer object only exists to kill off the memory backing during asynchronous eviction,
by using the delayed destroyed mechanism. The re-initialization there looks correct to me.

~Maarten

Thomas Hellstrom July 19, 2013, 9:13 a.m. UTC | #8

On 07/18/2013 10:54 PM, David Herrmann wrote:
> Hi
>
> On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:

...

>>
>> I think that if there are good reasons to keep locking internal, I'm fine
>> with that, (And also, of course, with
>> Daniel's proposal). Currently the add / remove / lookup paths are mostly
>> used by TTM during object creation and
>> destruction.
>>
>> However, if the lookup path is ever used by pread / pwrite, that situation
>> might change and we would then like to
>> minimize the locking.
> I tried to keep the change as minimal as I could. Follow-up patches
> are welcome. I just thought pushing the lock into drm_vma_* would
> simplify things. If there are benchmarks that prove me wrong, I'll
> gladly spend some time optimizing that.

In the general case, one reason for designing the locking outside of a 
utilities like this, is that different callers may have
different requirements. For example, the call path is known not to be 
multithreaded at all, or
the caller may prefer a mutex over a spinlock for various reasons. It 
might also be that some callers will want to use
RCU locking in the future if the lookup path becomes busy, and that 
would require *all* users to adapt to RCU object destruction...

I haven't looked at the code closely enough to say that any of this 
applies in this particular case, though.

Thanks,
Thomas

> Thanks
> David

David Herrmann July 22, 2013, 10:55 a.m. UTC | #9

Sorry, I forgot to CC correctly.

On Mon, Jul 22, 2013 at 12:53 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
> Hi
>
> On Fri, Jul 19, 2013 at 11:13 AM, Thomas Hellstrom
> <thellstrom@vmware.com> wrote:
>> On 07/18/2013 10:54 PM, David Herrmann wrote:
>>>
>>> Hi
>>>
>>> On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com>
>>> wrote:
>>
>>
>> ...
>>
>>
>>>>
>>>> I think that if there are good reasons to keep locking internal, I'm fine
>>>> with that, (And also, of course, with
>>>> Daniel's proposal). Currently the add / remove / lookup paths are mostly
>>>> used by TTM during object creation and
>>>> destruction.
>>>>
>>>> However, if the lookup path is ever used by pread / pwrite, that
>>>> situation
>>>> might change and we would then like to
>>>> minimize the locking.
>>>
>>> I tried to keep the change as minimal as I could. Follow-up patches
>>> are welcome. I just thought pushing the lock into drm_vma_* would
>>> simplify things. If there are benchmarks that prove me wrong, I'll
>>> gladly spend some time optimizing that.
>>
>>
>> In the general case, one reason for designing the locking outside of a
>> utilities like this, is that different callers may have
>> different requirements. For example, the call path is known not to be
>> multithreaded at all, or
>> the caller may prefer a mutex over a spinlock for various reasons. It might
>> also be that some callers will want to use
>> RCU locking in the future if the lookup path becomes busy, and that would
>> require *all* users to adapt to RCU object destruction...
>>
>> I haven't looked at the code closely enough to say that any of this applies
>> in this particular case, though.
>
> Some notes why I went with local locking:
>  - mmap offset creation is done once and is independent of any other
> operations you might do on the BO in parallel
>  - mmap lookup is also only done once in most cases. User-space rarely
> maps a buffer twice (setup/teardown of mmaps is expensive, but keeping
> them around is very cheap). And for shared buffers only the writer
> maps it as the reader normally passes it to the kernel without
> mapping/touching it. Only for software-rendering we have two or more
> mappings of the same object.
>  - creating/mapping/destroying buffer objects is expensive in its
> current form and buffers tend to stay around for a long time
>
> I couldn't find a situation were the vma-manager was part of a
> performance critical path. But on the other hand, the paths were it is
> used are quite heavy. That's why I don't want to lock the whole buffer
> or even device. Instead, we need some "management lock" which is used
> for mmap-setup or similar things that don't affect other operations on
> the buffer or device. We don't have such a lock, so I introduced local
> locking. If we end up with more use-cases, I volunteer to refactor
> this. But I am no big fan of overgeneralizing it before more uses are
> known.
>
> Anyway, I will resend with vm_lock removed (+_unlocked() helpers) so
> we keep the status-quo regarding locks. Thanks for the comments on TTM
> buffer transfer.
>
> Thanks
> David

Thomas Hellstrom July 22, 2013, 11:45 a.m. UTC | #10

On 07/22/2013 12:55 PM, David Herrmann wrote:
> Sorry, I forgot to CC correctly.
>
> On Mon, Jul 22, 2013 at 12:53 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
>> Hi
>>
>> On Fri, Jul 19, 2013 at 11:13 AM, Thomas Hellstrom
>> <thellstrom@vmware.com> wrote:
>>> On 07/18/2013 10:54 PM, David Herrmann wrote:
>>>> Hi
>>>>
>>>> On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom <thellstrom@vmware.com>
>>>> wrote:
>>>
>>> ...
>>>
>>>
>>>>> I think that if there are good reasons to keep locking internal, I'm fine
>>>>> with that, (And also, of course, with
>>>>> Daniel's proposal). Currently the add / remove / lookup paths are mostly
>>>>> used by TTM during object creation and
>>>>> destruction.
>>>>>
>>>>> However, if the lookup path is ever used by pread / pwrite, that
>>>>> situation
>>>>> might change and we would then like to
>>>>> minimize the locking.
>>>> I tried to keep the change as minimal as I could. Follow-up patches
>>>> are welcome. I just thought pushing the lock into drm_vma_* would
>>>> simplify things. If there are benchmarks that prove me wrong, I'll
>>>> gladly spend some time optimizing that.
>>>
>>> In the general case, one reason for designing the locking outside of a
>>> utilities like this, is that different callers may have
>>> different requirements. For example, the call path is known not to be
>>> multithreaded at all, or
>>> the caller may prefer a mutex over a spinlock for various reasons. It might
>>> also be that some callers will want to use
>>> RCU locking in the future if the lookup path becomes busy, and that would
>>> require *all* users to adapt to RCU object destruction...
>>>
>>> I haven't looked at the code closely enough to say that any of this applies
>>> in this particular case, though.
>> Some notes why I went with local locking:
>>   - mmap offset creation is done once and is independent of any other
>> operations you might do on the BO in parallel
>>   - mmap lookup is also only done once in most cases. User-space rarely
>> maps a buffer twice (setup/teardown of mmaps is expensive, but keeping
>> them around is very cheap). And for shared buffers only the writer
>> maps it as the reader normally passes it to the kernel without
>> mapping/touching it. Only for software-rendering we have two or more
>> mappings of the same object.
>>   - creating/mapping/destroying buffer objects is expensive in its
>> current form and buffers tend to stay around for a long time
>>
>> I couldn't find a situation were the vma-manager was part of a
>> performance critical path. But on the other hand, the paths were it is
>> used are quite heavy. That's why I don't want to lock the whole buffer
>> or even device. Instead, we need some "management lock" which is used
>> for mmap-setup or similar things that don't affect other operations on
>> the buffer or device. We don't have such a lock, so I introduced local
>> locking. If we end up with more use-cases, I volunteer to refactor
>> this. But I am no big fan of overgeneralizing it before more uses are
>> known.

I think we are discussing two different things here:

1) Having a separate lock for vma management vs
2) building that lock into the vma manager utility.

You're arguing for 1) and I completely agree with you, and although I 
still think one generally should avoid building locks into utilities 
like this (avoid 2),
I'm fine with the current approach.

Thanks,
Thomas

[v3,3/4] drm/ttm: convert to unified vma offset manager

Commit Message

Comments

Patch