From patchwork Wed Oct 2 13:47:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28Intel=29?= X-Patchwork-Id: 11171089 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C731A112B for ; Wed, 2 Oct 2019 13:48:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 791BD21920 for ; Wed, 2 Oct 2019 13:48:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="DBIjtaSV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 791BD21920 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A95EC6B000A; Wed, 2 Oct 2019 09:47:51 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A1C9B6B000D; Wed, 2 Oct 2019 09:47:51 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90B036B000E; Wed, 2 Oct 2019 09:47:51 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 5381F6B000A for ; Wed, 2 Oct 2019 09:47:51 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 9F5CD824CA27 for ; Wed, 2 Oct 2019 13:47:50 +0000 (UTC) X-FDA: 75998972700.10.blade99_a22d6cb20838 X-Spam-Summary: 2,0,0,20618b4fe52ca668,d41d8cd98f00b204,thomas_os@shipmail.org,::linux-kernel@vger.kernel.org:torvalds@linux-foundation.org:thellstrom@vmware.com:akpm@linux-foundation.org:willy@infradead.org:will.deacon@arm.com:peterz@infradead.org:riel@surriel.com:minchan@kernel.org:mhocko@suse.com:ying.huang@intel.com:jglisse@redhat.com:kirill@shutemov.name:drawat@vmware.com,RULES_HIT:4:41:152:355:379:541:800:960:966:973:988:989:1260:1261:1277:1311:1313:1314:1345:1359:1431:1437:1515:1516:1518:1593:1594:1605:1676:1730:1747:1777:1792:1801:2196:2198:2199:2200:2393:2553:2559:2562:2639:2693:2731:2897:2898:2899:2904:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6261:6653:6742:7576:7875:7903:7974:9036:9592:10004:10226:10394:11026:11232:11233:11473:11657:11658:11876:11914:12043:12296:12297:12438:12517:12519:12555:12679:12895:12986:13255:13868:13894:14196:14394:14659:21080:21324:21325:21433:21451:21611:21627:21789:21790:21795:21939:30012:3 0034:300 X-HE-Tag: blade99_a22d6cb20838 X-Filterd-Recvd-Size: 16055 Received: from pio-pvt-msa3.bahnhof.se (pio-pvt-msa3.bahnhof.se [79.136.2.42]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Oct 2019 13:47:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa3.bahnhof.se (Postfix) with ESMTP id 456473F9E0; Wed, 2 Oct 2019 15:47:43 +0200 (CEST) Authentication-Results: pio-pvt-msa3.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=DBIjtaSV; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se X-Spam-Flag: NO X-Spam-Score: -2.1 X-Spam-Level: X-Spam-Status: No, score=-2.1 tagged_above=-999 required=6.31 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1] autolearn=ham autolearn_force=no Received: from pio-pvt-msa3.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa3.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O94t6RWudTbT; Wed, 2 Oct 2019 15:47:41 +0200 (CEST) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by pio-pvt-msa3.bahnhof.se (Postfix) with ESMTPA id 0867F3F9AD; Wed, 2 Oct 2019 15:47:40 +0200 (CEST) Received: from localhost.localdomain.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 95E26360D9B; Wed, 2 Oct 2019 15:47:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1570024060; bh=PpTFndekEIH3wph+G0WfgFcRnZ/FZuPs3QhIZYFgzKs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DBIjtaSVxqU48yM+a8Y1ly5Ty8mAwziYen2VHVWSUykFVOIK4N2kxNgb5POgVipKq H00NDbpw20fR+pLZ3Bnns8i5QFz58HLbBLdVOXUtQ6qWAQ2RjVEbm9u4Mmys2ahE1/ I9GbH32zlow6ai0ViJjsURBOwA09098yIAErvMQs= From: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28VMware=29?= To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: torvalds@linux-foundation.org, Thomas Hellstrom , Andrew Morton , Matthew Wilcox , Will Deacon , Peter Zijlstra , Rik van Riel , Minchan Kim , Michal Hocko , Huang Ying , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , "Kirill A . Shutemov" , Deepak Rawat Subject: [PATCH v3 6/7] drm/vmwgfx: Implement an infrastructure for read-coherent resources Date: Wed, 2 Oct 2019 15:47:29 +0200 Message-Id: <20191002134730.40985-7-thomas_os@shipmail.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191002134730.40985-1-thomas_os@shipmail.org> References: <20191002134730.40985-1-thomas_os@shipmail.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Thomas Hellstrom Similar to write-coherent resources, make sure that from the user-space point of view, GPU rendered contents is automatically available for reading by the CPU. Cc: Andrew Morton Cc: Matthew Wilcox Cc: Will Deacon Cc: Peter Zijlstra Cc: Rik van Riel Cc: Minchan Kim Cc: Michal Hocko Cc: Huang Ying Cc: Jérôme Glisse Cc: Kirill A. Shutemov Signed-off-by: Thomas Hellstrom Reviewed-by: Deepak Rawat --- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 7 +- drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c | 75 ++++++++++++- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 103 +++++++++++++++++- drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h | 2 + drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 3 +- 5 files changed, 179 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 53f8522ae032..729a2e93acf1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -680,7 +680,8 @@ extern void vmw_resource_unreference(struct vmw_resource **p_res); extern struct vmw_resource *vmw_resource_reference(struct vmw_resource *res); extern struct vmw_resource * vmw_resource_reference_unless_doomed(struct vmw_resource *res); -extern int vmw_resource_validate(struct vmw_resource *res, bool intr); +extern int vmw_resource_validate(struct vmw_resource *res, bool intr, + bool dirtying); extern int vmw_resource_reserve(struct vmw_resource *res, bool interruptible, bool no_backup); extern bool vmw_resource_needs_backup(const struct vmw_resource *res); @@ -724,6 +725,8 @@ void vmw_resource_mob_attach(struct vmw_resource *res); void vmw_resource_mob_detach(struct vmw_resource *res); void vmw_resource_dirty_update(struct vmw_resource *res, pgoff_t start, pgoff_t end); +int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start, + pgoff_t end, pgoff_t *num_prefault); /** * vmw_resource_mob_attached - Whether a resource currently has a mob attached @@ -1417,6 +1420,8 @@ int vmw_bo_dirty_add(struct vmw_buffer_object *vbo); void vmw_bo_dirty_transfer_to_res(struct vmw_resource *res); void vmw_bo_dirty_clear_res(struct vmw_resource *res); void vmw_bo_dirty_release(struct vmw_buffer_object *vbo); +void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end); vm_fault_t vmw_bo_vm_fault(struct vm_fault *vmf); vm_fault_t vmw_bo_vm_mkwrite(struct vm_fault *vmf); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c index 2a5aa4db16e5..fe8bb76d6360 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c @@ -153,7 +153,6 @@ static void vmw_bo_dirty_scan_mkwrite(struct vmw_buffer_object *vbo) } } - /** * vmw_bo_dirty_scan - Scan for dirty pages and add them to the dirty * tracking structure @@ -171,6 +170,51 @@ void vmw_bo_dirty_scan(struct vmw_buffer_object *vbo) vmw_bo_dirty_scan_mkwrite(vbo); } +/** + * vmw_bo_dirty_pre_unmap - write-protect and pick up dirty pages before + * an unmap_mapping_range operation. + * @vbo: The buffer object, + * @start: First page of the range within the buffer object. + * @end: Last page of the range within the buffer object + 1. + * + * If we're using the _PAGETABLE scan method, we may leak dirty pages + * when calling unmap_mapping_range(). This function makes sure we pick + * up all dirty pages. + */ +static void vmw_bo_dirty_pre_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end) +{ + struct vmw_bo_dirty *dirty = vbo->dirty; + unsigned long offset = drm_vma_node_start(&vbo->base.base.vma_node); + struct address_space *mapping = vbo->base.bdev->dev_mapping; + + if (dirty->method != VMW_BO_DIRTY_PAGETABLE || start >= end) + return; + + as_dirty_wrprotect(mapping, start + offset, end - start); + as_dirty_clean(mapping, start + offset, end - start, offset, + &dirty->bitmap[0], &dirty->start, &dirty->end); +} + +/** + * vmw_bo_dirty_unmap - Clear all ptes pointing to a range within a bo + * @vbo: The buffer object, + * @start: First page of the range within the buffer object. + * @end: Last page of the range within the buffer object + 1. + * + * This is similar to ttm_bo_unmap_virtual_locked() except it takes a subrange. + */ +void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end) +{ + unsigned long offset = drm_vma_node_start(&vbo->base.base.vma_node); + struct address_space *mapping = vbo->base.bdev->dev_mapping; + + vmw_bo_dirty_pre_unmap(vbo, start, end); + unmap_shared_mapping_range(mapping, (offset + start) << PAGE_SHIFT, + (loff_t) (end - start) << PAGE_SHIFT); +} + /** * vmw_bo_dirty_add - Add a dirty-tracking user to a buffer object * @vbo: The buffer object @@ -397,21 +441,42 @@ vm_fault_t vmw_bo_vm_fault(struct vm_fault *vmf) if (ret) return ret; + num_prefault = (vma->vm_flags & VM_RAND_READ) ? 1 : + TTM_BO_VM_NUM_PREFAULT; + + if (vbo->dirty) { + pgoff_t allowed_prefault; + unsigned long page_offset; + + page_offset = vmf->pgoff - + drm_vma_node_start(&bo->base.vma_node); + if (page_offset >= bo->num_pages || + vmw_resources_clean(vbo, page_offset, + page_offset + PAGE_SIZE, + &allowed_prefault)) { + ret = VM_FAULT_SIGBUS; + goto out_unlock; + } + + num_prefault = min(num_prefault, allowed_prefault); + } + /* - * This will cause mkwrite() to be called for each pte on - * write-enable vmas. + * If we don't track dirty using the MKWRITE method, make sure + * sure the page protection is write-enabled so we don't get + * a lot of unnecessary write faults. */ if (vbo->dirty && vbo->dirty->method == VMW_BO_DIRTY_MKWRITE) prot = vma->vm_page_prot; else prot = vm_get_page_prot(vma->vm_flags); - num_prefault = (vma->vm_flags & VM_RAND_READ) ? 0 : - TTM_BO_VM_NUM_PREFAULT; ret = ttm_bo_vm_fault_reserved(vmf, prot, num_prefault); if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) return ret; +out_unlock: dma_resv_unlock(bo->base.resv); + return ret; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index 328ad46076ff..c76faf33972e 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -393,7 +393,8 @@ static int vmw_resource_buf_alloc(struct vmw_resource *res, * should be retried once resources have been freed up. */ static int vmw_resource_do_validate(struct vmw_resource *res, - struct ttm_validate_buffer *val_buf) + struct ttm_validate_buffer *val_buf, + bool dirtying) { int ret = 0; const struct vmw_res_func *func = res->func; @@ -435,6 +436,15 @@ static int vmw_resource_do_validate(struct vmw_resource *res, * the resource. */ if (res->dirty) { + if (dirtying && !res->res_dirty) { + pgoff_t start = res->backup_offset >> PAGE_SHIFT; + pgoff_t end = __KERNEL_DIV_ROUND_UP + (res->backup_offset + res->backup_size, + PAGE_SIZE); + + vmw_bo_dirty_unmap(res->backup, start, end); + } + vmw_bo_dirty_transfer_to_res(res); return func->dirty_sync(res); } @@ -679,6 +689,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket, * to the device. * @res: The resource to make visible to the device. * @intr: Perform waits interruptible if possible. + * @dirtying: Pending GPU operation will dirty the resource * * On succesful return, any backup DMA buffer pointed to by @res->backup will * be reserved and validated. @@ -688,7 +699,8 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket, * Return: Zero on success, -ERESTARTSYS if interrupted, negative error code * on failure. */ -int vmw_resource_validate(struct vmw_resource *res, bool intr) +int vmw_resource_validate(struct vmw_resource *res, bool intr, + bool dirtying) { int ret; struct vmw_resource *evict_res; @@ -705,7 +717,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr) if (res->backup) val_buf.bo = &res->backup->base; do { - ret = vmw_resource_do_validate(res, &val_buf); + ret = vmw_resource_do_validate(res, &val_buf, dirtying); if (likely(ret != -EBUSY)) break; @@ -1005,7 +1017,7 @@ int vmw_resource_pin(struct vmw_resource *res, bool interruptible) /* Do we really need to pin the MOB as well? */ vmw_bo_pin_reserved(vbo, true); } - ret = vmw_resource_validate(res, interruptible); + ret = vmw_resource_validate(res, interruptible, true); if (vbo) ttm_bo_unreserve(&vbo->base); if (ret) @@ -1080,3 +1092,86 @@ void vmw_resource_dirty_update(struct vmw_resource *res, pgoff_t start, res->func->dirty_range_add(res, start << PAGE_SHIFT, end << PAGE_SHIFT); } + +/** + * vmw_resources_clean - Clean resources intersecting a mob range + * @vbo: The mob buffer object + * @start: The mob page offset starting the range + * @end: The mob page offset ending the range + * @num_prefault: Returns how many pages including the first have been + * cleaned and are ok to prefault + */ +int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start, + pgoff_t end, pgoff_t *num_prefault) +{ + struct rb_node *cur = vbo->res_tree.rb_node; + struct vmw_resource *found = NULL; + unsigned long res_start = start << PAGE_SHIFT; + unsigned long res_end = end << PAGE_SHIFT; + unsigned long last_cleaned = 0; + + /* + * Find the resource with lowest backup_offset that intersects the + * range. + */ + while (cur) { + struct vmw_resource *cur_res = + container_of(cur, struct vmw_resource, mob_node); + + if (cur_res->backup_offset >= res_end) { + cur = cur->rb_left; + } else if (cur_res->backup_offset + cur_res->backup_size <= + res_start) { + cur = cur->rb_right; + } else { + found = cur_res; + cur = cur->rb_left; + /* Continue to look for resources with lower offsets */ + } + } + + /* + * In order of increasing backup_offset, clean dirty resorces + * intersecting the range. + */ + while (found) { + if (found->res_dirty) { + int ret; + + if (!found->func->clean) + return -EINVAL; + + ret = found->func->clean(found); + if (ret) + return ret; + + found->res_dirty = false; + } + last_cleaned = found->backup_offset + found->backup_size; + cur = rb_next(&found->mob_node); + if (!cur) + break; + + found = container_of(cur, struct vmw_resource, mob_node); + if (found->backup_offset >= res_end) + break; + } + + /* + * Set number of pages allowed prefaulting and fence the buffer object + */ + *num_prefault = 1; + if (last_cleaned > res_start) { + struct ttm_buffer_object *bo = &vbo->base; + + *num_prefault = __KERNEL_DIV_ROUND_UP(last_cleaned - res_start, + PAGE_SIZE); + vmw_bo_fence_single(bo, NULL); + if (bo->moving) + dma_fence_put(bo->moving); + bo->moving = dma_fence_get + (dma_resv_get_excl(bo->base.resv)); + } + + return 0; +} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h index c85144286cfe..3b7438b2d289 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h @@ -77,6 +77,7 @@ struct vmw_user_resource_conv { * @dirty_sync: Upload the dirty mob contents to the resource. * @dirty_add_range: Add a sequential dirty range to the resource * dirty tracker. + * @clean: Clean the resource. */ struct vmw_res_func { enum vmw_res_type res_type; @@ -101,6 +102,7 @@ struct vmw_res_func { int (*dirty_sync)(struct vmw_resource *res); void (*dirty_range_add)(struct vmw_resource *res, size_t start, size_t end); + int (*clean)(struct vmw_resource *res); }; /** diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index 71349a7bae90..9aaf807ed73c 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -641,7 +641,8 @@ int vmw_validation_res_validate(struct vmw_validation_context *ctx, bool intr) struct vmw_resource *res = val->res; struct vmw_buffer_object *backup = res->backup; - ret = vmw_resource_validate(res, intr); + ret = vmw_resource_validate(res, intr, val->dirty_set && + val->dirty); if (ret) { if (ret != -ERESTARTSYS) DRM_ERROR("Failed to validate resource.\n");