From patchwork Fri Aug 18 15:08:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m?= X-Patchwork-Id: 13357917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B13F5C7115A for ; Fri, 18 Aug 2023 15:09:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EAE0610E539; Fri, 18 Aug 2023 15:09:49 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id E6DF010E538; Fri, 18 Aug 2023 15:09:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692371386; x=1723907386; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QIf7cmvMgsxITcSZw3NKztnF+1mW6wPmvSMIaMZOeKM=; b=QjPwQ5VBH0IUmIFZ2PIRxXhxrQN7K3xogWmae1iKAVGrO2vdAo9K0Au2 UdX8BiUYXgxwr8oi/Yzzd7CPXt5DHILRjfyVRJOLth2Zqxom8ejC90HNP Ug/Nxu6+eM3jJFz40V0FNDVlJAewlrJoFpuyzGPu6gw+mitAdgIL8pPfq PxSVDT84d9vYAZtCpnOqM4L2rZQqYjZEmo5OwcmfowFfMdGp8ByVJMEB0 waIf3LcVtzQZ3BybgQlW8fbKjZqEulyFGeyDM9zxxuYwHSnocM/ZPZBGG 6G8N5wU1/M7m14sKi+bl0bliT64KvLC5Kr+1NynPW9Jlijd7SRZpBGUkU A==; X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="363276718" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="363276718" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="1065774148" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="1065774148" Received: from kjeldbeg-mobl.ger.corp.intel.com (HELO fedora..) ([10.249.254.202]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:16 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH 1/4] drm/xe/vm: Use onion unwind for xe_vma_userptr_pin_pages() Date: Fri, 18 Aug 2023 17:08:42 +0200 Message-ID: <20230818150845.96679-2-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> References: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use onion error unwind since that makes the function easier to read and extend. No functional change. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 2e99f865d7ec..8bf7f62e6548 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -116,19 +116,17 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) kthread_unuse_mm(vma->userptr.notifier.mm); mmput(vma->userptr.notifier.mm); } -mm_closed: if (ret) - goto out; + goto out_release_pages; ret = sg_alloc_table_from_pages_segment(&vma->userptr.sgt, pages, pinned, 0, (u64)pinned << PAGE_SHIFT, xe_sg_segment_size(xe->drm.dev), GFP_KERNEL); - if (ret) { - vma->userptr.sg = NULL; - goto out; - } + if (ret) + goto out_release_pages; + vma->userptr.sg = &vma->userptr.sgt; ret = dma_map_sgtable(xe->drm.dev, vma->userptr.sg, @@ -136,11 +134,8 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) DMA_BIDIRECTIONAL, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING); - if (ret) { - sg_free_table(vma->userptr.sg); - vma->userptr.sg = NULL; - goto out; - } + if (ret) + goto out_free_sg; for (i = 0; i < pinned; ++i) { if (!read_only) { @@ -152,17 +147,23 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) mark_page_accessed(pages[i]); } -out: release_pages(pages, pinned); kvfree(pages); - if (!(ret < 0)) { - vma->userptr.notifier_seq = notifier_seq; - if (xe_vma_userptr_check_repin(vma) == -EAGAIN) - goto retry; - } + vma->userptr.notifier_seq = notifier_seq; + if (xe_vma_userptr_check_repin(vma) == -EAGAIN) + goto retry; + + return 0; - return ret < 0 ? ret : 0; +out_free_sg: + sg_free_table(vma->userptr.sg); + vma->userptr.sg = NULL; +out_release_pages: + release_pages(pages, pinned); +mm_closed: + kvfree(pages); + return ret; } static bool preempt_fences_waiting(struct xe_vm *vm) From patchwork Fri Aug 18 15:08:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m?= X-Patchwork-Id: 13357918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 36145C7115C for ; Fri, 18 Aug 2023 15:10:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 59FB510E53B; Fri, 18 Aug 2023 15:09:57 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id B860110E53E; Fri, 18 Aug 2023 15:09:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692371394; x=1723907394; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X1GaeCi2BOZUajkq9QoiMEyip0DITynEJdUxDRqtubY=; b=RjcvxvcvJQwljXJgtpOAtGCWakIBuIJK1mg3obbvkQW1e72Cnms5SUy8 uwJ9xNy1iB5/2ipLYCYIayxmZTflGXK4ALBkKIpHLRzZFXz/vU7z8+qLy FBxYZN1Huwmxsx4ZJ7oCg5KGUsGP+eq4FBrFJKIHt7xP1lydLBDML/iPc FJA7w1O2mcSbI7f9atzTfo6Aizyid6X6ADWS19dY+Um5RIk22lWddk2fL vcw98t2udSUUD/DbI9C7rbtikqOJUvgN9IeGb10oW0hHj42pMDAGCTv61 tg+Hc96wDkilTxGeK+t5J4hh8GaWKrMIKPmw9tQcomRhanbJxww9OGUcj Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="363276745" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="363276745" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="1065774160" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="1065774160" Received: from kjeldbeg-mobl.ger.corp.intel.com (HELO fedora..) ([10.249.254.202]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:23 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH 2/4] drm/xe/vm: Implement userptr page pinning Date: Fri, 18 Aug 2023 17:08:43 +0200 Message-ID: <20230818150845.96679-3-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> References: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Implement pinning of userptrs between VM_BIND and VM_UNBIND, which will facilitate avoiding long hangs on non-preemptible workloads. But don't hook it up to userspace just yet. Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_vm.c | 76 ++++++++++++++++++++++---------- drivers/gpu/drm/xe/xe_vm.h | 9 ++++ drivers/gpu/drm/xe/xe_vm_types.h | 12 +++++ 3 files changed, 74 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 8bf7f62e6548..ecbcad696b60 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -74,10 +74,6 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) if (notifier_seq == vma->userptr.notifier_seq) return 0; - pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) - return -ENOMEM; - if (vma->userptr.sg) { dma_unmap_sgtable(xe->drm.dev, vma->userptr.sg, @@ -87,6 +83,17 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) vma->userptr.sg = NULL; } + if (vma->userptr.pinned_pages) { + unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, + vma->userptr.num_pinned, + !read_only); + pages = vma->userptr.pinned_pages; + } else { + pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); + if (!pages) + return -ENOMEM; + } + pinned = ret = 0; if (in_kthread) { if (!mmget_not_zero(vma->userptr.notifier.mm)) { @@ -97,11 +104,18 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) } while (pinned < num_pages) { - ret = get_user_pages_fast(xe_vma_userptr(vma) + - pinned * PAGE_SIZE, - num_pages - pinned, - read_only ? 0 : FOLL_WRITE, - &pages[pinned]); + if (xe_vma_is_pinned(vma)) + ret = pin_user_pages_fast(xe_vma_userptr(vma) + + pinned * PAGE_SIZE, + num_pages - pinned, + read_only ? 0 : FOLL_WRITE, + &pages[pinned]); + else + ret = get_user_pages_fast(xe_vma_userptr(vma) + + pinned * PAGE_SIZE, + num_pages - pinned, + read_only ? 0 : FOLL_WRITE, + &pages[pinned]); if (ret < 0) { if (in_kthread) ret = 0; @@ -137,19 +151,24 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) if (ret) goto out_free_sg; - for (i = 0; i < pinned; ++i) { - if (!read_only) { - lock_page(pages[i]); - set_page_dirty(pages[i]); - unlock_page(pages[i]); + if (!xe_vma_is_pinned(vma)) { + for (i = 0; i < pinned; ++i) { + if (!read_only) { + lock_page(pages[i]); + set_page_dirty(pages[i]); + unlock_page(pages[i]); + } + + mark_page_accessed(pages[i]); } - mark_page_accessed(pages[i]); + release_pages(pages, pinned); + kvfree(pages); + } else { + vma->userptr.pinned_pages = pages; + vma->userptr.num_pinned = pinned; } - release_pages(pages, pinned); - kvfree(pages); - vma->userptr.notifier_seq = notifier_seq; if (xe_vma_userptr_check_repin(vma) == -EAGAIN) goto retry; @@ -160,9 +179,14 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) sg_free_table(vma->userptr.sg); vma->userptr.sg = NULL; out_release_pages: - release_pages(pages, pinned); + if (!xe_vma_is_pinned(vma)) + release_pages(pages, pinned); + else + unpin_user_pages(pages, pinned); + vma->userptr.num_pinned = 0; mm_closed: kvfree(pages); + vma->userptr.pinned_pages = NULL; return ret; } @@ -721,7 +745,7 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni, mmu_interval_set_seq(mni, cur_seq); /* No need to stop gpu access if the userptr is not yet bound. */ - if (!vma->userptr.initial_bind) { + if (xe_vma_is_pinned(vma) || !vma->userptr.initial_bind) { up_write(&vm->userptr.notifier_lock); return true; } @@ -976,10 +1000,16 @@ static void xe_vma_destroy_late(struct xe_vma *vma) vma->userptr.sg = NULL; } + if (vma->userptr.pinned_pages) { + unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, + vma->userptr.num_pinned, + !read_only); + kvfree(vma->userptr.pinned_pages); + } + /* - * Since userptr pages are not pinned, we can't remove - * the notifer until we're sure the GPU is not accessing - * them anymore + * We can't remove the notifer until we're sure the GPU is + * not accessing the pages anymore */ mmu_interval_notifier_remove(&vma->userptr.notifier); xe_vm_put(vm); diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index 6de6e3edb24a..913544d7d995 100644 --- a/drivers/gpu/drm/xe/xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -139,6 +139,15 @@ static inline bool xe_vma_is_userptr(struct xe_vma *vma) return xe_vma_has_no_bo(vma) && !xe_vma_is_null(vma); } +/** + * xe_vma_is_pinned() - User has requested the backing store of this vma + * to be pinned. + */ +static inline bool xe_vma_is_pinned(struct xe_vma *vma) +{ + return xe_vma_is_userptr(vma) && (vma->gpuva.flags & XE_VMA_PINNED); +} + #define xe_vm_assert_held(vm) dma_resv_assert_held(&(vm)->resv) u64 xe_vm_pdp4_descriptor(struct xe_vm *vm, struct xe_tile *tile); diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 3681a5ff588b..9b90e649cd69 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -33,6 +33,8 @@ struct xe_vm; #define XE_VMA_PTE_4K (DRM_GPUVA_USERBITS << 5) #define XE_VMA_PTE_2M (DRM_GPUVA_USERBITS << 6) #define XE_VMA_PTE_1G (DRM_GPUVA_USERBITS << 7) +/* User requested backing store to be pinned */ +#define XE_VMA_PINNED (DRM_GPUVA_USERBITS << 8) /** struct xe_userptr - User pointer */ struct xe_userptr { @@ -54,6 +56,16 @@ struct xe_userptr { * read: vm->userptr.notifier_lock in write mode or vm->resv held. */ bool initial_bind; + /** + * @pinned_pages: List of pinned pages if xe_vma_pinned(), + * NULL otherwise. protected by the vm lock. + */ + struct page **pinned_pages; + /** + * @num_pinned: Number of pointers to pinned pages in @pinned_pages. + * protected by the vm lock. + */ + unsigned long num_pinned; #if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT) u32 divisor; #endif From patchwork Fri Aug 18 15:08:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m?= X-Patchwork-Id: 13357919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EFC74C7115A for ; Fri, 18 Aug 2023 15:10:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B009F10E536; Fri, 18 Aug 2023 15:10:09 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5508910E536; Fri, 18 Aug 2023 15:10:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692371408; x=1723907408; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0GSZIKQay1eKaY6Vrv4xVJOGJZkh1ZLl7nt4sDwTX14=; b=gdNEpenFHGp45xnoHs4WD9X7ZRU1oVuOfNrRygIy6zvPwBDxOMfPIfMs xZwY/UDX64IrOyJYrw82tJb5fzbo4pmlya4kvwGMGgbpqNE7tFoO+YZE8 lsTS+tMqwJFkap0R4KZAx+EGJcdyg4dTO1IreVB2AhDpcygmamBkypSiy 7dzh47JFl3jRVoTeiEFIq6K/4QkyIXv+PchD5v+m81DW2A2qsHzT9W6ON p7ZgAXIzmbfLqJabrhUbge32jEuUE9ST/ZNcKZEB/7C6qVIXv/BnQRw0E rpm92BkfF00HFx+lQ/6ytQPUgHD2M0VmVEnk3LOdg7Bjq5ZYopE0u5aDx g==; X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="363276805" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="363276805" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="1065774199" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="1065774199" Received: from kjeldbeg-mobl.ger.corp.intel.com (HELO fedora..) ([10.249.254.202]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:29 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH 3/4] drm/xe/vm: Perform accounting of userptr pinned pages Date: Fri, 18 Aug 2023 17:08:44 +0200 Message-ID: <20230818150845.96679-4-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> References: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Account these pages against RLIMIT_MEMLOCK following how RDMA does this with CAP_IPC_LOCK bypassing the limit. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 43 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index ecbcad696b60..d9c000689002 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -34,6 +34,33 @@ #define TEST_VM_ASYNC_OPS_ERROR +/* + * Perform userptr PIN accounting against RLIMIT_MEMLOCK for now, similarly + * to how RDMA does this. + */ +static int xe_vma_mlock_alloc(struct xe_vma *vma, unsigned long num_pages) +{ + unsigned long lock_limit, new_pinned; + struct mm_struct *mm = vma->userptr.notifier.mm; + + if (!can_do_mlock()) + return -EPERM; + + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + new_pinned = atomic64_add_return(num_pages, &mm->pinned_vm); + if (new_pinned > lock_limit && !capable(CAP_IPC_LOCK)) { + atomic64_sub(num_pages, &mm->pinned_vm); + return -ENOMEM; + } + + return 0; +} + +static void xe_vma_mlock_free(struct xe_vma *vma, unsigned long num_pages) +{ + atomic64_sub(num_pages, &vma->userptr.notifier.mm->pinned_vm); +} + /** * xe_vma_userptr_check_repin() - Advisory check for repin needed * @vma: The userptr vma @@ -89,9 +116,17 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) !read_only); pages = vma->userptr.pinned_pages; } else { + if (xe_vma_is_pinned(vma)) { + ret = xe_vma_mlock_alloc(vma, num_pages); + if (ret) + return ret; + } + pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) - return -ENOMEM; + if (!pages) { + ret = -ENOMEM; + goto out_account; + } } pinned = ret = 0; @@ -187,6 +222,9 @@ int xe_vma_userptr_pin_pages(struct xe_vma *vma) mm_closed: kvfree(pages); vma->userptr.pinned_pages = NULL; +out_account: + if (xe_vma_is_pinned(vma)) + xe_vma_mlock_free(vma, num_pages); return ret; } @@ -1004,6 +1042,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma) unpin_user_pages_dirty_lock(vma->userptr.pinned_pages, vma->userptr.num_pinned, !read_only); + xe_vma_mlock_free(vma, xe_vma_size(vma) >> PAGE_SHIFT); kvfree(vma->userptr.pinned_pages); } From patchwork Fri Aug 18 15:08:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m?= X-Patchwork-Id: 13357920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6879C73C66 for ; Fri, 18 Aug 2023 15:10:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D1B2610E53E; Fri, 18 Aug 2023 15:10:10 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id A583110E53A; Fri, 18 Aug 2023 15:10:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692371408; x=1723907408; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=v90Ib/kJ8DKC8sZ0/kZtpmnx7LiJ/Ou5+DKlfr6W+i0=; b=bDzIetmfADFbbJiLKmKcX6r33rgn1keY9P1yVtHX8B2Dx9MoClNry9/4 iFQZhG8h/P3ttLejgMKV7l6FDk9RQNU0veQH2FSxSMle1bNO83m0JfDcL B6in2UfSuvHU71kAzcL2HlsZB3ZlPUmmW78zgioApxm3vqz9+yavMJE/2 otRjvlWR7WFO125Xw8FzASfU4MmaX7UTOsa8RJKOKAJd2g6dT4i7HN9Vp jxOwm87rom1xwYkOwptBqK2vnUj71ddOLn7tDqqTItjAlv6L2m1tfkBkh ZAwRC9MhMZooLElmBH5pWWAZSwYx730U2opIF+lxnxBrYfSW5M3Aux/Fj g==; X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="363276862" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="363276862" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10806"; a="1065774217" X-IronPort-AV: E=Sophos;i="6.01,183,1684825200"; d="scan'208";a="1065774217" Received: from kjeldbeg-mobl.ger.corp.intel.com (HELO fedora..) ([10.249.254.202]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 08:09:37 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Subject: [PATCH 4/4] drm/xe/uapi: Support pinning of userptr vmas Date: Fri, 18 Aug 2023 17:08:45 +0200 Message-ID: <20230818150845.96679-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> References: <20230818150845.96679-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Support pinning of vmas using XE_VM_BIND_FLAG_PIN, initially for userptr only. Pinned memory becomes accounted against RLIMIT_MEMLOCK and processes with CAP_IPC_LOCK will not apply the limit. This is pretty similar to mlock()'ing userptr memory with the added benefit that the driver is aware and can ignore some actions in the MMU invalidation notifier. This will initially become useful for compute VMs on hardware without mid-thread-preemption capability since with pinned pages, the MMU invalidation notifier never tries to preempt a running compute kernel. If that were the only usage we could restrict this to a flag that always pins userptr VMAs on compute VMs on such hardware, but there are indications that this may become needed in other situations as well. From a more general point of view, the usage pattern of a system may be such that in most cases it only ever runs a single workload per system and then the sysadmin would want to configure the system to allow extensive pinning for performance reasons. Hence we might want to extend the pinning capability to bo-backed VMAs as well. How that pinning will be accounted remains an open but to build on the current drm CGROUP work would be an option. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Brost --- drivers/gpu/drm/xe/xe_vm.c | 33 +++++++++++++++++++++++++------- drivers/gpu/drm/xe/xe_vm_types.h | 2 ++ include/uapi/drm/xe_drm.h | 18 +++++++++++++++++ 3 files changed, 46 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index d9c000689002..3832f1f21def 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -936,6 +936,7 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, u64 start, u64 end, bool read_only, bool is_null, + bool pin, u8 tile_mask) { struct xe_vma *vma; @@ -967,6 +968,8 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, vma->gpuva.flags |= XE_VMA_READ_ONLY; if (is_null) vma->gpuva.flags |= DRM_GPUVA_SPARSE; + if (pin) + vma->gpuva.flags |= XE_VMA_PINNED; if (tile_mask) { vma->tile_mask = tile_mask; @@ -2367,6 +2370,7 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo, op->map.read_only = operation & XE_VM_BIND_FLAG_READONLY; op->map.is_null = operation & XE_VM_BIND_FLAG_NULL; + op->map.pin = operation & XE_VM_BIND_FLAG_PIN; } break; case XE_VM_BIND_OP_UNMAP: @@ -2431,7 +2435,8 @@ vm_bind_ioctl_ops_create(struct xe_vm *vm, struct xe_bo *bo, } static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op, - u8 tile_mask, bool read_only, bool is_null) + u8 tile_mask, bool read_only, bool is_null, + bool pin) { struct xe_bo *bo = op->gem.obj ? gem_to_xe_bo(op->gem.obj) : NULL; struct xe_vma *vma; @@ -2447,7 +2452,7 @@ static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op, } vma = xe_vma_create(vm, bo, op->gem.offset, op->va.addr, op->va.addr + - op->va.range - 1, read_only, is_null, + op->va.range - 1, read_only, is_null, pin, tile_mask); if (bo) xe_bo_unlock(bo, &ww); @@ -2562,7 +2567,7 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, vma = new_vma(vm, &op->base.map, op->tile_mask, op->map.read_only, - op->map.is_null); + op->map.is_null, op->map.pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -2587,10 +2592,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, bool is_null = op->base.remap.unmap->va->flags & DRM_GPUVA_SPARSE; + bool pin = + op->base.remap.unmap->va->flags & + XE_VMA_PINNED; vma = new_vma(vm, op->base.remap.prev, op->tile_mask, read_only, - is_null); + is_null, pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -2623,10 +2631,13 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct xe_exec_queue *q, bool is_null = op->base.remap.unmap->va->flags & DRM_GPUVA_SPARSE; + bool pin = + op->base.remap.unmap->va->flags & + XE_VMA_PINNED; vma = new_vma(vm, op->base.remap.next, op->tile_mask, read_only, - is_null); + is_null, pin); if (IS_ERR(vma)) { err = PTR_ERR(vma); goto free_fence; @@ -3131,11 +3142,12 @@ static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm, #define SUPPORTED_FLAGS \ (FORCE_ASYNC_OP_ERROR | XE_VM_BIND_FLAG_ASYNC | \ XE_VM_BIND_FLAG_READONLY | XE_VM_BIND_FLAG_IMMEDIATE | \ - XE_VM_BIND_FLAG_NULL | 0xffff) + XE_VM_BIND_FLAG_NULL | XE_VM_BIND_FLAG_PIN | 0xffff) #else #define SUPPORTED_FLAGS \ (XE_VM_BIND_FLAG_ASYNC | XE_VM_BIND_FLAG_READONLY | \ - XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | 0xffff) + XE_VM_BIND_FLAG_IMMEDIATE | XE_VM_BIND_FLAG_NULL | \ + XE_VM_BIND_FLAG_PIN | 0xffff) #endif #define XE_64K_PAGE_MASK 0xffffull @@ -3205,6 +3217,13 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, goto free_bind_ops; } + /* TODO: Support OP_PREFETCH, OP_MAP */ + if (XE_IOCTL_DBG(xe, (op & XE_VM_BIND_FLAG_PIN) && + VM_BIND_OP(op) != XE_VM_BIND_OP_MAP_USERPTR)) { + err = -EINVAL; + goto free_bind_ops; + } + if (XE_IOCTL_DBG(xe, VM_BIND_OP(op) > XE_VM_BIND_OP_PREFETCH) || XE_IOCTL_DBG(xe, op & ~SUPPORTED_FLAGS) || diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h index 9b90e649cd69..024ccabadd12 100644 --- a/drivers/gpu/drm/xe/xe_vm_types.h +++ b/drivers/gpu/drm/xe/xe_vm_types.h @@ -360,6 +360,8 @@ struct xe_vma_op_map { bool read_only; /** @is_null: is NULL binding */ bool is_null; + /** @pin: pin underlying memory */ + bool pin; }; /** struct xe_vma_op_remap - VMA remap operation */ diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 86f16d50e9cc..fc3d9cd4f8d0 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -631,6 +631,24 @@ struct drm_xe_vm_bind_op { * intended to implement VK sparse bindings. */ #define XE_VM_BIND_FLAG_NULL (0x1 << 19) + /* + * When the PIN flag is set, the user requests the underlying + * backing store of the vma to be pinned, that is, it will be + * resident while bound and the underlying physical memory + * will not change. For userptr VMAs this means that if the + * user performs an operation that changes the underlying + * pages of the CPU virtual space, the corresponding pinned + * GPU virtual space will not pick up the new memory unless + * an OP_UNMAP followed by a OP_MAP_USERPTR is performed. + * Pinned userptr memory is accounted in the same way as + * mlock(2), and if pinning fails the following error codes + * may be returned: + * -EINVAL: The memory region does not support pinning. + * -EPERM: The process is not permitted to pin. + * -ENOMEM: The pinning limit does not allow pinning. + * For userptr memory, CAP_IPC_LOCK will bypass the limit checking. + */ +#define XE_VM_BIND_FLAG_PIN (0x1 << 20) /** @op: Operation to perform (lower 16 bits) and flags (upper 16 bits) */ __u32 op;