From patchwork Fri Dec 13 21:56:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291581 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70767139A for ; Fri, 13 Dec 2019 22:07:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 64C112073D for ; Fri, 13 Dec 2019 22:07:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64C112073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D9D946EDE4; Fri, 13 Dec 2019 22:07:30 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 514916EDD0; Fri, 13 Dec 2019 22:07:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558857" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:27 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:03 -0800 Message-Id: <20191213215614.24558-2-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 01/12] drm/i915/svm: Add SVM documentation X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add Shared Virtual Memory (SVM) support information. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- Documentation/gpu/i915.rst | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index e539c42a3e78..0bc999963489 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -415,6 +415,35 @@ Object Tiling IOCTLs .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c :doc: buffer object tiling +Shared Virtual Memory (SVM) +--------------------------- + +Shared Virtual Memory (SVM) allows the programmer to use a single virtual +address space which will be shared between threads executing on CPUs and GPUs. +It abstracts away from the user the location of the backing memory, and hence +simplifies the user programming model. +SVM supports two types of virtual memory allocation methods. +Runtime allocator requires the driver to provide memory allocation and +management interface, like buffer object (BO) interface. +Whereas system allocator makes use of default OS memory allocation and +management support like malloc(). + +Linux kernel has a Heterogeneous Memory Management (HMM) framework to +Support SVM system allocator. HMM’s address space mirroring support allows +sharing of the address space by duplicating sections of CPU page tables in the +device page tables. This enables both CPU and GPU access a physical memory +location using the same virtual address. Linux kernel also provides the ability +to plugin device memory with the system (as a special ZONE_DEVICE type) and +allocates struct page for each device memory page. It also provides a mechanism +to migrate pages from host to device memory and vice versa. +More information on HMM can be found here. +https://www.kernel.org/doc/Documentation/vm/hmm.rst + +i915 supports both SVM system and runtime allocator. As PCIe is a non-coherent +bus, it plugs in device memory as DEVICE_PRIVATE and no memory access across +PCIe link is allowed. Any such access will trigger migration of the page/s +or BOs before the memory is accessed. + Microcontrollers ================ From patchwork Fri Dec 13 21:56:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BDBA14B7 for ; Fri, 13 Dec 2019 22:07:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 203B12073D for ; Fri, 13 Dec 2019 22:07:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 203B12073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 073C66EDEC; Fri, 13 Dec 2019 22:07:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2EA506EDE4; Fri, 13 Dec 2019 22:07:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558862" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:27 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:04 -0800 Message-Id: <20191213215614.24558-3-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig | 11 ++++ drivers/gpu/drm/i915/Makefile | 3 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 58 ++++++++++++++---- drivers/gpu/drm/i915/gem/i915_gem_svm.c | 60 +++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++++++ drivers/gpu/drm/i915/i915_drv.c | 21 +++++++ drivers/gpu/drm/i915/i915_drv.h | 22 +++++++ drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.h | 13 ++++ include/uapi/drm/i915_drm.h | 27 +++++++++ 10 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index ba9595960bbe..c2e48710eec8 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -137,6 +137,16 @@ config DRM_I915_GVT_KVMGT Choose this option if you want to enable KVMGT support for Intel GVT-g. +config DRM_I915_SVM + bool "Enable Shared Virtual Memory support in i915" + depends on STAGING + depends on DRM_I915 + default n + help + Choose this option if you want Shared Virtual Memory (SVM) + support in i915. With SVM support, one can share the virtual + address space between a process and the GPU. + menu "drm/i915 Debugging" depends on DRM_I915 depends on EXPERT diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e0fd10c0cfb8..75fe45633779 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -153,6 +153,9 @@ i915-y += \ intel_region_lmem.o \ intel_wopcm.o +# SVM code +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o + # general-purpose microcontroller (GuC) support obj-y += gt/uc/ i915-y += gt/uc/intel_uc.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 5003e616a1ad..af360238a392 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2836,10 +2836,14 @@ int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; struct drm_i915_gem_execbuffer2 *args = data; - struct drm_i915_gem_exec_object2 *exec2_list; - struct drm_syncobj **fences = NULL; const size_t count = args->buffer_count; + struct drm_syncobj **fences = NULL; + unsigned int i = 0, svm_count = 0; + struct i915_address_space *vm; + struct i915_gem_context *ctx; + struct i915_svm_obj *svm_obj; int err; if (!check_buffer_count(count)) { @@ -2851,15 +2855,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (err) return err; + ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); + if (!ctx || !rcu_access_pointer(ctx->vm)) + return -ENOENT; + + rcu_read_lock(); + vm = i915_vm_get(ctx->vm); + rcu_read_unlock(); + +alloc_again: + svm_count = vm->svm_count; /* Allocate an extra slot for use by the command parser */ - exec2_list = kvmalloc_array(count + 1, eb_element_size(), + exec2_list = kvmalloc_array(count + svm_count + 1, eb_element_size(), __GFP_NOWARN | GFP_KERNEL); if (exec2_list == NULL) { DRM_DEBUG("Failed to allocate exec list for %zd buffers\n", - count); + count + svm_count); return -ENOMEM; } - if (copy_from_user(exec2_list, + mutex_lock(&vm->mutex); + if (svm_count != vm->svm_count) { + mutex_unlock(&vm->mutex); + kvfree(exec2_list); + goto alloc_again; + } + + list_for_each_entry(svm_obj, &vm->svm_list, link) { + memset(&exec2_list[i], 0, sizeof(*exec2_list)); + exec2_list[i].handle = svm_obj->handle; + exec2_list[i].offset = svm_obj->offset; + exec2_list[i].flags = EXEC_OBJECT_PINNED | + EXEC_OBJECT_SUPPORTS_48B_ADDRESS; + i++; + } + exec2_list_user = &exec2_list[i]; + args->buffer_count += svm_count; + mutex_unlock(&vm->mutex); + i915_vm_put(vm); + i915_gem_context_put(ctx); + + if (copy_from_user(exec2_list_user, u64_to_user_ptr(args->buffers_ptr), sizeof(*exec2_list) * count)) { DRM_DEBUG("copy %zd exec entries failed\n", count); @@ -2876,6 +2911,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, } err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences); + args->buffer_count -= svm_count; /* * Now that we have begun execution of the batchbuffer, we ignore @@ -2886,7 +2922,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (args->flags & __EXEC_HAS_RELOC) { struct drm_i915_gem_exec_object2 __user *user_exec_list = u64_to_user_ptr(args->buffers_ptr); - unsigned int i; /* Copy the new buffer offsets back to the user's exec list. */ /* @@ -2900,13 +2935,14 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, goto end; for (i = 0; i < args->buffer_count; i++) { - if (!(exec2_list[i].offset & UPDATE)) + u64 *offset = &exec2_list_user[i].offset; + + if (!(*offset & UPDATE)) continue; - exec2_list[i].offset = - gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK); - unsafe_put_user(exec2_list[i].offset, - &user_exec_list[i].offset, + *offset = gen8_canonical_addr(*offset & + PIN_OFFSET_MASK); + unsafe_put_user(*offset, &user_exec_list[i].offset, end_user); } end_user: diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.c b/drivers/gpu/drm/i915/gem/i915_gem_svm.c new file mode 100644 index 000000000000..882fe56138e2 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "i915_gem_gtt.h" +#include "i915_gem_lmem.h" + +int i915_gem_vm_bind_svm_obj(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args, + struct drm_file *file) +{ + struct i915_svm_obj *svm_obj, *tmp; + struct drm_i915_gem_object *obj; + int ret = 0; + + obj = i915_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + /* For dgfx, ensure the obj is in device local memory only */ + if (IS_DGFX(vm->i915) && !i915_gem_object_is_lmem(obj)) + return -EINVAL; + + /* FIXME: Need to handle case with unending batch buffers */ + if (!(args->flags & I915_GEM_VM_BIND_UNBIND)) { + svm_obj = kmalloc(sizeof(*svm_obj), GFP_KERNEL); + if (!svm_obj) { + ret = -ENOMEM; + goto put_obj; + } + svm_obj->handle = args->handle; + svm_obj->offset = args->start; + } + + mutex_lock(&vm->mutex); + if (!(args->flags & I915_GEM_VM_BIND_UNBIND)) { + list_add(&svm_obj->link, &vm->svm_list); + vm->svm_count++; + } else { + /* + * FIXME: Need to handle case where object is migrated/closed + * without unbinding first. + */ + list_for_each_entry_safe(svm_obj, tmp, &vm->svm_list, link) { + if (svm_obj->handle != args->handle) + continue; + + list_del_init(&svm_obj->link); + vm->svm_count--; + kfree(svm_obj); + break; + } + } + mutex_unlock(&vm->mutex); +put_obj: + i915_gem_object_put(obj); + return ret; +} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.h b/drivers/gpu/drm/i915/gem/i915_gem_svm.h new file mode 100644 index 000000000000..d60b35c7d21a --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __I915_GEM_SVM_H +#define __I915_GEM_SVM_H + +#include "i915_drv.h" + +#if defined(CONFIG_DRM_I915_SVM) +int i915_gem_vm_bind_svm_obj(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args, + struct drm_file *file); +#else +static inline int i915_gem_vm_bind_svm_obj(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args, + struct drm_file *file) +{ return -ENOTSUPP; } +#endif + +#endif /* __I915_GEM_SVM_H */ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 2a11f60c4fd2..d452ea8e40b3 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2680,6 +2680,26 @@ i915_gem_reject_pin_ioctl(struct drm_device *dev, void *data, return -ENODEV; } +static int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_vm_bind *args = data; + struct i915_address_space *vm; + int ret = -EINVAL; + + vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id); + if (unlikely(!vm)) + return -ENOENT; + + switch (args->type) { + case I915_GEM_VM_BIND_SVM_OBJ: + ret = i915_gem_vm_bind_svm_obj(vm, args, file); + } + + i915_vm_put(vm); + return ret; +} + static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_FLUSH, drm_noop, DRM_AUTH), @@ -2739,6 +2759,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW), }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ce130e1f1e47..2d0a7cd2dc44 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1909,6 +1909,28 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) return ctx; } +static inline struct i915_address_space * +__i915_gem_address_space_lookup_rcu(struct drm_i915_file_private *file_priv, + u32 id) +{ + return idr_find(&file_priv->vm_idr, id); +} + +static inline struct i915_address_space * +i915_gem_address_space_lookup(struct drm_i915_file_private *file_priv, + u32 id) +{ + struct i915_address_space *vm; + + rcu_read_lock(); + vm = __i915_gem_address_space_lookup_rcu(file_priv, id); + if (vm) + vm = i915_vm_get(vm); + rcu_read_unlock(); + + return vm; +} + /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index be36719e7987..7d4f5fa84b02 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -586,6 +586,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass) stash_init(&vm->free_pages); INIT_LIST_HEAD(&vm->bound_list); + INIT_LIST_HEAD(&vm->svm_list); } static int __setup_page_dma(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 31a4a96ddd0d..7c1b54c9677d 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -285,6 +285,13 @@ struct pagestash { struct pagevec pvec; }; +struct i915_svm_obj { + /** This obj's place in the SVM object list */ + struct list_head link; + u32 handle; + u64 offset; +}; + struct i915_address_space { struct kref ref; struct rcu_work rcu; @@ -329,6 +336,12 @@ struct i915_address_space { */ struct list_head bound_list; + /** + * List of SVM bind objects. + */ + struct list_head svm_list; + unsigned int svm_count; + struct pagestash free_pages; /* Global GTT */ diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 20314eea632a..e10d7bf2cf9f 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -360,6 +360,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_VM_CREATE 0x3a #define DRM_I915_GEM_VM_DESTROY 0x3b #define DRM_I915_GEM_OBJECT_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM +#define DRM_I915_GEM_VM_BIND 0x3c /* Must be kept compact -- no holes */ #define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) @@ -424,6 +425,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_VM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_VM_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param) +#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind) /* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -2300,6 +2302,31 @@ struct drm_i915_query_perf_config { __u8 data[]; }; +/** + * struct drm_i915_gem_vm_bind + * + * Bind an object in a vm's page table. + */ +struct drm_i915_gem_vm_bind { + /** VA start to bind **/ + __u64 start; + + /** Type of memory to [un]bind **/ + __u32 type; +#define I915_GEM_VM_BIND_SVM_OBJ 0 + + /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ + __u32 handle; + + /** vm to [un]bind **/ + __u32 vm_id; + + /** Flags **/ + __u32 flags; +#define I915_GEM_VM_BIND_UNBIND (1 << 0) +#define I915_GEM_VM_BIND_READONLY (1 << 1) +}; + #if defined(__cplusplus) } #endif From patchwork Fri Dec 13 21:56:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291579 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D2DC139A for ; Fri, 13 Dec 2019 22:07:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 313152073D for ; Fri, 13 Dec 2019 22:07:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 313152073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 80D6D6EDE7; Fri, 13 Dec 2019 22:07:30 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 938E06EDE7; Fri, 13 Dec 2019 22:07:29 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558874" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:28 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:05 -0800 Message-Id: <20191213215614.24558-4-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 03/12] drm/i915/svm: Implicitly migrate BOs upon CPU access X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Venkata Sandeep Dhanalakota As PCIe is non-coherent link, do not allow direct access to buffer objects across the PCIe link for SVM case. Upon CPU accesses (mmap, pread), migrate buffer object to host memory. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Cc: Niranjana Vishwanathapura Signed-off-by: Venkata Sandeep Dhanalakota --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 10 ++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 29 +++++++++++++++++----- drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 +++ drivers/gpu/drm/i915/intel_memory_region.c | 4 --- drivers/gpu/drm/i915/intel_memory_region.h | 4 +++ 5 files changed, 40 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 879fff8adc48..fc1a11f0bec9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -14,6 +14,7 @@ #include "i915_drv.h" #include "i915_gem_gtt.h" #include "i915_gem_ioctls.h" +#include "i915_gem_lmem.h" #include "i915_gem_object.h" #include "i915_gem_mman.h" #include "i915_trace.h" @@ -295,6 +296,15 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) if (i915_gem_object_is_readonly(obj) && write) return VM_FAULT_SIGBUS; + /* Implicitly migrate BO to SMEM if it is SVM mapped */ + if (i915_gem_object_svm_mapped(obj) && i915_gem_object_is_lmem(obj)) { + u32 regions[] = { REGION_MAP(INTEL_MEMORY_SYSTEM, 0) }; + + ret = i915_gem_object_migrate_region(obj, regions, 1); + if (ret) + goto err; + } + /* We don't use vmf->pgoff since that has the fake offset */ page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 025c26266801..003d81c171d2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -517,12 +517,17 @@ __region_id(u32 region) return INTEL_REGION_UNKNOWN; } +bool +i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) +{ + return false; +} + static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, struct drm_i915_gem_object_param *args, struct drm_file *file, struct drm_i915_gem_object *obj) { - struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context; u32 __user *uregions = u64_to_user_ptr(args->data); u32 uregions_copy[INTEL_REGION_UNKNOWN]; int i, ret; @@ -542,16 +547,28 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, ++uregions; } + ret = i915_gem_object_migrate_region(obj, uregions_copy, + args->size); + + return ret; +} + +int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj, + u32 *regions, int size) +{ + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context; + int i, ret; + mutex_lock(&dev_priv->drm.struct_mutex); ret = i915_gem_object_prepare_move(obj); if (ret) { DRM_ERROR("Cannot set memory region, object in use\n"); - goto err; + goto err; } - for (i = 0; i < args->size; i++) { - u32 region = uregions_copy[i]; - enum intel_region_id id = __region_id(region); + for (i = 0; i < size; i++) { + enum intel_region_id id = __region_id(regions[i]); if (id == INTEL_REGION_UNKNOWN) { ret = -EINVAL; @@ -561,7 +578,7 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, ret = i915_gem_object_migrate(obj, ce, id); if (!ret) { if (!i915_gem_object_has_pages(obj) && - MEMORY_TYPE_FROM_REGION(region) == + MEMORY_TYPE_FROM_REGION(regions[i]) == INTEL_MEMORY_LOCAL) { /* * TODO: this should be part of get_pages(), diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 87e6b6f18d91..6d8ca3f0ccf7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -47,6 +47,9 @@ int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj); int i915_gem_object_migrate(struct drm_i915_gem_object *obj, struct intel_context *ce, enum intel_region_id id); +bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj); +int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj, + u32 *uregions, int size); void i915_gem_flush_free_objects(struct drm_i915_private *i915); diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c index baaeaecc64af..049a1b482d9d 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.c +++ b/drivers/gpu/drm/i915/intel_memory_region.c @@ -6,10 +6,6 @@ #include "intel_memory_region.h" #include "i915_drv.h" -/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */ -#define REGION_MAP(type, inst) \ - BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst) - const u32 intel_region_map[] = { [INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0), [INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0), diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h index 238722009677..e3e8ab946d78 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.h +++ b/drivers/gpu/drm/i915/intel_memory_region.h @@ -44,6 +44,10 @@ enum intel_region_id { #define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT)) #define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff)) +/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */ +#define REGION_MAP(type, inst) \ + BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst) + #define I915_ALLOC_MIN_PAGE_SIZE BIT(0) #define I915_ALLOC_CONTIGUOUS BIT(1) From patchwork Fri Dec 13 21:56:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E19E14B7 for ; Fri, 13 Dec 2019 22:07:44 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4063A2073D for ; Fri, 13 Dec 2019 22:07:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4063A2073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 64CDE6EDE8; Fri, 13 Dec 2019 22:07:31 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3AC2E6EDE7; Fri, 13 Dec 2019 22:07:30 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558880" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:29 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:06 -0800 Message-Id: <20191213215614.24558-5-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 04/12] drm/i915/svm: Page table update support for SVM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" For Shared Virtual Memory (SVM) system (SYS) allocator, there is no backing buffer object (BO). Add support to bind a VA to PA mapping in the device page table. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_gem_gtt.c | 60 ++++++++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++++ 2 files changed, 68 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7d4f5fa84b02..6657ff41dc3f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -195,6 +195,50 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) vma->vm->clear_range(vma->vm, vma->node.start, vma->size); } +int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size) +{ + return vm->allocate_va_range(vm, start, size); +} + +int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes) +{ + struct i915_vma vma = {0}; + u32 pte_flags = 0; + + /* use a vma wrapper */ + vma.page_sizes.sg = sg_page_sizes; + vma.node.start = start; + vma.node.size = size; + vma.pages = st; + vma.vm = vm; + + /* Applicable to VLV, and gen8+ */ + if (flags & I915_GTT_SVM_READONLY) + pte_flags |= PTE_READ_ONLY; + + vm->insert_entries(vm, &vma, 0, pte_flags); + return 0; +} + +int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes) +{ + int ret; + + ret = svm_bind_addr_prepare(vm, start, size); + if (ret) + return ret; + + return svm_bind_addr_commit(vm, start, size, flags, st, sg_page_sizes); +} + +void svm_unbind_addr(struct i915_address_space *vm, + u64 start, u64 size) +{ + vm->clear_range(vm, start, size); +} + static int ppgtt_set_pages(struct i915_vma *vma) { GEM_BUG_ON(vma->pages); @@ -985,11 +1029,21 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, DBG("%s(%p):{ lvl:%d, start:%llx, end:%llx, idx:%d, len:%d, used:%d }\n", __func__, vm, lvl + 1, start, end, idx, len, atomic_read(px_used(pd))); - GEM_BUG_ON(!len || len >= atomic_read(px_used(pd))); + /* + * FIXME: In SVM case, during mmu invalidation, we need to clear ppgtt, + * but we don't know if the entry exist or not. So, we can't assume + * that it is called only when the entry exist. revisit. + * Also need to add the ebility to properly handle partial invalidations + * by downgrading the large mappings. + */ + GEM_BUG_ON(!len); do { struct i915_page_table *pt = pd->entry[idx]; + if (!pt) + continue; + if (atomic_fetch_inc(&pt->used) >> gen8_pd_shift(1) && gen8_pd_contains(start, end, lvl)) { DBG("%s(%p):{ lvl:%d, idx:%d, start:%llx, end:%llx } removing pd\n", @@ -1012,7 +1066,9 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, __func__, vm, lvl, start, end, gen8_pd_index(start, 0), count, atomic_read(&pt->used)); - GEM_BUG_ON(!count || count >= atomic_read(&pt->used)); + GEM_BUG_ON(!count); + if (count > atomic_read(&pt->used)) + count = atomic_read(&pt->used); vaddr = kmap_atomic_px(pt); memset64(vaddr + gen8_pd_index(start, 0), diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 7c1b54c9677d..8a8a314e1295 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -39,6 +39,7 @@ #include #include #include +#include #include @@ -679,4 +680,13 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, #define PIN_OFFSET_MASK (-I915_GTT_PAGE_SIZE) +/* SVM UAPI */ +#define I915_GTT_SVM_READONLY BIT(0) + +int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size); +int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes); +int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes); +void svm_unbind_addr(struct i915_address_space *vm, u64 start, u64 size); #endif From patchwork Fri Dec 13 21:56:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 06604139A for ; Fri, 13 Dec 2019 22:07:46 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EEE6C2073D for ; Fri, 13 Dec 2019 22:07:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EEE6C2073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 371796EDEA; Fri, 13 Dec 2019 22:07:33 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 05FB46EDE8; Fri, 13 Dec 2019 22:07:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558900" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:30 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:07 -0800 Message-Id: <20191213215614.24558-6-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 05/12] drm/i915/svm: Page table mirroring support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Use HMM page table mirroring support to build device page table. Implement the bind ioctl and bind the process address range in the specified context's ppgtt. Handle invalidation notifications by unbinding the address range. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig | 3 + drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_drv.c | 5 + drivers/gpu/drm/i915/i915_gem_gtt.c | 5 + drivers/gpu/drm/i915/i915_gem_gtt.h | 4 + drivers/gpu/drm/i915/i915_svm.c | 324 ++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_svm.h | 50 +++++ include/uapi/drm/i915_drm.h | 9 +- 8 files changed, 401 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_svm.c create mode 100644 drivers/gpu/drm/i915/i915_svm.h diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index c2e48710eec8..689e57fe3973 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -141,6 +141,9 @@ config DRM_I915_SVM bool "Enable Shared Virtual Memory support in i915" depends on STAGING depends on DRM_I915 + depends on MMU + select HMM_MIRROR + select MMU_NOTIFIER default n help Choose this option if you want Shared Virtual Memory (SVM) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 75fe45633779..7d4cd9eefd12 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -154,7 +154,8 @@ i915-y += \ intel_wopcm.o # SVM code -i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ + i915_svm.o # general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index d452ea8e40b3..866d3cbb1edf 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -62,6 +62,7 @@ #include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" #include "gem/i915_gem_mman.h" +#include "gem/i915_gem_svm.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" #include "gt/intel_rc6.h" @@ -73,6 +74,7 @@ #include "i915_perf.h" #include "i915_query.h" #include "i915_suspend.h" +#include "i915_svm.h" #include "i915_switcheroo.h" #include "i915_sysfs.h" #include "i915_trace.h" @@ -2694,6 +2696,9 @@ static int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data, switch (args->type) { case I915_GEM_VM_BIND_SVM_OBJ: ret = i915_gem_vm_bind_svm_obj(vm, args, file); + break; + case I915_GEM_VM_BIND_SVM_BUFFER: + ret = i915_gem_vm_bind_svm_buffer(vm, args); } i915_vm_put(vm); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 6657ff41dc3f..192674f03e4e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -42,6 +42,7 @@ #include "i915_drv.h" #include "i915_scatterlist.h" +#include "i915_svm.h" #include "i915_trace.h" #include "i915_vgpu.h" @@ -562,6 +563,7 @@ static void i915_address_space_fini(struct i915_address_space *vm) drm_mm_takedown(&vm->mm); mutex_destroy(&vm->mutex); + mutex_destroy(&vm->svm_mutex); } void __i915_vm_close(struct i915_address_space *vm) @@ -591,6 +593,7 @@ static void __i915_vm_release(struct work_struct *work) struct i915_address_space *vm = container_of(work, struct i915_address_space, rcu.work); + i915_svm_unbind_mm(vm); vm->cleanup(vm); i915_address_space_fini(vm); @@ -620,6 +623,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass) * attempt holding the lock is immediately reported by lockdep. */ mutex_init(&vm->mutex); + mutex_init(&vm->svm_mutex); lockdep_set_subclass(&vm->mutex, subclass); i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex); @@ -631,6 +635,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass) INIT_LIST_HEAD(&vm->bound_list); INIT_LIST_HEAD(&vm->svm_list); + RCU_INIT_POINTER(vm->svm, NULL); } static int __setup_page_dma(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 8a8a314e1295..e06e6447e0d7 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -293,6 +293,8 @@ struct i915_svm_obj { u64 offset; }; +struct i915_svm; + struct i915_address_space { struct kref ref; struct rcu_work rcu; @@ -342,6 +344,8 @@ struct i915_address_space { */ struct list_head svm_list; unsigned int svm_count; + struct i915_svm *svm; + struct mutex svm_mutex; /* protects svm enabling */ struct pagestash free_pages; diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c new file mode 100644 index 000000000000..5941be5b5803 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm.c @@ -0,0 +1,324 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include +#include + +#include "i915_svm.h" +#include "intel_memory_region.h" +#include "gem/i915_gem_context.h" + +struct svm_notifier { + struct mmu_interval_notifier notifier; + struct i915_svm *svm; +}; + +static const u64 i915_range_flags[HMM_PFN_FLAG_MAX] = { + [HMM_PFN_VALID] = (1 << 0), /* HMM_PFN_VALID */ + [HMM_PFN_WRITE] = (1 << 1), /* HMM_PFN_WRITE */ + [HMM_PFN_DEVICE_PRIVATE] = (1 << 2) /* HMM_PFN_DEVICE_PRIVATE */ +}; + +static const u64 i915_range_values[HMM_PFN_VALUE_MAX] = { + [HMM_PFN_ERROR] = 0xfffffffffffffffeUL, /* HMM_PFN_ERROR */ + [HMM_PFN_NONE] = 0, /* HMM_PFN_NONE */ + [HMM_PFN_SPECIAL] = 0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */ +}; + +static struct i915_svm *vm_get_svm(struct i915_address_space *vm) +{ + struct i915_svm *svm = vm->svm; + + mutex_lock(&vm->svm_mutex); + if (svm && !kref_get_unless_zero(&svm->ref)) + svm = NULL; + + mutex_unlock(&vm->svm_mutex); + return svm; +} + +static void release_svm(struct kref *ref) +{ + struct i915_svm *svm = container_of(ref, typeof(*svm), ref); + struct i915_address_space *vm = svm->vm; + + mmu_notifier_unregister(&svm->notifier, svm->notifier.mm); + mutex_destroy(&svm->mutex); + vm->svm = NULL; + kfree(svm); +} + +static void vm_put_svm(struct i915_address_space *vm) +{ + mutex_lock(&vm->svm_mutex); + if (vm->svm) + kref_put(&vm->svm->ref, release_svm); + mutex_unlock(&vm->svm_mutex); +} + +static u32 i915_svm_build_sg(struct i915_address_space *vm, + struct hmm_range *range, + struct sg_table *st) +{ + struct scatterlist *sg; + u32 sg_page_sizes = 0; + u64 i, npages; + + sg = NULL; + st->nents = 0; + npages = (range->end - range->start) / PAGE_SIZE; + + /* + * No need to dma map the host pages and later unmap it, as + * GPU is not allowed to access it with SVM. + * XXX: Need to dma map host pages for integrated graphics while + * extending SVM support there. + */ + for (i = 0; i < npages; i++) { + u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1); + + if (sg && (addr == (sg_dma_address(sg) + sg->length))) { + sg->length += PAGE_SIZE; + sg_dma_len(sg) += PAGE_SIZE; + continue; + } + + if (sg) + sg_page_sizes |= sg->length; + + sg = sg ? __sg_next(sg) : st->sgl; + sg_dma_address(sg) = addr; + sg_dma_len(sg) = PAGE_SIZE; + sg->length = PAGE_SIZE; + st->nents++; + } + + sg_page_sizes |= sg->length; + sg_mark_end(sg); + return sg_page_sizes; +} + +static bool i915_svm_range_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) +{ + struct svm_notifier *sn = + container_of(mni, struct svm_notifier, notifier); + + /* + * serializes the update to mni->invalidate_seq done by caller and + * prevents invalidation of the PTE from progressing while HW is being + * programmed. This is very hacky and only works because the normal + * notifier that does invalidation is always called after the range + * notifier. + */ + if (mmu_notifier_range_blockable(range)) + mutex_lock(&sn->svm->mutex); + else if (!mutex_trylock(&sn->svm->mutex)) + return false; + mmu_interval_set_seq(mni, cur_seq); + mutex_unlock(&sn->svm->mutex); + return true; +} + +static const struct mmu_interval_notifier_ops i915_svm_mni_ops = { + .invalidate = i915_svm_range_invalidate, +}; + +static int i915_range_fault(struct svm_notifier *sn, + struct drm_i915_gem_vm_bind *args, + struct sg_table *st, u64 *pfns) +{ + unsigned long timeout = + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + /* Have HMM fault pages within the fault window to the GPU. */ + struct hmm_range range = { + .notifier = &sn->notifier, + .start = sn->notifier.interval_tree.start, + .end = sn->notifier.interval_tree.last + 1, + .pfns = pfns, + .pfn_shift = PAGE_SHIFT, + .flags = i915_range_flags, + .values = i915_range_values, + .default_flags = (range.flags[HMM_PFN_VALID] | + ((args->flags & I915_GEM_VM_BIND_READONLY) ? + 0 : range.flags[HMM_PFN_WRITE])), + .pfn_flags_mask = 0, + + }; + struct i915_svm *svm = sn->svm; + struct mm_struct *mm = sn->notifier.mm; + struct i915_address_space *vm = svm->vm; + u32 sg_page_sizes; + u64 flags; + long ret; + + while (true) { + if (time_after(jiffies, timeout)) + return -EBUSY; + + range.notifier_seq = mmu_interval_read_begin(range.notifier); + down_read(&mm->mmap_sem); + ret = hmm_range_fault(&range, 0); + up_read(&mm->mmap_sem); + if (ret <= 0) { + if (ret == 0 || ret == -EBUSY) + continue; + return ret; + } + + sg_page_sizes = i915_svm_build_sg(vm, &range, st); + + mutex_lock(&svm->mutex); + if (mmu_interval_read_retry(range.notifier, + range.notifier_seq)) { + mutex_unlock(&svm->mutex); + continue; + } + break; + } + + flags = (args->flags & I915_GEM_VM_BIND_READONLY) ? + I915_GTT_SVM_READONLY : 0; + ret = svm_bind_addr_commit(vm, args->start, args->length, + flags, st, sg_page_sizes); + mutex_unlock(&svm->mutex); + + return ret; +} + +static void i915_gem_vm_unbind_svm_buffer(struct i915_address_space *vm, + u64 start, u64 length) +{ + struct i915_svm *svm = vm->svm; + + mutex_lock(&svm->mutex); + /* FIXME: Need to flush the TLB */ + svm_unbind_addr(vm, start, length); + mutex_unlock(&svm->mutex); +} + +int i915_gem_vm_bind_svm_buffer(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args) +{ + struct svm_notifier sn; + struct i915_svm *svm; + struct mm_struct *mm; + struct sg_table st; + u64 npages, *pfns; + int ret = 0; + + svm = vm_get_svm(vm); + if (!svm) + return -EINVAL; + + mm = svm->notifier.mm; + if (mm != current->mm) { + ret = -EPERM; + goto bind_done; + } + + args->length += (args->start & ~PAGE_MASK); + args->start &= PAGE_MASK; + DRM_DEBUG_DRIVER("%sing start 0x%llx length 0x%llx vm_id 0x%x\n", + (args->flags & I915_GEM_VM_BIND_UNBIND) ? + "Unbind" : "Bind", args->start, args->length, + args->vm_id); + if (args->flags & I915_GEM_VM_BIND_UNBIND) { + i915_gem_vm_unbind_svm_buffer(vm, args->start, args->length); + goto bind_done; + } + + npages = args->length / PAGE_SIZE; + if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) { + ret = -ENOMEM; + goto bind_done; + } + + pfns = kvmalloc_array(npages, sizeof(uint64_t), GFP_KERNEL); + if (unlikely(!pfns)) { + ret = -ENOMEM; + goto range_done; + } + + ret = svm_bind_addr_prepare(vm, args->start, args->length); + if (unlikely(ret)) + goto prepare_done; + + sn.svm = svm; + ret = mmu_interval_notifier_insert(&sn.notifier, mm, + args->start, args->length, + &i915_svm_mni_ops); + if (!ret) { + ret = i915_range_fault(&sn, args, &st, pfns); + mmu_interval_notifier_remove(&sn.notifier); + } + + if (unlikely(ret)) + i915_gem_vm_unbind_svm_buffer(vm, args->start, args->length); +prepare_done: + kvfree(pfns); +range_done: + sg_free_table(&st); +bind_done: + vm_put_svm(vm); + return ret; +} + +static int +i915_svm_invalidate_range_start(struct mmu_notifier *mn, + const struct mmu_notifier_range *update) +{ + struct i915_svm *svm = container_of(mn, struct i915_svm, notifier); + unsigned long length = update->end - update->start; + + DRM_DEBUG_DRIVER("start 0x%lx length 0x%lx\n", update->start, length); + if (!mmu_notifier_range_blockable(update)) + return -EAGAIN; + + i915_gem_vm_unbind_svm_buffer(svm->vm, update->start, length); + return 0; +} + +static const struct mmu_notifier_ops i915_mn_ops = { + .invalidate_range_start = i915_svm_invalidate_range_start, +}; + +void i915_svm_unbind_mm(struct i915_address_space *vm) +{ + vm_put_svm(vm); +} + +int i915_svm_bind_mm(struct i915_address_space *vm) +{ + struct mm_struct *mm = current->mm; + struct i915_svm *svm; + int ret = 0; + + down_write(&mm->mmap_sem); + mutex_lock(&vm->svm_mutex); + if (vm->svm) + goto bind_out; + + svm = kzalloc(sizeof(*svm), GFP_KERNEL); + if (!svm) { + ret = -ENOMEM; + goto bind_out; + } + mutex_init(&svm->mutex); + kref_init(&svm->ref); + svm->vm = vm; + + svm->notifier.ops = &i915_mn_ops; + ret = __mmu_notifier_register(&svm->notifier, mm); + if (ret) + goto bind_out; + + vm->svm = svm; +bind_out: + mutex_unlock(&vm->svm_mutex); + up_write(&mm->mmap_sem); + return ret; +} diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h new file mode 100644 index 000000000000..a91e6a637f10 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __I915_SVM_H +#define __I915_SVM_H + +#include + +#include "i915_drv.h" + +#if defined(CONFIG_DRM_I915_SVM) +struct i915_svm { + /* i915 address space */ + struct i915_address_space *vm; + + struct mmu_notifier notifier; + struct mutex mutex; /* protects svm operations */ + /* + * XXX: Probably just make use of mmu_notifier's reference + * counting (get/put) instead of our own. + */ + struct kref ref; +}; + +int i915_gem_vm_bind_svm_buffer(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args); +void i915_svm_unbind_mm(struct i915_address_space *vm); +int i915_svm_bind_mm(struct i915_address_space *vm); +static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) +{ + return vm->svm; +} + +#else + +struct i915_svm { }; +static inline int i915_gem_vm_bind_svm_buffer(struct i915_address_space *vm, + struct drm_i915_gem_vm_bind *args) +{ return -ENOTSUPP; } +static inline void i915_svm_unbind_mm(struct i915_address_space *vm) { } +static inline int i915_svm_bind_mm(struct i915_address_space *vm) +{ return -ENOTSUPP; } +static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) +{ return false; } + +#endif + +#endif /* __I915_SVM_H */ diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index e10d7bf2cf9f..3164045446d8 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2305,15 +2305,22 @@ struct drm_i915_query_perf_config { /** * struct drm_i915_gem_vm_bind * - * Bind an object in a vm's page table. + * Bind an object/buffer in a vm's page table. */ struct drm_i915_gem_vm_bind { /** VA start to bind **/ __u64 start; + /** + * VA length to [un]bind + * length only required while binding buffers. + */ + __u64 length; + /** Type of memory to [un]bind **/ __u32 type; #define I915_GEM_VM_BIND_SVM_OBJ 0 +#define I915_GEM_VM_BIND_SVM_BUFFER 1 /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ __u32 handle; From patchwork Fri Dec 13 21:56:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8A21C139A for ; Fri, 13 Dec 2019 22:07:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7E9062073D for ; Fri, 13 Dec 2019 22:07:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E9062073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D816F6EDEF; Fri, 13 Dec 2019 22:07:33 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id BBBF86EDEA; Fri, 13 Dec 2019 22:07:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558905" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:30 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:08 -0800 Message-Id: <20191213215614.24558-7-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 06/12] drm/i915/svm: Device memory support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Plugin device memory through HMM as DEVICE_PRIVATE. Add support functions to allocate pages and free pages from device memory. Implement ioctl to prefetch pages from host to device memory. For now, only support migrating pages from host memory to device memory. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig | 9 + drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 13 - drivers/gpu/drm/i915/i915_buddy.h | 12 + drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_svm.c | 6 + drivers/gpu/drm/i915/i915_svm.h | 15 + drivers/gpu/drm/i915/i915_svm_devmem.c | 400 +++++++++++++++++++++ drivers/gpu/drm/i915/intel_memory_region.h | 14 + drivers/gpu/drm/i915/intel_region_lmem.c | 10 + include/uapi/drm/i915_drm.h | 22 ++ 11 files changed, 491 insertions(+), 14 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 689e57fe3973..66337f2ca2bf 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -141,9 +141,18 @@ config DRM_I915_SVM bool "Enable Shared Virtual Memory support in i915" depends on STAGING depends on DRM_I915 + depends on ARCH_ENABLE_MEMORY_HOTPLUG + depends on ARCH_ENABLE_MEMORY_HOTREMOVE + depends on MEMORY_HOTPLUG + depends on MEMORY_HOTREMOVE + depends on ARCH_HAS_PTE_DEVMAP + depends on SPARSEMEM_VMEMMAP + depends on ZONE_DEVICE + depends on DEVICE_PRIVATE depends on MMU select HMM_MIRROR select MMU_NOTIFIER + select MIGRATE_VMA_HELPER default n help Choose this option if you want Shared Virtual Memory (SVM) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 7d4cd9eefd12..b574ec31ea2e 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -155,7 +155,8 @@ i915-y += \ # SVM code i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ - i915_svm.o + i915_svm.o \ + i915_svm_devmem.o # general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 003d81c171d2..f868a301fc04 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -504,19 +504,6 @@ int __init i915_global_objects_init(void) return 0; } -static enum intel_region_id -__region_id(u32 region) -{ - enum intel_region_id id; - - for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) { - if (intel_region_map[id] == region) - return id; - } - - return INTEL_REGION_UNKNOWN; -} - bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h index ed41f3507cdc..afc493e6c130 100644 --- a/drivers/gpu/drm/i915/i915_buddy.h +++ b/drivers/gpu/drm/i915/i915_buddy.h @@ -9,6 +9,9 @@ #include #include +/* 512 bits (one per pages) supports 2MB blocks */ +#define I915_BUDDY_MAX_PAGES 512 + struct i915_buddy_block { #define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12) #define I915_BUDDY_HEADER_STATE GENMASK_ULL(11, 10) @@ -32,6 +35,15 @@ struct i915_buddy_block { */ struct list_head link; struct list_head tmp_link; + + unsigned long pfn_first; + /* + * FIXME: There are other alternatives to bitmap. Like splitting the + * block into contiguous 4K sized blocks. But it is part of bigger + * issues involving partially invalidating large mapping, freeing the + * blocks etc., revisit. + */ + unsigned long bitmap[BITS_TO_LONGS(I915_BUDDY_MAX_PAGES)]; }; #define I915_BUDDY_MAX_ORDER I915_BUDDY_HEADER_ORDER diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 866d3cbb1edf..f1b92fd3d234 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2765,6 +2765,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_VM_PREFETCH, i915_gem_vm_prefetch_ioctl, DRM_RENDER_ALLOW) }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c index 5941be5b5803..31a80ae0dd45 100644 --- a/drivers/gpu/drm/i915/i915_svm.c +++ b/drivers/gpu/drm/i915/i915_svm.c @@ -152,6 +152,7 @@ static int i915_range_fault(struct svm_notifier *sn, struct mm_struct *mm = sn->notifier.mm; struct i915_address_space *vm = svm->vm; u32 sg_page_sizes; + int regions; u64 flags; long ret; @@ -169,6 +170,11 @@ static int i915_range_fault(struct svm_notifier *sn, return ret; } + /* For dgfx, ensure the range is in device local memory only */ + regions = i915_dmem_convert_pfn(vm->i915, &range); + if (!regions || (IS_DGFX(vm->i915) && (regions & REGION_SMEM))) + return -EINVAL; + sg_page_sizes = i915_svm_build_sg(vm, &range, st); mutex_lock(&svm->mutex); diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h index a91e6a637f10..ae39c2ef2f18 100644 --- a/drivers/gpu/drm/i915/i915_svm.h +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -33,6 +33,14 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) return vm->svm; } +int i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, + struct hmm_range *range); +int i915_gem_vm_prefetch_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, + u64 size); +void i915_svm_devmem_remove(struct i915_devmem *devmem); + #else struct i915_svm { }; @@ -45,6 +53,13 @@ static inline int i915_svm_bind_mm(struct i915_address_space *vm) static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) { return false; } +static inline int i915_gem_vm_prefetch_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ return -ENOTSUPP; } +static inline +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size) +{ return NULL; } +static inline void i915_svm_devmem_remove(struct i915_devmem *devmem) { } #endif #endif /* __I915_SVM_H */ diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c new file mode 100644 index 000000000000..0a1f1394f196 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -0,0 +1,400 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include +#include + +#include "i915_svm.h" +#include "intel_memory_region.h" + +struct i915_devmem_migrate { + struct drm_i915_private *i915; + struct migrate_vma *args; + + enum intel_region_id src_id; + enum intel_region_id dst_id; + u64 npages; +}; + +struct i915_devmem { + struct drm_i915_private *i915; + struct dev_pagemap pagemap; + unsigned long pfn_first; + unsigned long pfn_last; +}; + +static inline bool +i915_dmem_page(struct drm_i915_private *dev_priv, struct page *page) +{ + if (!is_device_private_page(page)) + return false; + + return true; +} + +int i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, + struct hmm_range *range) +{ + unsigned long i, npages; + int regions = 0; + + npages = (range->end - range->start) >> PAGE_SHIFT; + for (i = 0; i < npages; ++i) { + struct i915_buddy_block *block; + struct intel_memory_region *mem; + struct page *page; + u64 addr; + + page = hmm_device_entry_to_page(range, range->pfns[i]); + if (!page) + continue; + + if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE])) { + regions |= REGION_SMEM; + continue; + } + + if (!i915_dmem_page(dev_priv, page)) { + WARN(1, "Some unknown device memory !\n"); + range->pfns[i] = 0; + continue; + } + + regions |= REGION_LMEM; + block = page->zone_device_data; + mem = block->private; + addr = mem->region.start + + i915_buddy_block_offset(block); + addr += (page_to_pfn(page) - block->pfn_first) << PAGE_SHIFT; + + range->pfns[i] &= ~range->flags[HMM_PFN_DEVICE_PRIVATE]; + range->pfns[i] &= ((1UL << range->pfn_shift) - 1); + range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift; + } + + return regions; +} + +static int +i915_devmem_page_alloc_locked(struct intel_memory_region *mem, + unsigned long npages, + struct list_head *blocks) +{ + unsigned long size = ALIGN((npages * PAGE_SIZE), mem->mm.chunk_size); + struct i915_buddy_block *block; + int ret; + + INIT_LIST_HEAD(blocks); + ret = __intel_memory_region_get_pages_buddy(mem, size, 0, blocks); + if (unlikely(ret)) + goto alloc_failed; + + list_for_each_entry(block, blocks, link) { + block->pfn_first = mem->devmem->pfn_first; + block->pfn_first += i915_buddy_block_offset(block) / + PAGE_SIZE; + bitmap_zero(block->bitmap, I915_BUDDY_MAX_PAGES); + DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n", + "Allocated block", block->pfn_first, + i915_buddy_block_offset(block), + i915_buddy_block_size(&mem->mm, block)); + } + +alloc_failed: + return ret; +} + +static struct page * +i915_devmem_page_get_locked(struct intel_memory_region *mem, + struct list_head *blocks) +{ + struct i915_buddy_block *block, *on; + + list_for_each_entry_safe(block, on, blocks, link) { + unsigned long weight, max; + unsigned long i, pfn; + struct page *page; + + max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE; + i = find_first_zero_bit(block->bitmap, max); + if (unlikely(i == max)) { + WARN(1, "Getting a page should have never failed\n"); + break; + } + + set_bit(i, block->bitmap); + pfn = block->pfn_first + i; + page = pfn_to_page(pfn); + get_page(page); + lock_page(page); + page->zone_device_data = block; + weight = bitmap_weight(block->bitmap, max); + if (weight == max) + list_del_init(&block->link); + DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n", + "Allocated page", pfn, weight); + return page; + } + return NULL; +} + +static void +i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, + struct page *page) +{ + unlock_page(page); + put_page(page); +} + +static int +i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) +{ + struct drm_i915_private *i915 = migrate->i915; + struct migrate_vma *args = migrate->args; + struct intel_memory_region *mem; + struct list_head blocks = {0}; + unsigned long i, npages, cnt; + struct page *page; + int ret; + + npages = (args->end - args->start) >> PAGE_SHIFT; + DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages); + + /* Check source pages */ + for (i = 0, cnt = 0; i < npages; i++) { + args->dst[i] = 0; + page = migrate_pfn_to_page(args->src[i]); + if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) + continue; + + args->dst[i] = MIGRATE_PFN_VALID; + cnt++; + } + + if (!cnt) { + ret = -ENOMEM; + goto migrate_out; + } + + mem = i915->mm.regions[migrate->dst_id]; + ret = i915_devmem_page_alloc_locked(mem, cnt, &blocks); + if (unlikely(ret)) + goto migrate_out; + + /* Allocate device memory */ + for (i = 0, cnt = 0; i < npages; i++) { + if (!args->dst[i]) + continue; + + page = i915_devmem_page_get_locked(mem, &blocks); + if (unlikely(!page)) { + WARN(1, "Failed to get dst page\n"); + args->dst[i] = 0; + continue; + } + + cnt++; + args->dst[i] = migrate_pfn(page_to_pfn(page)) | + MIGRATE_PFN_LOCKED; + } + + if (!cnt) { + ret = -ENOMEM; + goto migrate_out; + } + + /* Copy the pages */ + migrate->npages = npages; +migrate_out: + if (unlikely(ret)) { + for (i = 0; i < npages; i++) { + if (args->dst[i] & MIGRATE_PFN_LOCKED) { + page = migrate_pfn_to_page(args->dst[i]); + i915_devmem_page_free_locked(i915, page); + } + args->dst[i] = 0; + } + } + + return ret; +} + +void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate) +{ + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); +} + +static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate) +{ + int ret; + + ret = i915_devmem_migrate_alloc_and_copy(migrate); + if (!ret) { + migrate_vma_pages(migrate->args); + i915_devmem_migrate_finalize_and_map(migrate); + } + migrate_vma_finalize(migrate->args); +} + +int i915_devmem_migrate_vma(struct intel_memory_region *mem, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + unsigned long npages = (end - start) >> PAGE_SHIFT; + unsigned long max = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages); + struct i915_devmem_migrate migrate = {0}; + struct migrate_vma args = { + .vma = vma, + .start = start, + }; + unsigned long c, i; + int ret = 0; + + /* XXX: Opportunistically migrate additional pages? */ + DRM_DEBUG_DRIVER("start 0x%lx end 0x%lx\n", start, end); + args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL); + if (unlikely(!args.src)) + return -ENOMEM; + + args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL); + if (unlikely(!args.dst)) { + kfree(args.src); + return -ENOMEM; + } + + /* XXX: Support migrating from LMEM to SMEM */ + migrate.args = &args; + migrate.i915 = mem->i915; + migrate.src_id = INTEL_REGION_SMEM; + migrate.dst_id = MEMORY_TYPE_FROM_REGION(mem->id); + for (i = 0; i < npages; i += c) { + c = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages); + args.end = start + (c << PAGE_SHIFT); + ret = migrate_vma_setup(&args); + if (unlikely(ret)) + goto migrate_done; + if (args.cpages) + i915_devmem_migrate_chunk(&migrate); + args.start = args.end; + } +migrate_done: + kfree(args.dst); + kfree(args.src); + return ret; +} + +static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + +static void i915_devmem_page_free(struct page *page) +{ + struct i915_buddy_block *block = page->zone_device_data; + struct intel_memory_region *mem = block->private; + unsigned long i, max, weight; + + max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE; + i = page_to_pfn(page) - block->pfn_first; + clear_bit(i, block->bitmap); + weight = bitmap_weight(block->bitmap, max); + DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n", + "Freeing page", page_to_pfn(page), weight); + if (!weight) { + DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n", + "Freeing block", block->pfn_first, + i915_buddy_block_offset(block), + i915_buddy_block_size(&mem->mm, block)); + __intel_memory_region_put_block_buddy(block); + } +} + +static const struct dev_pagemap_ops i915_devmem_pagemap_ops = { + .page_free = i915_devmem_page_free, + .migrate_to_ram = i915_devmem_migrate_to_ram, +}; + +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size) +{ + struct device *dev = &i915->drm.pdev->dev; + struct i915_devmem *devmem; + struct resource *res; + + devmem = kzalloc(sizeof(*devmem), GFP_KERNEL); + if (!devmem) + return NULL; + + devmem->i915 = i915; + res = devm_request_free_mem_region(dev, &iomem_resource, size); + if (IS_ERR(res)) + goto out_free; + + devmem->pagemap.type = MEMORY_DEVICE_PRIVATE; + devmem->pagemap.res = *res; + devmem->pagemap.ops = &i915_devmem_pagemap_ops; + if (IS_ERR(devm_memremap_pages(dev, &devmem->pagemap))) + goto out_free; + + devmem->pfn_first = res->start >> PAGE_SHIFT; + devmem->pfn_last = res->end >> PAGE_SHIFT; + return devmem; +out_free: + kfree(devmem); + return NULL; +} + +void i915_svm_devmem_remove(struct i915_devmem *devmem) +{ + /* XXX: Is it the right way to release? */ + release_resource(&devmem->pagemap.res); + kfree(devmem); +} + +int i915_gem_vm_prefetch_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_i915_private *i915 = to_i915(dev); + struct drm_i915_gem_vm_prefetch *args = data; + unsigned long addr, end, size = args->length; + struct intel_memory_region *mem; + enum intel_region_id id; + struct mm_struct *mm; + + if (args->type != I915_GEM_VM_PREFETCH_SVM_BUFFER) + return -EINVAL; + + DRM_DEBUG_DRIVER("start 0x%llx length 0x%llx region 0x%x\n", + args->start, args->length, args->region); + id = __region_id(args->region); + if ((MEMORY_TYPE_FROM_REGION(args->region) != INTEL_MEMORY_LOCAL) || + id == INTEL_REGION_UNKNOWN) + return -EINVAL; + + mem = i915->mm.regions[id]; + + mm = get_task_mm(current); + down_read(&mm->mmap_sem); + + for (addr = args->start, end = args->start + size; addr < end;) { + struct vm_area_struct *vma; + unsigned long next; + + vma = find_vma_intersection(mm, addr, end); + if (!vma) + break; + + addr &= PAGE_MASK; + next = min(vma->vm_end, end); + next = round_up(next, PAGE_SIZE); + /* This is a best effort so we ignore errors */ + i915_devmem_migrate_vma(mem, vma, addr, next); + addr = next; + } + + up_read(&mm->mmap_sem); + mmput(mm); + return 0; +} diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h index e3e8ab946d78..4c9dab6bca83 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.h +++ b/drivers/gpu/drm/i915/intel_memory_region.h @@ -56,6 +56,19 @@ enum intel_region_id { */ extern const u32 intel_region_map[]; +static inline enum intel_region_id +__region_id(u32 region) +{ + enum intel_region_id id; + + for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) { + if (intel_region_map[id] == region) + return id; + } + + return INTEL_REGION_UNKNOWN; +} + struct intel_memory_region_ops { unsigned int flags; @@ -71,6 +84,7 @@ struct intel_memory_region_ops { struct intel_memory_region { struct drm_i915_private *i915; + struct i915_devmem *devmem; const struct intel_memory_region_ops *ops; struct io_mapping iomap; diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c index eddb392917aa..2ba4a4720eb6 100644 --- a/drivers/gpu/drm/i915/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/intel_region_lmem.c @@ -4,6 +4,7 @@ */ #include "i915_drv.h" +#include "i915_svm.h" #include "intel_memory_region.h" #include "gem/i915_gem_lmem.h" #include "gem/i915_gem_region.h" @@ -66,6 +67,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem) static void region_lmem_release(struct intel_memory_region *mem) { + i915_svm_devmem_remove(mem->devmem); release_fake_lmem_bar(mem); io_mapping_fini(&mem->iomap); intel_memory_region_release_buddy(mem); @@ -122,6 +124,14 @@ intel_setup_fake_lmem(struct drm_i915_private *i915) PAGE_SIZE, io_start, &intel_region_lmem_ops); + if (!IS_ERR(mem)) { + mem->devmem = i915_svm_devmem_add(i915, mappable_end); + if (IS_ERR(mem->devmem)) { + intel_memory_region_put(mem); + mem = ERR_CAST(mem->devmem); + } + } + if (!IS_ERR(mem)) { DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region); DRM_INFO("Intel graphics fake LMEM IO start: %llx\n", diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 3164045446d8..f49e29716460 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -361,6 +361,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_VM_DESTROY 0x3b #define DRM_I915_GEM_OBJECT_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM #define DRM_I915_GEM_VM_BIND 0x3c +#define DRM_I915_GEM_VM_PREFETCH 0x3d /* Must be kept compact -- no holes */ #define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) @@ -426,6 +427,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_VM_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param) #define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind) +#define DRM_IOCTL_I915_GEM_VM_PREFETCH DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_PREFETCH, struct drm_i915_gem_vm_prefetch) /* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -2334,6 +2336,26 @@ struct drm_i915_gem_vm_bind { #define I915_GEM_VM_BIND_READONLY (1 << 1) }; +/** + * struct drm_i915_gem_vm_prefetch + * + * Prefetch an address range to a memory region. + */ +struct drm_i915_gem_vm_prefetch { + /** Type of memory to prefetch **/ + __u32 type; +#define I915_GEM_VM_PREFETCH_SVM_BUFFER 0 + + /** Memory region to prefetch to **/ + __u32 region; + + /** VA start to prefetch **/ + __u64 start; + + /** VA length to prefetch **/ + __u64 length; +}; + #if defined(__cplusplus) } #endif From patchwork Fri Dec 13 21:56:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291607 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A3EE139A for ; Fri, 13 Dec 2019 22:08:02 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0EAD02073D for ; Fri, 13 Dec 2019 22:08:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0EAD02073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8765A6EDF4; Fri, 13 Dec 2019 22:07:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 51CB56EDE9; Fri, 13 Dec 2019 22:07:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558909" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:31 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:09 -0800 Message-Id: <20191213215614.24558-8-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 07/12] drm/i915/svm: Implicitly migrate pages upon CPU fault X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As PCIe is non-coherent link, do not allow direct memory access across PCIe link. Handle CPU fault by migrating pages back to host memory. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_svm_devmem.c | 70 +++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 0a1f1394f196..98ab27879041 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -286,9 +286,77 @@ int i915_devmem_migrate_vma(struct intel_memory_region *mem, return ret; } +static vm_fault_t +i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) +{ + struct device *dev = migrate->i915->drm.dev; + struct migrate_vma *args = migrate->args; + struct page *dpage, *spage; + + DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); + /* Allocate host page */ + spage = migrate_pfn_to_page(args->src[0]); + if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE))) + return 0; + + dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, args->start); + if (unlikely(!dpage)) + return VM_FAULT_SIGBUS; + lock_page(dpage); + + args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + + /* Copy the pages */ + migrate->npages = 1; + + return 0; +} + +void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate) +{ + DRM_DEBUG_DRIVER("\n"); +} + +static inline struct i915_devmem *page_to_devmem(struct page *page) +{ + return container_of(page->pgmap, struct i915_devmem, pagemap); +} + static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) { - return VM_FAULT_SIGBUS; + struct i915_devmem *devmem = page_to_devmem(vmf->page); + struct drm_i915_private *i915 = devmem->i915; + struct i915_devmem_migrate migrate = {0}; + unsigned long src = 0, dst = 0; + vm_fault_t ret; + struct migrate_vma args = { + .vma = vmf->vma, + .start = vmf->address, + .end = vmf->address + PAGE_SIZE, + .src = &src, + .dst = &dst, + }; + + /* XXX: Opportunistically migrate more pages? */ + DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start); + migrate.i915 = i915; + migrate.args = &args; + migrate.src_id = INTEL_REGION_LMEM; + migrate.dst_id = INTEL_REGION_SMEM; + if (migrate_vma_setup(&args) < 0) + return VM_FAULT_SIGBUS; + if (!args.cpages) + return 0; + + ret = i915_devmem_fault_alloc_and_copy(&migrate); + if (ret || dst == 0) + goto done; + + migrate_vma_pages(&args); + i915_devmem_fault_finalize_and_map(&migrate); +done: + migrate_vma_finalize(&args); + return ret; } static void i915_devmem_page_free(struct page *page) From patchwork Fri Dec 13 21:56:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291609 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC50414B7 for ; Fri, 13 Dec 2019 22:08:02 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B065D2073D for ; Fri, 13 Dec 2019 22:08:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B065D2073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 215846EDEE; Fri, 13 Dec 2019 22:07:40 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 049BD6EDE6; Fri, 13 Dec 2019 22:07:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558915" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:32 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:10 -0800 Message-Id: <20191213215614.24558-9-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 08/12] drm/i915/svm: Page copy support during migration X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Copy the pages duing SVM migration using memcpy(). Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_svm_devmem.c | 72 ++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 98ab27879041..2b0c7f4c51b8 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -148,6 +148,69 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, put_page(page); } +static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate) +{ + const unsigned long *src = migrate->args->src; + unsigned long *dst = migrate->args->dst; + struct drm_i915_private *i915 = migrate->i915; + struct intel_memory_region *mem; + void *src_vaddr, *dst_vaddr; + u64 src_addr, dst_addr; + struct page *page; + int i, ret = 0; + + /* XXX: Copy multiple pages at a time */ + for (i = 0; !ret && i < migrate->npages; i++) { + if (!dst[i]) + continue; + + page = migrate_pfn_to_page(dst[i]); + mem = i915->mm.regions[migrate->dst_id]; + dst_addr = page_to_pfn(page); + if (migrate->dst_id != INTEL_REGION_SMEM) + dst_addr -= mem->devmem->pfn_first; + dst_addr <<= PAGE_SHIFT; + + if (migrate->dst_id == INTEL_REGION_SMEM) + dst_vaddr = phys_to_virt(dst_addr); + else + dst_vaddr = io_mapping_map_atomic_wc(&mem->iomap, + dst_addr); + if (unlikely(!dst_vaddr)) + return -ENOMEM; + + page = migrate_pfn_to_page(src[i]); + mem = i915->mm.regions[migrate->src_id]; + src_addr = page_to_pfn(page); + if (migrate->src_id != INTEL_REGION_SMEM) + src_addr -= mem->devmem->pfn_first; + src_addr <<= PAGE_SHIFT; + + if (migrate->src_id == INTEL_REGION_SMEM) + src_vaddr = phys_to_virt(src_addr); + else + src_vaddr = io_mapping_map_atomic_wc(&mem->iomap, + src_addr); + + if (likely(src_vaddr)) + memcpy(dst_vaddr, src_vaddr, PAGE_SIZE); + else + ret = -ENOMEM; + + if (migrate->dst_id != INTEL_REGION_SMEM) + io_mapping_unmap_atomic(dst_vaddr); + + if (migrate->src_id != INTEL_REGION_SMEM && src_vaddr) + io_mapping_unmap_atomic(src_vaddr); + + DRM_DEBUG_DRIVER("src [%d] 0x%llx, dst [%d] 0x%llx\n", + migrate->src_id, src_addr, + migrate->dst_id, dst_addr); + } + + return ret; +} + static int i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) { @@ -207,6 +270,8 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) /* Copy the pages */ migrate->npages = npages; + /* XXX: Flush the caches? */ + ret = i915_migrate_cpu_copy(migrate); migrate_out: if (unlikely(ret)) { for (i = 0; i < npages; i++) { @@ -292,6 +357,7 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) struct device *dev = migrate->i915->drm.dev; struct migrate_vma *args = migrate->args; struct page *dpage, *spage; + int err; DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); /* Allocate host page */ @@ -308,6 +374,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) /* Copy the pages */ migrate->npages = 1; + err = i915_migrate_cpu_copy(migrate); + if (unlikely(err)) { + __free_page(dpage); + args->dst[0] = 0; + return VM_FAULT_SIGBUS; + } return 0; } From patchwork Fri Dec 13 21:56:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291613 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0AD9B139A for ; Fri, 13 Dec 2019 22:08:04 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F34BF2073D for ; Fri, 13 Dec 2019 22:08:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F34BF2073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D5BE86EDF7; Fri, 13 Dec 2019 22:07:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id AC28F6EDE9; Fri, 13 Dec 2019 22:07:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558919" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:32 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:11 -0800 Message-Id: <20191213215614.24558-10-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 09/12] drm/i915/svm: Add functions to blitter copy SVM buffers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add support function to blitter copy SVM VAs without requiring any gem objects. Also add function to wait for completion of the copy. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 + drivers/gpu/drm/i915/gem/i915_gem_wait.c | 2 +- drivers/gpu/drm/i915/i915_svm.h | 6 + drivers/gpu/drm/i915/i915_svm_copy.c | 172 +++++++++++++++++++++ 5 files changed, 184 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b574ec31ea2e..97d40172bf27 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -156,7 +156,8 @@ i915-y += \ # SVM code i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ i915_svm.o \ - i915_svm_devmem.o + i915_svm_devmem.o \ + i915_svm_copy.o # general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 6d8ca3f0ccf7..1defc227c729 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -478,6 +478,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj) int i915_gem_object_wait(struct drm_i915_gem_object *obj, unsigned int flags, long timeout); +long i915_gem_object_wait_fence(struct dma_fence *fence, + unsigned int flags, + long timeout); int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, unsigned int flags, const struct i915_sched_attr *attr); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index 8af55cd3e690..b7905aa8f821 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -12,7 +12,7 @@ #include "i915_gem_ioctls.h" #include "i915_gem_object.h" -static long +long i915_gem_object_wait_fence(struct dma_fence *fence, unsigned int flags, long timeout) diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h index ae39c2ef2f18..f8bedb4569b5 100644 --- a/drivers/gpu/drm/i915/i915_svm.h +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -33,6 +33,12 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) return vm->svm; } +int i915_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 size, + struct dma_fence **fence); +int i915_svm_copy_blt_wait(struct drm_i915_private *i915, + struct dma_fence *fence); + int i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, struct hmm_range *range); int i915_gem_vm_prefetch_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_svm_copy.c b/drivers/gpu/drm/i915/i915_svm_copy.c new file mode 100644 index 000000000000..42f7d563f6b4 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm_copy.c @@ -0,0 +1,172 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "gem/i915_gem_object_blt.h" +#include "gt/intel_engine_pm.h" +#include "gt/intel_engine_pool.h" +#include "gt/intel_gpu_commands.h" +#include "gt/intel_gt.h" + +static struct i915_vma * +intel_emit_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 buff_size) +{ + struct drm_i915_private *i915 = ce->vm->i915; + const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */ + struct intel_engine_pool_node *pool; + struct i915_vma *batch; + u64 count, rem; + u32 size, *cmd; + int err; + + GEM_BUG_ON(intel_engine_is_virtual(ce->engine)); + intel_engine_pm_get(ce->engine); + + if (INTEL_GEN(i915) < 8) + return ERR_PTR(-ENOTSUPP); + + count = div_u64(round_up(buff_size, block_size), block_size); + size = (1 + 11 * count) * sizeof(u32); + size = round_up(size, PAGE_SIZE); + pool = intel_engine_get_pool(ce->engine, size); + if (IS_ERR(pool)) { + err = PTR_ERR(pool); + goto out_pm; + } + + cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC); + if (IS_ERR(cmd)) { + err = PTR_ERR(cmd); + goto out_put; + } + + rem = buff_size; + do { + size = min_t(u64, rem, block_size); + GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX); + + if (INTEL_GEN(i915) >= 9) { + *cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2); + *cmd++ = BLT_DEPTH_32 | PAGE_SIZE; + *cmd++ = 0; + *cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cmd++ = lower_32_bits(dst_start); + *cmd++ = upper_32_bits(dst_start); + *cmd++ = 0; + *cmd++ = PAGE_SIZE; + *cmd++ = lower_32_bits(src_start); + *cmd++ = upper_32_bits(src_start); + } else if (INTEL_GEN(i915) >= 8) { + *cmd++ = XY_SRC_COPY_BLT_CMD | + BLT_WRITE_RGBA | (10 - 2); + *cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE; + *cmd++ = 0; + *cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cmd++ = lower_32_bits(dst_start); + *cmd++ = upper_32_bits(dst_start); + *cmd++ = 0; + *cmd++ = PAGE_SIZE; + *cmd++ = lower_32_bits(src_start); + *cmd++ = upper_32_bits(src_start); + } + + /* Allow ourselves to be preempted in between blocks */ + *cmd++ = MI_ARB_CHECK; + + src_start += size; + dst_start += size; + rem -= size; + } while (rem); + + *cmd = MI_BATCH_BUFFER_END; + intel_gt_chipset_flush(ce->vm->gt); + + i915_gem_object_unpin_map(pool->obj); + + batch = i915_vma_instance(pool->obj, ce->vm, NULL); + if (IS_ERR(batch)) { + err = PTR_ERR(batch); + goto out_put; + } + + err = i915_vma_pin(batch, 0, 0, PIN_USER); + if (unlikely(err)) + goto out_put; + + batch->private = pool; + return batch; + +out_put: + intel_engine_pool_put(pool); +out_pm: + intel_engine_pm_put(ce->engine); + return ERR_PTR(err); +} + +int i915_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 size, + struct dma_fence **fence) +{ + struct drm_i915_private *i915 = ce->gem_context->i915; + struct i915_request *rq; + struct i915_vma *batch; + int err; + + DRM_DEBUG_DRIVER("src_start 0x%llx dst_start 0x%llx size 0x%llx\n", + src_start, dst_start, size); + mutex_lock(&i915->drm.struct_mutex); + batch = intel_emit_svm_copy_blt(ce, src_start, dst_start, size); + if (IS_ERR(batch)) { + err = PTR_ERR(batch); + goto out_unlock; + } + + rq = i915_request_create(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_batch; + } + + err = intel_emit_vma_mark_active(batch, rq); + if (unlikely(err)) + goto out_request; + + if (ce->engine->emit_init_breadcrumb) { + err = ce->engine->emit_init_breadcrumb(rq); + if (unlikely(err)) + goto out_request; + } + + err = rq->engine->emit_bb_start(rq, + batch->node.start, batch->node.size, + 0); +out_request: + if (unlikely(err)) + i915_request_skip(rq, err); + else + *fence = &rq->fence; + + i915_request_add(rq); +out_batch: + intel_emit_vma_release(ce, batch); +out_unlock: + mutex_unlock(&i915->drm.struct_mutex); + return err; +} + +int i915_svm_copy_blt_wait(struct drm_i915_private *i915, + struct dma_fence *fence) +{ + long timeout; + + mutex_lock(&i915->drm.struct_mutex); + timeout = i915_gem_object_wait_fence(fence, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + mutex_unlock(&i915->drm.struct_mutex); + return (timeout < 0) ? timeout : 0; +} From patchwork Fri Dec 13 21:56:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291615 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B765F14B7 for ; Fri, 13 Dec 2019 22:08:04 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ABD232073D for ; Fri, 13 Dec 2019 22:08:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ABD232073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BC2726EDEB; Fri, 13 Dec 2019 22:07:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id A2BE16EDEE; Fri, 13 Dec 2019 22:07:34 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558923" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:33 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:12 -0800 Message-Id: <20191213215614.24558-11-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 10/12] drm/i915/svm: Use blitter copy for migration X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Use blitter engine to copy pages during migration. As blitter context virtual address space is shared with other flows, ensure virtual address are allocated properly from that address space. Also ensure completion of blitter copy by waiting on the fence of the issued request. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_svm_devmem.c | 249 ++++++++++++++++++++++++- 1 file changed, 245 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 2b0c7f4c51b8..589cf42bc2da 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -15,7 +15,15 @@ struct i915_devmem_migrate { enum intel_region_id src_id; enum intel_region_id dst_id; + dma_addr_t *host_dma; + bool blitter_copy; u64 npages; + + /* required for blitter copy */ + struct drm_mm_node src_node; + struct drm_mm_node dst_node; + struct intel_context *ce; + struct dma_fence *fence; }; struct i915_devmem { @@ -148,6 +156,139 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, put_page(page); } +static int i915_devmem_bind_addr(struct i915_devmem_migrate *migrate, + bool is_src) +{ + struct i915_gem_context *ctx = migrate->ce->gem_context; + struct drm_i915_private *i915 = migrate->i915; + struct i915_address_space *vm = ctx->vm; + struct intel_memory_region *mem; + u64 npages = migrate->npages; + enum intel_region_id mem_id; + struct drm_mm_node *node; + struct scatterlist *sg; + u32 sg_page_sizes = 0; + struct sg_table st; + u64 flags = 0; + int i, ret; + + if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) + return -ENOMEM; + + if (is_src) { + node = &migrate->src_node; + mem_id = migrate->src_id; + flags |= I915_GTT_SVM_READONLY; + } else { + node = &migrate->dst_node; + mem_id = migrate->dst_id; + } + + mutex_lock(&vm->mutex); + ret = i915_gem_gtt_insert(vm, node, npages * PAGE_SIZE, + I915_GTT_PAGE_SIZE_2M, 0, + 0, vm->total, PIN_USER); + mutex_unlock(&vm->mutex); + if (unlikely(ret)) + return ret; + + sg = NULL; + st.nents = 0; + + /* + * XXX: If the source page is missing, source it from a temporary + * zero filled page. If needed, destination page missing scenarios + * can be similarly handled by draining data into a temporary page. + */ + for (i = 0; i < npages; i++) { + u64 addr; + + if (mem_id == INTEL_REGION_SMEM) { + addr = migrate->host_dma[i]; + } else { + struct page *page; + unsigned long mpfn; + + mpfn = is_src ? migrate->args->src[i] : + migrate->args->dst[i]; + page = migrate_pfn_to_page(mpfn); + mem = i915->mm.regions[mem_id]; + addr = page_to_pfn(page) - mem->devmem->pfn_first; + addr <<= PAGE_SHIFT; + addr += mem->region.start; + } + + if (sg && (addr == (sg_dma_address(sg) + sg->length))) { + sg->length += PAGE_SIZE; + sg_dma_len(sg) += PAGE_SIZE; + continue; + } + + if (sg) + sg_page_sizes |= sg->length; + + sg = sg ? __sg_next(sg) : st.sgl; + sg_dma_address(sg) = addr; + sg_dma_len(sg) = PAGE_SIZE; + sg->length = PAGE_SIZE; + st.nents++; + } + + sg_page_sizes |= sg->length; + sg_mark_end(sg); + i915_sg_trim(&st); + + ret = svm_bind_addr(vm, node->start, npages * PAGE_SIZE, + flags, &st, sg_page_sizes); + sg_free_table(&st); + + return ret; +} + +static void i915_devmem_unbind_addr(struct i915_devmem_migrate *migrate, + bool is_src) +{ + struct i915_gem_context *ctx = migrate->ce->gem_context; + struct i915_address_space *vm = ctx->vm; + struct drm_mm_node *node; + + node = is_src ? &migrate->src_node : &migrate->dst_node; + svm_unbind_addr(vm, node->start, migrate->npages * PAGE_SIZE); + mutex_lock(&vm->mutex); + drm_mm_remove_node(node); + mutex_unlock(&vm->mutex); +} + +static int i915_migrate_blitter_copy(struct i915_devmem_migrate *migrate) +{ + struct drm_i915_private *i915 = migrate->i915; + int ret; + + migrate->ce = i915->engine[BCS0]->kernel_context; + ret = i915_devmem_bind_addr(migrate, true); + if (unlikely(ret)) + return ret; + + ret = i915_devmem_bind_addr(migrate, false); + if (unlikely(ret)) { + i915_devmem_unbind_addr(migrate, true); + return ret; + } + + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); + ret = i915_svm_copy_blt(migrate->ce, + migrate->src_node.start, + migrate->dst_node.start, + migrate->npages * PAGE_SIZE, + &migrate->fence); + if (unlikely(ret)) { + i915_devmem_unbind_addr(migrate, false); + i915_devmem_unbind_addr(migrate, true); + } + + return ret; +} + static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate) { const unsigned long *src = migrate->args->src; @@ -216,6 +357,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) { struct drm_i915_private *i915 = migrate->i915; struct migrate_vma *args = migrate->args; + struct device *dev = i915->drm.dev; struct intel_memory_region *mem; struct list_head blocks = {0}; unsigned long i, npages, cnt; @@ -224,13 +366,35 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) npages = (args->end - args->start) >> PAGE_SHIFT; DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages); + /* + * XXX: Set proper condition for cpu vs blitter copy for performance, + * functionality and to avoid any deadlocks around blitter usage. + */ + migrate->blitter_copy = true; + + if (migrate->blitter_copy) { + /* Allocate storage for DMA addresses, so we can unmap later. */ + migrate->host_dma = kcalloc(npages, sizeof(*migrate->host_dma), + GFP_KERNEL); + if (unlikely(!migrate->host_dma)) + migrate->blitter_copy = false; + } - /* Check source pages */ + /* Check and DMA map source pages */ for (i = 0, cnt = 0; i < npages; i++) { args->dst[i] = 0; page = migrate_pfn_to_page(args->src[i]); - if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) + if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) { + migrate->blitter_copy = false; continue; + } + + migrate->host_dma[i] = dma_map_page(dev, page, 0, PAGE_SIZE, + PCI_DMA_TODEVICE); + if (unlikely(dma_mapping_error(dev, migrate->host_dma[i]))) { + migrate->blitter_copy = false; + migrate->host_dma[i] = 0; + } args->dst[i] = MIGRATE_PFN_VALID; cnt++; @@ -254,6 +418,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) page = i915_devmem_page_get_locked(mem, &blocks); if (unlikely(!page)) { WARN(1, "Failed to get dst page\n"); + migrate->blitter_copy = false; args->dst[i] = 0; continue; } @@ -271,7 +436,11 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) /* Copy the pages */ migrate->npages = npages; /* XXX: Flush the caches? */ - ret = i915_migrate_cpu_copy(migrate); + /* XXX: Fallback to cpu_copy if blitter_copy fails? */ + if (migrate->blitter_copy) + ret = i915_migrate_blitter_copy(migrate); + else + ret = i915_migrate_cpu_copy(migrate); migrate_out: if (unlikely(ret)) { for (i = 0; i < npages; i++) { @@ -283,12 +452,42 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) } } + if (unlikely(migrate->host_dma && (!migrate->blitter_copy || ret))) { + for (i = 0; i < npages; i++) { + if (migrate->host_dma[i]) + dma_unmap_page(dev, migrate->host_dma[i], + PAGE_SIZE, PCI_DMA_TODEVICE); + } + kfree(migrate->host_dma); + migrate->host_dma = NULL; + } + return ret; } void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate) { + struct drm_i915_private *i915 = migrate->i915; + unsigned long i; + int ret; + + if (!migrate->blitter_copy) + return; + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); + ret = i915_svm_copy_blt_wait(i915, migrate->fence); + if (unlikely(ret < 0)) + WARN(1, "Waiting for dma fence failed %d\n", ret); + + i915_devmem_unbind_addr(migrate, true); + i915_devmem_unbind_addr(migrate, false); + + for (i = 0; i < migrate->npages; i++) + dma_unmap_page(i915->drm.dev, migrate->host_dma[i], + PAGE_SIZE, PCI_DMA_TODEVICE); + + kfree(migrate->host_dma); + migrate->host_dma = NULL; } static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate) @@ -360,6 +559,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) int err; DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); + /* + * XXX: Set proper condition for cpu vs blitter copy for performance, + * functionality and to avoid any deadlocks around blitter usage. + */ + migrate->blitter_copy = false; + /* Allocate host page */ spage = migrate_pfn_to_page(args->src[0]); if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE))) @@ -370,12 +575,31 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) return VM_FAULT_SIGBUS; lock_page(dpage); + /* DMA map host page */ + if (migrate->blitter_copy) { + migrate->host_dma[0] = dma_map_page(dev, dpage, 0, PAGE_SIZE, + PCI_DMA_FROMDEVICE); + if (unlikely(dma_mapping_error(dev, migrate->host_dma[0]))) { + migrate->host_dma[0] = 0; + err = -ENOMEM; + goto fault_out; + } + } + args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; /* Copy the pages */ migrate->npages = 1; - err = i915_migrate_cpu_copy(migrate); + /* XXX: Fallback to cpu_copy if blitter_copy fails? */ + if (migrate->blitter_copy) + err = i915_migrate_blitter_copy(migrate); + else + err = i915_migrate_cpu_copy(migrate); +fault_out: if (unlikely(err)) { + if (migrate->host_dma[0]) + dma_unmap_page(dev, migrate->host_dma[0], + PAGE_SIZE, PCI_DMA_FROMDEVICE); __free_page(dpage); args->dst[0] = 0; return VM_FAULT_SIGBUS; @@ -386,7 +610,22 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate) { + struct drm_i915_private *i915 = migrate->i915; + int err; + DRM_DEBUG_DRIVER("\n"); + if (!migrate->blitter_copy) + return; + + err = i915_svm_copy_blt_wait(i915, migrate->fence); + if (unlikely(err < 0)) + WARN(1, "Waiting for dma fence failed %d\n", err); + + i915_devmem_unbind_addr(migrate, true); + i915_devmem_unbind_addr(migrate, false); + + dma_unmap_page(i915->drm.dev, migrate->host_dma[0], + PAGE_SIZE, PCI_DMA_FROMDEVICE); } static inline struct i915_devmem *page_to_devmem(struct page *page) @@ -400,6 +639,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) struct drm_i915_private *i915 = devmem->i915; struct i915_devmem_migrate migrate = {0}; unsigned long src = 0, dst = 0; + dma_addr_t dma_addr = 0; vm_fault_t ret; struct migrate_vma args = { .vma = vmf->vma, @@ -413,6 +653,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start); migrate.i915 = i915; migrate.args = &args; + migrate.host_dma = &dma_addr; migrate.src_id = INTEL_REGION_LMEM; migrate.dst_id = INTEL_REGION_SMEM; if (migrate_vma_setup(&args) < 0) From patchwork Fri Dec 13 21:56:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291611 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F3B3175D for ; Fri, 13 Dec 2019 22:08:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 538A82073D for ; Fri, 13 Dec 2019 22:08:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 538A82073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CFAA56EDF5; Fri, 13 Dec 2019 22:07:39 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 18F556EDF4; Fri, 13 Dec 2019 22:07:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558930" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:34 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:13 -0800 Message-Id: <20191213215614.24558-12-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 11/12] drm/i915/svm: Add support to en/disable SVM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add SVM as a capability and allow user to enable/disable SVM functionality on a per context basis. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Venkata Sandeep Dhanalakota --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 95 ++++++++++++++++++- drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 ++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 11 +++ drivers/gpu/drm/i915/i915_drv.c | 4 +- drivers/gpu/drm/i915/i915_drv.h | 10 ++ drivers/gpu/drm/i915/i915_getparam.c | 3 + include/uapi/drm/i915_drm.h | 17 ++++ 9 files changed, 145 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index d91975efc940..7db09c8bacb5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -77,6 +77,7 @@ #include "i915_gem_context.h" #include "i915_globals.h" +#include "i915_svm.h" #include "i915_trace.h" #include "i915_user_extensions.h" #include "i915_gem_ioctls.h" @@ -961,6 +962,78 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data, return 0; } +static int i915_gem_vm_setparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_vm_param *args = data; + struct i915_address_space *vm; + int err = 0; + u32 id; + + id = args->vm_id; + if (!id) + return -ENOENT; + + err = mutex_lock_interruptible(&file_priv->vm_idr_lock); + if (err) + return err; + + vm = idr_find(&file_priv->vm_idr, id); + + mutex_unlock(&file_priv->vm_idr_lock); + if (!vm) + return -ENOENT; + + switch (lower_32_bits(args->param)) { + case I915_GEM_VM_PARAM_SVM: + /* FIXME: Ensure ppgtt is empty before switching */ + if (!i915_has_svm(file_priv->dev_priv)) + err = -ENOTSUPP; + else if (args->value) + err = i915_svm_bind_mm(vm); + else + i915_svm_unbind_mm(vm); + break; + default: + err = -EINVAL; + } + return err; +} + +static int i915_gem_vm_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_vm_param *args = data; + struct i915_address_space *vm; + int err = 0; + u32 id; + + id = args->vm_id; + if (!id) + return -ENOENT; + + err = mutex_lock_interruptible(&file_priv->vm_idr_lock); + if (err) + return err; + + vm = idr_find(&file_priv->vm_idr, id); + + mutex_unlock(&file_priv->vm_idr_lock); + if (!vm) + return -ENOENT; + + switch (lower_32_bits(args->param)) { + case I915_GEM_VM_PARAM_SVM: + args->value = i915_vm_is_svm_enabled(vm); + break; + default: + err = -EINVAL; + } + return err; +} + struct context_barrier_task { struct i915_active base; void (*task)(void *data); @@ -2382,6 +2455,21 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +int i915_gem_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_context_param *args = data; + u32 class = upper_32_bits(args->param); + + switch (class) { + case 0: + return i915_gem_context_getparam_ioctl(dev, data, file); + case 2: + return i915_gem_vm_getparam_ioctl(dev, data, file); + } + return -EINVAL; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -2404,14 +2492,15 @@ int i915_gem_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { struct drm_i915_gem_context_param *args = data; - u32 object_class = upper_32_bits(args->param); + u32 class = upper_32_bits(args->param); - switch (object_class) { + switch (class) { case 0: return i915_gem_context_setparam_ioctl(dev, data, file); case 1: return i915_gem_object_setparam_ioctl(dev, data, file); - + case 2: + return i915_gem_vm_setparam_ioctl(dev, data, file); } return -EINVAL; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index 0b17636cfea2..59eedb9e8ae5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -175,6 +175,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int i915_gem_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); int i915_gem_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 69df5459c350..edda953a4d96 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -129,6 +129,7 @@ struct i915_gem_context { #define UCONTEXT_BANNABLE 2 #define UCONTEXT_RECOVERABLE 3 #define UCONTEXT_PERSISTENCE 4 +#define UCONTEXT_SVM_ENABLED 5 /** * @flags: small set of booleans diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index af360238a392..a7ac24de2017 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -25,6 +25,7 @@ #include "i915_gem_clflush.h" #include "i915_gem_context.h" #include "i915_gem_ioctls.h" +#include "i915_svm.h" #include "i915_trace.h" enum { @@ -443,6 +444,11 @@ eb_validate_vma(struct i915_execbuffer *eb, if (unlikely(entry->alignment && !is_power_of_2(entry->alignment))) return -EINVAL; + /* Only allow user PINNED addresses for SVM enabled contexts */ + if (unlikely(i915_vm_is_svm_enabled(eb->gem_context->vm) && + !(entry->flags & EXEC_OBJECT_PINNED))) + return -EINVAL; + /* * Offset can be used as input (EXEC_OBJECT_PINNED), reject * any non-page-aligned or non-canonical addresses. diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index f868a301fc04..cfad5b9f9bb9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -36,6 +36,7 @@ #include "i915_gem_object_blt.h" #include "i915_gem_region.h" #include "i915_globals.h" +#include "i915_svm.h" #include "i915_trace.h" static struct i915_global_object { @@ -507,6 +508,16 @@ int __init i915_global_objects_init(void) bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) { + struct i915_vma *vma; + + spin_lock(&obj->vma.lock); + list_for_each_entry(vma, &obj->vma.list, obj_link) + if (i915_vm_is_svm_enabled(vma->vm)) { + spin_unlock(&obj->vma.lock); + return true; + } + + spin_unlock(&obj->vma.lock); return false; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f1b92fd3d234..e6b826e80d55 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2692,6 +2692,8 @@ static int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data, vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id); if (unlikely(!vm)) return -ENOENT; + if (unlikely(!i915_vm_is_svm_enabled(vm))) + return -ENOTSUPP; switch (args->type) { case I915_GEM_VM_BIND_SVM_OBJ: @@ -2756,7 +2758,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW), - DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW), diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2d0a7cd2dc44..365f67a77171 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -1736,6 +1737,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, /* Only valid when HAS_DISPLAY() is true */ #define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), !i915_modparams.disable_display) +static inline bool i915_has_svm(struct drm_i915_private *dev_priv) +{ +#ifdef CONFIG_DRM_I915_SVM + return INTEL_GEN(dev_priv) >= 8; +#else + return false; +#endif +} + static inline bool intel_vtd_active(void) { #ifdef CONFIG_INTEL_IOMMU diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 54fce81d5724..c35402103e0f 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -161,6 +161,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = i915_perf_ioctl_version(); break; + case I915_PARAM_HAS_SVM: + value = i915_has_svm(i915); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index f49e29716460..1ebcba014908 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -360,6 +360,8 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_VM_CREATE 0x3a #define DRM_I915_GEM_VM_DESTROY 0x3b #define DRM_I915_GEM_OBJECT_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM +#define DRM_I915_GEM_VM_GETPARAM DRM_I915_GEM_CONTEXT_GETPARAM +#define DRM_I915_GEM_VM_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM #define DRM_I915_GEM_VM_BIND 0x3c #define DRM_I915_GEM_VM_PREFETCH 0x3d /* Must be kept compact -- no holes */ @@ -426,6 +428,8 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_VM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_VM_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param) +#define DRM_IOCTL_I915_GEM_VM_GETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_GETPARAM, struct drm_i915_gem_vm_param) +#define DRM_IOCTL_I915_GEM_VM_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_SETPARAM, struct drm_i915_gem_vm_param) #define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind) #define DRM_IOCTL_I915_GEM_VM_PREFETCH DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_PREFETCH, struct drm_i915_gem_vm_prefetch) @@ -625,6 +629,8 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_PERF_REVISION 54 +/* Shared Virtual Memory (SVM) support capability */ +#define I915_PARAM_HAS_SVM 55 /* Must be kept compact -- no holes and well documented */ typedef struct drm_i915_getparam { @@ -1851,6 +1857,17 @@ struct drm_i915_gem_vm_control { __u32 vm_id; }; +struct drm_i915_gem_vm_param { + __u32 vm_id; + __u32 rsvd; + +#define I915_VM_PARAM (2ull << 32) +#define I915_GEM_VM_PARAM_SVM 0x1 + __u64 param; + + __u64 value; +}; + struct drm_i915_reg_read { /* * Register offset. From patchwork Fri Dec 13 21:56:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niranjana Vishwanathapura X-Patchwork-Id: 11291621 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6257C139A for ; Fri, 13 Dec 2019 22:08:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 56B252073D for ; Fri, 13 Dec 2019 22:08:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56B252073D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 016E56EDFE; Fri, 13 Dec 2019 22:07:47 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id BDA6E6EDF7; Fri, 13 Dec 2019 22:07:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 14:07:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,311,1571727600"; d="scan'208";a="216558934" Received: from nvishwa1-desk.sc.intel.com ([10.3.160.185]) by orsmga006.jf.intel.com with ESMTP; 13 Dec 2019 14:07:34 -0800 From: Niranjana Vishwanathapura To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Dec 2019 13:56:14 -0800 Message-Id: <20191213215614.24558-13-niranjana.vishwanathapura@intel.com> X-Mailer: git-send-email 2.21.0.rc0.32.g243a4c7e27 In-Reply-To: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> References: <20191213215614.24558-1-niranjana.vishwanathapura@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC v2 12/12] drm/i915/svm: Add page table dump support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kenneth.w.graunke@intel.com, sanjay.k.kumar@intel.com, dri-devel@lists.freedesktop.org, jason.ekstrand@intel.com, dave.hansen@intel.com, jglisse@redhat.com, daniel.vetter@intel.com, dan.j.williams@intel.com, ira.weiny@intel.com, jgg@mellanox.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Add support to dump page table for debug purpose. Here is an example dump. Format is, [] : Page Table dump start 0x0 len 0xffffffffffffffff [0x0fe] 0x7f0000000: 0x6b0003 [0x1e6] 0x7f7980000: 0x6c0003 [0x16d] 0x7f79ada00: 0x5f0003 [0x000] 0x7f79ada00: 0x610803 [0x16e] 0x7f79adc00: 0x6d0003 [0x000] 0x7f79adc00: 0x630803 [0x100] 0x800000000: 0x6f0003 [0x000] 0x800000000: 0x700003 [0x000] 0x800000000: 0x710003 [0x000] 0x800000000: 0x5d0803 Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig.debug | 14 +++ .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.c | 92 +++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_gtt.h | 14 +++ 4 files changed, 121 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index 206882e154bc..257510a38b15 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -221,3 +221,17 @@ config DRM_I915_DEBUG_RUNTIME_PM driver loading, suspend and resume operations. If in doubt, say "N" + +config DRM_I915_DUMP_PPGTT + bool "Enable PPGTT Page Table dump support" + depends on DRM_I915 + default n + help + Choose this option to enable PPGTT page table dump support. + The page table snapshot helps developers to debug page table + related issues. This will affect performance and dumps a lot of + information, so only recommended for developer debug. + + Recommended for driver developers only. + + If in doubt, say "N". diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index a7ac24de2017..2c09d4bdee6f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2678,6 +2678,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, intel_engine_pool_mark_active(eb.batch->private, eb.request); trace_i915_request_queue(eb.request, eb.batch_flags); + ppgtt_dump(eb.context->vm, 0, eb.context->vm->total); err = eb_submit(&eb); err_request: add_to_client(eb.request, file); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 192674f03e4e..a473f43c5320 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1227,6 +1227,97 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm, return err; } +#ifdef CONFIG_DRM_I915_DUMP_PPGTT +static void __gen8_ppgtt_dump(struct i915_address_space * const vm, + struct i915_page_directory * const pd, + u64 start, u64 end, int lvl) +{ + char *prefix[4] = { "\t\t\t\t", "\t\t\t", "\t\t", "\t"}; + char *format = "%s [0x%03x] 0x%llx: 0x%llx\n"; + unsigned int idx, len; + gen8_pte_t *vaddr; + unsigned int pdpe; + bool is_large; + + GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT); + + len = gen8_pd_range(start, end, lvl--, &idx); + GEM_BUG_ON(!len || (idx + len - 1) >> gen8_pd_shift(1)); + + spin_lock(&pd->lock); + GEM_BUG_ON(!atomic_read(px_used(pd))); /* Must be pinned! */ + do { + struct i915_page_table *pt = pd->entry[idx]; + + if (!pt) { + start += (1 << gen8_pd_shift(lvl + 1)); + continue; + } + + vaddr = kmap_atomic_px(&pd->pt); + pdpe = gen8_pd_index(start, lvl + 1); + DRM_DEBUG_DRIVER(format, prefix[lvl + 1], pdpe, + start, vaddr[pdpe]); + is_large = (vaddr[pdpe] & GEN8_PDE_PS_2M); + kunmap_atomic(vaddr); + if (is_large) { + start += (1 << gen8_pd_shift(lvl + 1)); + continue; + } + + if (lvl) { + atomic_inc(&pt->used); + spin_unlock(&pd->lock); + + __gen8_ppgtt_dump(vm, as_pd(pt), + start, end, lvl); + + start += (1 << gen8_pd_shift(lvl + 1)); + spin_lock(&pd->lock); + atomic_dec(&pt->used); + GEM_BUG_ON(!atomic_read(&pt->used)); + } else { + unsigned int count = gen8_pt_count(start, end); + + pdpe = gen8_pd_index(start, lvl); + vaddr = kmap_atomic_px(pt); + while (count) { + if (vaddr[pdpe] != vm->scratch[lvl].encode) + DRM_DEBUG_DRIVER(format, prefix[lvl], + pdpe, start, + vaddr[pdpe]); + start++; + count--; + pdpe++; + } + + kunmap_atomic(vaddr); + GEM_BUG_ON(atomic_read(&pt->used) > I915_PDES); + } + } while (idx++, --len); + spin_unlock(&pd->lock); +} + +static void gen8_ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) +{ + GEM_BUG_ON(!IS_ALIGNED(start, BIT_ULL(GEN8_PTE_SHIFT))); + GEM_BUG_ON(!IS_ALIGNED(length, BIT_ULL(GEN8_PTE_SHIFT))); + GEM_BUG_ON(range_overflows(start, length, vm->total)); + + start >>= GEN8_PTE_SHIFT; + length >>= GEN8_PTE_SHIFT; + GEM_BUG_ON(length == 0); + + DRM_DEBUG_DRIVER("PPGTT dump: start 0x%llx length 0x%llx\n", + start, length); + __gen8_ppgtt_dump(vm, i915_vm_to_ppgtt(vm)->pd, + start, start + length, vm->top); +} +#else +#define gen8_ppgtt_dump NULL +#endif + static inline struct sgt_dma { struct scatterlist *sg; dma_addr_t dma, max; @@ -1596,6 +1687,7 @@ static struct i915_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915) ppgtt->vm.insert_entries = gen8_ppgtt_insert; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; + ppgtt->vm.dump_va_range = gen8_ppgtt_dump; if (intel_vgpu_active(i915)) gen8_ppgtt_notify_vgt(ppgtt, true); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index e06e6447e0d7..db3505263e6c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -367,6 +367,8 @@ struct i915_address_space { u64 start, u64 length); void (*clear_range)(struct i915_address_space *vm, u64 start, u64 length); + void (*dump_va_range)(struct i915_address_space *vm, + u64 start, u64 length); void (*insert_page)(struct i915_address_space *vm, dma_addr_t addr, u64 offset, @@ -684,6 +686,18 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, #define PIN_OFFSET_MASK (-I915_GTT_PAGE_SIZE) +#ifdef CONFIG_DRM_I915_DUMP_PPGTT +static inline void ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) +{ + if (vm->dump_va_range) + vm->dump_va_range(vm, start, length); +} +#else +static inline void ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) { } +#endif + /* SVM UAPI */ #define I915_GTT_SVM_READONLY BIT(0)