From patchwork Wed May 22 08:20:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2922112C for ; Wed, 22 May 2019 08:21:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AFB0928A2E for ; Wed, 22 May 2019 08:21:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A42A728A51; Wed, 22 May 2019 08:21:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 472B4287C6 for ; Wed, 22 May 2019 08:21:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E478589789; Wed, 22 May 2019 08:21:10 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4CF6E897D0 for ; Wed, 22 May 2019 08:21:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636781-1500050 for ; Wed, 22 May 2019 09:21:04 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:52 +0100 Message-Id: <20190522082104.5117-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [CI 01/13] drm/i915: Split GEM object type definition to its own header X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP For convenience in avoiding inline spaghetti, keep the type definition as a separate header. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Acked-by: Rodrigo Vivi Acked-by: Jani Nikula Acked-by: Joonas Lahtinen --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/Makefile | 1 + drivers/gpu/drm/i915/gem/Makefile.header-test | 16 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 285 +++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 1 + drivers/gpu/drm/i915/i915_drv.h | 3 +- drivers/gpu/drm/i915/i915_gem_batch_pool.h | 3 +- drivers/gpu/drm/i915/i915_gem_gtt.h | 1 + drivers/gpu/drm/i915/i915_gem_object.h | 295 +----------------- 9 files changed, 312 insertions(+), 294 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/Makefile create mode 100644 drivers/gpu/drm/i915/gem/Makefile.header-test create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_types.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 68106fe35a04..96344c9a0726 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -85,6 +85,7 @@ gt-$(CONFIG_DRM_I915_SELFTEST) += \ i915-y += $(gt-y) # GEM (Graphics Execution Management) code +obj-y += gem/ i915-y += \ i915_active.o \ i915_cmd_parser.o \ diff --git a/drivers/gpu/drm/i915/gem/Makefile b/drivers/gpu/drm/i915/gem/Makefile new file mode 100644 index 000000000000..07e7b8b840ea --- /dev/null +++ b/drivers/gpu/drm/i915/gem/Makefile @@ -0,0 +1 @@ +include $(src)/Makefile.header-test # Extra header tests diff --git a/drivers/gpu/drm/i915/gem/Makefile.header-test b/drivers/gpu/drm/i915/gem/Makefile.header-test new file mode 100644 index 000000000000..61e06cbb4b32 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/Makefile.header-test @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: MIT +# Copyright © 2019 Intel Corporation + +# Test the headers are compilable as standalone units +header_test := $(notdir $(wildcard $(src)/*.h)) + +quiet_cmd_header_test = HDRTEST $@ + cmd_header_test = echo "\#include \"$( $@ + +header_test_%.c: %.h + $(call cmd,header_test) + +extra-$(CONFIG_DRM_I915_WERROR) += \ + $(foreach h,$(header_test),$(patsubst %.h,header_test_%.o,$(h))) + +clean-files += $(foreach h,$(header_test),$(patsubst %.h,header_test_%.c,$(h))) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h new file mode 100644 index 000000000000..fe3b2a2775f7 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -0,0 +1,285 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef __I915_GEM_OBJECT_TYPES_H__ +#define __I915_GEM_OBJECT_TYPES_H__ + +#include + +#include + +#include "i915_active.h" +#include "i915_selftest.h" + +struct drm_i915_gem_object; + +/* + * struct i915_lut_handle tracks the fast lookups from handle to vma used + * for execbuf. Although we use a radixtree for that mapping, in order to + * remove them as the object or context is closed, we need a secondary list + * and a translation entry (i915_lut_handle). + */ +struct i915_lut_handle { + struct list_head obj_link; + struct list_head ctx_link; + struct i915_gem_context *ctx; + u32 handle; +}; + +struct drm_i915_gem_object_ops { + unsigned int flags; +#define I915_GEM_OBJECT_HAS_STRUCT_PAGE BIT(0) +#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(1) +#define I915_GEM_OBJECT_IS_PROXY BIT(2) +#define I915_GEM_OBJECT_ASYNC_CANCEL BIT(3) + + /* Interface between the GEM object and its backing storage. + * get_pages() is called once prior to the use of the associated set + * of pages before to binding them into the GTT, and put_pages() is + * called after we no longer need them. As we expect there to be + * associated cost with migrating pages between the backing storage + * and making them available for the GPU (e.g. clflush), we may hold + * onto the pages after they are no longer referenced by the GPU + * in case they may be used again shortly (for example migrating the + * pages to a different memory domain within the GTT). put_pages() + * will therefore most likely be called when the object itself is + * being released or under memory pressure (where we attempt to + * reap pages for the shrinker). + */ + int (*get_pages)(struct drm_i915_gem_object *obj); + void (*put_pages)(struct drm_i915_gem_object *obj, + struct sg_table *pages); + + int (*pwrite)(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pwrite *arg); + + int (*dmabuf_export)(struct drm_i915_gem_object *obj); + void (*release)(struct drm_i915_gem_object *obj); +}; + +struct drm_i915_gem_object { + struct drm_gem_object base; + + const struct drm_i915_gem_object_ops *ops; + + struct { + /** + * @vma.lock: protect the list/tree of vmas + */ + spinlock_t lock; + + /** + * @vma.list: List of VMAs backed by this object + * + * The VMA on this list are ordered by type, all GGTT vma are + * placed at the head and all ppGTT vma are placed at the tail. + * The different types of GGTT vma are unordered between + * themselves, use the @vma.tree (which has a defined order + * between all VMA) to quickly find an exact match. + */ + struct list_head list; + + /** + * @vma.tree: Ordered tree of VMAs backed by this object + * + * All VMA created for this object are placed in the @vma.tree + * for fast retrieval via a binary search in + * i915_vma_instance(). They are also added to @vma.list for + * easy iteration. + */ + struct rb_root tree; + } vma; + + /** + * @lut_list: List of vma lookup entries in use for this object. + * + * If this object is closed, we need to remove all of its VMA from + * the fast lookup index in associated contexts; @lut_list provides + * this translation from object to context->handles_vma. + */ + struct list_head lut_list; + + /** Stolen memory for this object, instead of being backed by shmem. */ + struct drm_mm_node *stolen; + union { + struct rcu_head rcu; + struct llist_node freed; + }; + + /** + * Whether the object is currently in the GGTT mmap. + */ + unsigned int userfault_count; + struct list_head userfault_link; + + struct list_head batch_pool_link; + I915_SELFTEST_DECLARE(struct list_head st_link); + + unsigned long flags; + + /** + * Have we taken a reference for the object for incomplete GPU + * activity? + */ +#define I915_BO_ACTIVE_REF 0 + + /* + * Is the object to be mapped as read-only to the GPU + * Only honoured if hardware has relevant pte bit + */ + unsigned int cache_level:3; + unsigned int cache_coherent:2; +#define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) +#define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) + unsigned int cache_dirty:1; + + /** + * @read_domains: Read memory domains. + * + * These monitor which caches contain read/write data related to the + * object. When transitioning from one set of domains to another, + * the driver is called to ensure that caches are suitably flushed and + * invalidated. + */ + u16 read_domains; + + /** + * @write_domain: Corresponding unique write memory domain. + */ + u16 write_domain; + + atomic_t frontbuffer_bits; + unsigned int frontbuffer_ggtt_origin; /* write once */ + struct i915_active_request frontbuffer_write; + + /** Current tiling stride for the object, if it's tiled. */ + unsigned int tiling_and_stride; +#define FENCE_MINIMUM_STRIDE 128 /* See i915_tiling_ok() */ +#define TILING_MASK (FENCE_MINIMUM_STRIDE - 1) +#define STRIDE_MASK (~TILING_MASK) + + /** Count of VMA actually bound by this object */ + unsigned int bind_count; + unsigned int active_count; + /** Count of how many global VMA are currently pinned for use by HW */ + unsigned int pin_global; + + struct { + struct mutex lock; /* protects the pages and their use */ + atomic_t pages_pin_count; + + struct sg_table *pages; + void *mapping; + + /* TODO: whack some of this into the error state */ + struct i915_page_sizes { + /** + * The sg mask of the pages sg_table. i.e the mask of + * of the lengths for each sg entry. + */ + unsigned int phys; + + /** + * The gtt page sizes we are allowed to use given the + * sg mask and the supported page sizes. This will + * express the smallest unit we can use for the whole + * object, as well as the larger sizes we may be able + * to use opportunistically. + */ + unsigned int sg; + + /** + * The actual gtt page size usage. Since we can have + * multiple vma associated with this object we need to + * prevent any trampling of state, hence a copy of this + * struct also lives in each vma, therefore the gtt + * value here should only be read/write through the vma. + */ + unsigned int gtt; + } page_sizes; + + I915_SELFTEST_DECLARE(unsigned int page_mask); + + struct i915_gem_object_page_iter { + struct scatterlist *sg_pos; + unsigned int sg_idx; /* in pages, but 32bit eek! */ + + struct radix_tree_root radix; + struct mutex lock; /* protects this cache */ + } get_page; + + /** + * Element within i915->mm.unbound_list or i915->mm.bound_list, + * locked by i915->mm.obj_lock. + */ + struct list_head link; + + /** + * Advice: are the backing pages purgeable? + */ + unsigned int madv:2; + + /** + * This is set if the object has been written to since the + * pages were last acquired. + */ + bool dirty:1; + + /** + * This is set if the object has been pinned due to unknown + * swizzling. + */ + bool quirked:1; + } mm; + + /** Breadcrumb of last rendering to the buffer. + * There can only be one writer, but we allow for multiple readers. + * If there is a writer that necessarily implies that all other + * read requests are complete - but we may only be lazily clearing + * the read requests. A read request is naturally the most recent + * request on a ring, so we may have two different write and read + * requests on one ring where the write request is older than the + * read request. This allows for the CPU to read from an active + * buffer by only waiting for the write to complete. + */ + struct reservation_object *resv; + + /** References from framebuffers, locks out tiling changes. */ + unsigned int framebuffer_references; + + /** Record of address bit 17 of each page at last unbind. */ + unsigned long *bit_17; + + union { + struct i915_gem_userptr { + uintptr_t ptr; + + struct i915_mm_struct *mm; + struct i915_mmu_object *mmu_object; + struct work_struct *work; + } userptr; + + unsigned long scratch; + + void *gvt_info; + }; + + /** for phys allocated objects */ + struct drm_dma_handle *phys_handle; + + struct reservation_object __builtin_resv; +}; + +static inline struct drm_i915_gem_object * +to_intel_bo(struct drm_gem_object *gem) +{ + /* Assert that to_intel_bo(NULL) == NULL */ + BUILD_BUG_ON(offsetof(struct drm_i915_gem_object, base)); + + return container_of(gem, struct drm_i915_gem_object, base); +} + +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index f3fc2e8acc90..2c232501d973 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -29,6 +29,7 @@ #define I915_CMD_HASH_ORDER 9 struct dma_fence; +struct drm_i915_gem_object; struct drm_i915_reg_table; struct i915_gem_context; struct i915_request; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1ad3818d2676..64756f4b1e46 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -81,7 +81,6 @@ #include "i915_gem.h" #include "i915_gem_context.h" #include "i915_gem_fence_reg.h" -#include "i915_gem_object.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" #include "i915_request.h" @@ -136,6 +135,8 @@ bool i915_error_injected(void); __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \ fmt, ##__VA_ARGS__) +struct drm_i915_gem_object; + enum hpd_pin { HPD_NONE = 0, HPD_TV = HPD_NONE, /* TV is known to be unreliable */ diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.h b/drivers/gpu/drm/i915/i915_gem_batch_pool.h index 56947daaaf65..feeeeeaa54d8 100644 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.h +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.h @@ -9,6 +9,7 @@ #include +struct drm_i915_gem_object; struct intel_engine_cs; struct i915_gem_batch_pool { @@ -19,7 +20,7 @@ struct i915_gem_batch_pool { void i915_gem_batch_pool_init(struct i915_gem_batch_pool *pool, struct intel_engine_cs *engine); void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); -struct drm_i915_gem_object* +struct drm_i915_gem_object * i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, size_t size); #endif /* I915_GEM_BATCH_POOL_H */ diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 98fc71053f7c..248a26b1bb04 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -61,6 +61,7 @@ struct drm_i915_file_private; struct drm_i915_fence_reg; +struct drm_i915_gem_object; struct i915_vma; typedef u32 gen6_pte_t; diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h index ca93a40c0c87..3666b0c5f6ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/i915_gem_object.h @@ -1,308 +1,19 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #ifndef __I915_GEM_OBJECT_H__ #define __I915_GEM_OBJECT_H__ -#include - -#include #include #include #include #include -#include "i915_request.h" -#include "i915_selftest.h" - -struct drm_i915_gem_object; - -/* - * struct i915_lut_handle tracks the fast lookups from handle to vma used - * for execbuf. Although we use a radixtree for that mapping, in order to - * remove them as the object or context is closed, we need a secondary list - * and a translation entry (i915_lut_handle). - */ -struct i915_lut_handle { - struct list_head obj_link; - struct list_head ctx_link; - struct i915_gem_context *ctx; - u32 handle; -}; - -struct drm_i915_gem_object_ops { - unsigned int flags; -#define I915_GEM_OBJECT_HAS_STRUCT_PAGE BIT(0) -#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(1) -#define I915_GEM_OBJECT_IS_PROXY BIT(2) -#define I915_GEM_OBJECT_ASYNC_CANCEL BIT(3) - - /* Interface between the GEM object and its backing storage. - * get_pages() is called once prior to the use of the associated set - * of pages before to binding them into the GTT, and put_pages() is - * called after we no longer need them. As we expect there to be - * associated cost with migrating pages between the backing storage - * and making them available for the GPU (e.g. clflush), we may hold - * onto the pages after they are no longer referenced by the GPU - * in case they may be used again shortly (for example migrating the - * pages to a different memory domain within the GTT). put_pages() - * will therefore most likely be called when the object itself is - * being released or under memory pressure (where we attempt to - * reap pages for the shrinker). - */ - int (*get_pages)(struct drm_i915_gem_object *); - void (*put_pages)(struct drm_i915_gem_object *, struct sg_table *); - - int (*pwrite)(struct drm_i915_gem_object *, - const struct drm_i915_gem_pwrite *); - - int (*dmabuf_export)(struct drm_i915_gem_object *); - void (*release)(struct drm_i915_gem_object *); -}; - -struct drm_i915_gem_object { - struct drm_gem_object base; - - const struct drm_i915_gem_object_ops *ops; - - struct { - /** - * @vma.lock: protect the list/tree of vmas - */ - spinlock_t lock; - - /** - * @vma.list: List of VMAs backed by this object - * - * The VMA on this list are ordered by type, all GGTT vma are - * placed at the head and all ppGTT vma are placed at the tail. - * The different types of GGTT vma are unordered between - * themselves, use the @vma.tree (which has a defined order - * between all VMA) to quickly find an exact match. - */ - struct list_head list; - - /** - * @vma.tree: Ordered tree of VMAs backed by this object - * - * All VMA created for this object are placed in the @vma.tree - * for fast retrieval via a binary search in - * i915_vma_instance(). They are also added to @vma.list for - * easy iteration. - */ - struct rb_root tree; - } vma; - - /** - * @lut_list: List of vma lookup entries in use for this object. - * - * If this object is closed, we need to remove all of its VMA from - * the fast lookup index in associated contexts; @lut_list provides - * this translation from object to context->handles_vma. - */ - struct list_head lut_list; - - /** Stolen memory for this object, instead of being backed by shmem. */ - struct drm_mm_node *stolen; - union { - struct rcu_head rcu; - struct llist_node freed; - }; - - /** - * Whether the object is currently in the GGTT mmap. - */ - unsigned int userfault_count; - struct list_head userfault_link; - - struct list_head batch_pool_link; - I915_SELFTEST_DECLARE(struct list_head st_link); - - unsigned long flags; - - /** - * Have we taken a reference for the object for incomplete GPU - * activity? - */ -#define I915_BO_ACTIVE_REF 0 - - /* - * Is the object to be mapped as read-only to the GPU - * Only honoured if hardware has relevant pte bit - */ - unsigned int cache_level:3; - unsigned int cache_coherent:2; -#define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) -#define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) - unsigned int cache_dirty:1; - - /** - * @read_domains: Read memory domains. - * - * These monitor which caches contain read/write data related to the - * object. When transitioning from one set of domains to another, - * the driver is called to ensure that caches are suitably flushed and - * invalidated. - */ - u16 read_domains; - - /** - * @write_domain: Corresponding unique write memory domain. - */ - u16 write_domain; - - atomic_t frontbuffer_bits; - unsigned int frontbuffer_ggtt_origin; /* write once */ - struct i915_active_request frontbuffer_write; - - /** Current tiling stride for the object, if it's tiled. */ - unsigned int tiling_and_stride; -#define FENCE_MINIMUM_STRIDE 128 /* See i915_tiling_ok() */ -#define TILING_MASK (FENCE_MINIMUM_STRIDE-1) -#define STRIDE_MASK (~TILING_MASK) - - /** Count of VMA actually bound by this object */ - unsigned int bind_count; - unsigned int active_count; - /** Count of how many global VMA are currently pinned for use by HW */ - unsigned int pin_global; - - struct { - struct mutex lock; /* protects the pages and their use */ - atomic_t pages_pin_count; - - struct sg_table *pages; - void *mapping; - - /* TODO: whack some of this into the error state */ - struct i915_page_sizes { - /** - * The sg mask of the pages sg_table. i.e the mask of - * of the lengths for each sg entry. - */ - unsigned int phys; - - /** - * The gtt page sizes we are allowed to use given the - * sg mask and the supported page sizes. This will - * express the smallest unit we can use for the whole - * object, as well as the larger sizes we may be able - * to use opportunistically. - */ - unsigned int sg; - - /** - * The actual gtt page size usage. Since we can have - * multiple vma associated with this object we need to - * prevent any trampling of state, hence a copy of this - * struct also lives in each vma, therefore the gtt - * value here should only be read/write through the vma. - */ - unsigned int gtt; - } page_sizes; - - I915_SELFTEST_DECLARE(unsigned int page_mask); - - struct i915_gem_object_page_iter { - struct scatterlist *sg_pos; - unsigned int sg_idx; /* in pages, but 32bit eek! */ - - struct radix_tree_root radix; - struct mutex lock; /* protects this cache */ - } get_page; - - /** - * Element within i915->mm.unbound_list or i915->mm.bound_list, - * locked by i915->mm.obj_lock. - */ - struct list_head link; - - /** - * Advice: are the backing pages purgeable? - */ - unsigned int madv:2; - - /** - * This is set if the object has been written to since the - * pages were last acquired. - */ - bool dirty:1; - - /** - * This is set if the object has been pinned due to unknown - * swizzling. - */ - bool quirked:1; - } mm; - - /** Breadcrumb of last rendering to the buffer. - * There can only be one writer, but we allow for multiple readers. - * If there is a writer that necessarily implies that all other - * read requests are complete - but we may only be lazily clearing - * the read requests. A read request is naturally the most recent - * request on a ring, so we may have two different write and read - * requests on one ring where the write request is older than the - * read request. This allows for the CPU to read from an active - * buffer by only waiting for the write to complete. - */ - struct reservation_object *resv; - - /** References from framebuffers, locks out tiling changes. */ - unsigned int framebuffer_references; - - /** Record of address bit 17 of each page at last unbind. */ - unsigned long *bit_17; - - union { - struct i915_gem_userptr { - uintptr_t ptr; - - struct i915_mm_struct *mm; - struct i915_mmu_object *mmu_object; - struct work_struct *work; - } userptr; - - unsigned long scratch; - - void *gvt_info; - }; - - /** for phys allocated objects */ - struct drm_dma_handle *phys_handle; - - struct reservation_object __builtin_resv; -}; - -static inline struct drm_i915_gem_object * -to_intel_bo(struct drm_gem_object *gem) -{ - /* Assert that to_intel_bo(NULL) == NULL */ - BUILD_BUG_ON(offsetof(struct drm_i915_gem_object, base)); - - return container_of(gem, struct drm_i915_gem_object, base); -} +#include "gem/i915_gem_object_types.h" struct drm_i915_gem_object *i915_gem_object_alloc(void); void i915_gem_object_free(struct drm_i915_gem_object *obj); From patchwork Wed May 22 08:20:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 92E7D112C for ; Wed, 22 May 2019 08:21:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85CC7289B7 for ; Wed, 22 May 2019 08:21:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7A7A528B50; Wed, 22 May 2019 08:21:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E9A0828B59 for ; Wed, 22 May 2019 08:21:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AADBC8982E; Wed, 22 May 2019 08:21:19 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id B17E9897D0 for ; Wed, 22 May 2019 08:21:14 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636782-1500050 for ; Wed, 22 May 2019 09:21:05 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:53 +0100 Message-Id: <20190522082104.5117-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 02/13] drm/i915: Pull GEM ioctls interface to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Declutter i915_drv/gem.h by moving the ioctl API into its own header. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_ioctls.h | 52 ++++++++++++++++++++++ drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_drv.h | 38 ---------------- drivers/gpu/drm/i915/i915_gem.c | 1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 1 + drivers/gpu/drm/i915/i915_gem_tiling.c | 3 ++ drivers/gpu/drm/i915/i915_gem_userptr.c | 12 +++-- 7 files changed, 66 insertions(+), 42 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h new file mode 100644 index 000000000000..ddc7f2a52b3e --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h @@ -0,0 +1,52 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef I915_GEM_IOCTLS_H +#define I915_GEM_IOCTLS_H + +struct drm_device; +struct drm_file; + +int i915_gem_busy_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_create_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_madvise_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_mmap_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_pread_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_throttle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_userptr_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); + +#endif diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 83d2eb9e74cb..30afd8aef737 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -47,6 +47,7 @@ #include #include +#include "gem/i915_gem_ioctls.h" #include "gt/intel_gt_pm.h" #include "gt/intel_reset.h" #include "gt/intel_workarounds.h" diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 64756f4b1e46..753365e08e2f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2741,46 +2741,8 @@ static inline bool intel_vgpu_active(struct drm_i915_private *dev_priv) } /* i915_gem.c */ -int i915_gem_create_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_pread_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_mmap_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_busy_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data, - struct drm_file *file); -int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, - struct drm_file *file); -int i915_gem_throttle_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_madvise_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); int i915_gem_init_userptr(struct drm_i915_private *dev_priv); void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv); -int i915_gem_userptr_ioctl(struct drm_device *dev, void *data, - struct drm_file *file); -int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); -int i915_gem_wait_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv); void i915_gem_sanitize(struct drm_i915_private *i915); int i915_gem_init_early(struct drm_i915_private *dev_priv); void i915_gem_cleanup_early(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d3b7dac527dc..d75b05f14994 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -39,6 +39,7 @@ #include #include +#include "gem/i915_gem_ioctls.h" #include "gt/intel_engine_pm.h" #include "gt/intel_gt_pm.h" #include "gt/intel_mocs.h" diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 8b85c91c3ea4..908fddcc57c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -34,6 +34,7 @@ #include #include +#include "gem/i915_gem_ioctls.h" #include "gt/intel_gt_pm.h" #include "i915_drv.h" diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index a9b5329dae3b..7d5cc9ccf6dd 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -28,6 +28,9 @@ #include #include #include + +#include "gem/i915_gem_ioctls.h" + #include "i915_drv.h" /** diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 8079ea3af103..2c1b6bb7a040 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -22,16 +22,20 @@ * */ -#include -#include "i915_drv.h" -#include "i915_trace.h" -#include "intel_drv.h" #include #include #include #include #include +#include + +#include "gem/i915_gem_ioctls.h" + +#include "i915_drv.h" +#include "i915_trace.h" +#include "intel_drv.h" + struct i915_mm_struct { struct mm_struct *mm; struct drm_i915_private *i915; From patchwork Wed May 22 08:20:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BC7B912 for ; Wed, 22 May 2019 08:21:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F27928A28 for ; Wed, 22 May 2019 08:21:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0392628A32; Wed, 22 May 2019 08:21:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2D4811FFC8 for ; Wed, 22 May 2019 08:21:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5FF9389804; Wed, 22 May 2019 08:21:18 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id ECD09897E8 for ; Wed, 22 May 2019 08:21:13 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636783-1500050 for ; Wed, 22 May 2019 09:21:05 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:54 +0100 Message-Id: <20190522082104.5117-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 03/13] drm/i915: Move object->pages API to i915_gem_object.[ch] X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Currently the code for manipulating the pages on an object is still residing in i915_gem.c, move it to i915_gem_object.c Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 4 +- .../gpu/drm/i915/{ => gem}/i915_gem_object.c | 0 .../gpu/drm/i915/{ => gem}/i915_gem_object.h | 132 +++++++++++++++- drivers/gpu/drm/i915/i915_drv.h | 141 +----------------- drivers/gpu/drm/i915/i915_globals.c | 2 +- drivers/gpu/drm/i915/i915_vma.h | 2 +- drivers/gpu/drm/i915/intel_frontbuffer.h | 2 +- 7 files changed, 143 insertions(+), 140 deletions(-) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_object.c (100%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_object.h (58%) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 96344c9a0726..1c8bd0a5212d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -86,7 +86,10 @@ i915-y += $(gt-y) # GEM (Graphics Execution Management) code obj-y += gem/ +gem-y += \ + gem/i915_gem_object.o i915-y += \ + $(gem-y) \ i915_active.o \ i915_cmd_parser.o \ i915_gem_batch_pool.o \ @@ -99,7 +102,6 @@ i915-y += \ i915_gem_gtt.o \ i915_gem_internal.o \ i915_gem.o \ - i915_gem_object.o \ i915_gem_pm.o \ i915_gem_render_state.o \ i915_gem_shrinker.o \ diff --git a/drivers/gpu/drm/i915/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c similarity index 100% rename from drivers/gpu/drm/i915/i915_gem_object.c rename to drivers/gpu/drm/i915/gem/i915_gem_object.c diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h similarity index 58% rename from drivers/gpu/drm/i915/i915_gem_object.h rename to drivers/gpu/drm/i915/gem/i915_gem_object.h index 3666b0c5f6ca..9dfd8b0db725 100644 --- a/drivers/gpu/drm/i915/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -13,7 +13,7 @@ #include -#include "gem/i915_gem_object_types.h" +#include "i915_gem_object_types.h" struct drm_i915_gem_object *i915_gem_object_alloc(void); void i915_gem_object_free(struct drm_i915_gem_object *obj); @@ -192,6 +192,136 @@ i915_gem_object_get_tile_row_size(const struct drm_i915_gem_object *obj) int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj, unsigned int tiling, unsigned int stride); +struct scatterlist * +i915_gem_object_get_sg(struct drm_i915_gem_object *obj, + unsigned int n, unsigned int *offset); + +struct page * +i915_gem_object_get_page(struct drm_i915_gem_object *obj, + unsigned int n); + +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, + unsigned int n); + +dma_addr_t +i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj, + unsigned long n, + unsigned int *len); + +dma_addr_t +i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, + unsigned long n); + +void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, + struct sg_table *pages, + unsigned int sg_page_sizes); +int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); + +static inline int __must_check +i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) +{ + might_lock(&obj->mm.lock); + + if (atomic_inc_not_zero(&obj->mm.pages_pin_count)) + return 0; + + return __i915_gem_object_get_pages(obj); +} + +static inline bool +i915_gem_object_has_pages(struct drm_i915_gem_object *obj) +{ + return !IS_ERR_OR_NULL(READ_ONCE(obj->mm.pages)); +} + +static inline void +__i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) +{ + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + + atomic_inc(&obj->mm.pages_pin_count); +} + +static inline bool +i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj) +{ + return atomic_read(&obj->mm.pages_pin_count); +} + +static inline void +__i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) +{ + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); + + atomic_dec(&obj->mm.pages_pin_count); +} + +static inline void +i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) +{ + __i915_gem_object_unpin_pages(obj); +} + +enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ + I915_MM_NORMAL = 0, + I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */ +}; + +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, + enum i915_mm_subclass subclass); +void __i915_gem_object_truncate(struct drm_i915_gem_object *obj); + +enum i915_map_type { + I915_MAP_WB = 0, + I915_MAP_WC, +#define I915_MAP_OVERRIDE BIT(31) + I915_MAP_FORCE_WB = I915_MAP_WB | I915_MAP_OVERRIDE, + I915_MAP_FORCE_WC = I915_MAP_WC | I915_MAP_OVERRIDE, +}; + +/** + * i915_gem_object_pin_map - return a contiguous mapping of the entire object + * @obj: the object to map into kernel address space + * @type: the type of mapping, used to select pgprot_t + * + * Calls i915_gem_object_pin_pages() to prevent reaping of the object's + * pages and then returns a contiguous mapping of the backing storage into + * the kernel address space. Based on the @type of mapping, the PTE will be + * set to either WriteBack or WriteCombine (via pgprot_t). + * + * The caller is responsible for calling i915_gem_object_unpin_map() when the + * mapping is no longer required. + * + * Returns the pointer through which to access the mapped object, or an + * ERR_PTR() on error. + */ +void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, + enum i915_map_type type); + +void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, + unsigned long offset, + unsigned long size); +static inline void i915_gem_object_flush_map(struct drm_i915_gem_object *obj) +{ + __i915_gem_object_flush_map(obj, 0, obj->base.size); +} + +/** + * i915_gem_object_unpin_map - releases an earlier mapping + * @obj: the object to unmap + * + * After pinning the object and mapping its pages, once you are finished + * with your access, call i915_gem_object_unpin_map() to release the pin + * upon the mapping. Once the pin count reaches zero, that mapping may be + * removed. + */ +static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj) +{ + i915_gem_object_unpin_pages(obj); +} + static inline struct intel_engine_cs * i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 753365e08e2f..b56c5583f933 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2814,141 +2814,6 @@ static inline int __sg_page_count(const struct scatterlist *sg) return sg->length >> PAGE_SHIFT; } -struct scatterlist * -i915_gem_object_get_sg(struct drm_i915_gem_object *obj, - unsigned int n, unsigned int *offset); - -struct page * -i915_gem_object_get_page(struct drm_i915_gem_object *obj, - unsigned int n); - -struct page * -i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, - unsigned int n); - -dma_addr_t -i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj, - unsigned long n, - unsigned int *len); -dma_addr_t -i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, - unsigned long n); - -void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, - struct sg_table *pages, - unsigned int sg_page_sizes); -int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); - -static inline int __must_check -i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) -{ - might_lock(&obj->mm.lock); - - if (atomic_inc_not_zero(&obj->mm.pages_pin_count)) - return 0; - - return __i915_gem_object_get_pages(obj); -} - -static inline bool -i915_gem_object_has_pages(struct drm_i915_gem_object *obj) -{ - return !IS_ERR_OR_NULL(READ_ONCE(obj->mm.pages)); -} - -static inline void -__i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) -{ - GEM_BUG_ON(!i915_gem_object_has_pages(obj)); - - atomic_inc(&obj->mm.pages_pin_count); -} - -static inline bool -i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj) -{ - return atomic_read(&obj->mm.pages_pin_count); -} - -static inline void -__i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) -{ - GEM_BUG_ON(!i915_gem_object_has_pages(obj)); - GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); - - atomic_dec(&obj->mm.pages_pin_count); -} - -static inline void -i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) -{ - __i915_gem_object_unpin_pages(obj); -} - -enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ - I915_MM_NORMAL = 0, - I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */ -}; - -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass); -void __i915_gem_object_truncate(struct drm_i915_gem_object *obj); - -enum i915_map_type { - I915_MAP_WB = 0, - I915_MAP_WC, -#define I915_MAP_OVERRIDE BIT(31) - I915_MAP_FORCE_WB = I915_MAP_WB | I915_MAP_OVERRIDE, - I915_MAP_FORCE_WC = I915_MAP_WC | I915_MAP_OVERRIDE, -}; - -static inline enum i915_map_type -i915_coherent_map_type(struct drm_i915_private *i915) -{ - return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC; -} - -/** - * i915_gem_object_pin_map - return a contiguous mapping of the entire object - * @obj: the object to map into kernel address space - * @type: the type of mapping, used to select pgprot_t - * - * Calls i915_gem_object_pin_pages() to prevent reaping of the object's - * pages and then returns a contiguous mapping of the backing storage into - * the kernel address space. Based on the @type of mapping, the PTE will be - * set to either WriteBack or WriteCombine (via pgprot_t). - * - * The caller is responsible for calling i915_gem_object_unpin_map() when the - * mapping is no longer required. - * - * Returns the pointer through which to access the mapped object, or an - * ERR_PTR() on error. - */ -void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, - enum i915_map_type type); - -void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, - unsigned long offset, - unsigned long size); -static inline void i915_gem_object_flush_map(struct drm_i915_gem_object *obj) -{ - __i915_gem_object_flush_map(obj, 0, obj->base.size); -} - -/** - * i915_gem_object_unpin_map - releases an earlier mapping - * @obj: the object to unmap - * - * After pinning the object and mapping its pages, once you are finished - * with your access, call i915_gem_object_unpin_map() to release the pin - * upon the mapping. Once the pin count reaches zero, that mapping may be - * removed. - */ -static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj) -{ - i915_gem_object_unpin_pages(obj); -} - int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, unsigned int *needs_clflush); int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, @@ -3347,6 +3212,12 @@ static inline u32 i915_scratch_offset(const struct drm_i915_private *i915) return i915_ggtt_offset(i915->gt.scratch); } +static inline enum i915_map_type +i915_coherent_map_type(struct drm_i915_private *i915) +{ + return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC; +} + static inline void add_taint_for_CI(unsigned int taint) { /* diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c index 81e5c2ce336b..db52a58eadcc 100644 --- a/drivers/gpu/drm/i915/i915_globals.c +++ b/drivers/gpu/drm/i915/i915_globals.c @@ -9,7 +9,7 @@ #include "i915_active.h" #include "i915_gem_context.h" -#include "i915_gem_object.h" +#include "gem/i915_gem_object.h" #include "i915_globals.h" #include "i915_request.h" #include "i915_scheduler.h" diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 8543d2953cd1..754a762d90b4 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -32,7 +32,7 @@ #include "i915_gem_gtt.h" #include "i915_gem_fence_reg.h" -#include "i915_gem_object.h" +#include "gem/i915_gem_object.h" #include "i915_active.h" #include "i915_request.h" diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.h b/drivers/gpu/drm/i915/intel_frontbuffer.h index d5894666f658..5727320c8084 100644 --- a/drivers/gpu/drm/i915/intel_frontbuffer.h +++ b/drivers/gpu/drm/i915/intel_frontbuffer.h @@ -24,7 +24,7 @@ #ifndef __INTEL_FRONTBUFFER_H__ #define __INTEL_FRONTBUFFER_H__ -#include "i915_gem_object.h" +#include "gem/i915_gem_object.h" struct drm_i915_private; struct drm_i915_gem_object; From patchwork Wed May 22 08:20:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDFA5912 for ; Wed, 22 May 2019 08:21:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB284287C6 for ; Wed, 22 May 2019 08:21:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF1D028A30; Wed, 22 May 2019 08:21:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9BF3928A7A for ; Wed, 22 May 2019 08:21:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C8B1189817; Wed, 22 May 2019 08:21:19 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D6937897D0 for ; Wed, 22 May 2019 08:21:13 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636784-1500050 for ; Wed, 22 May 2019 09:21:05 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:55 +0100 Message-Id: <20190522082104.5117-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 04/13] drm/i915: Move shmem object setup to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Split the plain old shmem object into its own file to start decluttering i915_gem.c v2: Lose the confusing, hysterical raisins, suffix of _gtt. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 298 +++++++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 41 +- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 511 +++++++++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +- drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 2 +- drivers/gpu/drm/i915/gvt/cmd_parser.c | 21 +- drivers/gpu/drm/i915/i915_drv.h | 10 - drivers/gpu/drm/i915/i915_gem.c | 826 +----------------- drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/intel_fbdev.c | 2 +- drivers/gpu/drm/i915/intel_guc.c | 2 +- drivers/gpu/drm/i915/intel_uc_fw.c | 3 +- drivers/gpu/drm/i915/selftests/huge_pages.c | 6 +- .../gpu/drm/i915/selftests/i915_gem_dmabuf.c | 8 +- .../gpu/drm/i915/selftests/i915_gem_object.c | 4 +- 16 files changed, 882 insertions(+), 861 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_shmem.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 1c8bd0a5212d..625f9749355b 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -87,7 +87,8 @@ i915-y += $(gt-y) # GEM (Graphics Execution Management) code obj-y += gem/ gem-y += \ - gem/i915_gem_object.o + gem/i915_gem_object.o \ + gem/i915_gem_shmem.o i915-y += \ $(gem-y) \ i915_active.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index ac6a5ab84586..3000b26b4bbf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -25,6 +25,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" #include "i915_globals.h" +#include "intel_frontbuffer.h" static struct i915_global_object { struct i915_global base; @@ -41,6 +42,64 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj) return kmem_cache_free(global.slab_objects, obj); } +/* some bookkeeping */ +static void i915_gem_info_add_obj(struct drm_i915_private *i915, + u64 size) +{ + spin_lock(&i915->mm.object_stat_lock); + i915->mm.object_count++; + i915->mm.object_memory += size; + spin_unlock(&i915->mm.object_stat_lock); +} + +static void i915_gem_info_remove_obj(struct drm_i915_private *i915, + u64 size) +{ + spin_lock(&i915->mm.object_stat_lock); + i915->mm.object_count--; + i915->mm.object_memory -= size; + spin_unlock(&i915->mm.object_stat_lock); +} + +static void +frontbuffer_retire(struct i915_active_request *active, + struct i915_request *request) +{ + struct drm_i915_gem_object *obj = + container_of(active, typeof(*obj), frontbuffer_write); + + intel_fb_obj_flush(obj, ORIGIN_CS); +} + +void i915_gem_object_init(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_object_ops *ops) +{ + mutex_init(&obj->mm.lock); + + spin_lock_init(&obj->vma.lock); + INIT_LIST_HEAD(&obj->vma.list); + + INIT_LIST_HEAD(&obj->lut_list); + INIT_LIST_HEAD(&obj->batch_pool_link); + + init_rcu_head(&obj->rcu); + + obj->ops = ops; + + reservation_object_init(&obj->__builtin_resv); + obj->resv = &obj->__builtin_resv; + + obj->frontbuffer_ggtt_origin = ORIGIN_GTT; + i915_active_request_init(&obj->frontbuffer_write, + NULL, frontbuffer_retire); + + obj->mm.madv = I915_MADV_WILLNEED; + INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN); + mutex_init(&obj->mm.get_page.lock); + + i915_gem_info_add_obj(to_i915(obj->base.dev), obj->base.size); +} + /** * Mark up the object's coherency levels for a given cache_level * @obj: #drm_i915_gem_object @@ -63,6 +122,245 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE); } +void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file) +{ + struct drm_i915_private *i915 = to_i915(gem->dev); + struct drm_i915_gem_object *obj = to_intel_bo(gem); + struct drm_i915_file_private *fpriv = file->driver_priv; + struct i915_lut_handle *lut, *ln; + + mutex_lock(&i915->drm.struct_mutex); + + list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) { + struct i915_gem_context *ctx = lut->ctx; + struct i915_vma *vma; + + GEM_BUG_ON(ctx->file_priv == ERR_PTR(-EBADF)); + if (ctx->file_priv != fpriv) + continue; + + vma = radix_tree_delete(&ctx->handles_vma, lut->handle); + GEM_BUG_ON(vma->obj != obj); + + /* We allow the process to have multiple handles to the same + * vma, in the same fd namespace, by virtue of flink/open. + */ + GEM_BUG_ON(!vma->open_count); + if (!--vma->open_count && !i915_vma_is_ggtt(vma)) + i915_vma_close(vma); + + list_del(&lut->obj_link); + list_del(&lut->ctx_link); + + i915_lut_handle_free(lut); + __i915_gem_object_release_unless_active(obj); + } + + mutex_unlock(&i915->drm.struct_mutex); +} + +static bool discard_backing_storage(struct drm_i915_gem_object *obj) +{ + /* If we are the last user of the backing storage (be it shmemfs + * pages or stolen etc), we know that the pages are going to be + * immediately released. In this case, we can then skip copying + * back the contents from the GPU. + */ + + if (obj->mm.madv != I915_MADV_WILLNEED) + return false; + + if (!obj->base.filp) + return true; + + /* At first glance, this looks racy, but then again so would be + * userspace racing mmap against close. However, the first external + * reference to the filp can only be obtained through the + * i915_gem_mmap_ioctl() which safeguards us against the user + * acquiring such a reference whilst we are in the middle of + * freeing the object. + */ + return file_count(obj->base.filp) == 1; +} + +static void __i915_gem_free_objects(struct drm_i915_private *i915, + struct llist_node *freed) +{ + struct drm_i915_gem_object *obj, *on; + intel_wakeref_t wakeref; + + wakeref = intel_runtime_pm_get(i915); + llist_for_each_entry_safe(obj, on, freed, freed) { + struct i915_vma *vma, *vn; + + trace_i915_gem_object_destroy(obj); + + mutex_lock(&i915->drm.struct_mutex); + + GEM_BUG_ON(i915_gem_object_is_active(obj)); + list_for_each_entry_safe(vma, vn, &obj->vma.list, obj_link) { + GEM_BUG_ON(i915_vma_is_active(vma)); + vma->flags &= ~I915_VMA_PIN_MASK; + i915_vma_destroy(vma); + } + GEM_BUG_ON(!list_empty(&obj->vma.list)); + GEM_BUG_ON(!RB_EMPTY_ROOT(&obj->vma.tree)); + + /* This serializes freeing with the shrinker. Since the free + * is delayed, first by RCU then by the workqueue, we want the + * shrinker to be able to free pages of unreferenced objects, + * or else we may oom whilst there are plenty of deferred + * freed objects. + */ + if (i915_gem_object_has_pages(obj)) { + spin_lock(&i915->mm.obj_lock); + list_del_init(&obj->mm.link); + spin_unlock(&i915->mm.obj_lock); + } + + mutex_unlock(&i915->drm.struct_mutex); + + GEM_BUG_ON(obj->bind_count); + GEM_BUG_ON(obj->userfault_count); + GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits)); + GEM_BUG_ON(!list_empty(&obj->lut_list)); + + if (obj->ops->release) + obj->ops->release(obj); + + if (WARN_ON(i915_gem_object_has_pinned_pages(obj))) + atomic_set(&obj->mm.pages_pin_count, 0); + __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + GEM_BUG_ON(i915_gem_object_has_pages(obj)); + + if (obj->base.import_attach) + drm_prime_gem_destroy(&obj->base, NULL); + + reservation_object_fini(&obj->__builtin_resv); + drm_gem_object_release(&obj->base); + i915_gem_info_remove_obj(i915, obj->base.size); + + bitmap_free(obj->bit_17); + i915_gem_object_free(obj); + + GEM_BUG_ON(!atomic_read(&i915->mm.free_count)); + atomic_dec(&i915->mm.free_count); + + if (on) + cond_resched(); + } + intel_runtime_pm_put(i915, wakeref); +} + +void i915_gem_flush_free_objects(struct drm_i915_private *i915) +{ + struct llist_node *freed; + + /* Free the oldest, most stale object to keep the free_list short */ + freed = NULL; + if (!llist_empty(&i915->mm.free_list)) { /* quick test for hotpath */ + /* Only one consumer of llist_del_first() allowed */ + spin_lock(&i915->mm.free_lock); + freed = llist_del_first(&i915->mm.free_list); + spin_unlock(&i915->mm.free_lock); + } + if (unlikely(freed)) { + freed->next = NULL; + __i915_gem_free_objects(i915, freed); + } +} + +static void __i915_gem_free_work(struct work_struct *work) +{ + struct drm_i915_private *i915 = + container_of(work, struct drm_i915_private, mm.free_work); + struct llist_node *freed; + + /* + * All file-owned VMA should have been released by this point through + * i915_gem_close_object(), or earlier by i915_gem_context_close(). + * However, the object may also be bound into the global GTT (e.g. + * older GPUs without per-process support, or for direct access through + * the GTT either for the user or for scanout). Those VMA still need to + * unbound now. + */ + + spin_lock(&i915->mm.free_lock); + while ((freed = llist_del_all(&i915->mm.free_list))) { + spin_unlock(&i915->mm.free_lock); + + __i915_gem_free_objects(i915, freed); + if (need_resched()) + return; + + spin_lock(&i915->mm.free_lock); + } + spin_unlock(&i915->mm.free_lock); +} + +static void __i915_gem_free_object_rcu(struct rcu_head *head) +{ + struct drm_i915_gem_object *obj = + container_of(head, typeof(*obj), rcu); + struct drm_i915_private *i915 = to_i915(obj->base.dev); + + /* + * We reuse obj->rcu for the freed list, so we had better not treat + * it like a rcu_head from this point forwards. And we expect all + * objects to be freed via this path. + */ + destroy_rcu_head(&obj->rcu); + + /* + * Since we require blocking on struct_mutex to unbind the freed + * object from the GPU before releasing resources back to the + * system, we can not do that directly from the RCU callback (which may + * be a softirq context), but must instead then defer that work onto a + * kthread. We use the RCU callback rather than move the freed object + * directly onto the work queue so that we can mix between using the + * worker and performing frees directly from subsequent allocations for + * crude but effective memory throttling. + */ + if (llist_add(&obj->freed, &i915->mm.free_list)) + queue_work(i915->wq, &i915->mm.free_work); +} + +void i915_gem_free_object(struct drm_gem_object *gem_obj) +{ + struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); + + if (obj->mm.quirked) + __i915_gem_object_unpin_pages(obj); + + if (discard_backing_storage(obj)) + obj->mm.madv = I915_MADV_DONTNEED; + + /* + * Before we free the object, make sure any pure RCU-only + * read-side critical sections are complete, e.g. + * i915_gem_busy_ioctl(). For the corresponding synchronized + * lookup see i915_gem_object_lookup_rcu(). + */ + atomic_inc(&to_i915(obj->base.dev)->mm.free_count); + call_rcu(&obj->rcu, __i915_gem_free_object_rcu); +} + +void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj) +{ + lockdep_assert_held(&obj->base.dev->struct_mutex); + + if (!i915_gem_object_has_active_reference(obj) && + i915_gem_object_is_active(obj)) + i915_gem_object_set_active_reference(obj); + else + i915_gem_object_put(obj); +} + +void i915_gem_init__objects(struct drm_i915_private *i915) +{ + INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); +} + static void i915_global_objects_shrink(void) { kmem_cache_shrink(global.slab_objects); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 9dfd8b0db725..05ef6b0076ae 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -15,9 +15,29 @@ #include "i915_gem_object_types.h" +void i915_gem_init__objects(struct drm_i915_private *i915); + struct drm_i915_gem_object *i915_gem_object_alloc(void); void i915_gem_object_free(struct drm_i915_gem_object *obj); +void i915_gem_object_init(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_object_ops *ops); +struct drm_i915_gem_object * +i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size); +struct drm_i915_gem_object * +i915_gem_object_create_shmem_from_data(struct drm_i915_private *i915, + const void *data, size_t size); + +extern const struct drm_i915_gem_object_ops i915_gem_shmem_ops; +void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, + struct sg_table *pages, + bool needs_clflush); + +void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file); +void i915_gem_free_object(struct drm_gem_object *obj); + +void i915_gem_flush_free_objects(struct drm_i915_private *i915); + /** * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle * @filp: DRM file private date @@ -343,8 +363,23 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, unsigned int cache_level); void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); -void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, - struct sg_table *pages, - bool needs_clflush); +static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) +{ + if (obj->cache_dirty) + return false; + + if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE)) + return true; + + return obj->pin_global; /* currently in use by HW, keep flushed */ +} + +static inline void __start_cpu_write(struct drm_i915_gem_object *obj) +{ + obj->read_domains = I915_GEM_DOMAIN_CPU; + obj->write_domain = I915_GEM_DOMAIN_CPU; + if (cpu_write_needs_clflush(obj)) + obj->cache_dirty = true; +} #endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c new file mode 100644 index 000000000000..e2721bd9ab44 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -0,0 +1,511 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include +#include + +#include "i915_drv.h" +#include "i915_gem_object.h" + +/* + * Move pages to appropriate lru and release the pagevec, decrementing the + * ref count of those pages. + */ +static void check_release_pagevec(struct pagevec *pvec) +{ + check_move_unevictable_pages(pvec); + __pagevec_release(pvec); + cond_resched(); +} + +static int shmem_get_pages(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + const unsigned long page_count = obj->base.size / PAGE_SIZE; + unsigned long i; + struct address_space *mapping; + struct sg_table *st; + struct scatterlist *sg; + struct sgt_iter sgt_iter; + struct page *page; + unsigned long last_pfn = 0; /* suppress gcc warning */ + unsigned int max_segment = i915_sg_segment_size(); + unsigned int sg_page_sizes; + struct pagevec pvec; + gfp_t noreclaim; + int ret; + + /* + * Assert that the object is not currently in any GPU domain. As it + * wasn't in the GTT, there shouldn't be any way it could have been in + * a GPU cache + */ + GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); + GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); + + /* + * If there's no chance of allocating enough pages for the whole + * object, bail early. + */ + if (page_count > totalram_pages()) + return -ENOMEM; + + st = kmalloc(sizeof(*st), GFP_KERNEL); + if (!st) + return -ENOMEM; + +rebuild_st: + if (sg_alloc_table(st, page_count, GFP_KERNEL)) { + kfree(st); + return -ENOMEM; + } + + /* + * Get the list of pages out of our struct file. They'll be pinned + * at this point until we release them. + * + * Fail silently without starting the shrinker + */ + mapping = obj->base.filp->f_mapping; + mapping_set_unevictable(mapping); + noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); + noreclaim |= __GFP_NORETRY | __GFP_NOWARN; + + sg = st->sgl; + st->nents = 0; + sg_page_sizes = 0; + for (i = 0; i < page_count; i++) { + const unsigned int shrink[] = { + (I915_SHRINK_BOUND | + I915_SHRINK_UNBOUND | + I915_SHRINK_PURGEABLE), + 0, + }, *s = shrink; + gfp_t gfp = noreclaim; + + do { + cond_resched(); + page = shmem_read_mapping_page_gfp(mapping, i, gfp); + if (!IS_ERR(page)) + break; + + if (!*s) { + ret = PTR_ERR(page); + goto err_sg; + } + + i915_gem_shrink(i915, 2 * page_count, NULL, *s++); + + /* + * We've tried hard to allocate the memory by reaping + * our own buffer, now let the real VM do its job and + * go down in flames if truly OOM. + * + * However, since graphics tend to be disposable, + * defer the oom here by reporting the ENOMEM back + * to userspace. + */ + if (!*s) { + /* reclaim and warn, but no oom */ + gfp = mapping_gfp_mask(mapping); + + /* + * Our bo are always dirty and so we require + * kswapd to reclaim our pages (direct reclaim + * does not effectively begin pageout of our + * buffers on its own). However, direct reclaim + * only waits for kswapd when under allocation + * congestion. So as a result __GFP_RECLAIM is + * unreliable and fails to actually reclaim our + * dirty pages -- unless you try over and over + * again with !__GFP_NORETRY. However, we still + * want to fail this allocation rather than + * trigger the out-of-memory killer and for + * this we want __GFP_RETRY_MAYFAIL. + */ + gfp |= __GFP_RETRY_MAYFAIL; + } + } while (1); + + if (!i || + sg->length >= max_segment || + page_to_pfn(page) != last_pfn + 1) { + if (i) { + sg_page_sizes |= sg->length; + sg = sg_next(sg); + } + st->nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + } else { + sg->length += PAGE_SIZE; + } + last_pfn = page_to_pfn(page); + + /* Check that the i965g/gm workaround works. */ + WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL)); + } + if (sg) { /* loop terminated early; short sg table */ + sg_page_sizes |= sg->length; + sg_mark_end(sg); + } + + /* Trim unused sg entries to avoid wasting memory. */ + i915_sg_trim(st); + + ret = i915_gem_gtt_prepare_pages(obj, st); + if (ret) { + /* + * DMA remapping failed? One possible cause is that + * it could not reserve enough large entries, asking + * for PAGE_SIZE chunks instead may be helpful. + */ + if (max_segment > PAGE_SIZE) { + for_each_sgt_page(page, sgt_iter, st) + put_page(page); + sg_free_table(st); + + max_segment = PAGE_SIZE; + goto rebuild_st; + } else { + dev_warn(&i915->drm.pdev->dev, + "Failed to DMA remap %lu pages\n", + page_count); + goto err_pages; + } + } + + if (i915_gem_object_needs_bit17_swizzle(obj)) + i915_gem_object_do_bit_17_swizzle(obj, st); + + __i915_gem_object_set_pages(obj, st, sg_page_sizes); + + return 0; + +err_sg: + sg_mark_end(sg); +err_pages: + mapping_clear_unevictable(mapping); + pagevec_init(&pvec); + for_each_sgt_page(page, sgt_iter, st) { + if (!pagevec_add(&pvec, page)) + check_release_pagevec(&pvec); + } + if (pagevec_count(&pvec)) + check_release_pagevec(&pvec); + sg_free_table(st); + kfree(st); + + /* + * shmemfs first checks if there is enough memory to allocate the page + * and reports ENOSPC should there be insufficient, along with the usual + * ENOMEM for a genuine allocation failure. + * + * We use ENOSPC in our driver to mean that we have run out of aperture + * space and so want to translate the error from shmemfs back to our + * usual understanding of ENOMEM. + */ + if (ret == -ENOSPC) + ret = -ENOMEM; + + return ret; +} + +void +__i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, + struct sg_table *pages, + bool needs_clflush) +{ + GEM_BUG_ON(obj->mm.madv == __I915_MADV_PURGED); + + if (obj->mm.madv == I915_MADV_DONTNEED) + obj->mm.dirty = false; + + if (needs_clflush && + (obj->read_domains & I915_GEM_DOMAIN_CPU) == 0 && + !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)) + drm_clflush_sg(pages); + + __start_cpu_write(obj); +} + +static void +shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages) +{ + struct sgt_iter sgt_iter; + struct pagevec pvec; + struct page *page; + + __i915_gem_object_release_shmem(obj, pages, true); + + i915_gem_gtt_finish_pages(obj, pages); + + if (i915_gem_object_needs_bit17_swizzle(obj)) + i915_gem_object_save_bit_17_swizzle(obj, pages); + + mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping); + + pagevec_init(&pvec); + for_each_sgt_page(page, sgt_iter, pages) { + if (obj->mm.dirty) + set_page_dirty(page); + + if (obj->mm.madv == I915_MADV_WILLNEED) + mark_page_accessed(page); + + if (!pagevec_add(&pvec, page)) + check_release_pagevec(&pvec); + } + if (pagevec_count(&pvec)) + check_release_pagevec(&pvec); + obj->mm.dirty = false; + + sg_free_table(pages); + kfree(pages); +} + +static int +shmem_pwrite(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pwrite *arg) +{ + struct address_space *mapping = obj->base.filp->f_mapping; + char __user *user_data = u64_to_user_ptr(arg->data_ptr); + u64 remain, offset; + unsigned int pg; + + /* Caller already validated user args */ + GEM_BUG_ON(!access_ok(user_data, arg->size)); + + /* + * Before we instantiate/pin the backing store for our use, we + * can prepopulate the shmemfs filp efficiently using a write into + * the pagecache. We avoid the penalty of instantiating all the + * pages, important if the user is just writing to a few and never + * uses the object on the GPU, and using a direct write into shmemfs + * allows it to avoid the cost of retrieving a page (either swapin + * or clearing-before-use) before it is overwritten. + */ + if (i915_gem_object_has_pages(obj)) + return -ENODEV; + + if (obj->mm.madv != I915_MADV_WILLNEED) + return -EFAULT; + + /* + * Before the pages are instantiated the object is treated as being + * in the CPU domain. The pages will be clflushed as required before + * use, and we can freely write into the pages directly. If userspace + * races pwrite with any other operation; corruption will ensue - + * that is userspace's prerogative! + */ + + remain = arg->size; + offset = arg->offset; + pg = offset_in_page(offset); + + do { + unsigned int len, unwritten; + struct page *page; + void *data, *vaddr; + int err; + char c; + + len = PAGE_SIZE - pg; + if (len > remain) + len = remain; + + /* Prefault the user page to reduce potential recursion */ + err = __get_user(c, user_data); + if (err) + return err; + + err = __get_user(c, user_data + len - 1); + if (err) + return err; + + err = pagecache_write_begin(obj->base.filp, mapping, + offset, len, 0, + &page, &data); + if (err < 0) + return err; + + vaddr = kmap_atomic(page); + unwritten = __copy_from_user_inatomic(vaddr + pg, + user_data, + len); + kunmap_atomic(vaddr); + + err = pagecache_write_end(obj->base.filp, mapping, + offset, len, len - unwritten, + page, data); + if (err < 0) + return err; + + /* We don't handle -EFAULT, leave it to the caller to check */ + if (unwritten) + return -ENODEV; + + remain -= len; + user_data += len; + offset += len; + pg = 0; + } while (remain); + + return 0; +} + +const struct drm_i915_gem_object_ops i915_gem_shmem_ops = { + .flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE | + I915_GEM_OBJECT_IS_SHRINKABLE, + + .get_pages = shmem_get_pages, + .put_pages = shmem_put_pages, + + .pwrite = shmem_pwrite, +}; + +static int create_shmem(struct drm_i915_private *i915, + struct drm_gem_object *obj, + size_t size) +{ + unsigned long flags = VM_NORESERVE; + struct file *filp; + + drm_gem_private_object_init(&i915->drm, obj, size); + + if (i915->mm.gemfs) + filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size, + flags); + else + filp = shmem_file_setup("i915", size, flags); + if (IS_ERR(filp)) + return PTR_ERR(filp); + + obj->filp = filp; + return 0; +} + +struct drm_i915_gem_object * +i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size) +{ + struct drm_i915_gem_object *obj; + struct address_space *mapping; + unsigned int cache_level; + gfp_t mask; + int ret; + + /* There is a prevalence of the assumption that we fit the object's + * page count inside a 32bit _signed_ variable. Let's document this and + * catch if we ever need to fix it. In the meantime, if you do spot + * such a local variable, please consider fixing! + */ + if (size >> PAGE_SHIFT > INT_MAX) + return ERR_PTR(-E2BIG); + + if (overflows_type(size, obj->base.size)) + return ERR_PTR(-E2BIG); + + obj = i915_gem_object_alloc(); + if (!obj) + return ERR_PTR(-ENOMEM); + + ret = create_shmem(i915, &obj->base, size); + if (ret) + goto fail; + + mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; + if (IS_I965GM(i915) || IS_I965G(i915)) { + /* 965gm cannot relocate objects above 4GiB. */ + mask &= ~__GFP_HIGHMEM; + mask |= __GFP_DMA32; + } + + mapping = obj->base.filp->f_mapping; + mapping_set_gfp_mask(mapping, mask); + GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); + + i915_gem_object_init(obj, &i915_gem_shmem_ops); + + obj->write_domain = I915_GEM_DOMAIN_CPU; + obj->read_domains = I915_GEM_DOMAIN_CPU; + + if (HAS_LLC(i915)) + /* On some devices, we can have the GPU use the LLC (the CPU + * cache) for about a 10% performance improvement + * compared to uncached. Graphics requests other than + * display scanout are coherent with the CPU in + * accessing this cache. This means in this mode we + * don't need to clflush on the CPU side, and on the + * GPU side we only need to flush internal caches to + * get data visible to the CPU. + * + * However, we maintain the display planes as UC, and so + * need to rebind when first used as such. + */ + cache_level = I915_CACHE_LLC; + else + cache_level = I915_CACHE_NONE; + + i915_gem_object_set_cache_coherency(obj, cache_level); + + trace_i915_gem_object_create(obj); + + return obj; + +fail: + i915_gem_object_free(obj); + return ERR_PTR(ret); +} + +/* Allocate a new GEM object and fill it with the supplied data */ +struct drm_i915_gem_object * +i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv, + const void *data, size_t size) +{ + struct drm_i915_gem_object *obj; + struct file *file; + size_t offset; + int err; + + obj = i915_gem_object_create_shmem(dev_priv, round_up(size, PAGE_SIZE)); + if (IS_ERR(obj)) + return obj; + + GEM_BUG_ON(obj->write_domain != I915_GEM_DOMAIN_CPU); + + file = obj->base.filp; + offset = 0; + do { + unsigned int len = min_t(typeof(size), size, PAGE_SIZE); + struct page *page; + void *pgdata, *vaddr; + + err = pagecache_write_begin(file, file->f_mapping, + offset, len, 0, + &page, &pgdata); + if (err < 0) + goto fail; + + vaddr = kmap(page); + memcpy(vaddr, data, len); + kunmap(page); + + err = pagecache_write_end(file, file->f_mapping, + offset, len, len, + page, pgdata); + if (err < 0) + goto fail; + + size -= len; + data += len; + offset += len; + } while (size); + + return obj; + +fail: + i915_gem_object_put(obj); + return ERR_PTR(err); +} diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 13a90404e0f6..db0b9687b904 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1921,7 +1921,7 @@ static int lrc_setup_wa_ctx(struct intel_engine_cs *engine) struct i915_vma *vma; int err; - obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE); + obj = i915_gem_object_create_shmem(engine->i915, CTX_WA_BB_OBJ_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -3012,7 +3012,7 @@ static int execlists_context_deferred_alloc(struct intel_context *ce, */ context_size += LRC_HEADER_PAGES * PAGE_SIZE; - ctx_obj = i915_gem_object_create(engine->i915, context_size); + ctx_obj = i915_gem_object_create_shmem(engine->i915, context_size); if (IS_ERR(ctx_obj)) return PTR_ERR(ctx_obj); diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index f0d60affdba3..ac93080bd863 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1396,7 +1396,7 @@ alloc_context_vma(struct intel_engine_cs *engine) struct i915_vma *vma; int err; - obj = i915_gem_object_create(i915, engine->context_size); + obj = i915_gem_object_create_shmem(i915, engine->context_size); if (IS_ERR(obj)) return ERR_CAST(obj); diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c index 5cb59c0b4bbe..7584bf0aeaa4 100644 --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c @@ -1725,7 +1725,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) int ret = 0; struct intel_vgpu_mm *mm = (s->buf_addr_type == GTT_BUFFER) ? s->vgpu->gtt.ggtt_mm : s->workload->shadow_mm; - unsigned long gma_start_offset = 0; + unsigned long start_offset = 0; /* get the start gm address of the batch buffer */ gma = get_gma_bb_from_cmd(s, 1); @@ -1742,7 +1742,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) bb->ppgtt = (s->buf_addr_type == GTT_BUFFER) ? false : true; - /* the gma_start_offset stores the batch buffer's start gma's + /* the start_offset stores the batch buffer's start gma's * offset relative to page boundary. so for non-privileged batch * buffer, the shadowed gem object holds exactly the same page * layout as original gem object. This is for the convience of @@ -1754,10 +1754,11 @@ static int perform_bb_shadow(struct parser_exec_state *s) * that of shadowed page. */ if (bb->ppgtt) - gma_start_offset = gma & ~I915_GTT_PAGE_MASK; + start_offset = gma & ~I915_GTT_PAGE_MASK; - bb->obj = i915_gem_object_create(s->vgpu->gvt->dev_priv, - roundup(bb_size + gma_start_offset, PAGE_SIZE)); + bb->obj = i915_gem_object_create_shmem(s->vgpu->gvt->dev_priv, + round_up(bb_size + start_offset, + PAGE_SIZE)); if (IS_ERR(bb->obj)) { ret = PTR_ERR(bb->obj); goto err_free_bb; @@ -1780,7 +1781,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) ret = copy_gma_to_hva(s->vgpu, mm, gma, gma + bb_size, - bb->va + gma_start_offset); + bb->va + start_offset); if (ret < 0) { gvt_vgpu_err("fail to copy guest ring buffer\n"); ret = -EFAULT; @@ -1806,7 +1807,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) * buffer's gma in pair. After all, we don't want to pin the shadow * buffer here (too early). */ - s->ip_va = bb->va + gma_start_offset; + s->ip_va = bb->va + start_offset; s->ip_gma = gma; return 0; err_unmap: @@ -2829,9 +2830,9 @@ static int shadow_indirect_ctx(struct intel_shadow_wa_ctx *wa_ctx) int ret = 0; void *map; - obj = i915_gem_object_create(workload->vgpu->gvt->dev_priv, - roundup(ctx_size + CACHELINE_BYTES, - PAGE_SIZE)); + obj = i915_gem_object_create_shmem(workload->vgpu->gvt->dev_priv, + roundup(ctx_size + CACHELINE_BYTES, + PAGE_SIZE)); if (IS_ERR(obj)) return PTR_ERR(obj); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b56c5583f933..a541a524396f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2750,16 +2750,6 @@ void i915_gem_load_init_fences(struct drm_i915_private *dev_priv); int i915_gem_freeze(struct drm_i915_private *dev_priv); int i915_gem_freeze_late(struct drm_i915_private *dev_priv); -void i915_gem_object_init(struct drm_i915_gem_object *obj, - const struct drm_i915_gem_object_ops *ops); -struct drm_i915_gem_object * -i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size); -struct drm_i915_gem_object * -i915_gem_object_create_from_data(struct drm_i915_private *dev_priv, - const void *data, size_t size); -void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file); -void i915_gem_free_object(struct drm_gem_object *obj); - static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915) { if (!atomic_read(&i915->mm.free_count)) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d75b05f14994..f02a50999a7d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -58,19 +58,6 @@ #include "intel_frontbuffer.h" #include "intel_pm.h" -static void i915_gem_flush_free_objects(struct drm_i915_private *i915); - -static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) -{ - if (obj->cache_dirty) - return false; - - if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE)) - return true; - - return obj->pin_global; /* currently in use by HW, keep flushed */ -} - static int insert_mappable_node(struct i915_ggtt *ggtt, struct drm_mm_node *node, u32 size) @@ -88,25 +75,6 @@ remove_mappable_node(struct drm_mm_node *node) drm_mm_remove_node(node); } -/* some bookkeeping */ -static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv, - u64 size) -{ - spin_lock(&dev_priv->mm.object_stat_lock); - dev_priv->mm.object_count++; - dev_priv->mm.object_memory += size; - spin_unlock(&dev_priv->mm.object_stat_lock); -} - -static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv, - u64 size) -{ - spin_lock(&dev_priv->mm.object_stat_lock); - dev_priv->mm.object_count--; - dev_priv->mm.object_memory -= size; - spin_unlock(&dev_priv->mm.object_stat_lock); -} - int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, struct drm_file *file) @@ -207,32 +175,6 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) return err; } -static void __start_cpu_write(struct drm_i915_gem_object *obj) -{ - obj->read_domains = I915_GEM_DOMAIN_CPU; - obj->write_domain = I915_GEM_DOMAIN_CPU; - if (cpu_write_needs_clflush(obj)) - obj->cache_dirty = true; -} - -void -__i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, - struct sg_table *pages, - bool needs_clflush) -{ - GEM_BUG_ON(obj->mm.madv == __I915_MADV_PURGED); - - if (obj->mm.madv == I915_MADV_DONTNEED) - obj->mm.dirty = false; - - if (needs_clflush && - (obj->read_domains & I915_GEM_DOMAIN_CPU) == 0 && - !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)) - drm_clflush_sg(pages); - - __start_cpu_write(obj); -} - static void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, struct sg_table *pages) @@ -284,8 +226,6 @@ static const struct drm_i915_gem_object_ops i915_gem_phys_ops = { .release = i915_gem_object_release_phys, }; -static const struct drm_i915_gem_object_ops i915_gem_object_ops; - int i915_gem_object_unbind(struct drm_i915_gem_object *obj) { struct i915_vma *vma; @@ -542,7 +482,7 @@ i915_gem_create(struct drm_file *file, return -EINVAL; /* Allocate the new object */ - obj = i915_gem_object_create(dev_priv, size); + obj = i915_gem_object_create_shmem(dev_priv, size); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -2089,52 +2029,6 @@ void __i915_gem_object_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); } -/* - * Move pages to appropriate lru and release the pagevec, decrementing the - * ref count of those pages. - */ -static void check_release_pagevec(struct pagevec *pvec) -{ - check_move_unevictable_pages(pvec); - __pagevec_release(pvec); - cond_resched(); -} - -static void -i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj, - struct sg_table *pages) -{ - struct sgt_iter sgt_iter; - struct pagevec pvec; - struct page *page; - - __i915_gem_object_release_shmem(obj, pages, true); - i915_gem_gtt_finish_pages(obj, pages); - - if (i915_gem_object_needs_bit17_swizzle(obj)) - i915_gem_object_save_bit_17_swizzle(obj, pages); - - mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping); - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, pages) { - if (obj->mm.dirty) - set_page_dirty(page); - - if (obj->mm.madv == I915_MADV_WILLNEED) - mark_page_accessed(page); - - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); - obj->mm.dirty = false; - - sg_free_table(pages); - kfree(pages); -} - static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj) { struct radix_tree_iter iter; @@ -2250,196 +2144,6 @@ bool i915_sg_trim(struct sg_table *orig_st) return true; } -static int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - const unsigned long page_count = obj->base.size / PAGE_SIZE; - unsigned long i; - struct address_space *mapping; - struct sg_table *st; - struct scatterlist *sg; - struct sgt_iter sgt_iter; - struct page *page; - unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment = i915_sg_segment_size(); - unsigned int sg_page_sizes; - struct pagevec pvec; - gfp_t noreclaim; - int ret; - - /* - * Assert that the object is not currently in any GPU domain. As it - * wasn't in the GTT, there shouldn't be any way it could have been in - * a GPU cache - */ - GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); - GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); - - /* - * If there's no chance of allocating enough pages for the whole - * object, bail early. - */ - if (page_count > totalram_pages()) - return -ENOMEM; - - st = kmalloc(sizeof(*st), GFP_KERNEL); - if (st == NULL) - return -ENOMEM; - -rebuild_st: - if (sg_alloc_table(st, page_count, GFP_KERNEL)) { - kfree(st); - return -ENOMEM; - } - - /* - * Get the list of pages out of our struct file. They'll be pinned - * at this point until we release them. - * - * Fail silently without starting the shrinker - */ - mapping = obj->base.filp->f_mapping; - mapping_set_unevictable(mapping); - noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); - noreclaim |= __GFP_NORETRY | __GFP_NOWARN; - - sg = st->sgl; - st->nents = 0; - sg_page_sizes = 0; - for (i = 0; i < page_count; i++) { - const unsigned int shrink[] = { - I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_PURGEABLE, - 0, - }, *s = shrink; - gfp_t gfp = noreclaim; - - do { - cond_resched(); - page = shmem_read_mapping_page_gfp(mapping, i, gfp); - if (!IS_ERR(page)) - break; - - if (!*s) { - ret = PTR_ERR(page); - goto err_sg; - } - - i915_gem_shrink(dev_priv, 2 * page_count, NULL, *s++); - - /* - * We've tried hard to allocate the memory by reaping - * our own buffer, now let the real VM do its job and - * go down in flames if truly OOM. - * - * However, since graphics tend to be disposable, - * defer the oom here by reporting the ENOMEM back - * to userspace. - */ - if (!*s) { - /* reclaim and warn, but no oom */ - gfp = mapping_gfp_mask(mapping); - - /* - * Our bo are always dirty and so we require - * kswapd to reclaim our pages (direct reclaim - * does not effectively begin pageout of our - * buffers on its own). However, direct reclaim - * only waits for kswapd when under allocation - * congestion. So as a result __GFP_RECLAIM is - * unreliable and fails to actually reclaim our - * dirty pages -- unless you try over and over - * again with !__GFP_NORETRY. However, we still - * want to fail this allocation rather than - * trigger the out-of-memory killer and for - * this we want __GFP_RETRY_MAYFAIL. - */ - gfp |= __GFP_RETRY_MAYFAIL; - } - } while (1); - - if (!i || - sg->length >= max_segment || - page_to_pfn(page) != last_pfn + 1) { - if (i) { - sg_page_sizes |= sg->length; - sg = sg_next(sg); - } - st->nents++; - sg_set_page(sg, page, PAGE_SIZE, 0); - } else { - sg->length += PAGE_SIZE; - } - last_pfn = page_to_pfn(page); - - /* Check that the i965g/gm workaround works. */ - WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL)); - } - if (sg) { /* loop terminated early; short sg table */ - sg_page_sizes |= sg->length; - sg_mark_end(sg); - } - - /* Trim unused sg entries to avoid wasting memory. */ - i915_sg_trim(st); - - ret = i915_gem_gtt_prepare_pages(obj, st); - if (ret) { - /* - * DMA remapping failed? One possible cause is that - * it could not reserve enough large entries, asking - * for PAGE_SIZE chunks instead may be helpful. - */ - if (max_segment > PAGE_SIZE) { - for_each_sgt_page(page, sgt_iter, st) - put_page(page); - sg_free_table(st); - - max_segment = PAGE_SIZE; - goto rebuild_st; - } else { - dev_warn(&dev_priv->drm.pdev->dev, - "Failed to DMA remap %lu pages\n", - page_count); - goto err_pages; - } - } - - if (i915_gem_object_needs_bit17_swizzle(obj)) - i915_gem_object_do_bit_17_swizzle(obj, st); - - __i915_gem_object_set_pages(obj, st, sg_page_sizes); - - return 0; - -err_sg: - sg_mark_end(sg); -err_pages: - mapping_clear_unevictable(mapping); - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, st) { - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); - sg_free_table(st); - kfree(st); - - /* - * shmemfs first checks if there is enough memory to allocate the page - * and reports ENOSPC should there be insufficient, along with the usual - * ENOMEM for a genuine allocation failure. - * - * We use ENOSPC in our driver to mean that we have run out of aperture - * space and so want to translate the error from shmemfs back to our - * usual understanding of ENOMEM. - */ - if (ret == -ENOSPC) - ret = -ENOMEM; - - return ret; -} - void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct sg_table *pages, unsigned int sg_page_sizes) @@ -2686,133 +2390,6 @@ void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, } } -static int -i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj, - const struct drm_i915_gem_pwrite *arg) -{ - struct address_space *mapping = obj->base.filp->f_mapping; - char __user *user_data = u64_to_user_ptr(arg->data_ptr); - u64 remain, offset; - unsigned int pg; - - /* Caller already validated user args */ - GEM_BUG_ON(!access_ok(user_data, arg->size)); - - /* - * Before we instantiate/pin the backing store for our use, we - * can prepopulate the shmemfs filp efficiently using a write into - * the pagecache. We avoid the penalty of instantiating all the - * pages, important if the user is just writing to a few and never - * uses the object on the GPU, and using a direct write into shmemfs - * allows it to avoid the cost of retrieving a page (either swapin - * or clearing-before-use) before it is overwritten. - */ - if (i915_gem_object_has_pages(obj)) - return -ENODEV; - - if (obj->mm.madv != I915_MADV_WILLNEED) - return -EFAULT; - - /* - * Before the pages are instantiated the object is treated as being - * in the CPU domain. The pages will be clflushed as required before - * use, and we can freely write into the pages directly. If userspace - * races pwrite with any other operation; corruption will ensue - - * that is userspace's prerogative! - */ - - remain = arg->size; - offset = arg->offset; - pg = offset_in_page(offset); - - do { - unsigned int len, unwritten; - struct page *page; - void *data, *vaddr; - int err; - char c; - - len = PAGE_SIZE - pg; - if (len > remain) - len = remain; - - /* Prefault the user page to reduce potential recursion */ - err = __get_user(c, user_data); - if (err) - return err; - - err = __get_user(c, user_data + len - 1); - if (err) - return err; - - err = pagecache_write_begin(obj->base.filp, mapping, - offset, len, 0, - &page, &data); - if (err < 0) - return err; - - vaddr = kmap_atomic(page); - unwritten = __copy_from_user_inatomic(vaddr + pg, - user_data, - len); - kunmap_atomic(vaddr); - - err = pagecache_write_end(obj->base.filp, mapping, - offset, len, len - unwritten, - page, data); - if (err < 0) - return err; - - /* We don't handle -EFAULT, leave it to the caller to check */ - if (unwritten) - return -ENODEV; - - remain -= len; - user_data += len; - offset += len; - pg = 0; - } while (remain); - - return 0; -} - -void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file) -{ - struct drm_i915_private *i915 = to_i915(gem->dev); - struct drm_i915_gem_object *obj = to_intel_bo(gem); - struct drm_i915_file_private *fpriv = file->driver_priv; - struct i915_lut_handle *lut, *ln; - - mutex_lock(&i915->drm.struct_mutex); - - list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) { - struct i915_gem_context *ctx = lut->ctx; - struct i915_vma *vma; - - GEM_BUG_ON(ctx->file_priv == ERR_PTR(-EBADF)); - if (ctx->file_priv != fpriv) - continue; - - vma = radix_tree_delete(&ctx->handles_vma, lut->handle); - GEM_BUG_ON(vma->obj != obj); - - /* We allow the process to have multiple handles to the same - * vma, in the same fd namespace, by virtue of flink/open. - */ - GEM_BUG_ON(!vma->open_count); - if (!--vma->open_count && !i915_vma_is_ggtt(vma)) - i915_vma_close(vma); - - list_del(&lut->obj_link); - list_del(&lut->ctx_link); - - i915_lut_handle_free(lut); - __i915_gem_object_release_unless_active(obj); - } - - mutex_unlock(&i915->drm.struct_mutex); -} - static unsigned long to_wait_timeout(s64 timeout_ns) { if (timeout_ns < 0) @@ -3811,348 +3388,6 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data, return err; } -static void -frontbuffer_retire(struct i915_active_request *active, - struct i915_request *request) -{ - struct drm_i915_gem_object *obj = - container_of(active, typeof(*obj), frontbuffer_write); - - intel_fb_obj_flush(obj, ORIGIN_CS); -} - -void i915_gem_object_init(struct drm_i915_gem_object *obj, - const struct drm_i915_gem_object_ops *ops) -{ - mutex_init(&obj->mm.lock); - - spin_lock_init(&obj->vma.lock); - INIT_LIST_HEAD(&obj->vma.list); - - INIT_LIST_HEAD(&obj->lut_list); - INIT_LIST_HEAD(&obj->batch_pool_link); - - init_rcu_head(&obj->rcu); - - obj->ops = ops; - - reservation_object_init(&obj->__builtin_resv); - obj->resv = &obj->__builtin_resv; - - obj->frontbuffer_ggtt_origin = ORIGIN_GTT; - i915_active_request_init(&obj->frontbuffer_write, - NULL, frontbuffer_retire); - - obj->mm.madv = I915_MADV_WILLNEED; - INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN); - mutex_init(&obj->mm.get_page.lock); - - i915_gem_info_add_obj(to_i915(obj->base.dev), obj->base.size); -} - -static const struct drm_i915_gem_object_ops i915_gem_object_ops = { - .flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE | - I915_GEM_OBJECT_IS_SHRINKABLE, - - .get_pages = i915_gem_object_get_pages_gtt, - .put_pages = i915_gem_object_put_pages_gtt, - - .pwrite = i915_gem_object_pwrite_gtt, -}; - -static int i915_gem_object_create_shmem(struct drm_device *dev, - struct drm_gem_object *obj, - size_t size) -{ - struct drm_i915_private *i915 = to_i915(dev); - unsigned long flags = VM_NORESERVE; - struct file *filp; - - drm_gem_private_object_init(dev, obj, size); - - if (i915->mm.gemfs) - filp = shmem_file_setup_with_mnt(i915->mm.gemfs, "i915", size, - flags); - else - filp = shmem_file_setup("i915", size, flags); - - if (IS_ERR(filp)) - return PTR_ERR(filp); - - obj->filp = filp; - - return 0; -} - -struct drm_i915_gem_object * -i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) -{ - struct drm_i915_gem_object *obj; - struct address_space *mapping; - unsigned int cache_level; - gfp_t mask; - int ret; - - /* There is a prevalence of the assumption that we fit the object's - * page count inside a 32bit _signed_ variable. Let's document this and - * catch if we ever need to fix it. In the meantime, if you do spot - * such a local variable, please consider fixing! - */ - if (size >> PAGE_SHIFT > INT_MAX) - return ERR_PTR(-E2BIG); - - if (overflows_type(size, obj->base.size)) - return ERR_PTR(-E2BIG); - - obj = i915_gem_object_alloc(); - if (obj == NULL) - return ERR_PTR(-ENOMEM); - - ret = i915_gem_object_create_shmem(&dev_priv->drm, &obj->base, size); - if (ret) - goto fail; - - mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; - if (IS_I965GM(dev_priv) || IS_I965G(dev_priv)) { - /* 965gm cannot relocate objects above 4GiB. */ - mask &= ~__GFP_HIGHMEM; - mask |= __GFP_DMA32; - } - - mapping = obj->base.filp->f_mapping; - mapping_set_gfp_mask(mapping, mask); - GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); - - i915_gem_object_init(obj, &i915_gem_object_ops); - - obj->write_domain = I915_GEM_DOMAIN_CPU; - obj->read_domains = I915_GEM_DOMAIN_CPU; - - if (HAS_LLC(dev_priv)) - /* On some devices, we can have the GPU use the LLC (the CPU - * cache) for about a 10% performance improvement - * compared to uncached. Graphics requests other than - * display scanout are coherent with the CPU in - * accessing this cache. This means in this mode we - * don't need to clflush on the CPU side, and on the - * GPU side we only need to flush internal caches to - * get data visible to the CPU. - * - * However, we maintain the display planes as UC, and so - * need to rebind when first used as such. - */ - cache_level = I915_CACHE_LLC; - else - cache_level = I915_CACHE_NONE; - - i915_gem_object_set_cache_coherency(obj, cache_level); - - trace_i915_gem_object_create(obj); - - return obj; - -fail: - i915_gem_object_free(obj); - return ERR_PTR(ret); -} - -static bool discard_backing_storage(struct drm_i915_gem_object *obj) -{ - /* If we are the last user of the backing storage (be it shmemfs - * pages or stolen etc), we know that the pages are going to be - * immediately released. In this case, we can then skip copying - * back the contents from the GPU. - */ - - if (obj->mm.madv != I915_MADV_WILLNEED) - return false; - - if (obj->base.filp == NULL) - return true; - - /* At first glance, this looks racy, but then again so would be - * userspace racing mmap against close. However, the first external - * reference to the filp can only be obtained through the - * i915_gem_mmap_ioctl() which safeguards us against the user - * acquiring such a reference whilst we are in the middle of - * freeing the object. - */ - return file_count(obj->base.filp) == 1; -} - -static void __i915_gem_free_objects(struct drm_i915_private *i915, - struct llist_node *freed) -{ - struct drm_i915_gem_object *obj, *on; - intel_wakeref_t wakeref; - - wakeref = intel_runtime_pm_get(i915); - llist_for_each_entry_safe(obj, on, freed, freed) { - struct i915_vma *vma, *vn; - - trace_i915_gem_object_destroy(obj); - - mutex_lock(&i915->drm.struct_mutex); - - GEM_BUG_ON(i915_gem_object_is_active(obj)); - list_for_each_entry_safe(vma, vn, &obj->vma.list, obj_link) { - GEM_BUG_ON(i915_vma_is_active(vma)); - vma->flags &= ~I915_VMA_PIN_MASK; - i915_vma_destroy(vma); - } - GEM_BUG_ON(!list_empty(&obj->vma.list)); - GEM_BUG_ON(!RB_EMPTY_ROOT(&obj->vma.tree)); - - /* This serializes freeing with the shrinker. Since the free - * is delayed, first by RCU then by the workqueue, we want the - * shrinker to be able to free pages of unreferenced objects, - * or else we may oom whilst there are plenty of deferred - * freed objects. - */ - if (i915_gem_object_has_pages(obj)) { - spin_lock(&i915->mm.obj_lock); - list_del_init(&obj->mm.link); - spin_unlock(&i915->mm.obj_lock); - } - - mutex_unlock(&i915->drm.struct_mutex); - - GEM_BUG_ON(obj->bind_count); - GEM_BUG_ON(obj->userfault_count); - GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits)); - GEM_BUG_ON(!list_empty(&obj->lut_list)); - - if (obj->ops->release) - obj->ops->release(obj); - - if (WARN_ON(i915_gem_object_has_pinned_pages(obj))) - atomic_set(&obj->mm.pages_pin_count, 0); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); - GEM_BUG_ON(i915_gem_object_has_pages(obj)); - - if (obj->base.import_attach) - drm_prime_gem_destroy(&obj->base, NULL); - - reservation_object_fini(&obj->__builtin_resv); - drm_gem_object_release(&obj->base); - i915_gem_info_remove_obj(i915, obj->base.size); - - bitmap_free(obj->bit_17); - i915_gem_object_free(obj); - - GEM_BUG_ON(!atomic_read(&i915->mm.free_count)); - atomic_dec(&i915->mm.free_count); - - if (on) - cond_resched(); - } - intel_runtime_pm_put(i915, wakeref); -} - -static void i915_gem_flush_free_objects(struct drm_i915_private *i915) -{ - struct llist_node *freed; - - /* Free the oldest, most stale object to keep the free_list short */ - freed = NULL; - if (!llist_empty(&i915->mm.free_list)) { /* quick test for hotpath */ - /* Only one consumer of llist_del_first() allowed */ - spin_lock(&i915->mm.free_lock); - freed = llist_del_first(&i915->mm.free_list); - spin_unlock(&i915->mm.free_lock); - } - if (unlikely(freed)) { - freed->next = NULL; - __i915_gem_free_objects(i915, freed); - } -} - -static void __i915_gem_free_work(struct work_struct *work) -{ - struct drm_i915_private *i915 = - container_of(work, struct drm_i915_private, mm.free_work); - struct llist_node *freed; - - /* - * All file-owned VMA should have been released by this point through - * i915_gem_close_object(), or earlier by i915_gem_context_close(). - * However, the object may also be bound into the global GTT (e.g. - * older GPUs without per-process support, or for direct access through - * the GTT either for the user or for scanout). Those VMA still need to - * unbound now. - */ - - spin_lock(&i915->mm.free_lock); - while ((freed = llist_del_all(&i915->mm.free_list))) { - spin_unlock(&i915->mm.free_lock); - - __i915_gem_free_objects(i915, freed); - if (need_resched()) - return; - - spin_lock(&i915->mm.free_lock); - } - spin_unlock(&i915->mm.free_lock); -} - -static void __i915_gem_free_object_rcu(struct rcu_head *head) -{ - struct drm_i915_gem_object *obj = - container_of(head, typeof(*obj), rcu); - struct drm_i915_private *i915 = to_i915(obj->base.dev); - - /* - * We reuse obj->rcu for the freed list, so we had better not treat - * it like a rcu_head from this point forwards. And we expect all - * objects to be freed via this path. - */ - destroy_rcu_head(&obj->rcu); - - /* - * Since we require blocking on struct_mutex to unbind the freed - * object from the GPU before releasing resources back to the - * system, we can not do that directly from the RCU callback (which may - * be a softirq context), but must instead then defer that work onto a - * kthread. We use the RCU callback rather than move the freed object - * directly onto the work queue so that we can mix between using the - * worker and performing frees directly from subsequent allocations for - * crude but effective memory throttling. - */ - if (llist_add(&obj->freed, &i915->mm.free_list)) - queue_work(i915->wq, &i915->mm.free_work); -} - -void i915_gem_free_object(struct drm_gem_object *gem_obj) -{ - struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); - - if (obj->mm.quirked) - __i915_gem_object_unpin_pages(obj); - - if (discard_backing_storage(obj)) - obj->mm.madv = I915_MADV_DONTNEED; - - /* - * Before we free the object, make sure any pure RCU-only - * read-side critical sections are complete, e.g. - * i915_gem_busy_ioctl(). For the corresponding synchronized - * lookup see i915_gem_object_lookup_rcu(). - */ - atomic_inc(&to_i915(obj->base.dev)->mm.free_count); - call_rcu(&obj->rcu, __i915_gem_free_object_rcu); -} - -void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj) -{ - lockdep_assert_held(&obj->base.dev->struct_mutex); - - if (!i915_gem_object_has_active_reference(obj) && - i915_gem_object_is_active(obj)) - i915_gem_object_set_active_reference(obj); - else - i915_gem_object_put(obj); -} - void i915_gem_sanitize(struct drm_i915_private *i915) { intel_wakeref_t wakeref; @@ -4749,7 +3984,7 @@ static void i915_gem_init__mm(struct drm_i915_private *i915) INIT_LIST_HEAD(&i915->mm.fence_list); INIT_LIST_HEAD(&i915->mm.userfault_list); - INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); + i915_gem_init__objects(i915); } int i915_gem_init_early(struct drm_i915_private *dev_priv) @@ -4915,57 +4150,6 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, } } -/* Allocate a new GEM object and fill it with the supplied data */ -struct drm_i915_gem_object * -i915_gem_object_create_from_data(struct drm_i915_private *dev_priv, - const void *data, size_t size) -{ - struct drm_i915_gem_object *obj; - struct file *file; - size_t offset; - int err; - - obj = i915_gem_object_create(dev_priv, round_up(size, PAGE_SIZE)); - if (IS_ERR(obj)) - return obj; - - GEM_BUG_ON(obj->write_domain != I915_GEM_DOMAIN_CPU); - - file = obj->base.filp; - offset = 0; - do { - unsigned int len = min_t(typeof(size), size, PAGE_SIZE); - struct page *page; - void *pgdata, *vaddr; - - err = pagecache_write_begin(file, file->f_mapping, - offset, len, 0, - &page, &pgdata); - if (err < 0) - goto fail; - - vaddr = kmap(page); - memcpy(vaddr, data, len); - kunmap(page); - - err = pagecache_write_end(file, file->f_mapping, - offset, len, len, - page, pgdata); - if (err < 0) - goto fail; - - size -= len; - data += len; - offset += len; - } while (size); - - return obj; - -fail: - i915_gem_object_put(obj); - return ERR_PTR(err); -} - struct scatterlist * i915_gem_object_get_sg(struct drm_i915_gem_object *obj, unsigned int n, @@ -5140,7 +4324,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) if (obj->ops == &i915_gem_phys_ops) return 0; - if (obj->ops != &i915_gem_object_ops) + if (obj->ops != &i915_gem_shmem_ops) return -EINVAL; err = i915_gem_object_unbind(obj); @@ -5176,12 +4360,12 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) __i915_gem_object_pin_pages(obj); if (!IS_ERR_OR_NULL(pages)) - i915_gem_object_ops.put_pages(obj, pages); + i915_gem_shmem_ops.put_pages(obj, pages); mutex_unlock(&obj->mm.lock); return 0; err_xfer: - obj->ops = &i915_gem_object_ops; + obj->ops = &i915_gem_shmem_ops; if (!IS_ERR_OR_NULL(pages)) { unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c4995d5a16d2..379fd89a180f 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1510,7 +1510,7 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv) BUILD_BUG_ON_NOT_POWER_OF_2(OA_BUFFER_SIZE); BUILD_BUG_ON(OA_BUFFER_SIZE < SZ_128K || OA_BUFFER_SIZE > SZ_16M); - bo = i915_gem_object_create(dev_priv, OA_BUFFER_SIZE); + bo = i915_gem_object_create_shmem(dev_priv, OA_BUFFER_SIZE); if (IS_ERR(bo)) { DRM_ERROR("Failed to allocate OA buffer\n"); ret = PTR_ERR(bo); diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c index 89db71996148..0d3a6fa674e6 100644 --- a/drivers/gpu/drm/i915/intel_fbdev.c +++ b/drivers/gpu/drm/i915/intel_fbdev.c @@ -144,7 +144,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper, if (size * 2 < dev_priv->stolen_usable_size) obj = i915_gem_object_create_stolen(dev_priv, size); if (obj == NULL) - obj = i915_gem_object_create(dev_priv, size); + obj = i915_gem_object_create_shmem(dev_priv, size); if (IS_ERR(obj)) { DRM_ERROR("failed to allocate framebuffer\n"); ret = PTR_ERR(obj); diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c index c4ac29309fcc..98e45a500a15 100644 --- a/drivers/gpu/drm/i915/intel_guc.c +++ b/drivers/gpu/drm/i915/intel_guc.c @@ -690,7 +690,7 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size) u64 flags; int ret; - obj = i915_gem_object_create(dev_priv, size); + obj = i915_gem_object_create_shmem(dev_priv, size); if (IS_ERR(obj)) return ERR_CAST(obj); diff --git a/drivers/gpu/drm/i915/intel_uc_fw.c b/drivers/gpu/drm/i915/intel_uc_fw.c index b9cb6fea9332..3257a054cb0b 100644 --- a/drivers/gpu/drm/i915/intel_uc_fw.c +++ b/drivers/gpu/drm/i915/intel_uc_fw.c @@ -159,7 +159,8 @@ void intel_uc_fw_fetch(struct drm_i915_private *dev_priv, goto fail; } - obj = i915_gem_object_create_from_data(dev_priv, fw->data, fw->size); + obj = i915_gem_object_create_shmem_from_data(dev_priv, + fw->data, fw->size); if (IS_ERR(obj)) { err = PTR_ERR(obj); DRM_DEBUG_DRIVER("%s fw object_create err=%d\n", diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c index 1e1f83326a96..ce4ec87698f6 100644 --- a/drivers/gpu/drm/i915/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c @@ -1390,7 +1390,7 @@ static int igt_ppgtt_gemfs_huge(void *arg) for (i = 0; i < ARRAY_SIZE(sizes); ++i) { unsigned int size = sizes[i]; - obj = i915_gem_object_create(i915, size); + obj = i915_gem_object_create_shmem(i915, size); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -1568,7 +1568,7 @@ static int igt_tmpfs_fallback(void *arg) i915->mm.gemfs = NULL; - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) { err = PTR_ERR(obj); goto out_restore; @@ -1628,7 +1628,7 @@ static int igt_shrink_thp(void *arg) return 0; } - obj = i915_gem_object_create(i915, SZ_2M); + obj = i915_gem_object_create_shmem(i915, SZ_2M); if (IS_ERR(obj)) return PTR_ERR(obj); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c index 2b943ee246c9..cc65a503e2f0 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c @@ -33,7 +33,7 @@ static int igt_dmabuf_export(void *arg) struct drm_i915_gem_object *obj; struct dma_buf *dmabuf; - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -57,7 +57,7 @@ static int igt_dmabuf_import_self(void *arg) struct dma_buf *dmabuf; int err; - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -232,7 +232,7 @@ static int igt_dmabuf_export_vmap(void *arg) void *ptr; int err; - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -279,7 +279,7 @@ static int igt_dmabuf_export_kmap(void *arg) void *ptr; int err; - obj = i915_gem_object_create(i915, 2*PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, 2 * PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c index b926d1cd165d..a0f59fb0d701 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c @@ -36,7 +36,7 @@ static int igt_gem_object(void *arg) /* Basic test to ensure we can create an object */ - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) { err = PTR_ERR(obj); pr_err("i915_gem_object_create failed, err=%d\n", err); @@ -59,7 +59,7 @@ static int igt_phys_object(void *arg) * i.e. exercise the i915_gem_object_phys API. */ - obj = i915_gem_object_create(i915, PAGE_SIZE); + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) { err = PTR_ERR(obj); pr_err("i915_gem_object_create failed, err=%d\n", err); From patchwork Wed May 22 08:20:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7E2C017E0 for ; Wed, 22 May 2019 08:21:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BBF428B4D for ; Wed, 22 May 2019 08:21:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5FC5428A51; Wed, 22 May 2019 08:21:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 352EB28A28 for ; Wed, 22 May 2019 08:21:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C2CCB8982C; Wed, 22 May 2019 08:21:18 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1F32C897F6 for ; Wed, 22 May 2019 08:21:12 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636785-1500050 for ; Wed, 22 May 2019 09:21:05 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:56 +0100 Message-Id: <20190522082104.5117-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 05/13] drm/i915: Move phys objects to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c, this time the legacy physical object. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 2 + drivers/gpu/drm/i915/gem/i915_gem_object.h | 11 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_pages.c | 521 +++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_phys.c | 211 ++++++ drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 61 ++ .../drm/i915/gem/selftests/i915_gem_phys.c | 80 ++ drivers/gpu/drm/i915/i915_drv.h | 2 - drivers/gpu/drm/i915/i915_gem.c | 712 +----------------- drivers/gpu/drm/i915/i915_gem_shrinker.c | 59 +- .../gpu/drm/i915/selftests/i915_gem_object.c | 54 -- .../drm/i915/selftests/i915_mock_selftests.h | 1 + 12 files changed, 895 insertions(+), 821 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_pages.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_phys.c create mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 625f9749355b..ba3b82f3cd49 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -88,6 +88,8 @@ i915-y += $(gt-y) obj-y += gem/ gem-y += \ gem/i915_gem_object.o \ + gem/i915_gem_pages.o \ + gem/i915_gem_phys.o \ gem/i915_gem_shmem.o i915-y += \ $(gem-y) \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 05ef6b0076ae..88495530ebd9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -33,11 +33,17 @@ void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, bool needs_clflush); +int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align); + void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file); void i915_gem_free_object(struct drm_gem_object *obj); void i915_gem_flush_free_objects(struct drm_i915_private *i915); +struct sg_table * +__i915_gem_object_unset_pages(struct drm_i915_gem_object *obj); +void i915_gem_object_truncate(struct drm_i915_gem_object *obj); + /** * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle * @filp: DRM file private date @@ -236,6 +242,8 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct sg_table *pages, unsigned int sg_page_sizes); + +int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj); int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check @@ -291,7 +299,8 @@ enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, enum i915_mm_subclass subclass); -void __i915_gem_object_truncate(struct drm_i915_gem_object *obj); +void i915_gem_object_truncate(struct drm_i915_gem_object *obj); +void i915_gem_object_writeback(struct drm_i915_gem_object *obj); enum i915_map_type { I915_MAP_WB = 0, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index fe3b2a2775f7..df8e29ee3943 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -52,6 +52,8 @@ struct drm_i915_gem_object_ops { int (*get_pages)(struct drm_i915_gem_object *obj); void (*put_pages)(struct drm_i915_gem_object *obj, struct sg_table *pages); + void (*truncate)(struct drm_i915_gem_object *obj); + void (*writeback)(struct drm_i915_gem_object *obj); int (*pwrite)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pwrite *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c new file mode 100644 index 000000000000..3879b3669dea --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -0,0 +1,521 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include "i915_drv.h" +#include "i915_gem_object.h" + +void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, + struct sg_table *pages, + unsigned int sg_page_sizes) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + unsigned long supported = INTEL_INFO(i915)->page_sizes; + int i; + + lockdep_assert_held(&obj->mm.lock); + + /* Make the pages coherent with the GPU (flushing any swapin). */ + if (obj->cache_dirty) { + obj->write_domain = 0; + if (i915_gem_object_has_struct_page(obj)) + drm_clflush_sg(pages); + obj->cache_dirty = false; + } + + obj->mm.get_page.sg_pos = pages->sgl; + obj->mm.get_page.sg_idx = 0; + + obj->mm.pages = pages; + + if (i915_gem_object_is_tiled(obj) && + i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) { + GEM_BUG_ON(obj->mm.quirked); + __i915_gem_object_pin_pages(obj); + obj->mm.quirked = true; + } + + GEM_BUG_ON(!sg_page_sizes); + obj->mm.page_sizes.phys = sg_page_sizes; + + /* + * Calculate the supported page-sizes which fit into the given + * sg_page_sizes. This will give us the page-sizes which we may be able + * to use opportunistically when later inserting into the GTT. For + * example if phys=2G, then in theory we should be able to use 1G, 2M, + * 64K or 4K pages, although in practice this will depend on a number of + * other factors. + */ + obj->mm.page_sizes.sg = 0; + for_each_set_bit(i, &supported, ilog2(I915_GTT_MAX_PAGE_SIZE) + 1) { + if (obj->mm.page_sizes.phys & ~0u << i) + obj->mm.page_sizes.sg |= BIT(i); + } + GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg)); + + spin_lock(&i915->mm.obj_lock); + list_add(&obj->mm.link, &i915->mm.unbound_list); + spin_unlock(&i915->mm.obj_lock); +} + +int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj) +{ + int err; + + if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) { + DRM_DEBUG("Attempting to obtain a purgeable object\n"); + return -EFAULT; + } + + err = obj->ops->get_pages(obj); + GEM_BUG_ON(!err && !i915_gem_object_has_pages(obj)); + + return err; +} + +/* Ensure that the associated pages are gathered from the backing storage + * and pinned into our object. i915_gem_object_pin_pages() may be called + * multiple times before they are released by a single call to + * i915_gem_object_unpin_pages() - once the pages are no longer referenced + * either as a result of memory pressure (reaping pages under the shrinker) + * or as the object is itself released. + */ +int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) +{ + int err; + + err = mutex_lock_interruptible(&obj->mm.lock); + if (err) + return err; + + if (unlikely(!i915_gem_object_has_pages(obj))) { + GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); + + err = ____i915_gem_object_get_pages(obj); + if (err) + goto unlock; + + smp_mb__before_atomic(); + } + atomic_inc(&obj->mm.pages_pin_count); + +unlock: + mutex_unlock(&obj->mm.lock); + return err; +} + +/* Immediately discard the backing storage */ +void i915_gem_object_truncate(struct drm_i915_gem_object *obj) +{ + drm_gem_free_mmap_offset(&obj->base); + if (obj->ops->truncate) + obj->ops->truncate(obj); +} + +/* Try to discard unwanted pages */ +void i915_gem_object_writeback(struct drm_i915_gem_object *obj) +{ + lockdep_assert_held(&obj->mm.lock); + GEM_BUG_ON(i915_gem_object_has_pages(obj)); + + if (obj->ops->writeback) + obj->ops->writeback(obj); +} + +static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj) +{ + struct radix_tree_iter iter; + void __rcu **slot; + + rcu_read_lock(); + radix_tree_for_each_slot(slot, &obj->mm.get_page.radix, &iter, 0) + radix_tree_delete(&obj->mm.get_page.radix, iter.index); + rcu_read_unlock(); +} + +struct sg_table * +__i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct sg_table *pages; + + pages = fetch_and_zero(&obj->mm.pages); + if (IS_ERR_OR_NULL(pages)) + return pages; + + spin_lock(&i915->mm.obj_lock); + list_del(&obj->mm.link); + spin_unlock(&i915->mm.obj_lock); + + if (obj->mm.mapping) { + void *ptr; + + ptr = page_mask_bits(obj->mm.mapping); + if (is_vmalloc_addr(ptr)) + vunmap(ptr); + else + kunmap(kmap_to_page(ptr)); + + obj->mm.mapping = NULL; + } + + __i915_gem_object_reset_page_iter(obj); + obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0; + + return pages; +} + +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, + enum i915_mm_subclass subclass) +{ + struct sg_table *pages; + int err; + + if (i915_gem_object_has_pinned_pages(obj)) + return -EBUSY; + + GEM_BUG_ON(obj->bind_count); + + /* May be called by shrinker from within get_pages() (on another bo) */ + mutex_lock_nested(&obj->mm.lock, subclass); + if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { + err = -EBUSY; + goto unlock; + } + + /* + * ->put_pages might need to allocate memory for the bit17 swizzle + * array, hence protect them from being reaped by removing them from gtt + * lists early. + */ + pages = __i915_gem_object_unset_pages(obj); + + /* + * XXX Temporary hijinx to avoid updating all backends to handle + * NULL pages. In the future, when we have more asynchronous + * get_pages backends we should be better able to handle the + * cancellation of the async task in a more uniform manner. + */ + if (!pages && !i915_gem_object_needs_async_cancel(obj)) + pages = ERR_PTR(-EINVAL); + + if (!IS_ERR(pages)) + obj->ops->put_pages(obj, pages); + + err = 0; +unlock: + mutex_unlock(&obj->mm.lock); + + return err; +} + +/* The 'mapping' part of i915_gem_object_pin_map() below */ +static void *i915_gem_object_map(const struct drm_i915_gem_object *obj, + enum i915_map_type type) +{ + unsigned long n_pages = obj->base.size >> PAGE_SHIFT; + struct sg_table *sgt = obj->mm.pages; + struct sgt_iter sgt_iter; + struct page *page; + struct page *stack_pages[32]; + struct page **pages = stack_pages; + unsigned long i = 0; + pgprot_t pgprot; + void *addr; + + /* A single page can always be kmapped */ + if (n_pages == 1 && type == I915_MAP_WB) + return kmap(sg_page(sgt->sgl)); + + if (n_pages > ARRAY_SIZE(stack_pages)) { + /* Too big for stack -- allocate temporary array instead */ + pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL); + if (!pages) + return NULL; + } + + for_each_sgt_page(page, sgt_iter, sgt) + pages[i++] = page; + + /* Check that we have the expected number of pages */ + GEM_BUG_ON(i != n_pages); + + switch (type) { + default: + MISSING_CASE(type); + /* fallthrough to use PAGE_KERNEL anyway */ + case I915_MAP_WB: + pgprot = PAGE_KERNEL; + break; + case I915_MAP_WC: + pgprot = pgprot_writecombine(PAGE_KERNEL_IO); + break; + } + addr = vmap(pages, n_pages, 0, pgprot); + + if (pages != stack_pages) + kvfree(pages); + + return addr; +} + +/* get, pin, and map the pages of the object into kernel space */ +void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, + enum i915_map_type type) +{ + enum i915_map_type has_type; + bool pinned; + void *ptr; + int err; + + if (unlikely(!i915_gem_object_has_struct_page(obj))) + return ERR_PTR(-ENXIO); + + err = mutex_lock_interruptible(&obj->mm.lock); + if (err) + return ERR_PTR(err); + + pinned = !(type & I915_MAP_OVERRIDE); + type &= ~I915_MAP_OVERRIDE; + + if (!atomic_inc_not_zero(&obj->mm.pages_pin_count)) { + if (unlikely(!i915_gem_object_has_pages(obj))) { + GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); + + err = ____i915_gem_object_get_pages(obj); + if (err) + goto err_unlock; + + smp_mb__before_atomic(); + } + atomic_inc(&obj->mm.pages_pin_count); + pinned = false; + } + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + + ptr = page_unpack_bits(obj->mm.mapping, &has_type); + if (ptr && has_type != type) { + if (pinned) { + err = -EBUSY; + goto err_unpin; + } + + if (is_vmalloc_addr(ptr)) + vunmap(ptr); + else + kunmap(kmap_to_page(ptr)); + + ptr = obj->mm.mapping = NULL; + } + + if (!ptr) { + ptr = i915_gem_object_map(obj, type); + if (!ptr) { + err = -ENOMEM; + goto err_unpin; + } + + obj->mm.mapping = page_pack_bits(ptr, type); + } + +out_unlock: + mutex_unlock(&obj->mm.lock); + return ptr; + +err_unpin: + atomic_dec(&obj->mm.pages_pin_count); +err_unlock: + ptr = ERR_PTR(err); + goto out_unlock; +} + +void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, + unsigned long offset, + unsigned long size) +{ + enum i915_map_type has_type; + void *ptr; + + GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); + GEM_BUG_ON(range_overflows_t(typeof(obj->base.size), + offset, size, obj->base.size)); + + obj->mm.dirty = true; + + if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) + return; + + ptr = page_unpack_bits(obj->mm.mapping, &has_type); + if (has_type == I915_MAP_WC) + return; + + drm_clflush_virt_range(ptr + offset, size); + if (size == obj->base.size) { + obj->write_domain &= ~I915_GEM_DOMAIN_CPU; + obj->cache_dirty = false; + } +} + +struct scatterlist * +i915_gem_object_get_sg(struct drm_i915_gem_object *obj, + unsigned int n, + unsigned int *offset) +{ + struct i915_gem_object_page_iter *iter = &obj->mm.get_page; + struct scatterlist *sg; + unsigned int idx, count; + + might_sleep(); + GEM_BUG_ON(n >= obj->base.size >> PAGE_SHIFT); + GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); + + /* As we iterate forward through the sg, we record each entry in a + * radixtree for quick repeated (backwards) lookups. If we have seen + * this index previously, we will have an entry for it. + * + * Initial lookup is O(N), but this is amortized to O(1) for + * sequential page access (where each new request is consecutive + * to the previous one). Repeated lookups are O(lg(obj->base.size)), + * i.e. O(1) with a large constant! + */ + if (n < READ_ONCE(iter->sg_idx)) + goto lookup; + + mutex_lock(&iter->lock); + + /* We prefer to reuse the last sg so that repeated lookup of this + * (or the subsequent) sg are fast - comparing against the last + * sg is faster than going through the radixtree. + */ + + sg = iter->sg_pos; + idx = iter->sg_idx; + count = __sg_page_count(sg); + + while (idx + count <= n) { + void *entry; + unsigned long i; + int ret; + + /* If we cannot allocate and insert this entry, or the + * individual pages from this range, cancel updating the + * sg_idx so that on this lookup we are forced to linearly + * scan onwards, but on future lookups we will try the + * insertion again (in which case we need to be careful of + * the error return reporting that we have already inserted + * this index). + */ + ret = radix_tree_insert(&iter->radix, idx, sg); + if (ret && ret != -EEXIST) + goto scan; + + entry = xa_mk_value(idx); + for (i = 1; i < count; i++) { + ret = radix_tree_insert(&iter->radix, idx + i, entry); + if (ret && ret != -EEXIST) + goto scan; + } + + idx += count; + sg = ____sg_next(sg); + count = __sg_page_count(sg); + } + +scan: + iter->sg_pos = sg; + iter->sg_idx = idx; + + mutex_unlock(&iter->lock); + + if (unlikely(n < idx)) /* insertion completed by another thread */ + goto lookup; + + /* In case we failed to insert the entry into the radixtree, we need + * to look beyond the current sg. + */ + while (idx + count <= n) { + idx += count; + sg = ____sg_next(sg); + count = __sg_page_count(sg); + } + + *offset = n - idx; + return sg; + +lookup: + rcu_read_lock(); + + sg = radix_tree_lookup(&iter->radix, n); + GEM_BUG_ON(!sg); + + /* If this index is in the middle of multi-page sg entry, + * the radix tree will contain a value entry that points + * to the start of that range. We will return the pointer to + * the base page and the offset of this page within the + * sg entry's range. + */ + *offset = 0; + if (unlikely(xa_is_value(sg))) { + unsigned long base = xa_to_value(sg); + + sg = radix_tree_lookup(&iter->radix, base); + GEM_BUG_ON(!sg); + + *offset = n - base; + } + + rcu_read_unlock(); + + return sg; +} + +struct page * +i915_gem_object_get_page(struct drm_i915_gem_object *obj, unsigned int n) +{ + struct scatterlist *sg; + unsigned int offset; + + GEM_BUG_ON(!i915_gem_object_has_struct_page(obj)); + + sg = i915_gem_object_get_sg(obj, n, &offset); + return nth_page(sg_page(sg), offset); +} + +/* Like i915_gem_object_get_page(), but mark the returned page dirty */ +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, + unsigned int n) +{ + struct page *page; + + page = i915_gem_object_get_page(obj, n); + if (!obj->mm.dirty) + set_page_dirty(page); + + return page; +} + +dma_addr_t +i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj, + unsigned long n, + unsigned int *len) +{ + struct scatterlist *sg; + unsigned int offset; + + sg = i915_gem_object_get_sg(obj, n, &offset); + + if (len) + *len = sg_dma_len(sg) - (offset << PAGE_SHIFT); + + return sg_dma_address(sg) + (offset << PAGE_SHIFT); +} + +dma_addr_t +i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, + unsigned long n) +{ + return i915_gem_object_get_dma_address_len(obj, n, NULL); +} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c new file mode 100644 index 000000000000..1c0ce69f765b --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -0,0 +1,211 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include +#include +#include + +#include /* for drm_legacy.h! */ +#include +#include /* for drm_pci.h! */ +#include + +#include "i915_drv.h" +#include "i915_gem_object.h" + +static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) +{ + struct address_space *mapping = obj->base.filp->f_mapping; + struct drm_dma_handle *phys; + struct sg_table *st; + struct scatterlist *sg; + char *vaddr; + int i; + int err; + + if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj))) + return -EINVAL; + + /* Always aligning to the object size, allows a single allocation + * to handle all possible callers, and given typical object sizes, + * the alignment of the buddy allocation will naturally match. + */ + phys = drm_pci_alloc(obj->base.dev, + roundup_pow_of_two(obj->base.size), + roundup_pow_of_two(obj->base.size)); + if (!phys) + return -ENOMEM; + + vaddr = phys->vaddr; + for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { + struct page *page; + char *src; + + page = shmem_read_mapping_page(mapping, i); + if (IS_ERR(page)) { + err = PTR_ERR(page); + goto err_phys; + } + + src = kmap_atomic(page); + memcpy(vaddr, src, PAGE_SIZE); + drm_clflush_virt_range(vaddr, PAGE_SIZE); + kunmap_atomic(src); + + put_page(page); + vaddr += PAGE_SIZE; + } + + i915_gem_chipset_flush(to_i915(obj->base.dev)); + + st = kmalloc(sizeof(*st), GFP_KERNEL); + if (!st) { + err = -ENOMEM; + goto err_phys; + } + + if (sg_alloc_table(st, 1, GFP_KERNEL)) { + kfree(st); + err = -ENOMEM; + goto err_phys; + } + + sg = st->sgl; + sg->offset = 0; + sg->length = obj->base.size; + + sg_dma_address(sg) = phys->busaddr; + sg_dma_len(sg) = obj->base.size; + + obj->phys_handle = phys; + + __i915_gem_object_set_pages(obj, st, sg->length); + + return 0; + +err_phys: + drm_pci_free(obj->base.dev, phys); + + return err; +} + +static void +i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, + struct sg_table *pages) +{ + __i915_gem_object_release_shmem(obj, pages, false); + + if (obj->mm.dirty) { + struct address_space *mapping = obj->base.filp->f_mapping; + char *vaddr = obj->phys_handle->vaddr; + int i; + + for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { + struct page *page; + char *dst; + + page = shmem_read_mapping_page(mapping, i); + if (IS_ERR(page)) + continue; + + dst = kmap_atomic(page); + drm_clflush_virt_range(vaddr, PAGE_SIZE); + memcpy(dst, vaddr, PAGE_SIZE); + kunmap_atomic(dst); + + set_page_dirty(page); + if (obj->mm.madv == I915_MADV_WILLNEED) + mark_page_accessed(page); + put_page(page); + vaddr += PAGE_SIZE; + } + obj->mm.dirty = false; + } + + sg_free_table(pages); + kfree(pages); + + drm_pci_free(obj->base.dev, obj->phys_handle); +} + +static void +i915_gem_object_release_phys(struct drm_i915_gem_object *obj) +{ + i915_gem_object_unpin_pages(obj); +} + +static const struct drm_i915_gem_object_ops i915_gem_phys_ops = { + .get_pages = i915_gem_object_get_pages_phys, + .put_pages = i915_gem_object_put_pages_phys, + .release = i915_gem_object_release_phys, +}; + +int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) +{ + struct sg_table *pages; + int err; + + if (align > obj->base.size) + return -EINVAL; + + if (obj->ops == &i915_gem_phys_ops) + return 0; + + if (obj->ops != &i915_gem_shmem_ops) + return -EINVAL; + + err = i915_gem_object_unbind(obj); + if (err) + return err; + + mutex_lock(&obj->mm.lock); + + if (obj->mm.madv != I915_MADV_WILLNEED) { + err = -EFAULT; + goto err_unlock; + } + + if (obj->mm.quirked) { + err = -EFAULT; + goto err_unlock; + } + + if (obj->mm.mapping) { + err = -EBUSY; + goto err_unlock; + } + + pages = __i915_gem_object_unset_pages(obj); + + obj->ops = &i915_gem_phys_ops; + + err = ____i915_gem_object_get_pages(obj); + if (err) + goto err_xfer; + + /* Perma-pin (until release) the physical set of pages */ + __i915_gem_object_pin_pages(obj); + + if (!IS_ERR_OR_NULL(pages)) + i915_gem_shmem_ops.put_pages(obj, pages); + mutex_unlock(&obj->mm.lock); + return 0; + +err_xfer: + obj->ops = &i915_gem_shmem_ops; + if (!IS_ERR_OR_NULL(pages)) { + unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl); + + __i915_gem_object_set_pages(obj, pages, sg_page_sizes); + } +err_unlock: + mutex_unlock(&obj->mm.lock); + return err; +} + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/i915_gem_phys.c" +#endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index e2721bd9ab44..568164ca66fd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -213,6 +213,65 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) return ret; } +static void +shmem_truncate(struct drm_i915_gem_object *obj) +{ + /* + * Our goal here is to return as much of the memory as + * is possible back to the system as we are called from OOM. + * To do this we must instruct the shmfs to drop all of its + * backing pages, *now*. + */ + shmem_truncate_range(file_inode(obj->base.filp), 0, (loff_t)-1); + obj->mm.madv = __I915_MADV_PURGED; + obj->mm.pages = ERR_PTR(-EFAULT); +} + +static void +shmem_writeback(struct drm_i915_gem_object *obj) +{ + struct address_space *mapping; + struct writeback_control wbc = { + .sync_mode = WB_SYNC_NONE, + .nr_to_write = SWAP_CLUSTER_MAX, + .range_start = 0, + .range_end = LLONG_MAX, + .for_reclaim = 1, + }; + unsigned long i; + + /* + * Leave mmapings intact (GTT will have been revoked on unbinding, + * leaving only CPU mmapings around) and add those pages to the LRU + * instead of invoking writeback so they are aged and paged out + * as normal. + */ + mapping = obj->base.filp->f_mapping; + + /* Begin writeback on each dirty page */ + for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) { + struct page *page; + + page = find_lock_entry(mapping, i); + if (!page || xa_is_value(page)) + continue; + + if (!page_mapped(page) && clear_page_dirty_for_io(page)) { + int ret; + + SetPageReclaim(page); + ret = mapping->a_ops->writepage(page, &wbc); + if (!PageWriteback(page)) + ClearPageReclaim(page); + if (!ret) + goto put; + } + unlock_page(page); +put: + put_page(page); + } +} + void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, @@ -362,6 +421,8 @@ const struct drm_i915_gem_object_ops i915_gem_shmem_ops = { .get_pages = shmem_get_pages, .put_pages = shmem_put_pages, + .truncate = shmem_truncate, + .writeback = shmem_writeback, .pwrite = shmem_pwrite, }; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c new file mode 100644 index 000000000000..ed64012e5d24 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -0,0 +1,80 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#include "i915_selftest.h" + +#include "selftests/mock_gem_device.h" + +static int mock_phys_object(void *arg) +{ + struct drm_i915_private *i915 = arg; + struct drm_i915_gem_object *obj; + int err; + + /* Create an object and bind it to a contiguous set of physical pages, + * i.e. exercise the i915_gem_object_phys API. + */ + + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); + if (IS_ERR(obj)) { + err = PTR_ERR(obj); + pr_err("i915_gem_object_create failed, err=%d\n", err); + goto out; + } + + mutex_lock(&i915->drm.struct_mutex); + err = i915_gem_object_attach_phys(obj, PAGE_SIZE); + mutex_unlock(&i915->drm.struct_mutex); + if (err) { + pr_err("i915_gem_object_attach_phys failed, err=%d\n", err); + goto out_obj; + } + + if (obj->ops != &i915_gem_phys_ops) { + pr_err("i915_gem_object_attach_phys did not create a phys object\n"); + err = -EINVAL; + goto out_obj; + } + + if (!atomic_read(&obj->mm.pages_pin_count)) { + pr_err("i915_gem_object_attach_phys did not pin its phys pages\n"); + err = -EINVAL; + goto out_obj; + } + + /* Make the object dirty so that put_pages must do copy back the data */ + mutex_lock(&i915->drm.struct_mutex); + err = i915_gem_object_set_to_gtt_domain(obj, true); + mutex_unlock(&i915->drm.struct_mutex); + if (err) { + pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", + err); + goto out_obj; + } + +out_obj: + i915_gem_object_put(obj); +out: + return err; +} + +int i915_gem_phys_mock_selftests(void) +{ + static const struct i915_subtest tests[] = { + SUBTEST(mock_phys_object), + }; + struct drm_i915_private *i915; + int err; + + i915 = mock_gem_device(); + if (!i915) + return -ENOMEM; + + err = i915_subtests(tests, i915); + + drm_dev_put(&i915->drm); + return err; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a541a524396f..107447287b22 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2892,8 +2892,6 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, unsigned int flags); void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); -int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, - int align); int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file); void i915_gem_release(struct drm_device *dev, struct drm_file *file); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f02a50999a7d..e8e6f3d3082a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -26,7 +26,6 @@ */ #include -#include #include #include #include @@ -99,133 +98,6 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, return 0; } -static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) -{ - struct address_space *mapping = obj->base.filp->f_mapping; - drm_dma_handle_t *phys; - struct sg_table *st; - struct scatterlist *sg; - char *vaddr; - int i; - int err; - - if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj))) - return -EINVAL; - - /* Always aligning to the object size, allows a single allocation - * to handle all possible callers, and given typical object sizes, - * the alignment of the buddy allocation will naturally match. - */ - phys = drm_pci_alloc(obj->base.dev, - roundup_pow_of_two(obj->base.size), - roundup_pow_of_two(obj->base.size)); - if (!phys) - return -ENOMEM; - - vaddr = phys->vaddr; - for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { - struct page *page; - char *src; - - page = shmem_read_mapping_page(mapping, i); - if (IS_ERR(page)) { - err = PTR_ERR(page); - goto err_phys; - } - - src = kmap_atomic(page); - memcpy(vaddr, src, PAGE_SIZE); - drm_clflush_virt_range(vaddr, PAGE_SIZE); - kunmap_atomic(src); - - put_page(page); - vaddr += PAGE_SIZE; - } - - i915_gem_chipset_flush(to_i915(obj->base.dev)); - - st = kmalloc(sizeof(*st), GFP_KERNEL); - if (!st) { - err = -ENOMEM; - goto err_phys; - } - - if (sg_alloc_table(st, 1, GFP_KERNEL)) { - kfree(st); - err = -ENOMEM; - goto err_phys; - } - - sg = st->sgl; - sg->offset = 0; - sg->length = obj->base.size; - - sg_dma_address(sg) = phys->busaddr; - sg_dma_len(sg) = obj->base.size; - - obj->phys_handle = phys; - - __i915_gem_object_set_pages(obj, st, sg->length); - - return 0; - -err_phys: - drm_pci_free(obj->base.dev, phys); - - return err; -} - -static void -i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, - struct sg_table *pages) -{ - __i915_gem_object_release_shmem(obj, pages, false); - - if (obj->mm.dirty) { - struct address_space *mapping = obj->base.filp->f_mapping; - char *vaddr = obj->phys_handle->vaddr; - int i; - - for (i = 0; i < obj->base.size / PAGE_SIZE; i++) { - struct page *page; - char *dst; - - page = shmem_read_mapping_page(mapping, i); - if (IS_ERR(page)) - continue; - - dst = kmap_atomic(page); - drm_clflush_virt_range(vaddr, PAGE_SIZE); - memcpy(dst, vaddr, PAGE_SIZE); - kunmap_atomic(dst); - - set_page_dirty(page); - if (obj->mm.madv == I915_MADV_WILLNEED) - mark_page_accessed(page); - put_page(page); - vaddr += PAGE_SIZE; - } - obj->mm.dirty = false; - } - - sg_free_table(pages); - kfree(pages); - - drm_pci_free(obj->base.dev, obj->phys_handle); -} - -static void -i915_gem_object_release_phys(struct drm_i915_gem_object *obj) -{ - i915_gem_object_unpin_pages(obj); -} - -static const struct drm_i915_gem_object_ops i915_gem_phys_ops = { - .get_pages = i915_gem_object_get_pages_phys, - .put_pages = i915_gem_object_put_pages_phys, - .release = i915_gem_object_release_phys, -}; - int i915_gem_object_unbind(struct drm_i915_gem_object *obj) { struct i915_vma *vma; @@ -1961,11 +1833,6 @@ static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj) return err; } -static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj) -{ - drm_gem_free_mmap_offset(&obj->base); -} - int i915_gem_mmap_gtt(struct drm_file *file, struct drm_device *dev, @@ -2011,111 +1878,6 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data, return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset); } -/* Immediately discard the backing storage */ -void __i915_gem_object_truncate(struct drm_i915_gem_object *obj) -{ - i915_gem_object_free_mmap_offset(obj); - - if (obj->base.filp == NULL) - return; - - /* Our goal here is to return as much of the memory as - * is possible back to the system as we are called from OOM. - * To do this we must instruct the shmfs to drop all of its - * backing pages, *now*. - */ - shmem_truncate_range(file_inode(obj->base.filp), 0, (loff_t)-1); - obj->mm.madv = __I915_MADV_PURGED; - obj->mm.pages = ERR_PTR(-EFAULT); -} - -static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj) -{ - struct radix_tree_iter iter; - void __rcu **slot; - - rcu_read_lock(); - radix_tree_for_each_slot(slot, &obj->mm.get_page.radix, &iter, 0) - radix_tree_delete(&obj->mm.get_page.radix, iter.index); - rcu_read_unlock(); -} - -static struct sg_table * -__i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct sg_table *pages; - - pages = fetch_and_zero(&obj->mm.pages); - if (IS_ERR_OR_NULL(pages)) - return pages; - - spin_lock(&i915->mm.obj_lock); - list_del(&obj->mm.link); - spin_unlock(&i915->mm.obj_lock); - - if (obj->mm.mapping) { - void *ptr; - - ptr = page_mask_bits(obj->mm.mapping); - if (is_vmalloc_addr(ptr)) - vunmap(ptr); - else - kunmap(kmap_to_page(ptr)); - - obj->mm.mapping = NULL; - } - - __i915_gem_object_reset_page_iter(obj); - obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0; - - return pages; -} - -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass) -{ - struct sg_table *pages; - int ret; - - if (i915_gem_object_has_pinned_pages(obj)) - return -EBUSY; - - GEM_BUG_ON(obj->bind_count); - - /* May be called by shrinker from within get_pages() (on another bo) */ - mutex_lock_nested(&obj->mm.lock, subclass); - if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { - ret = -EBUSY; - goto unlock; - } - - /* - * ->put_pages might need to allocate memory for the bit17 swizzle - * array, hence protect them from being reaped by removing them from gtt - * lists early. - */ - pages = __i915_gem_object_unset_pages(obj); - - /* - * XXX Temporary hijinx to avoid updating all backends to handle - * NULL pages. In the future, when we have more asynchronous - * get_pages backends we should be better able to handle the - * cancellation of the async task in a more uniform manner. - */ - if (!pages && !i915_gem_object_needs_async_cancel(obj)) - pages = ERR_PTR(-EINVAL); - - if (!IS_ERR(pages)) - obj->ops->put_pages(obj, pages); - - ret = 0; -unlock: - mutex_unlock(&obj->mm.lock); - - return ret; -} - bool i915_sg_trim(struct sg_table *orig_st) { struct sg_table new_st; @@ -2144,252 +1906,6 @@ bool i915_sg_trim(struct sg_table *orig_st) return true; } -void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, - struct sg_table *pages, - unsigned int sg_page_sizes) -{ - struct drm_i915_private *i915 = to_i915(obj->base.dev); - unsigned long supported = INTEL_INFO(i915)->page_sizes; - int i; - - lockdep_assert_held(&obj->mm.lock); - - /* Make the pages coherent with the GPU (flushing any swapin). */ - if (obj->cache_dirty) { - obj->write_domain = 0; - if (i915_gem_object_has_struct_page(obj)) - drm_clflush_sg(pages); - obj->cache_dirty = false; - } - - obj->mm.get_page.sg_pos = pages->sgl; - obj->mm.get_page.sg_idx = 0; - - obj->mm.pages = pages; - - if (i915_gem_object_is_tiled(obj) && - i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) { - GEM_BUG_ON(obj->mm.quirked); - __i915_gem_object_pin_pages(obj); - obj->mm.quirked = true; - } - - GEM_BUG_ON(!sg_page_sizes); - obj->mm.page_sizes.phys = sg_page_sizes; - - /* - * Calculate the supported page-sizes which fit into the given - * sg_page_sizes. This will give us the page-sizes which we may be able - * to use opportunistically when later inserting into the GTT. For - * example if phys=2G, then in theory we should be able to use 1G, 2M, - * 64K or 4K pages, although in practice this will depend on a number of - * other factors. - */ - obj->mm.page_sizes.sg = 0; - for_each_set_bit(i, &supported, ilog2(I915_GTT_MAX_PAGE_SIZE) + 1) { - if (obj->mm.page_sizes.phys & ~0u << i) - obj->mm.page_sizes.sg |= BIT(i); - } - GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg)); - - spin_lock(&i915->mm.obj_lock); - list_add(&obj->mm.link, &i915->mm.unbound_list); - spin_unlock(&i915->mm.obj_lock); -} - -static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj) -{ - int err; - - if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) { - DRM_DEBUG("Attempting to obtain a purgeable object\n"); - return -EFAULT; - } - - err = obj->ops->get_pages(obj); - GEM_BUG_ON(!err && !i915_gem_object_has_pages(obj)); - - return err; -} - -/* Ensure that the associated pages are gathered from the backing storage - * and pinned into our object. i915_gem_object_pin_pages() may be called - * multiple times before they are released by a single call to - * i915_gem_object_unpin_pages() - once the pages are no longer referenced - * either as a result of memory pressure (reaping pages under the shrinker) - * or as the object is itself released. - */ -int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) -{ - int err; - - err = mutex_lock_interruptible(&obj->mm.lock); - if (err) - return err; - - if (unlikely(!i915_gem_object_has_pages(obj))) { - GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); - - err = ____i915_gem_object_get_pages(obj); - if (err) - goto unlock; - - smp_mb__before_atomic(); - } - atomic_inc(&obj->mm.pages_pin_count); - -unlock: - mutex_unlock(&obj->mm.lock); - return err; -} - -/* The 'mapping' part of i915_gem_object_pin_map() below */ -static void *i915_gem_object_map(const struct drm_i915_gem_object *obj, - enum i915_map_type type) -{ - unsigned long n_pages = obj->base.size >> PAGE_SHIFT; - struct sg_table *sgt = obj->mm.pages; - struct sgt_iter sgt_iter; - struct page *page; - struct page *stack_pages[32]; - struct page **pages = stack_pages; - unsigned long i = 0; - pgprot_t pgprot; - void *addr; - - /* A single page can always be kmapped */ - if (n_pages == 1 && type == I915_MAP_WB) - return kmap(sg_page(sgt->sgl)); - - if (n_pages > ARRAY_SIZE(stack_pages)) { - /* Too big for stack -- allocate temporary array instead */ - pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) - return NULL; - } - - for_each_sgt_page(page, sgt_iter, sgt) - pages[i++] = page; - - /* Check that we have the expected number of pages */ - GEM_BUG_ON(i != n_pages); - - switch (type) { - default: - MISSING_CASE(type); - /* fallthrough to use PAGE_KERNEL anyway */ - case I915_MAP_WB: - pgprot = PAGE_KERNEL; - break; - case I915_MAP_WC: - pgprot = pgprot_writecombine(PAGE_KERNEL_IO); - break; - } - addr = vmap(pages, n_pages, 0, pgprot); - - if (pages != stack_pages) - kvfree(pages); - - return addr; -} - -/* get, pin, and map the pages of the object into kernel space */ -void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, - enum i915_map_type type) -{ - enum i915_map_type has_type; - bool pinned; - void *ptr; - int ret; - - if (unlikely(!i915_gem_object_has_struct_page(obj))) - return ERR_PTR(-ENXIO); - - ret = mutex_lock_interruptible(&obj->mm.lock); - if (ret) - return ERR_PTR(ret); - - pinned = !(type & I915_MAP_OVERRIDE); - type &= ~I915_MAP_OVERRIDE; - - if (!atomic_inc_not_zero(&obj->mm.pages_pin_count)) { - if (unlikely(!i915_gem_object_has_pages(obj))) { - GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); - - ret = ____i915_gem_object_get_pages(obj); - if (ret) - goto err_unlock; - - smp_mb__before_atomic(); - } - atomic_inc(&obj->mm.pages_pin_count); - pinned = false; - } - GEM_BUG_ON(!i915_gem_object_has_pages(obj)); - - ptr = page_unpack_bits(obj->mm.mapping, &has_type); - if (ptr && has_type != type) { - if (pinned) { - ret = -EBUSY; - goto err_unpin; - } - - if (is_vmalloc_addr(ptr)) - vunmap(ptr); - else - kunmap(kmap_to_page(ptr)); - - ptr = obj->mm.mapping = NULL; - } - - if (!ptr) { - ptr = i915_gem_object_map(obj, type); - if (!ptr) { - ret = -ENOMEM; - goto err_unpin; - } - - obj->mm.mapping = page_pack_bits(ptr, type); - } - -out_unlock: - mutex_unlock(&obj->mm.lock); - return ptr; - -err_unpin: - atomic_dec(&obj->mm.pages_pin_count); -err_unlock: - ptr = ERR_PTR(ret); - goto out_unlock; -} - -void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, - unsigned long offset, - unsigned long size) -{ - enum i915_map_type has_type; - void *ptr; - - GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); - GEM_BUG_ON(range_overflows_t(typeof(obj->base.size), - offset, size, obj->base.size)); - - obj->mm.dirty = true; - - if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) - return; - - ptr = page_unpack_bits(obj->mm.mapping, &has_type); - if (has_type == I915_MAP_WC) - return; - - drm_clflush_virt_range(ptr + offset, size); - if (size == obj->base.size) { - obj->write_domain &= ~I915_GEM_DOMAIN_CPU; - obj->cache_dirty = false; - } -} - static unsigned long to_wait_timeout(s64 timeout_ns) { if (timeout_ns < 0) @@ -3378,7 +2894,7 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data, /* if the object is no longer attached, discard its backing storage */ if (obj->mm.madv == I915_MADV_DONTNEED && !i915_gem_object_has_pages(obj)) - __i915_gem_object_truncate(obj); + i915_gem_object_truncate(obj); args->retained = obj->mm.madv != __I915_MADV_PURGED; mutex_unlock(&obj->mm.lock); @@ -4150,232 +3666,6 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, } } -struct scatterlist * -i915_gem_object_get_sg(struct drm_i915_gem_object *obj, - unsigned int n, - unsigned int *offset) -{ - struct i915_gem_object_page_iter *iter = &obj->mm.get_page; - struct scatterlist *sg; - unsigned int idx, count; - - might_sleep(); - GEM_BUG_ON(n >= obj->base.size >> PAGE_SHIFT); - GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); - - /* As we iterate forward through the sg, we record each entry in a - * radixtree for quick repeated (backwards) lookups. If we have seen - * this index previously, we will have an entry for it. - * - * Initial lookup is O(N), but this is amortized to O(1) for - * sequential page access (where each new request is consecutive - * to the previous one). Repeated lookups are O(lg(obj->base.size)), - * i.e. O(1) with a large constant! - */ - if (n < READ_ONCE(iter->sg_idx)) - goto lookup; - - mutex_lock(&iter->lock); - - /* We prefer to reuse the last sg so that repeated lookup of this - * (or the subsequent) sg are fast - comparing against the last - * sg is faster than going through the radixtree. - */ - - sg = iter->sg_pos; - idx = iter->sg_idx; - count = __sg_page_count(sg); - - while (idx + count <= n) { - void *entry; - unsigned long i; - int ret; - - /* If we cannot allocate and insert this entry, or the - * individual pages from this range, cancel updating the - * sg_idx so that on this lookup we are forced to linearly - * scan onwards, but on future lookups we will try the - * insertion again (in which case we need to be careful of - * the error return reporting that we have already inserted - * this index). - */ - ret = radix_tree_insert(&iter->radix, idx, sg); - if (ret && ret != -EEXIST) - goto scan; - - entry = xa_mk_value(idx); - for (i = 1; i < count; i++) { - ret = radix_tree_insert(&iter->radix, idx + i, entry); - if (ret && ret != -EEXIST) - goto scan; - } - - idx += count; - sg = ____sg_next(sg); - count = __sg_page_count(sg); - } - -scan: - iter->sg_pos = sg; - iter->sg_idx = idx; - - mutex_unlock(&iter->lock); - - if (unlikely(n < idx)) /* insertion completed by another thread */ - goto lookup; - - /* In case we failed to insert the entry into the radixtree, we need - * to look beyond the current sg. - */ - while (idx + count <= n) { - idx += count; - sg = ____sg_next(sg); - count = __sg_page_count(sg); - } - - *offset = n - idx; - return sg; - -lookup: - rcu_read_lock(); - - sg = radix_tree_lookup(&iter->radix, n); - GEM_BUG_ON(!sg); - - /* If this index is in the middle of multi-page sg entry, - * the radix tree will contain a value entry that points - * to the start of that range. We will return the pointer to - * the base page and the offset of this page within the - * sg entry's range. - */ - *offset = 0; - if (unlikely(xa_is_value(sg))) { - unsigned long base = xa_to_value(sg); - - sg = radix_tree_lookup(&iter->radix, base); - GEM_BUG_ON(!sg); - - *offset = n - base; - } - - rcu_read_unlock(); - - return sg; -} - -struct page * -i915_gem_object_get_page(struct drm_i915_gem_object *obj, unsigned int n) -{ - struct scatterlist *sg; - unsigned int offset; - - GEM_BUG_ON(!i915_gem_object_has_struct_page(obj)); - - sg = i915_gem_object_get_sg(obj, n, &offset); - return nth_page(sg_page(sg), offset); -} - -/* Like i915_gem_object_get_page(), but mark the returned page dirty */ -struct page * -i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, - unsigned int n) -{ - struct page *page; - - page = i915_gem_object_get_page(obj, n); - if (!obj->mm.dirty) - set_page_dirty(page); - - return page; -} - -dma_addr_t -i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj, - unsigned long n, - unsigned int *len) -{ - struct scatterlist *sg; - unsigned int offset; - - sg = i915_gem_object_get_sg(obj, n, &offset); - - if (len) - *len = sg_dma_len(sg) - (offset << PAGE_SHIFT); - - return sg_dma_address(sg) + (offset << PAGE_SHIFT); -} - -dma_addr_t -i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj, - unsigned long n) -{ - return i915_gem_object_get_dma_address_len(obj, n, NULL); -} - - -int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) -{ - struct sg_table *pages; - int err; - - if (align > obj->base.size) - return -EINVAL; - - if (obj->ops == &i915_gem_phys_ops) - return 0; - - if (obj->ops != &i915_gem_shmem_ops) - return -EINVAL; - - err = i915_gem_object_unbind(obj); - if (err) - return err; - - mutex_lock(&obj->mm.lock); - - if (obj->mm.madv != I915_MADV_WILLNEED) { - err = -EFAULT; - goto err_unlock; - } - - if (obj->mm.quirked) { - err = -EFAULT; - goto err_unlock; - } - - if (obj->mm.mapping) { - err = -EBUSY; - goto err_unlock; - } - - pages = __i915_gem_object_unset_pages(obj); - - obj->ops = &i915_gem_phys_ops; - - err = ____i915_gem_object_get_pages(obj); - if (err) - goto err_xfer; - - /* Perma-pin (until release) the physical set of pages */ - __i915_gem_object_pin_pages(obj); - - if (!IS_ERR_OR_NULL(pages)) - i915_gem_shmem_ops.put_pages(obj, pages); - mutex_unlock(&obj->mm.lock); - return 0; - -err_xfer: - obj->ops = &i915_gem_shmem_ops; - if (!IS_ERR_OR_NULL(pages)) { - unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl); - - __i915_gem_object_set_pages(obj, pages, sg_page_sizes); - } -err_unlock: - mutex_unlock(&obj->mm.lock); - return err; -} - #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/scatterlist.c" #include "selftests/mock_gem_device.c" diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c index 588e3898b120..2c7aefb3e101 100644 --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c @@ -114,65 +114,18 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj) return !i915_gem_object_has_pages(obj); } -static void __start_writeback(struct drm_i915_gem_object *obj, - unsigned int flags) +static void try_to_writeback(struct drm_i915_gem_object *obj, + unsigned int flags) { - struct address_space *mapping; - struct writeback_control wbc = { - .sync_mode = WB_SYNC_NONE, - .nr_to_write = SWAP_CLUSTER_MAX, - .range_start = 0, - .range_end = LLONG_MAX, - .for_reclaim = 1, - }; - unsigned long i; - - lockdep_assert_held(&obj->mm.lock); - GEM_BUG_ON(i915_gem_object_has_pages(obj)); - switch (obj->mm.madv) { case I915_MADV_DONTNEED: - __i915_gem_object_truncate(obj); + i915_gem_object_truncate(obj); case __I915_MADV_PURGED: return; } - if (!obj->base.filp) - return; - - if (!(flags & I915_SHRINK_WRITEBACK)) - return; - - /* - * Leave mmapings intact (GTT will have been revoked on unbinding, - * leaving only CPU mmapings around) and add those pages to the LRU - * instead of invoking writeback so they are aged and paged out - * as normal. - */ - mapping = obj->base.filp->f_mapping; - - /* Begin writeback on each dirty page */ - for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) { - struct page *page; - - page = find_lock_entry(mapping, i); - if (!page || xa_is_value(page)) - continue; - - if (!page_mapped(page) && clear_page_dirty_for_io(page)) { - int ret; - - SetPageReclaim(page); - ret = mapping->a_ops->writepage(page, &wbc); - if (!PageWriteback(page)) - ClearPageReclaim(page); - if (!ret) - goto put; - } - unlock_page(page); -put: - put_page(page); - } + if (flags & I915_SHRINK_WRITEBACK) + i915_gem_object_writeback(obj); } /** @@ -315,7 +268,7 @@ i915_gem_shrink(struct drm_i915_private *i915, mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER); if (!i915_gem_object_has_pages(obj)) { - __start_writeback(obj, flags); + try_to_writeback(obj, flags); count += obj->base.size >> PAGE_SHIFT; } mutex_unlock(&obj->mm.lock); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c index a0f59fb0d701..b98a286a8be5 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c @@ -49,59 +49,6 @@ static int igt_gem_object(void *arg) return err; } -static int igt_phys_object(void *arg) -{ - struct drm_i915_private *i915 = arg; - struct drm_i915_gem_object *obj; - int err; - - /* Create an object and bind it to a contiguous set of physical pages, - * i.e. exercise the i915_gem_object_phys API. - */ - - obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); - if (IS_ERR(obj)) { - err = PTR_ERR(obj); - pr_err("i915_gem_object_create failed, err=%d\n", err); - goto out; - } - - mutex_lock(&i915->drm.struct_mutex); - err = i915_gem_object_attach_phys(obj, PAGE_SIZE); - mutex_unlock(&i915->drm.struct_mutex); - if (err) { - pr_err("i915_gem_object_attach_phys failed, err=%d\n", err); - goto out_obj; - } - - if (obj->ops != &i915_gem_phys_ops) { - pr_err("i915_gem_object_attach_phys did not create a phys object\n"); - err = -EINVAL; - goto out_obj; - } - - if (!atomic_read(&obj->mm.pages_pin_count)) { - pr_err("i915_gem_object_attach_phys did not pin its phys pages\n"); - err = -EINVAL; - goto out_obj; - } - - /* Make the object dirty so that put_pages must do copy back the data */ - mutex_lock(&i915->drm.struct_mutex); - err = i915_gem_object_set_to_gtt_domain(obj, true); - mutex_unlock(&i915->drm.struct_mutex); - if (err) { - pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", - err); - goto out_obj; - } - -out_obj: - i915_gem_object_put(obj); -out: - return err; -} - static int igt_gem_huge(void *arg) { const unsigned int nreal = 509; /* just to be awkward */ @@ -631,7 +578,6 @@ int i915_gem_object_mock_selftests(void) { static const struct i915_subtest tests[] = { SUBTEST(igt_gem_object), - SUBTEST(igt_phys_object), }; struct drm_i915_private *i915; int err; diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h index 88e5ab586337..510eb176bb2c 100644 --- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h @@ -18,6 +18,7 @@ selftest(engine, intel_engine_cs_mock_selftests) selftest(timelines, i915_timeline_mock_selftests) selftest(requests, i915_request_mock_selftests) selftest(objects, i915_gem_object_mock_selftests) +selftest(phys, i915_gem_phys_mock_selftests) selftest(dmabuf, i915_gem_dmabuf_mock_selftests) selftest(vma, i915_vma_mock_selftests) selftest(evict, i915_gem_evict_mock_selftests) From patchwork Wed May 22 08:20:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955279 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 226E3912 for ; Wed, 22 May 2019 08:21:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18D7A202A5 for ; Wed, 22 May 2019 08:21:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0C34228A7A; Wed, 22 May 2019 08:21:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 97048289D7 for ; Wed, 22 May 2019 08:21:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 00199897F6; Wed, 22 May 2019 08:21:17 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id D388E897E8 for ; Wed, 22 May 2019 08:21:12 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636786-1500050 for ; Wed, 22 May 2019 09:21:05 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:57 +0100 Message-Id: <20190522082104.5117-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 06/13] drm/i915: Move mmap and friends to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c, now the turn of do_mmap and the faulthandlers Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_mman.c | 504 ++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 56 ++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 7 + .../drm/i915/gem/selftests/i915_gem_mman.c | 503 ++++++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/i915_gem.c | 561 +----------------- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- .../gpu/drm/i915/selftests/i915_gem_object.c | 487 --------------- .../drm/i915/selftests/i915_live_selftests.h | 1 + 10 files changed, 1087 insertions(+), 1036 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_mman.c create mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index ba3b82f3cd49..d05757c52492 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -88,6 +88,7 @@ i915-y += $(gt-y) obj-y += gem/ gem-y += \ gem/i915_gem_object.o \ + gem/i915_gem_mman.o \ gem/i915_gem_pages.o \ gem/i915_gem_phys.o \ gem/i915_gem_shmem.o diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c new file mode 100644 index 000000000000..0da40183f63c --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -0,0 +1,504 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include +#include + +#include "i915_drv.h" +#include "i915_gem_gtt.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" +#include "i915_vma.h" +#include "intel_drv.h" + +static inline bool +__vma_matches(struct vm_area_struct *vma, struct file *filp, + unsigned long addr, unsigned long size) +{ + if (vma->vm_file != filp) + return false; + + return vma->vm_start == addr && + (vma->vm_end - vma->vm_start) == PAGE_ALIGN(size); +} + +/** + * i915_gem_mmap_ioctl - Maps the contents of an object, returning the address + * it is mapped to. + * @dev: drm device + * @data: ioctl data blob + * @file: drm file + * + * While the mapping holds a reference on the contents of the object, it doesn't + * imply a ref on the object itself. + * + * IMPORTANT: + * + * DRM driver writers who look a this function as an example for how to do GEM + * mmap support, please don't implement mmap support like here. The modern way + * to implement DRM mmap support is with an mmap offset ioctl (like + * i915_gem_mmap_gtt) and then using the mmap syscall on the DRM fd directly. + * That way debug tooling like valgrind will understand what's going on, hiding + * the mmap call in a driver private ioctl will break that. The i915 driver only + * does cpu mmaps this way because we didn't know better. + */ +int +i915_gem_mmap_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_mmap *args = data; + struct drm_i915_gem_object *obj; + unsigned long addr; + + if (args->flags & ~(I915_MMAP_WC)) + return -EINVAL; + + if (args->flags & I915_MMAP_WC && !boot_cpu_has(X86_FEATURE_PAT)) + return -ENODEV; + + obj = i915_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + /* prime objects have no backing filp to GEM mmap + * pages from. + */ + if (!obj->base.filp) { + addr = -ENXIO; + goto err; + } + + if (range_overflows(args->offset, args->size, (u64)obj->base.size)) { + addr = -EINVAL; + goto err; + } + + addr = vm_mmap(obj->base.filp, 0, args->size, + PROT_READ | PROT_WRITE, MAP_SHARED, + args->offset); + if (IS_ERR_VALUE(addr)) + goto err; + + if (args->flags & I915_MMAP_WC) { + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + + if (down_write_killable(&mm->mmap_sem)) { + addr = -EINTR; + goto err; + } + vma = find_vma(mm, addr); + if (vma && __vma_matches(vma, obj->base.filp, addr, args->size)) + vma->vm_page_prot = + pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + else + addr = -ENOMEM; + up_write(&mm->mmap_sem); + if (IS_ERR_VALUE(addr)) + goto err; + + /* This may race, but that's ok, it only gets set */ + WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU); + } + i915_gem_object_put(obj); + + args->addr_ptr = (u64)addr; + return 0; + +err: + i915_gem_object_put(obj); + return addr; +} + +static unsigned int tile_row_pages(const struct drm_i915_gem_object *obj) +{ + return i915_gem_object_get_tile_row_size(obj) >> PAGE_SHIFT; +} + +/** + * i915_gem_mmap_gtt_version - report the current feature set for GTT mmaps + * + * A history of the GTT mmap interface: + * + * 0 - Everything had to fit into the GTT. Both parties of a memcpy had to + * aligned and suitable for fencing, and still fit into the available + * mappable space left by the pinned display objects. A classic problem + * we called the page-fault-of-doom where we would ping-pong between + * two objects that could not fit inside the GTT and so the memcpy + * would page one object in at the expense of the other between every + * single byte. + * + * 1 - Objects can be any size, and have any compatible fencing (X Y, or none + * as set via i915_gem_set_tiling() [DRM_I915_GEM_SET_TILING]). If the + * object is too large for the available space (or simply too large + * for the mappable aperture!), a view is created instead and faulted + * into userspace. (This view is aligned and sized appropriately for + * fenced access.) + * + * 2 - Recognise WC as a separate cache domain so that we can flush the + * delayed writes via GTT before performing direct access via WC. + * + * 3 - Remove implicit set-domain(GTT) and synchronisation on initial + * pagefault; swapin remains transparent. + * + * Restrictions: + * + * * snoopable objects cannot be accessed via the GTT. It can cause machine + * hangs on some architectures, corruption on others. An attempt to service + * a GTT page fault from a snoopable object will generate a SIGBUS. + * + * * the object must be able to fit into RAM (physical memory, though no + * limited to the mappable aperture). + * + * + * Caveats: + * + * * a new GTT page fault will synchronize rendering from the GPU and flush + * all data to system memory. Subsequent access will not be synchronized. + * + * * all mappings are revoked on runtime device suspend. + * + * * there are only 8, 16 or 32 fence registers to share between all users + * (older machines require fence register for display and blitter access + * as well). Contention of the fence registers will cause the previous users + * to be unmapped and any new access will generate new page faults. + * + * * running out of memory while servicing a fault may generate a SIGBUS, + * rather than the expected SIGSEGV. + */ +int i915_gem_mmap_gtt_version(void) +{ + return 3; +} + +static inline struct i915_ggtt_view +compute_partial_view(const struct drm_i915_gem_object *obj, + pgoff_t page_offset, + unsigned int chunk) +{ + struct i915_ggtt_view view; + + if (i915_gem_object_is_tiled(obj)) + chunk = roundup(chunk, tile_row_pages(obj)); + + view.type = I915_GGTT_VIEW_PARTIAL; + view.partial.offset = rounddown(page_offset, chunk); + view.partial.size = + min_t(unsigned int, chunk, + (obj->base.size >> PAGE_SHIFT) - view.partial.offset); + + /* If the partial covers the entire object, just create a normal VMA. */ + if (chunk >= obj->base.size >> PAGE_SHIFT) + view.type = I915_GGTT_VIEW_NORMAL; + + return view; +} + +/** + * i915_gem_fault - fault a page into the GTT + * @vmf: fault info + * + * The fault handler is set up by drm_gem_mmap() when a object is GTT mapped + * from userspace. The fault handler takes care of binding the object to + * the GTT (if needed), allocating and programming a fence register (again, + * only if needed based on whether the old reg is still valid or the object + * is tiled) and inserting a new PTE into the faulting process. + * + * Note that the faulting process may involve evicting existing objects + * from the GTT and/or fence registers to make room. So performance may + * suffer if the GTT working set is large or there are few fence registers + * left. + * + * The current feature set supported by i915_gem_fault() and thus GTT mmaps + * is exposed via I915_PARAM_MMAP_GTT_VERSION (see i915_gem_mmap_gtt_version). + */ +vm_fault_t i915_gem_fault(struct vm_fault *vmf) +{ +#define MIN_CHUNK_PAGES (SZ_1M >> PAGE_SHIFT) + struct vm_area_struct *area = vmf->vma; + struct drm_i915_gem_object *obj = to_intel_bo(area->vm_private_data); + struct drm_device *dev = obj->base.dev; + struct drm_i915_private *i915 = to_i915(dev); + struct i915_ggtt *ggtt = &i915->ggtt; + bool write = area->vm_flags & VM_WRITE; + intel_wakeref_t wakeref; + struct i915_vma *vma; + pgoff_t page_offset; + int srcu; + int ret; + + /* Sanity check that we allow writing into this object */ + if (i915_gem_object_is_readonly(obj) && write) + return VM_FAULT_SIGBUS; + + /* We don't use vmf->pgoff since that has the fake offset */ + page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT; + + trace_i915_gem_object_fault(obj, page_offset, true, write); + + ret = i915_gem_object_pin_pages(obj); + if (ret) + goto err; + + wakeref = intel_runtime_pm_get(i915); + + srcu = i915_reset_trylock(i915); + if (srcu < 0) { + ret = srcu; + goto err_rpm; + } + + ret = i915_mutex_lock_interruptible(dev); + if (ret) + goto err_reset; + + /* Access to snoopable pages through the GTT is incoherent. */ + if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(i915)) { + ret = -EFAULT; + goto err_unlock; + } + + /* Now pin it into the GTT as needed */ + vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, + PIN_MAPPABLE | + PIN_NONBLOCK | + PIN_NONFAULT); + if (IS_ERR(vma)) { + /* Use a partial view if it is bigger than available space */ + struct i915_ggtt_view view = + compute_partial_view(obj, page_offset, MIN_CHUNK_PAGES); + unsigned int flags; + + flags = PIN_MAPPABLE; + if (view.type == I915_GGTT_VIEW_NORMAL) + flags |= PIN_NONBLOCK; /* avoid warnings for pinned */ + + /* + * Userspace is now writing through an untracked VMA, abandon + * all hope that the hardware is able to track future writes. + */ + obj->frontbuffer_ggtt_origin = ORIGIN_CPU; + + vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); + if (IS_ERR(vma) && !view.type) { + flags = PIN_MAPPABLE; + view.type = I915_GGTT_VIEW_PARTIAL; + vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); + } + } + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err_unlock; + } + + ret = i915_vma_pin_fence(vma); + if (ret) + goto err_unpin; + + /* Finally, remap it using the new GTT offset */ + ret = remap_io_mapping(area, + area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT), + (ggtt->gmadr.start + vma->node.start) >> PAGE_SHIFT, + min_t(u64, vma->size, area->vm_end - area->vm_start), + &ggtt->iomap); + if (ret) + goto err_fence; + + /* Mark as being mmapped into userspace for later revocation */ + assert_rpm_wakelock_held(i915); + if (!i915_vma_set_userfault(vma) && !obj->userfault_count++) + list_add(&obj->userfault_link, &i915->mm.userfault_list); + GEM_BUG_ON(!obj->userfault_count); + + i915_vma_set_ggtt_write(vma); + +err_fence: + i915_vma_unpin_fence(vma); +err_unpin: + __i915_vma_unpin(vma); +err_unlock: + mutex_unlock(&dev->struct_mutex); +err_reset: + i915_reset_unlock(i915, srcu); +err_rpm: + intel_runtime_pm_put(i915, wakeref); + i915_gem_object_unpin_pages(obj); +err: + switch (ret) { + case -EIO: + /* + * We eat errors when the gpu is terminally wedged to avoid + * userspace unduly crashing (gl has no provisions for mmaps to + * fail). But any other -EIO isn't ours (e.g. swap in failure) + * and so needs to be reported. + */ + if (!i915_terminally_wedged(i915)) + return VM_FAULT_SIGBUS; + /* else: fall through */ + case -EAGAIN: + /* + * EAGAIN means the gpu is hung and we'll wait for the error + * handler to reset everything when re-faulting in + * i915_mutex_lock_interruptible. + */ + case 0: + case -ERESTARTSYS: + case -EINTR: + case -EBUSY: + /* + * EBUSY is ok: this just means that another thread + * already did the job. + */ + return VM_FAULT_NOPAGE; + case -ENOMEM: + return VM_FAULT_OOM; + case -ENOSPC: + case -EFAULT: + return VM_FAULT_SIGBUS; + default: + WARN_ONCE(ret, "unhandled error in %s: %i\n", __func__, ret); + return VM_FAULT_SIGBUS; + } +} + +void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj) +{ + struct i915_vma *vma; + + GEM_BUG_ON(!obj->userfault_count); + + obj->userfault_count = 0; + list_del(&obj->userfault_link); + drm_vma_node_unmap(&obj->base.vma_node, + obj->base.dev->anon_inode->i_mapping); + + for_each_ggtt_vma(vma, obj) + i915_vma_unset_userfault(vma); +} + +/** + * i915_gem_object_release_mmap - remove physical page mappings + * @obj: obj in question + * + * Preserve the reservation of the mmapping with the DRM core code, but + * relinquish ownership of the pages back to the system. + * + * It is vital that we remove the page mapping if we have mapped a tiled + * object through the GTT and then lose the fence register due to + * resource pressure. Similarly if the object has been moved out of the + * aperture, than pages mapped into userspace must be revoked. Removing the + * mapping will then trigger a page fault on the next user access, allowing + * fixup by i915_gem_fault(). + */ +void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + intel_wakeref_t wakeref; + + /* Serialisation between user GTT access and our code depends upon + * revoking the CPU's PTE whilst the mutex is held. The next user + * pagefault then has to wait until we release the mutex. + * + * Note that RPM complicates somewhat by adding an additional + * requirement that operations to the GGTT be made holding the RPM + * wakeref. + */ + lockdep_assert_held(&i915->drm.struct_mutex); + wakeref = intel_runtime_pm_get(i915); + + if (!obj->userfault_count) + goto out; + + __i915_gem_object_release_mmap(obj); + + /* Ensure that the CPU's PTE are revoked and there are not outstanding + * memory transactions from userspace before we return. The TLB + * flushing implied above by changing the PTE above *should* be + * sufficient, an extra barrier here just provides us with a bit + * of paranoid documentation about our requirement to serialise + * memory writes before touching registers / GSM. + */ + wmb(); + +out: + intel_runtime_pm_put(i915, wakeref); +} + +static int create_mmap_offset(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + int err; + + err = drm_gem_create_mmap_offset(&obj->base); + if (likely(!err)) + return 0; + + /* Attempt to reap some mmap space from dead objects */ + do { + err = i915_gem_wait_for_idle(i915, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (err) + break; + + i915_gem_drain_freed_objects(i915); + err = drm_gem_create_mmap_offset(&obj->base); + if (!err) + break; + + } while (flush_delayed_work(&i915->gem.retire_work)); + + return err; +} + +int +i915_gem_mmap_gtt(struct drm_file *file, + struct drm_device *dev, + u32 handle, + u64 *offset) +{ + struct drm_i915_gem_object *obj; + int ret; + + obj = i915_gem_object_lookup(file, handle); + if (!obj) + return -ENOENT; + + ret = create_mmap_offset(obj); + if (ret == 0) + *offset = drm_vma_node_offset_addr(&obj->base.vma_node); + + i915_gem_object_put(obj); + return ret; +} + +/** + * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing + * @dev: DRM device + * @data: GTT mapping ioctl data + * @file: GEM object info + * + * Simply returns the fake offset to userspace so it can mmap it. + * The mmap call will end up in drm_gem_mmap(), which will set things + * up so we can get faults in the handler above. + * + * The fault handler will take care of binding the object into the GTT + * (since it may have been evicted to make room for something), allocating + * a fence register, and mapping the appropriate aperture address into + * userspace. + */ +int +i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_mmap_gtt *args = data; + + return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset); +} + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/i915_gem_mman.c" +#endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 3000b26b4bbf..4ed28ac9ab3a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -24,6 +24,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" +#include "i915_gem_clflush.h" #include "i915_globals.h" #include "intel_frontbuffer.h" @@ -356,6 +357,61 @@ void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj) i915_gem_object_put(obj); } +static inline enum fb_op_origin +fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) +{ + return (domain == I915_GEM_DOMAIN_GTT ? + obj->frontbuffer_ggtt_origin : ORIGIN_CPU); +} + +static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj) +{ + return !(obj->cache_level == I915_CACHE_NONE || + obj->cache_level == I915_CACHE_WT); +} + +void +i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj, + unsigned int flush_domains) +{ + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct i915_vma *vma; + + if (!(obj->write_domain & flush_domains)) + return; + + switch (obj->write_domain) { + case I915_GEM_DOMAIN_GTT: + i915_gem_flush_ggtt_writes(dev_priv); + + intel_fb_obj_flush(obj, + fb_write_origin(obj, I915_GEM_DOMAIN_GTT)); + + for_each_ggtt_vma(vma, obj) { + if (vma->iomap) + continue; + + i915_vma_unset_ggtt_write(vma); + } + break; + + case I915_GEM_DOMAIN_WC: + wmb(); + break; + + case I915_GEM_DOMAIN_CPU: + i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + break; + + case I915_GEM_DOMAIN_RENDER: + if (gpu_write_needs_clflush(obj)) + obj->cache_dirty = true; + break; + } + + obj->write_domain = 0; +} + void i915_gem_init__objects(struct drm_i915_private *i915) { INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 88495530ebd9..07f487cbff79 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -351,6 +351,13 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj) i915_gem_object_unpin_pages(obj); } +void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj); +void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj); + +void +i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj, + unsigned int flush_domains); + static inline struct intel_engine_cs * i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c new file mode 100644 index 000000000000..87da01230179 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -0,0 +1,503 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#include + +#include "gt/intel_gt_pm.h" +#include "i915_selftest.h" +#include "selftests/huge_gem_object.h" +#include "selftests/igt_flush_test.h" + +struct tile { + unsigned int width; + unsigned int height; + unsigned int stride; + unsigned int size; + unsigned int tiling; + unsigned int swizzle; +}; + +static u64 swizzle_bit(unsigned int bit, u64 offset) +{ + return (offset & BIT_ULL(bit)) >> (bit - 6); +} + +static u64 tiled_offset(const struct tile *tile, u64 v) +{ + u64 x, y; + + if (tile->tiling == I915_TILING_NONE) + return v; + + y = div64_u64_rem(v, tile->stride, &x); + v = div64_u64_rem(y, tile->height, &y) * tile->stride * tile->height; + + if (tile->tiling == I915_TILING_X) { + v += y * tile->width; + v += div64_u64_rem(x, tile->width, &x) << tile->size; + v += x; + } else if (tile->width == 128) { + const unsigned int ytile_span = 16; + const unsigned int ytile_height = 512; + + v += y * ytile_span; + v += div64_u64_rem(x, ytile_span, &x) * ytile_height; + v += x; + } else { + const unsigned int ytile_span = 32; + const unsigned int ytile_height = 256; + + v += y * ytile_span; + v += div64_u64_rem(x, ytile_span, &x) * ytile_height; + v += x; + } + + switch (tile->swizzle) { + case I915_BIT_6_SWIZZLE_9: + v ^= swizzle_bit(9, v); + break; + case I915_BIT_6_SWIZZLE_9_10: + v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v); + break; + case I915_BIT_6_SWIZZLE_9_11: + v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v); + break; + case I915_BIT_6_SWIZZLE_9_10_11: + v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v); + break; + } + + return v; +} + +static int check_partial_mapping(struct drm_i915_gem_object *obj, + const struct tile *tile, + unsigned long end_time) +{ + const unsigned int nreal = obj->scratch / PAGE_SIZE; + const unsigned long npages = obj->base.size / PAGE_SIZE; + struct i915_vma *vma; + unsigned long page; + int err; + + if (igt_timeout(end_time, + "%s: timed out before tiling=%d stride=%d\n", + __func__, tile->tiling, tile->stride)) + return -EINTR; + + err = i915_gem_object_set_tiling(obj, tile->tiling, tile->stride); + if (err) { + pr_err("Failed to set tiling mode=%u, stride=%u, err=%d\n", + tile->tiling, tile->stride, err); + return err; + } + + GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); + GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); + + for_each_prime_number_from(page, 1, npages) { + struct i915_ggtt_view view = + compute_partial_view(obj, page, MIN_CHUNK_PAGES); + u32 __iomem *io; + struct page *p; + unsigned int n; + u64 offset; + u32 *cpu; + + GEM_BUG_ON(view.partial.size > nreal); + cond_resched(); + + err = i915_gem_object_set_to_gtt_domain(obj, true); + if (err) { + pr_err("Failed to flush to GTT write domain; err=%d\n", + err); + return err; + } + + vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE); + if (IS_ERR(vma)) { + pr_err("Failed to pin partial view: offset=%lu; err=%d\n", + page, (int)PTR_ERR(vma)); + return PTR_ERR(vma); + } + + n = page - view.partial.offset; + GEM_BUG_ON(n >= view.partial.size); + + io = i915_vma_pin_iomap(vma); + i915_vma_unpin(vma); + if (IS_ERR(io)) { + pr_err("Failed to iomap partial view: offset=%lu; err=%d\n", + page, (int)PTR_ERR(io)); + return PTR_ERR(io); + } + + iowrite32(page, io + n * PAGE_SIZE / sizeof(*io)); + i915_vma_unpin_iomap(vma); + + offset = tiled_offset(tile, page << PAGE_SHIFT); + if (offset >= obj->base.size) + continue; + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + + p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT); + cpu = kmap(p) + offset_in_page(offset); + drm_clflush_virt_range(cpu, sizeof(*cpu)); + if (*cpu != (u32)page) { + pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n", + page, n, + view.partial.offset, + view.partial.size, + vma->size >> PAGE_SHIFT, + tile->tiling ? tile_row_pages(obj) : 0, + vma->fence ? vma->fence->id : -1, tile->tiling, tile->stride, + offset >> PAGE_SHIFT, + (unsigned int)offset_in_page(offset), + offset, + (u32)page, *cpu); + err = -EINVAL; + } + *cpu = 0; + drm_clflush_virt_range(cpu, sizeof(*cpu)); + kunmap(p); + if (err) + return err; + + i915_vma_destroy(vma); + } + + return 0; +} + +static int igt_partial_tiling(void *arg) +{ + const unsigned int nreal = 1 << 12; /* largest tile row x2 */ + struct drm_i915_private *i915 = arg; + struct drm_i915_gem_object *obj; + intel_wakeref_t wakeref; + int tiling; + int err; + + /* We want to check the page mapping and fencing of a large object + * mmapped through the GTT. The object we create is larger than can + * possibly be mmaped as a whole, and so we must use partial GGTT vma. + * We then check that a write through each partial GGTT vma ends up + * in the right set of pages within the object, and with the expected + * tiling, which we verify by manual swizzling. + */ + + obj = huge_gem_object(i915, + nreal << PAGE_SHIFT, + (1 + next_prime_number(i915->ggtt.vm.total >> PAGE_SHIFT)) << PAGE_SHIFT); + if (IS_ERR(obj)) + return PTR_ERR(obj); + + err = i915_gem_object_pin_pages(obj); + if (err) { + pr_err("Failed to allocate %u pages (%lu total), err=%d\n", + nreal, obj->base.size / PAGE_SIZE, err); + goto out; + } + + mutex_lock(&i915->drm.struct_mutex); + wakeref = intel_runtime_pm_get(i915); + + if (1) { + IGT_TIMEOUT(end); + struct tile tile; + + tile.height = 1; + tile.width = 1; + tile.size = 0; + tile.stride = 0; + tile.swizzle = I915_BIT_6_SWIZZLE_NONE; + tile.tiling = I915_TILING_NONE; + + err = check_partial_mapping(obj, &tile, end); + if (err && err != -EINTR) + goto out_unlock; + } + + for (tiling = I915_TILING_X; tiling <= I915_TILING_Y; tiling++) { + IGT_TIMEOUT(end); + unsigned int max_pitch; + unsigned int pitch; + struct tile tile; + + if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) + /* + * The swizzling pattern is actually unknown as it + * varies based on physical address of each page. + * See i915_gem_detect_bit_6_swizzle(). + */ + break; + + tile.tiling = tiling; + switch (tiling) { + case I915_TILING_X: + tile.swizzle = i915->mm.bit_6_swizzle_x; + break; + case I915_TILING_Y: + tile.swizzle = i915->mm.bit_6_swizzle_y; + break; + } + + GEM_BUG_ON(tile.swizzle == I915_BIT_6_SWIZZLE_UNKNOWN); + if (tile.swizzle == I915_BIT_6_SWIZZLE_9_17 || + tile.swizzle == I915_BIT_6_SWIZZLE_9_10_17) + continue; + + if (INTEL_GEN(i915) <= 2) { + tile.height = 16; + tile.width = 128; + tile.size = 11; + } else if (tile.tiling == I915_TILING_Y && + HAS_128_BYTE_Y_TILING(i915)) { + tile.height = 32; + tile.width = 128; + tile.size = 12; + } else { + tile.height = 8; + tile.width = 512; + tile.size = 12; + } + + if (INTEL_GEN(i915) < 4) + max_pitch = 8192 / tile.width; + else if (INTEL_GEN(i915) < 7) + max_pitch = 128 * I965_FENCE_MAX_PITCH_VAL / tile.width; + else + max_pitch = 128 * GEN7_FENCE_MAX_PITCH_VAL / tile.width; + + for (pitch = max_pitch; pitch; pitch >>= 1) { + tile.stride = tile.width * pitch; + err = check_partial_mapping(obj, &tile, end); + if (err == -EINTR) + goto next_tiling; + if (err) + goto out_unlock; + + if (pitch > 2 && INTEL_GEN(i915) >= 4) { + tile.stride = tile.width * (pitch - 1); + err = check_partial_mapping(obj, &tile, end); + if (err == -EINTR) + goto next_tiling; + if (err) + goto out_unlock; + } + + if (pitch < max_pitch && INTEL_GEN(i915) >= 4) { + tile.stride = tile.width * (pitch + 1); + err = check_partial_mapping(obj, &tile, end); + if (err == -EINTR) + goto next_tiling; + if (err) + goto out_unlock; + } + } + + if (INTEL_GEN(i915) >= 4) { + for_each_prime_number(pitch, max_pitch) { + tile.stride = tile.width * pitch; + err = check_partial_mapping(obj, &tile, end); + if (err == -EINTR) + goto next_tiling; + if (err) + goto out_unlock; + } + } + +next_tiling: ; + } + +out_unlock: + intel_runtime_pm_put(i915, wakeref); + mutex_unlock(&i915->drm.struct_mutex); + i915_gem_object_unpin_pages(obj); +out: + i915_gem_object_put(obj); + return err; +} + +static int make_obj_busy(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct i915_request *rq; + struct i915_vma *vma; + int err; + + vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL); + if (IS_ERR(vma)) + return PTR_ERR(vma); + + err = i915_vma_pin(vma, 0, 0, PIN_USER); + if (err) + return err; + + rq = i915_request_create(i915->engine[RCS0]->kernel_context); + if (IS_ERR(rq)) { + i915_vma_unpin(vma); + return PTR_ERR(rq); + } + + err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + + i915_request_add(rq); + + __i915_gem_object_release_unless_active(obj); + i915_vma_unpin(vma); + + return err; +} + +static bool assert_mmap_offset(struct drm_i915_private *i915, + unsigned long size, + int expected) +{ + struct drm_i915_gem_object *obj; + int err; + + obj = i915_gem_object_create_internal(i915, size); + if (IS_ERR(obj)) + return PTR_ERR(obj); + + err = create_mmap_offset(obj); + i915_gem_object_put(obj); + + return err == expected; +} + +static void disable_retire_worker(struct drm_i915_private *i915) +{ + i915_gem_shrinker_unregister(i915); + + intel_gt_pm_get(i915); + + cancel_delayed_work_sync(&i915->gem.retire_work); + flush_work(&i915->gem.idle_work); +} + +static void restore_retire_worker(struct drm_i915_private *i915) +{ + intel_gt_pm_put(i915); + + mutex_lock(&i915->drm.struct_mutex); + igt_flush_test(i915, I915_WAIT_LOCKED); + mutex_unlock(&i915->drm.struct_mutex); + + i915_gem_shrinker_register(i915); +} + +static int igt_mmap_offset_exhaustion(void *arg) +{ + struct drm_i915_private *i915 = arg; + struct drm_mm *mm = &i915->drm.vma_offset_manager->vm_addr_space_mm; + struct drm_i915_gem_object *obj; + struct drm_mm_node resv, *hole; + u64 hole_start, hole_end; + int loop, err; + + /* Disable background reaper */ + disable_retire_worker(i915); + GEM_BUG_ON(!i915->gt.awake); + + /* Trim the device mmap space to only a page */ + memset(&resv, 0, sizeof(resv)); + drm_mm_for_each_hole(hole, mm, hole_start, hole_end) { + resv.start = hole_start; + resv.size = hole_end - hole_start - 1; /* PAGE_SIZE units */ + err = drm_mm_reserve_node(mm, &resv); + if (err) { + pr_err("Failed to trim VMA manager, err=%d\n", err); + goto out_park; + } + break; + } + + /* Just fits! */ + if (!assert_mmap_offset(i915, PAGE_SIZE, 0)) { + pr_err("Unable to insert object into single page hole\n"); + err = -EINVAL; + goto out; + } + + /* Too large */ + if (!assert_mmap_offset(i915, 2 * PAGE_SIZE, -ENOSPC)) { + pr_err("Unexpectedly succeeded in inserting too large object into single page hole\n"); + err = -EINVAL; + goto out; + } + + /* Fill the hole, further allocation attempts should then fail */ + obj = i915_gem_object_create_internal(i915, PAGE_SIZE); + if (IS_ERR(obj)) { + err = PTR_ERR(obj); + goto out; + } + + err = create_mmap_offset(obj); + if (err) { + pr_err("Unable to insert object into reclaimed hole\n"); + goto err_obj; + } + + if (!assert_mmap_offset(i915, PAGE_SIZE, -ENOSPC)) { + pr_err("Unexpectedly succeeded in inserting object into no holes!\n"); + err = -EINVAL; + goto err_obj; + } + + i915_gem_object_put(obj); + + /* Now fill with busy dead objects that we expect to reap */ + for (loop = 0; loop < 3; loop++) { + if (i915_terminally_wedged(i915)) + break; + + obj = i915_gem_object_create_internal(i915, PAGE_SIZE); + if (IS_ERR(obj)) { + err = PTR_ERR(obj); + goto out; + } + + mutex_lock(&i915->drm.struct_mutex); + err = make_obj_busy(obj); + mutex_unlock(&i915->drm.struct_mutex); + if (err) { + pr_err("[loop %d] Failed to busy the object\n", loop); + goto err_obj; + } + + /* NB we rely on the _active_ reference to access obj now */ + GEM_BUG_ON(!i915_gem_object_is_active(obj)); + err = create_mmap_offset(obj); + if (err) { + pr_err("[loop %d] create_mmap_offset failed with err=%d\n", + loop, err); + goto out; + } + } + +out: + drm_mm_remove_node(&resv); +out_park: + restore_retire_worker(i915); + return err; +err_obj: + i915_gem_object_put(obj); + goto out; +} + +int i915_gem_mman_live_selftests(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(igt_partial_tiling), + SUBTEST(igt_mmap_offset_exhaustion), + }; + + return i915_subtests(tests, i915); +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 107447287b22..c5b2e39a88a4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2795,7 +2795,6 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, u64 flags); int i915_gem_object_unbind(struct drm_i915_gem_object *obj); -void i915_gem_release_mmap(struct drm_i915_gem_object *obj); void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e8e6f3d3082a..2ed3dadfcc16 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -404,12 +404,6 @@ i915_gem_dumb_create(struct drm_file *file, &args->size, &args->handle); } -static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj) -{ - return !(obj->cache_level == I915_CACHE_NONE || - obj->cache_level == I915_CACHE_WT); -} - /** * Creates a new mm object and returns a handle to it. * @dev: drm device pointer @@ -429,13 +423,6 @@ i915_gem_create_ioctl(struct drm_device *dev, void *data, &args->size, &args->handle); } -static inline enum fb_op_origin -fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) -{ - return (domain == I915_GEM_DOMAIN_GTT ? - obj->frontbuffer_ggtt_origin : ORIGIN_CPU); -} - void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv) { intel_wakeref_t wakeref; @@ -475,47 +462,6 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv) } } -static void -flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) -{ - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - struct i915_vma *vma; - - if (!(obj->write_domain & flush_domains)) - return; - - switch (obj->write_domain) { - case I915_GEM_DOMAIN_GTT: - i915_gem_flush_ggtt_writes(dev_priv); - - intel_fb_obj_flush(obj, - fb_write_origin(obj, I915_GEM_DOMAIN_GTT)); - - for_each_ggtt_vma(vma, obj) { - if (vma->iomap) - continue; - - i915_vma_unset_ggtt_write(vma); - } - break; - - case I915_GEM_DOMAIN_WC: - wmb(); - break; - - case I915_GEM_DOMAIN_CPU: - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); - break; - - case I915_GEM_DOMAIN_RENDER: - if (gpu_write_needs_clflush(obj)) - obj->cache_dirty = true; - break; - } - - obj->write_domain = 0; -} - /* * Pins the specified object's pages and synchronizes the object with * GPU accesses. Sets needs_clflush to non-zero if the caller should @@ -552,7 +498,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, goto out; } - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* If we're not in the cpu read domain, set ourself into the gtt * read domain and manually flush cachelines (if required). This @@ -604,7 +550,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, goto out; } - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* If we're not in the cpu write domain, set ourself into the * gtt write domain and manually flush cachelines (as required). @@ -1207,6 +1153,13 @@ static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) spin_unlock(&i915->mm.obj_lock); } +static inline enum fb_op_origin +fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) +{ + return (domain == I915_GEM_DOMAIN_GTT ? + obj->frontbuffer_ggtt_origin : ORIGIN_CPU); +} + /** * Called when user space prepares to use an object with the CPU, either * through the mmap ioctl's mapping or a GTT mapping. @@ -1350,420 +1303,6 @@ i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, return 0; } -static inline bool -__vma_matches(struct vm_area_struct *vma, struct file *filp, - unsigned long addr, unsigned long size) -{ - if (vma->vm_file != filp) - return false; - - return vma->vm_start == addr && - (vma->vm_end - vma->vm_start) == PAGE_ALIGN(size); -} - -/** - * i915_gem_mmap_ioctl - Maps the contents of an object, returning the address - * it is mapped to. - * @dev: drm device - * @data: ioctl data blob - * @file: drm file - * - * While the mapping holds a reference on the contents of the object, it doesn't - * imply a ref on the object itself. - * - * IMPORTANT: - * - * DRM driver writers who look a this function as an example for how to do GEM - * mmap support, please don't implement mmap support like here. The modern way - * to implement DRM mmap support is with an mmap offset ioctl (like - * i915_gem_mmap_gtt) and then using the mmap syscall on the DRM fd directly. - * That way debug tooling like valgrind will understand what's going on, hiding - * the mmap call in a driver private ioctl will break that. The i915 driver only - * does cpu mmaps this way because we didn't know better. - */ -int -i915_gem_mmap_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_gem_mmap *args = data; - struct drm_i915_gem_object *obj; - unsigned long addr; - - if (args->flags & ~(I915_MMAP_WC)) - return -EINVAL; - - if (args->flags & I915_MMAP_WC && !boot_cpu_has(X86_FEATURE_PAT)) - return -ENODEV; - - obj = i915_gem_object_lookup(file, args->handle); - if (!obj) - return -ENOENT; - - /* prime objects have no backing filp to GEM mmap - * pages from. - */ - if (!obj->base.filp) { - addr = -ENXIO; - goto err; - } - - if (range_overflows(args->offset, args->size, (u64)obj->base.size)) { - addr = -EINVAL; - goto err; - } - - addr = vm_mmap(obj->base.filp, 0, args->size, - PROT_READ | PROT_WRITE, MAP_SHARED, - args->offset); - if (IS_ERR_VALUE(addr)) - goto err; - - if (args->flags & I915_MMAP_WC) { - struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; - - if (down_write_killable(&mm->mmap_sem)) { - addr = -EINTR; - goto err; - } - vma = find_vma(mm, addr); - if (vma && __vma_matches(vma, obj->base.filp, addr, args->size)) - vma->vm_page_prot = - pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - else - addr = -ENOMEM; - up_write(&mm->mmap_sem); - if (IS_ERR_VALUE(addr)) - goto err; - - /* This may race, but that's ok, it only gets set */ - WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU); - } - i915_gem_object_put(obj); - - args->addr_ptr = (u64)addr; - return 0; - -err: - i915_gem_object_put(obj); - return addr; -} - -static unsigned int tile_row_pages(const struct drm_i915_gem_object *obj) -{ - return i915_gem_object_get_tile_row_size(obj) >> PAGE_SHIFT; -} - -/** - * i915_gem_mmap_gtt_version - report the current feature set for GTT mmaps - * - * A history of the GTT mmap interface: - * - * 0 - Everything had to fit into the GTT. Both parties of a memcpy had to - * aligned and suitable for fencing, and still fit into the available - * mappable space left by the pinned display objects. A classic problem - * we called the page-fault-of-doom where we would ping-pong between - * two objects that could not fit inside the GTT and so the memcpy - * would page one object in at the expense of the other between every - * single byte. - * - * 1 - Objects can be any size, and have any compatible fencing (X Y, or none - * as set via i915_gem_set_tiling() [DRM_I915_GEM_SET_TILING]). If the - * object is too large for the available space (or simply too large - * for the mappable aperture!), a view is created instead and faulted - * into userspace. (This view is aligned and sized appropriately for - * fenced access.) - * - * 2 - Recognise WC as a separate cache domain so that we can flush the - * delayed writes via GTT before performing direct access via WC. - * - * 3 - Remove implicit set-domain(GTT) and synchronisation on initial - * pagefault; swapin remains transparent. - * - * Restrictions: - * - * * snoopable objects cannot be accessed via the GTT. It can cause machine - * hangs on some architectures, corruption on others. An attempt to service - * a GTT page fault from a snoopable object will generate a SIGBUS. - * - * * the object must be able to fit into RAM (physical memory, though no - * limited to the mappable aperture). - * - * - * Caveats: - * - * * a new GTT page fault will synchronize rendering from the GPU and flush - * all data to system memory. Subsequent access will not be synchronized. - * - * * all mappings are revoked on runtime device suspend. - * - * * there are only 8, 16 or 32 fence registers to share between all users - * (older machines require fence register for display and blitter access - * as well). Contention of the fence registers will cause the previous users - * to be unmapped and any new access will generate new page faults. - * - * * running out of memory while servicing a fault may generate a SIGBUS, - * rather than the expected SIGSEGV. - */ -int i915_gem_mmap_gtt_version(void) -{ - return 3; -} - -static inline struct i915_ggtt_view -compute_partial_view(const struct drm_i915_gem_object *obj, - pgoff_t page_offset, - unsigned int chunk) -{ - struct i915_ggtt_view view; - - if (i915_gem_object_is_tiled(obj)) - chunk = roundup(chunk, tile_row_pages(obj)); - - view.type = I915_GGTT_VIEW_PARTIAL; - view.partial.offset = rounddown(page_offset, chunk); - view.partial.size = - min_t(unsigned int, chunk, - (obj->base.size >> PAGE_SHIFT) - view.partial.offset); - - /* If the partial covers the entire object, just create a normal VMA. */ - if (chunk >= obj->base.size >> PAGE_SHIFT) - view.type = I915_GGTT_VIEW_NORMAL; - - return view; -} - -/** - * i915_gem_fault - fault a page into the GTT - * @vmf: fault info - * - * The fault handler is set up by drm_gem_mmap() when a object is GTT mapped - * from userspace. The fault handler takes care of binding the object to - * the GTT (if needed), allocating and programming a fence register (again, - * only if needed based on whether the old reg is still valid or the object - * is tiled) and inserting a new PTE into the faulting process. - * - * Note that the faulting process may involve evicting existing objects - * from the GTT and/or fence registers to make room. So performance may - * suffer if the GTT working set is large or there are few fence registers - * left. - * - * The current feature set supported by i915_gem_fault() and thus GTT mmaps - * is exposed via I915_PARAM_MMAP_GTT_VERSION (see i915_gem_mmap_gtt_version). - */ -vm_fault_t i915_gem_fault(struct vm_fault *vmf) -{ -#define MIN_CHUNK_PAGES (SZ_1M >> PAGE_SHIFT) - struct vm_area_struct *area = vmf->vma; - struct drm_i915_gem_object *obj = to_intel_bo(area->vm_private_data); - struct drm_device *dev = obj->base.dev; - struct drm_i915_private *dev_priv = to_i915(dev); - struct i915_ggtt *ggtt = &dev_priv->ggtt; - bool write = area->vm_flags & VM_WRITE; - intel_wakeref_t wakeref; - struct i915_vma *vma; - pgoff_t page_offset; - int srcu; - int ret; - - /* Sanity check that we allow writing into this object */ - if (i915_gem_object_is_readonly(obj) && write) - return VM_FAULT_SIGBUS; - - /* We don't use vmf->pgoff since that has the fake offset */ - page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT; - - trace_i915_gem_object_fault(obj, page_offset, true, write); - - ret = i915_gem_object_pin_pages(obj); - if (ret) - goto err; - - wakeref = intel_runtime_pm_get(dev_priv); - - srcu = i915_reset_trylock(dev_priv); - if (srcu < 0) { - ret = srcu; - goto err_rpm; - } - - ret = i915_mutex_lock_interruptible(dev); - if (ret) - goto err_reset; - - /* Access to snoopable pages through the GTT is incoherent. */ - if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(dev_priv)) { - ret = -EFAULT; - goto err_unlock; - } - - /* Now pin it into the GTT as needed */ - vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, - PIN_MAPPABLE | - PIN_NONBLOCK | - PIN_NONFAULT); - if (IS_ERR(vma)) { - /* Use a partial view if it is bigger than available space */ - struct i915_ggtt_view view = - compute_partial_view(obj, page_offset, MIN_CHUNK_PAGES); - unsigned int flags; - - flags = PIN_MAPPABLE; - if (view.type == I915_GGTT_VIEW_NORMAL) - flags |= PIN_NONBLOCK; /* avoid warnings for pinned */ - - /* - * Userspace is now writing through an untracked VMA, abandon - * all hope that the hardware is able to track future writes. - */ - obj->frontbuffer_ggtt_origin = ORIGIN_CPU; - - vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); - if (IS_ERR(vma) && !view.type) { - flags = PIN_MAPPABLE; - view.type = I915_GGTT_VIEW_PARTIAL; - vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); - } - } - if (IS_ERR(vma)) { - ret = PTR_ERR(vma); - goto err_unlock; - } - - ret = i915_vma_pin_fence(vma); - if (ret) - goto err_unpin; - - /* Finally, remap it using the new GTT offset */ - ret = remap_io_mapping(area, - area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT), - (ggtt->gmadr.start + vma->node.start) >> PAGE_SHIFT, - min_t(u64, vma->size, area->vm_end - area->vm_start), - &ggtt->iomap); - if (ret) - goto err_fence; - - /* Mark as being mmapped into userspace for later revocation */ - assert_rpm_wakelock_held(dev_priv); - if (!i915_vma_set_userfault(vma) && !obj->userfault_count++) - list_add(&obj->userfault_link, &dev_priv->mm.userfault_list); - GEM_BUG_ON(!obj->userfault_count); - - i915_vma_set_ggtt_write(vma); - -err_fence: - i915_vma_unpin_fence(vma); -err_unpin: - __i915_vma_unpin(vma); -err_unlock: - mutex_unlock(&dev->struct_mutex); -err_reset: - i915_reset_unlock(dev_priv, srcu); -err_rpm: - intel_runtime_pm_put(dev_priv, wakeref); - i915_gem_object_unpin_pages(obj); -err: - switch (ret) { - case -EIO: - /* - * We eat errors when the gpu is terminally wedged to avoid - * userspace unduly crashing (gl has no provisions for mmaps to - * fail). But any other -EIO isn't ours (e.g. swap in failure) - * and so needs to be reported. - */ - if (!i915_terminally_wedged(dev_priv)) - return VM_FAULT_SIGBUS; - /* else: fall through */ - case -EAGAIN: - /* - * EAGAIN means the gpu is hung and we'll wait for the error - * handler to reset everything when re-faulting in - * i915_mutex_lock_interruptible. - */ - case 0: - case -ERESTARTSYS: - case -EINTR: - case -EBUSY: - /* - * EBUSY is ok: this just means that another thread - * already did the job. - */ - return VM_FAULT_NOPAGE; - case -ENOMEM: - return VM_FAULT_OOM; - case -ENOSPC: - case -EFAULT: - return VM_FAULT_SIGBUS; - default: - WARN_ONCE(ret, "unhandled error in i915_gem_fault: %i\n", ret); - return VM_FAULT_SIGBUS; - } -} - -static void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj) -{ - struct i915_vma *vma; - - GEM_BUG_ON(!obj->userfault_count); - - obj->userfault_count = 0; - list_del(&obj->userfault_link); - drm_vma_node_unmap(&obj->base.vma_node, - obj->base.dev->anon_inode->i_mapping); - - for_each_ggtt_vma(vma, obj) - i915_vma_unset_userfault(vma); -} - -/** - * i915_gem_release_mmap - remove physical page mappings - * @obj: obj in question - * - * Preserve the reservation of the mmapping with the DRM core code, but - * relinquish ownership of the pages back to the system. - * - * It is vital that we remove the page mapping if we have mapped a tiled - * object through the GTT and then lose the fence register due to - * resource pressure. Similarly if the object has been moved out of the - * aperture, than pages mapped into userspace must be revoked. Removing the - * mapping will then trigger a page fault on the next user access, allowing - * fixup by i915_gem_fault(). - */ -void -i915_gem_release_mmap(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *i915 = to_i915(obj->base.dev); - intel_wakeref_t wakeref; - - /* Serialisation between user GTT access and our code depends upon - * revoking the CPU's PTE whilst the mutex is held. The next user - * pagefault then has to wait until we release the mutex. - * - * Note that RPM complicates somewhat by adding an additional - * requirement that operations to the GGTT be made holding the RPM - * wakeref. - */ - lockdep_assert_held(&i915->drm.struct_mutex); - wakeref = intel_runtime_pm_get(i915); - - if (!obj->userfault_count) - goto out; - - __i915_gem_object_release_mmap(obj); - - /* Ensure that the CPU's PTE are revoked and there are not outstanding - * memory transactions from userspace before we return. The TLB - * flushing implied above by changing the PTE above *should* be - * sufficient, an extra barrier here just provides us with a bit - * of paranoid documentation about our requirement to serialise - * memory writes before touching registers / GSM. - */ - wmb(); - -out: - intel_runtime_pm_put(i915, wakeref); -} - void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv) { struct drm_i915_gem_object *obj, *on; @@ -1806,78 +1345,6 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv) } } -static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - int err; - - err = drm_gem_create_mmap_offset(&obj->base); - if (likely(!err)) - return 0; - - /* Attempt to reap some mmap space from dead objects */ - do { - err = i915_gem_wait_for_idle(dev_priv, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (err) - break; - - i915_gem_drain_freed_objects(dev_priv); - err = drm_gem_create_mmap_offset(&obj->base); - if (!err) - break; - - } while (flush_delayed_work(&dev_priv->gem.retire_work)); - - return err; -} - -int -i915_gem_mmap_gtt(struct drm_file *file, - struct drm_device *dev, - u32 handle, - u64 *offset) -{ - struct drm_i915_gem_object *obj; - int ret; - - obj = i915_gem_object_lookup(file, handle); - if (!obj) - return -ENOENT; - - ret = i915_gem_object_create_mmap_offset(obj); - if (ret == 0) - *offset = drm_vma_node_offset_addr(&obj->base.vma_node); - - i915_gem_object_put(obj); - return ret; -} - -/** - * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing - * @dev: DRM device - * @data: GTT mapping ioctl data - * @file: GEM object info - * - * Simply returns the fake offset to userspace so it can mmap it. - * The mmap call will end up in drm_gem_mmap(), which will set things - * up so we can get faults in the handler above. - * - * The fault handler will take care of binding the object into the GTT - * (since it may have been evicted to make room for something), allocating - * a fence register, and mapping the appropriate aperture address into - * userspace. - */ -int -i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_gem_mmap_gtt *args = data; - - return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset); -} - bool i915_sg_trim(struct sg_table *orig_st) { struct sg_table new_st; @@ -2081,7 +1548,7 @@ static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj) * We manually flush the CPU domain so that we can override and * force the flush for the display, and perform it asyncrhonously. */ - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); if (obj->cache_dirty) i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE); obj->write_domain = 0; @@ -2135,7 +1602,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) if (ret) return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); /* Serialise direct access to this object with the barriers for * coherent writes from the GPU, by effectively invalidating the @@ -2197,7 +1664,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) if (ret) return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); /* Serialise direct access to this object with the barriers for * coherent writes from the GPU, by effectively invalidating the @@ -2306,7 +1773,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, * then double check if the GTT mapping is still * valid for that pointer access. */ - i915_gem_release_mmap(obj); + i915_gem_object_release_mmap(obj); /* As we no longer need a fence for GTT access, * we can relinquish it now (and so prevent having @@ -2561,7 +2028,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) if (ret) return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* Flush the CPU cache if it's still invalid. */ if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index 7d5cc9ccf6dd..86d6d92ccbc9 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -299,7 +299,7 @@ i915_gem_object_set_tiling(struct drm_i915_gem_object *obj, i915_gem_object_unlock(obj); /* Force the fence to be reacquired for GTT access */ - i915_gem_release_mmap(obj); + i915_gem_object_release_mmap(obj); /* Try to preallocate memory required to save swizzling on put-pages */ if (i915_gem_object_needs_bit17_swizzle(obj)) { diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c index b98a286a8be5..a3dd2f1be95b 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c @@ -89,491 +89,6 @@ static int igt_gem_huge(void *arg) return err; } -struct tile { - unsigned int width; - unsigned int height; - unsigned int stride; - unsigned int size; - unsigned int tiling; - unsigned int swizzle; -}; - -static u64 swizzle_bit(unsigned int bit, u64 offset) -{ - return (offset & BIT_ULL(bit)) >> (bit - 6); -} - -static u64 tiled_offset(const struct tile *tile, u64 v) -{ - u64 x, y; - - if (tile->tiling == I915_TILING_NONE) - return v; - - y = div64_u64_rem(v, tile->stride, &x); - v = div64_u64_rem(y, tile->height, &y) * tile->stride * tile->height; - - if (tile->tiling == I915_TILING_X) { - v += y * tile->width; - v += div64_u64_rem(x, tile->width, &x) << tile->size; - v += x; - } else if (tile->width == 128) { - const unsigned int ytile_span = 16; - const unsigned int ytile_height = 512; - - v += y * ytile_span; - v += div64_u64_rem(x, ytile_span, &x) * ytile_height; - v += x; - } else { - const unsigned int ytile_span = 32; - const unsigned int ytile_height = 256; - - v += y * ytile_span; - v += div64_u64_rem(x, ytile_span, &x) * ytile_height; - v += x; - } - - switch (tile->swizzle) { - case I915_BIT_6_SWIZZLE_9: - v ^= swizzle_bit(9, v); - break; - case I915_BIT_6_SWIZZLE_9_10: - v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v); - break; - case I915_BIT_6_SWIZZLE_9_11: - v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v); - break; - case I915_BIT_6_SWIZZLE_9_10_11: - v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v); - break; - } - - return v; -} - -static int check_partial_mapping(struct drm_i915_gem_object *obj, - const struct tile *tile, - unsigned long end_time) -{ - const unsigned int nreal = obj->scratch / PAGE_SIZE; - const unsigned long npages = obj->base.size / PAGE_SIZE; - struct i915_vma *vma; - unsigned long page; - int err; - - if (igt_timeout(end_time, - "%s: timed out before tiling=%d stride=%d\n", - __func__, tile->tiling, tile->stride)) - return -EINTR; - - err = i915_gem_object_set_tiling(obj, tile->tiling, tile->stride); - if (err) { - pr_err("Failed to set tiling mode=%u, stride=%u, err=%d\n", - tile->tiling, tile->stride, err); - return err; - } - - GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); - GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); - - for_each_prime_number_from(page, 1, npages) { - struct i915_ggtt_view view = - compute_partial_view(obj, page, MIN_CHUNK_PAGES); - u32 __iomem *io; - struct page *p; - unsigned int n; - u64 offset; - u32 *cpu; - - GEM_BUG_ON(view.partial.size > nreal); - cond_resched(); - - err = i915_gem_object_set_to_gtt_domain(obj, true); - if (err) { - pr_err("Failed to flush to GTT write domain; err=%d\n", - err); - return err; - } - - vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE); - if (IS_ERR(vma)) { - pr_err("Failed to pin partial view: offset=%lu; err=%d\n", - page, (int)PTR_ERR(vma)); - return PTR_ERR(vma); - } - - n = page - view.partial.offset; - GEM_BUG_ON(n >= view.partial.size); - - io = i915_vma_pin_iomap(vma); - i915_vma_unpin(vma); - if (IS_ERR(io)) { - pr_err("Failed to iomap partial view: offset=%lu; err=%d\n", - page, (int)PTR_ERR(io)); - return PTR_ERR(io); - } - - iowrite32(page, io + n * PAGE_SIZE/sizeof(*io)); - i915_vma_unpin_iomap(vma); - - offset = tiled_offset(tile, page << PAGE_SHIFT); - if (offset >= obj->base.size) - continue; - - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - - p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT); - cpu = kmap(p) + offset_in_page(offset); - drm_clflush_virt_range(cpu, sizeof(*cpu)); - if (*cpu != (u32)page) { - pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n", - page, n, - view.partial.offset, - view.partial.size, - vma->size >> PAGE_SHIFT, - tile->tiling ? tile_row_pages(obj) : 0, - vma->fence ? vma->fence->id : -1, tile->tiling, tile->stride, - offset >> PAGE_SHIFT, - (unsigned int)offset_in_page(offset), - offset, - (u32)page, *cpu); - err = -EINVAL; - } - *cpu = 0; - drm_clflush_virt_range(cpu, sizeof(*cpu)); - kunmap(p); - if (err) - return err; - - i915_vma_destroy(vma); - } - - return 0; -} - -static int igt_partial_tiling(void *arg) -{ - const unsigned int nreal = 1 << 12; /* largest tile row x2 */ - struct drm_i915_private *i915 = arg; - struct drm_i915_gem_object *obj; - intel_wakeref_t wakeref; - int tiling; - int err; - - /* We want to check the page mapping and fencing of a large object - * mmapped through the GTT. The object we create is larger than can - * possibly be mmaped as a whole, and so we must use partial GGTT vma. - * We then check that a write through each partial GGTT vma ends up - * in the right set of pages within the object, and with the expected - * tiling, which we verify by manual swizzling. - */ - - obj = huge_gem_object(i915, - nreal << PAGE_SHIFT, - (1 + next_prime_number(i915->ggtt.vm.total >> PAGE_SHIFT)) << PAGE_SHIFT); - if (IS_ERR(obj)) - return PTR_ERR(obj); - - err = i915_gem_object_pin_pages(obj); - if (err) { - pr_err("Failed to allocate %u pages (%lu total), err=%d\n", - nreal, obj->base.size / PAGE_SIZE, err); - goto out; - } - - mutex_lock(&i915->drm.struct_mutex); - wakeref = intel_runtime_pm_get(i915); - - if (1) { - IGT_TIMEOUT(end); - struct tile tile; - - tile.height = 1; - tile.width = 1; - tile.size = 0; - tile.stride = 0; - tile.swizzle = I915_BIT_6_SWIZZLE_NONE; - tile.tiling = I915_TILING_NONE; - - err = check_partial_mapping(obj, &tile, end); - if (err && err != -EINTR) - goto out_unlock; - } - - for (tiling = I915_TILING_X; tiling <= I915_TILING_Y; tiling++) { - IGT_TIMEOUT(end); - unsigned int max_pitch; - unsigned int pitch; - struct tile tile; - - if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) - /* - * The swizzling pattern is actually unknown as it - * varies based on physical address of each page. - * See i915_gem_detect_bit_6_swizzle(). - */ - break; - - tile.tiling = tiling; - switch (tiling) { - case I915_TILING_X: - tile.swizzle = i915->mm.bit_6_swizzle_x; - break; - case I915_TILING_Y: - tile.swizzle = i915->mm.bit_6_swizzle_y; - break; - } - - GEM_BUG_ON(tile.swizzle == I915_BIT_6_SWIZZLE_UNKNOWN); - if (tile.swizzle == I915_BIT_6_SWIZZLE_9_17 || - tile.swizzle == I915_BIT_6_SWIZZLE_9_10_17) - continue; - - if (INTEL_GEN(i915) <= 2) { - tile.height = 16; - tile.width = 128; - tile.size = 11; - } else if (tile.tiling == I915_TILING_Y && - HAS_128_BYTE_Y_TILING(i915)) { - tile.height = 32; - tile.width = 128; - tile.size = 12; - } else { - tile.height = 8; - tile.width = 512; - tile.size = 12; - } - - if (INTEL_GEN(i915) < 4) - max_pitch = 8192 / tile.width; - else if (INTEL_GEN(i915) < 7) - max_pitch = 128 * I965_FENCE_MAX_PITCH_VAL / tile.width; - else - max_pitch = 128 * GEN7_FENCE_MAX_PITCH_VAL / tile.width; - - for (pitch = max_pitch; pitch; pitch >>= 1) { - tile.stride = tile.width * pitch; - err = check_partial_mapping(obj, &tile, end); - if (err == -EINTR) - goto next_tiling; - if (err) - goto out_unlock; - - if (pitch > 2 && INTEL_GEN(i915) >= 4) { - tile.stride = tile.width * (pitch - 1); - err = check_partial_mapping(obj, &tile, end); - if (err == -EINTR) - goto next_tiling; - if (err) - goto out_unlock; - } - - if (pitch < max_pitch && INTEL_GEN(i915) >= 4) { - tile.stride = tile.width * (pitch + 1); - err = check_partial_mapping(obj, &tile, end); - if (err == -EINTR) - goto next_tiling; - if (err) - goto out_unlock; - } - } - - if (INTEL_GEN(i915) >= 4) { - for_each_prime_number(pitch, max_pitch) { - tile.stride = tile.width * pitch; - err = check_partial_mapping(obj, &tile, end); - if (err == -EINTR) - goto next_tiling; - if (err) - goto out_unlock; - } - } - -next_tiling: ; - } - -out_unlock: - intel_runtime_pm_put(i915, wakeref); - mutex_unlock(&i915->drm.struct_mutex); - i915_gem_object_unpin_pages(obj); -out: - i915_gem_object_put(obj); - return err; -} - -static int make_obj_busy(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct i915_request *rq; - struct i915_vma *vma; - int err; - - vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL); - if (IS_ERR(vma)) - return PTR_ERR(vma); - - err = i915_vma_pin(vma, 0, 0, PIN_USER); - if (err) - return err; - - rq = i915_request_create(i915->engine[RCS0]->kernel_context); - if (IS_ERR(rq)) { - i915_vma_unpin(vma); - return PTR_ERR(rq); - } - - err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - - i915_request_add(rq); - - __i915_gem_object_release_unless_active(obj); - i915_vma_unpin(vma); - - return err; -} - -static bool assert_mmap_offset(struct drm_i915_private *i915, - unsigned long size, - int expected) -{ - struct drm_i915_gem_object *obj; - int err; - - obj = i915_gem_object_create_internal(i915, size); - if (IS_ERR(obj)) - return PTR_ERR(obj); - - err = i915_gem_object_create_mmap_offset(obj); - i915_gem_object_put(obj); - - return err == expected; -} - -static void disable_retire_worker(struct drm_i915_private *i915) -{ - i915_gem_shrinker_unregister(i915); - - intel_gt_pm_get(i915); - - cancel_delayed_work_sync(&i915->gem.retire_work); - flush_work(&i915->gem.idle_work); -} - -static void restore_retire_worker(struct drm_i915_private *i915) -{ - intel_gt_pm_put(i915); - - mutex_lock(&i915->drm.struct_mutex); - igt_flush_test(i915, I915_WAIT_LOCKED); - mutex_unlock(&i915->drm.struct_mutex); - - i915_gem_shrinker_register(i915); -} - -static int igt_mmap_offset_exhaustion(void *arg) -{ - struct drm_i915_private *i915 = arg; - struct drm_mm *mm = &i915->drm.vma_offset_manager->vm_addr_space_mm; - struct drm_i915_gem_object *obj; - struct drm_mm_node resv, *hole; - u64 hole_start, hole_end; - int loop, err; - - /* Disable background reaper */ - disable_retire_worker(i915); - GEM_BUG_ON(!i915->gt.awake); - - /* Trim the device mmap space to only a page */ - memset(&resv, 0, sizeof(resv)); - drm_mm_for_each_hole(hole, mm, hole_start, hole_end) { - resv.start = hole_start; - resv.size = hole_end - hole_start - 1; /* PAGE_SIZE units */ - err = drm_mm_reserve_node(mm, &resv); - if (err) { - pr_err("Failed to trim VMA manager, err=%d\n", err); - goto out_park; - } - break; - } - - /* Just fits! */ - if (!assert_mmap_offset(i915, PAGE_SIZE, 0)) { - pr_err("Unable to insert object into single page hole\n"); - err = -EINVAL; - goto out; - } - - /* Too large */ - if (!assert_mmap_offset(i915, 2*PAGE_SIZE, -ENOSPC)) { - pr_err("Unexpectedly succeeded in inserting too large object into single page hole\n"); - err = -EINVAL; - goto out; - } - - /* Fill the hole, further allocation attempts should then fail */ - obj = i915_gem_object_create_internal(i915, PAGE_SIZE); - if (IS_ERR(obj)) { - err = PTR_ERR(obj); - goto out; - } - - err = i915_gem_object_create_mmap_offset(obj); - if (err) { - pr_err("Unable to insert object into reclaimed hole\n"); - goto err_obj; - } - - if (!assert_mmap_offset(i915, PAGE_SIZE, -ENOSPC)) { - pr_err("Unexpectedly succeeded in inserting object into no holes!\n"); - err = -EINVAL; - goto err_obj; - } - - i915_gem_object_put(obj); - - /* Now fill with busy dead objects that we expect to reap */ - for (loop = 0; loop < 3; loop++) { - intel_wakeref_t wakeref; - - if (i915_terminally_wedged(i915)) - break; - - obj = i915_gem_object_create_internal(i915, PAGE_SIZE); - if (IS_ERR(obj)) { - err = PTR_ERR(obj); - goto out; - } - - err = 0; - mutex_lock(&i915->drm.struct_mutex); - with_intel_runtime_pm(i915, wakeref) - err = make_obj_busy(obj); - mutex_unlock(&i915->drm.struct_mutex); - if (err) { - pr_err("[loop %d] Failed to busy the object\n", loop); - goto err_obj; - } - - /* NB we rely on the _active_ reference to access obj now */ - GEM_BUG_ON(!i915_gem_object_is_active(obj)); - err = i915_gem_object_create_mmap_offset(obj); - if (err) { - pr_err("[loop %d] i915_gem_object_create_mmap_offset failed with err=%d\n", - loop, err); - goto out; - } - } - -out: - drm_mm_remove_node(&resv); -out_park: - restore_retire_worker(i915); - return err; -err_obj: - i915_gem_object_put(obj); - goto out; -} - int i915_gem_object_mock_selftests(void) { static const struct i915_subtest tests[] = { @@ -596,8 +111,6 @@ int i915_gem_object_live_selftests(struct drm_i915_private *i915) { static const struct i915_subtest tests[] = { SUBTEST(igt_gem_huge), - SUBTEST(igt_partial_tiling), - SUBTEST(igt_mmap_offset_exhaustion), }; return i915_subtests(tests, i915); diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index a54f590788a4..0c49d9db8ef7 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -16,6 +16,7 @@ selftest(timelines, i915_timeline_live_selftests) selftest(requests, i915_request_live_selftests) selftest(active, i915_active_live_selftests) selftest(objects, i915_gem_object_live_selftests) +selftest(mman, i915_gem_mman_live_selftests) selftest(dmabuf, i915_gem_dmabuf_live_selftests) selftest(vma, i915_vma_live_selftests) selftest(coherency, i915_gem_coherency_live_selftests) From patchwork Wed May 22 08:20:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 790EE912 for ; Wed, 22 May 2019 08:21:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 665DF28A2E for ; Wed, 22 May 2019 08:21:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B06828A32; Wed, 22 May 2019 08:21:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 473D9289B7 for ; Wed, 22 May 2019 08:21:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A4C82897E3; Wed, 22 May 2019 08:21:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4A1C189789 for ; Wed, 22 May 2019 08:21:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636787-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:58 +0100 Message-Id: <20190522082104.5117-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 07/13] drm/i915: Move GEM domain management to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c, that of the read/write domains, perhaps the biggest of GEM's follies? Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_domain.c | 782 ++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 29 + drivers/gpu/drm/i915/gvt/cmd_parser.c | 4 +- drivers/gpu/drm/i915/gvt/scheduler.c | 6 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 8 +- drivers/gpu/drm/i915/i915_drv.h | 26 - drivers/gpu/drm/i915/i915_gem.c | 777 +---------------- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/i915_gem_render_state.c | 4 +- drivers/gpu/drm/i915/selftests/huge_pages.c | 4 +- .../drm/i915/selftests/i915_gem_coherency.c | 8 +- .../gpu/drm/i915/selftests/i915_gem_context.c | 8 +- 13 files changed, 839 insertions(+), 822 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_domain.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index d05757c52492..29fc924ba819 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -87,6 +87,7 @@ i915-y += $(gt-y) # GEM (Graphics Execution Management) code obj-y += gem/ gem-y += \ + gem/i915_gem_domain.o \ gem/i915_gem_object.o \ gem/i915_gem_mman.o \ gem/i915_gem_pages.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c new file mode 100644 index 000000000000..bbc7fb758186 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -0,0 +1,782 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include "i915_drv.h" +#include "i915_gem_clflush.h" +#include "i915_gem_gtt.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" +#include "i915_vma.h" +#include "intel_frontbuffer.h" + +static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj) +{ + /* + * We manually flush the CPU domain so that we can override and + * force the flush for the display, and perform it asyncrhonously. + */ + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + if (obj->cache_dirty) + i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE); + obj->write_domain = 0; +} + +void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj) +{ + if (!READ_ONCE(obj->pin_global)) + return; + + mutex_lock(&obj->base.dev->struct_mutex); + __i915_gem_object_flush_for_display(obj); + mutex_unlock(&obj->base.dev->struct_mutex); +} + +/** + * Moves a single object to the WC read, and possibly write domain. + * @obj: object to act on + * @write: ask for write access or read only + * + * This function returns when the move is complete, including waiting on + * flushes to occur. + */ +int +i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) +{ + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + (write ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + if (obj->write_domain == I915_GEM_DOMAIN_WC) + return 0; + + /* Flush and acquire obj->pages so that we are coherent through + * direct access in memory with previous cached writes through + * shmemfs and that our cache domain tracking remains valid. + * For example, if the obj->filp was moved to swap without us + * being notified and releasing the pages, we would mistakenly + * continue to assume that the obj remained out of the CPU cached + * domain. + */ + ret = i915_gem_object_pin_pages(obj); + if (ret) + return ret; + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); + + /* Serialise direct access to this object with the barriers for + * coherent writes from the GPU, by effectively invalidating the + * WC domain upon first access. + */ + if ((obj->read_domains & I915_GEM_DOMAIN_WC) == 0) + mb(); + + /* It should now be out of any other write domains, and we can update + * the domain values for our changes. + */ + GEM_BUG_ON((obj->write_domain & ~I915_GEM_DOMAIN_WC) != 0); + obj->read_domains |= I915_GEM_DOMAIN_WC; + if (write) { + obj->read_domains = I915_GEM_DOMAIN_WC; + obj->write_domain = I915_GEM_DOMAIN_WC; + obj->mm.dirty = true; + } + + i915_gem_object_unpin_pages(obj); + return 0; +} + +/** + * Moves a single object to the GTT read, and possibly write domain. + * @obj: object to act on + * @write: ask for write access or read only + * + * This function returns when the move is complete, including waiting on + * flushes to occur. + */ +int +i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) +{ + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + (write ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + if (obj->write_domain == I915_GEM_DOMAIN_GTT) + return 0; + + /* Flush and acquire obj->pages so that we are coherent through + * direct access in memory with previous cached writes through + * shmemfs and that our cache domain tracking remains valid. + * For example, if the obj->filp was moved to swap without us + * being notified and releasing the pages, we would mistakenly + * continue to assume that the obj remained out of the CPU cached + * domain. + */ + ret = i915_gem_object_pin_pages(obj); + if (ret) + return ret; + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); + + /* Serialise direct access to this object with the barriers for + * coherent writes from the GPU, by effectively invalidating the + * GTT domain upon first access. + */ + if ((obj->read_domains & I915_GEM_DOMAIN_GTT) == 0) + mb(); + + /* It should now be out of any other write domains, and we can update + * the domain values for our changes. + */ + GEM_BUG_ON((obj->write_domain & ~I915_GEM_DOMAIN_GTT) != 0); + obj->read_domains |= I915_GEM_DOMAIN_GTT; + if (write) { + obj->read_domains = I915_GEM_DOMAIN_GTT; + obj->write_domain = I915_GEM_DOMAIN_GTT; + obj->mm.dirty = true; + } + + i915_gem_object_unpin_pages(obj); + return 0; +} + +/** + * Changes the cache-level of an object across all VMA. + * @obj: object to act on + * @cache_level: new cache level to set for the object + * + * After this function returns, the object will be in the new cache-level + * across all GTT and the contents of the backing storage will be coherent, + * with respect to the new cache-level. In order to keep the backing storage + * coherent for all users, we only allow a single cache level to be set + * globally on the object and prevent it from being changed whilst the + * hardware is reading from the object. That is if the object is currently + * on the scanout it will be set to uncached (or equivalent display + * cache coherency) and all non-MOCS GPU access will also be uncached so + * that all direct access to the scanout remains coherent. + */ +int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, + enum i915_cache_level cache_level) +{ + struct i915_vma *vma; + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + if (obj->cache_level == cache_level) + return 0; + + /* Inspect the list of currently bound VMA and unbind any that would + * be invalid given the new cache-level. This is principally to + * catch the issue of the CS prefetch crossing page boundaries and + * reading an invalid PTE on older architectures. + */ +restart: + list_for_each_entry(vma, &obj->vma.list, obj_link) { + if (!drm_mm_node_allocated(&vma->node)) + continue; + + if (i915_vma_is_pinned(vma)) { + DRM_DEBUG("can not change the cache level of pinned objects\n"); + return -EBUSY; + } + + if (!i915_vma_is_closed(vma) && + i915_gem_valid_gtt_space(vma, cache_level)) + continue; + + ret = i915_vma_unbind(vma); + if (ret) + return ret; + + /* As unbinding may affect other elements in the + * obj->vma_list (due to side-effects from retiring + * an active vma), play safe and restart the iterator. + */ + goto restart; + } + + /* We can reuse the existing drm_mm nodes but need to change the + * cache-level on the PTE. We could simply unbind them all and + * rebind with the correct cache-level on next use. However since + * we already have a valid slot, dma mapping, pages etc, we may as + * rewrite the PTE in the belief that doing so tramples upon less + * state and so involves less work. + */ + if (obj->bind_count) { + /* Before we change the PTE, the GPU must not be accessing it. + * If we wait upon the object, we know that all the bound + * VMA are no longer active. + */ + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + if (!HAS_LLC(to_i915(obj->base.dev)) && + cache_level != I915_CACHE_NONE) { + /* Access to snoopable pages through the GTT is + * incoherent and on some machines causes a hard + * lockup. Relinquish the CPU mmaping to force + * userspace to refault in the pages and we can + * then double check if the GTT mapping is still + * valid for that pointer access. + */ + i915_gem_object_release_mmap(obj); + + /* As we no longer need a fence for GTT access, + * we can relinquish it now (and so prevent having + * to steal a fence from someone else on the next + * fence request). Note GPU activity would have + * dropped the fence as all snoopable access is + * supposed to be linear. + */ + for_each_ggtt_vma(vma, obj) { + ret = i915_vma_put_fence(vma); + if (ret) + return ret; + } + } else { + /* We either have incoherent backing store and + * so no GTT access or the architecture is fully + * coherent. In such cases, existing GTT mmaps + * ignore the cache bit in the PTE and we can + * rewrite it without confusing the GPU or having + * to force userspace to fault back in its mmaps. + */ + } + + list_for_each_entry(vma, &obj->vma.list, obj_link) { + if (!drm_mm_node_allocated(&vma->node)) + continue; + + ret = i915_vma_bind(vma, cache_level, PIN_UPDATE); + if (ret) + return ret; + } + } + + list_for_each_entry(vma, &obj->vma.list, obj_link) + vma->node.color = cache_level; + i915_gem_object_set_cache_coherency(obj, cache_level); + obj->cache_dirty = true; /* Always invalidate stale cachelines */ + + return 0; +} + +int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_caching *args = data; + struct drm_i915_gem_object *obj; + int err = 0; + + rcu_read_lock(); + obj = i915_gem_object_lookup_rcu(file, args->handle); + if (!obj) { + err = -ENOENT; + goto out; + } + + switch (obj->cache_level) { + case I915_CACHE_LLC: + case I915_CACHE_L3_LLC: + args->caching = I915_CACHING_CACHED; + break; + + case I915_CACHE_WT: + args->caching = I915_CACHING_DISPLAY; + break; + + default: + args->caching = I915_CACHING_NONE; + break; + } +out: + rcu_read_unlock(); + return err; +} + +int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_private *i915 = to_i915(dev); + struct drm_i915_gem_caching *args = data; + struct drm_i915_gem_object *obj; + enum i915_cache_level level; + int ret = 0; + + switch (args->caching) { + case I915_CACHING_NONE: + level = I915_CACHE_NONE; + break; + case I915_CACHING_CACHED: + /* + * Due to a HW issue on BXT A stepping, GPU stores via a + * snooped mapping may leave stale data in a corresponding CPU + * cacheline, whereas normally such cachelines would get + * invalidated. + */ + if (!HAS_LLC(i915) && !HAS_SNOOP(i915)) + return -ENODEV; + + level = I915_CACHE_LLC; + break; + case I915_CACHING_DISPLAY: + level = HAS_WT(i915) ? I915_CACHE_WT : I915_CACHE_NONE; + break; + default: + return -EINVAL; + } + + obj = i915_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + /* + * The caching mode of proxy object is handled by its generator, and + * not allowed to be changed by userspace. + */ + if (i915_gem_object_is_proxy(obj)) { + ret = -ENXIO; + goto out; + } + + if (obj->cache_level == level) + goto out; + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto out; + + ret = i915_mutex_lock_interruptible(dev); + if (ret) + goto out; + + ret = i915_gem_object_set_cache_level(obj, level); + mutex_unlock(&dev->struct_mutex); + +out: + i915_gem_object_put(obj); + return ret; +} + +/* + * Prepare buffer for display plane (scanout, cursors, etc). Can be called from + * an uninterruptible phase (modesetting) and allows any flushes to be pipelined + * (for pageflips). We only flush the caches while preparing the buffer for + * display, the callers are responsible for frontbuffer flush. + */ +struct i915_vma * +i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, + u32 alignment, + const struct i915_ggtt_view *view, + unsigned int flags) +{ + struct i915_vma *vma; + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + /* Mark the global pin early so that we account for the + * display coherency whilst setting up the cache domains. + */ + obj->pin_global++; + + /* The display engine is not coherent with the LLC cache on gen6. As + * a result, we make sure that the pinning that is about to occur is + * done with uncached PTEs. This is lowest common denominator for all + * chipsets. + * + * However for gen6+, we could do better by using the GFDT bit instead + * of uncaching, which would allow us to flush all the LLC-cached data + * with that bit in the PTE to main memory with just one PIPE_CONTROL. + */ + ret = i915_gem_object_set_cache_level(obj, + HAS_WT(to_i915(obj->base.dev)) ? + I915_CACHE_WT : I915_CACHE_NONE); + if (ret) { + vma = ERR_PTR(ret); + goto err_unpin_global; + } + + /* As the user may map the buffer once pinned in the display plane + * (e.g. libkms for the bootup splash), we have to ensure that we + * always use map_and_fenceable for all scanout buffers. However, + * it may simply be too big to fit into mappable, in which case + * put it anyway and hope that userspace can cope (but always first + * try to preserve the existing ABI). + */ + vma = ERR_PTR(-ENOSPC); + if ((flags & PIN_MAPPABLE) == 0 && + (!view || view->type == I915_GGTT_VIEW_NORMAL)) + vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, + flags | + PIN_MAPPABLE | + PIN_NONBLOCK); + if (IS_ERR(vma)) + vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, flags); + if (IS_ERR(vma)) + goto err_unpin_global; + + vma->display_alignment = max_t(u64, vma->display_alignment, alignment); + + __i915_gem_object_flush_for_display(obj); + + /* It should now be out of any other write domains, and we can update + * the domain values for our changes. + */ + obj->read_domains |= I915_GEM_DOMAIN_GTT; + + return vma; + +err_unpin_global: + obj->pin_global--; + return vma; +} + +static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct list_head *list; + struct i915_vma *vma; + + GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); + + mutex_lock(&i915->ggtt.vm.mutex); + for_each_ggtt_vma(vma, obj) { + if (!drm_mm_node_allocated(&vma->node)) + continue; + + list_move_tail(&vma->vm_link, &vma->vm->bound_list); + } + mutex_unlock(&i915->ggtt.vm.mutex); + + spin_lock(&i915->mm.obj_lock); + list = obj->bind_count ? &i915->mm.bound_list : &i915->mm.unbound_list; + list_move_tail(&obj->mm.link, list); + spin_unlock(&i915->mm.obj_lock); +} + +void +i915_gem_object_unpin_from_display_plane(struct i915_vma *vma) +{ + lockdep_assert_held(&vma->vm->i915->drm.struct_mutex); + + if (WARN_ON(vma->obj->pin_global == 0)) + return; + + if (--vma->obj->pin_global == 0) + vma->display_alignment = I915_GTT_MIN_ALIGNMENT; + + /* Bump the LRU to try and avoid premature eviction whilst flipping */ + i915_gem_object_bump_inactive_ggtt(vma->obj); + + i915_vma_unpin(vma); +} + +/** + * Moves a single object to the CPU read, and possibly write domain. + * @obj: object to act on + * @write: requesting write or read-only access + * + * This function returns when the move is complete, including waiting on + * flushes to occur. + */ +int +i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) +{ + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + (write ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + + /* Flush the CPU cache if it's still invalid. */ + if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { + i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + obj->read_domains |= I915_GEM_DOMAIN_CPU; + } + + /* It should now be out of any other write domains, and we can update + * the domain values for our changes. + */ + GEM_BUG_ON(obj->write_domain & ~I915_GEM_DOMAIN_CPU); + + /* If we're writing through the CPU, then the GPU read domains will + * need to be invalidated at next use. + */ + if (write) + __start_cpu_write(obj); + + return 0; +} + +static inline enum fb_op_origin +fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) +{ + return (domain == I915_GEM_DOMAIN_GTT ? + obj->frontbuffer_ggtt_origin : ORIGIN_CPU); +} + +/** + * Called when user space prepares to use an object with the CPU, either + * through the mmap ioctl's mapping or a GTT mapping. + * @dev: drm device + * @data: ioctl data blob + * @file: drm file + */ +int +i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_set_domain *args = data; + struct drm_i915_gem_object *obj; + u32 read_domains = args->read_domains; + u32 write_domain = args->write_domain; + int err; + + /* Only handle setting domains to types used by the CPU. */ + if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS) + return -EINVAL; + + /* + * Having something in the write domain implies it's in the read + * domain, and only that read domain. Enforce that in the request. + */ + if (write_domain && read_domains != write_domain) + return -EINVAL; + + if (!read_domains) + return 0; + + obj = i915_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + /* + * Already in the desired write domain? Nothing for us to do! + * + * We apply a little bit of cunning here to catch a broader set of + * no-ops. If obj->write_domain is set, we must be in the same + * obj->read_domains, and only that domain. Therefore, if that + * obj->write_domain matches the request read_domains, we are + * already in the same read/write domain and can skip the operation, + * without having to further check the requested write_domain. + */ + if (READ_ONCE(obj->write_domain) == read_domains) { + err = 0; + goto out; + } + + /* + * Try to flush the object off the GPU without holding the lock. + * We will repeat the flush holding the lock in the normal manner + * to catch cases where we are gazumped. + */ + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + (write_domain ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); + if (err) + goto out; + + /* + * Proxy objects do not control access to the backing storage, ergo + * they cannot be used as a means to manipulate the cache domain + * tracking for that backing storage. The proxy object is always + * considered to be outside of any cache domain. + */ + if (i915_gem_object_is_proxy(obj)) { + err = -ENXIO; + goto out; + } + + /* + * Flush and acquire obj->pages so that we are coherent through + * direct access in memory with previous cached writes through + * shmemfs and that our cache domain tracking remains valid. + * For example, if the obj->filp was moved to swap without us + * being notified and releasing the pages, we would mistakenly + * continue to assume that the obj remained out of the CPU cached + * domain. + */ + err = i915_gem_object_pin_pages(obj); + if (err) + goto out; + + err = i915_mutex_lock_interruptible(dev); + if (err) + goto out_unpin; + + if (read_domains & I915_GEM_DOMAIN_WC) + err = i915_gem_object_set_to_wc_domain(obj, write_domain); + else if (read_domains & I915_GEM_DOMAIN_GTT) + err = i915_gem_object_set_to_gtt_domain(obj, write_domain); + else + err = i915_gem_object_set_to_cpu_domain(obj, write_domain); + + /* And bump the LRU for this access */ + i915_gem_object_bump_inactive_ggtt(obj); + + mutex_unlock(&dev->struct_mutex); + + if (write_domain != 0) + intel_fb_obj_invalidate(obj, + fb_write_origin(obj, write_domain)); + +out_unpin: + i915_gem_object_unpin_pages(obj); +out: + i915_gem_object_put(obj); + return err; +} + +/* + * Pins the specified object's pages and synchronizes the object with + * GPU accesses. Sets needs_clflush to non-zero if the caller should + * flush the object from the CPU cache. + */ +int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, + unsigned int *needs_clflush) +{ + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + *needs_clflush = 0; + if (!i915_gem_object_has_struct_page(obj)) + return -ENODEV; + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED, + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + ret = i915_gem_object_pin_pages(obj); + if (ret) + return ret; + + if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || + !static_cpu_has(X86_FEATURE_CLFLUSH)) { + ret = i915_gem_object_set_to_cpu_domain(obj, false); + if (ret) + goto err_unpin; + else + goto out; + } + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + + /* If we're not in the cpu read domain, set ourself into the gtt + * read domain and manually flush cachelines (if required). This + * optimizes for the case when the gpu will dirty the data + * anyway again before the next pread happens. + */ + if (!obj->cache_dirty && + !(obj->read_domains & I915_GEM_DOMAIN_CPU)) + *needs_clflush = CLFLUSH_BEFORE; + +out: + /* return with the pages pinned */ + return 0; + +err_unpin: + i915_gem_object_unpin_pages(obj); + return ret; +} + +int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, + unsigned int *needs_clflush) +{ + int ret; + + lockdep_assert_held(&obj->base.dev->struct_mutex); + + *needs_clflush = 0; + if (!i915_gem_object_has_struct_page(obj)) + return -ENODEV; + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_LOCKED | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + return ret; + + ret = i915_gem_object_pin_pages(obj); + if (ret) + return ret; + + if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || + !static_cpu_has(X86_FEATURE_CLFLUSH)) { + ret = i915_gem_object_set_to_cpu_domain(obj, true); + if (ret) + goto err_unpin; + else + goto out; + } + + i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + + /* If we're not in the cpu write domain, set ourself into the + * gtt write domain and manually flush cachelines (as required). + * This optimizes for the case when the gpu will use the data + * right away and we therefore have to clflush anyway. + */ + if (!obj->cache_dirty) { + *needs_clflush |= CLFLUSH_AFTER; + + /* + * Same trick applies to invalidate partially written + * cachelines read before writing. + */ + if (!(obj->read_domains & I915_GEM_DOMAIN_CPU)) + *needs_clflush |= CLFLUSH_BEFORE; + } + +out: + intel_fb_obj_invalidate(obj, ORIGIN_CPU); + obj->mm.dirty = true; + /* return with the pages pinned */ + return 0; + +err_unpin: + i915_gem_object_unpin_pages(obj); + return ret; +} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 07f487cbff79..8cf082abb0ab 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -15,6 +15,8 @@ #include "i915_gem_object_types.h" +#include "i915_gem_gtt.h" + void i915_gem_init__objects(struct drm_i915_private *i915); struct drm_i915_gem_object *i915_gem_object_alloc(void); @@ -358,6 +360,20 @@ void i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains); +int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, + unsigned int *needs_clflush); +int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, + unsigned int *needs_clflush); +#define CLFLUSH_BEFORE BIT(0) +#define CLFLUSH_AFTER BIT(1) +#define CLFLUSH_FLAGS (CLFLUSH_BEFORE | CLFLUSH_AFTER) + +static inline void +i915_gem_object_finish_access(struct drm_i915_gem_object *obj) +{ + i915_gem_object_unpin_pages(obj); +} + static inline struct intel_engine_cs * i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj) { @@ -379,6 +395,19 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, unsigned int cache_level); void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); +int __must_check +i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); +int __must_check +i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); +int __must_check +i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); +struct i915_vma * __must_check +i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, + u32 alignment, + const struct i915_ggtt_view *view, + unsigned int flags); +void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); + static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) { if (obj->cache_dirty) diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c index 7584bf0aeaa4..e3608b170105 100644 --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c @@ -1764,7 +1764,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) goto err_free_bb; } - ret = i915_gem_obj_prepare_shmem_write(bb->obj, &bb->clflush); + ret = i915_gem_object_prepare_write(bb->obj, &bb->clflush); if (ret) goto err_free_obj; @@ -1813,7 +1813,7 @@ static int perform_bb_shadow(struct parser_exec_state *s) err_unmap: i915_gem_object_unpin_map(bb->obj); err_finish_shmem_access: - i915_gem_obj_finish_shmem_access(bb->obj); + i915_gem_object_finish_access(bb->obj); err_free_obj: i915_gem_object_put(bb->obj); err_free_bb: diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index 38897d241f5f..3a691447f76c 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -482,7 +482,7 @@ static int prepare_shadow_batch_buffer(struct intel_vgpu_workload *workload) bb->obj->base.size); bb->clflush &= ~CLFLUSH_AFTER; } - i915_gem_obj_finish_shmem_access(bb->obj); + i915_gem_object_finish_access(bb->obj); bb->accessing = false; } else { @@ -510,7 +510,7 @@ static int prepare_shadow_batch_buffer(struct intel_vgpu_workload *workload) if (ret) goto err; - i915_gem_obj_finish_shmem_access(bb->obj); + i915_gem_object_finish_access(bb->obj); bb->accessing = false; ret = i915_vma_move_to_active(bb->vma, @@ -588,7 +588,7 @@ static void release_shadow_batch_buffer(struct intel_vgpu_workload *workload) list_for_each_entry_safe(bb, pos, &workload->shadow_bb, list) { if (bb->obj) { if (bb->accessing) - i915_gem_obj_finish_shmem_access(bb->obj); + i915_gem_object_finish_access(bb->obj); if (bb->va && !IS_ERR(bb->va)) i915_gem_object_unpin_map(bb->obj); diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index e9fadcb4d592..c893bd4eb2c8 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1058,11 +1058,11 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, void *dst, *src; int ret; - ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush); + ret = i915_gem_object_prepare_read(src_obj, &src_needs_clflush); if (ret) return ERR_PTR(ret); - ret = i915_gem_obj_prepare_shmem_write(dst_obj, &dst_needs_clflush); + ret = i915_gem_object_prepare_write(dst_obj, &dst_needs_clflush); if (ret) { dst = ERR_PTR(ret); goto unpin_src; @@ -1120,9 +1120,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, *needs_clflush_after = dst_needs_clflush & CLFLUSH_AFTER; unpin_dst: - i915_gem_obj_finish_shmem_access(dst_obj); + i915_gem_object_finish_access(dst_obj); unpin_src: - i915_gem_obj_finish_shmem_access(src_obj); + i915_gem_object_finish_access(src_obj); return dst; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c5b2e39a88a4..91689a23d9e6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2803,20 +2803,6 @@ static inline int __sg_page_count(const struct scatterlist *sg) return sg->length >> PAGE_SHIFT; } -int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, - unsigned int *needs_clflush); -int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, - unsigned int *needs_clflush); -#define CLFLUSH_BEFORE BIT(0) -#define CLFLUSH_AFTER BIT(1) -#define CLFLUSH_FLAGS (CLFLUSH_BEFORE | CLFLUSH_AFTER) - -static inline void -i915_gem_obj_finish_shmem_access(struct drm_i915_gem_object *obj) -{ - i915_gem_object_unpin_pages(obj); -} - static inline int __must_check i915_mutex_lock_interruptible(struct drm_device *dev) { @@ -2879,18 +2865,6 @@ int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, const struct i915_sched_attr *attr); #define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX) -int __must_check -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); -struct i915_vma * __must_check -i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, - u32 alignment, - const struct i915_ggtt_view *view, - unsigned int flags); -void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file); void i915_gem_release(struct drm_device *dev, struct drm_file *file); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2ed3dadfcc16..92e555e1c1ba 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -462,123 +462,6 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv) } } -/* - * Pins the specified object's pages and synchronizes the object with - * GPU accesses. Sets needs_clflush to non-zero if the caller should - * flush the object from the CPU cache. - */ -int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, - unsigned int *needs_clflush) -{ - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - *needs_clflush = 0; - if (!i915_gem_object_has_struct_page(obj)) - return -ENODEV; - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; - - if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - else - goto out; - } - - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - - /* If we're not in the cpu read domain, set ourself into the gtt - * read domain and manually flush cachelines (if required). This - * optimizes for the case when the gpu will dirty the data - * anyway again before the next pread happens. - */ - if (!obj->cache_dirty && - !(obj->read_domains & I915_GEM_DOMAIN_CPU)) - *needs_clflush = CLFLUSH_BEFORE; - -out: - /* return with the pages pinned */ - return 0; - -err_unpin: - i915_gem_object_unpin_pages(obj); - return ret; -} - -int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, - unsigned int *needs_clflush) -{ - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - *needs_clflush = 0; - if (!i915_gem_object_has_struct_page(obj)) - return -ENODEV; - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; - - if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - else - goto out; - } - - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - - /* If we're not in the cpu write domain, set ourself into the - * gtt write domain and manually flush cachelines (as required). - * This optimizes for the case when the gpu will use the data - * right away and we therefore have to clflush anyway. - */ - if (!obj->cache_dirty) { - *needs_clflush |= CLFLUSH_AFTER; - - /* - * Same trick applies to invalidate partially written - * cachelines read before writing. - */ - if (!(obj->read_domains & I915_GEM_DOMAIN_CPU)) - *needs_clflush |= CLFLUSH_BEFORE; - } - -out: - intel_fb_obj_invalidate(obj, ORIGIN_CPU); - obj->mm.dirty = true; - /* return with the pages pinned */ - return 0; - -err_unpin: - i915_gem_object_unpin_pages(obj); - return ret; -} - static int shmem_pread(struct page *page, int offset, int len, char __user *user_data, bool needs_clflush) @@ -612,7 +495,7 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj, if (ret) return ret; - ret = i915_gem_obj_prepare_shmem_read(obj, &needs_clflush); + ret = i915_gem_object_prepare_read(obj, &needs_clflush); mutex_unlock(&obj->base.dev->struct_mutex); if (ret) return ret; @@ -634,7 +517,7 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj, offset = 0; } - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return ret; } @@ -1009,7 +892,7 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, if (ret) return ret; - ret = i915_gem_obj_prepare_shmem_write(obj, &needs_clflush); + ret = i915_gem_object_prepare_write(obj, &needs_clflush); mutex_unlock(&i915->drm.struct_mutex); if (ret) return ret; @@ -1041,7 +924,7 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, } intel_fb_obj_flush(obj, ORIGIN_CPU); - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return ret; } @@ -1130,150 +1013,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, return ret; } -static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) -{ - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct list_head *list; - struct i915_vma *vma; - - GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); - - mutex_lock(&i915->ggtt.vm.mutex); - for_each_ggtt_vma(vma, obj) { - if (!drm_mm_node_allocated(&vma->node)) - continue; - - list_move_tail(&vma->vm_link, &vma->vm->bound_list); - } - mutex_unlock(&i915->ggtt.vm.mutex); - - spin_lock(&i915->mm.obj_lock); - list = obj->bind_count ? &i915->mm.bound_list : &i915->mm.unbound_list; - list_move_tail(&obj->mm.link, list); - spin_unlock(&i915->mm.obj_lock); -} - -static inline enum fb_op_origin -fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain) -{ - return (domain == I915_GEM_DOMAIN_GTT ? - obj->frontbuffer_ggtt_origin : ORIGIN_CPU); -} - -/** - * Called when user space prepares to use an object with the CPU, either - * through the mmap ioctl's mapping or a GTT mapping. - * @dev: drm device - * @data: ioctl data blob - * @file: drm file - */ -int -i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_gem_set_domain *args = data; - struct drm_i915_gem_object *obj; - u32 read_domains = args->read_domains; - u32 write_domain = args->write_domain; - int err; - - /* Only handle setting domains to types used by the CPU. */ - if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS) - return -EINVAL; - - /* - * Having something in the write domain implies it's in the read - * domain, and only that read domain. Enforce that in the request. - */ - if (write_domain && read_domains != write_domain) - return -EINVAL; - - if (!read_domains) - return 0; - - obj = i915_gem_object_lookup(file, args->handle); - if (!obj) - return -ENOENT; - - /* - * Already in the desired write domain? Nothing for us to do! - * - * We apply a little bit of cunning here to catch a broader set of - * no-ops. If obj->write_domain is set, we must be in the same - * obj->read_domains, and only that domain. Therefore, if that - * obj->write_domain matches the request read_domains, we are - * already in the same read/write domain and can skip the operation, - * without having to further check the requested write_domain. - */ - if (READ_ONCE(obj->write_domain) == read_domains) { - err = 0; - goto out; - } - - /* - * Try to flush the object off the GPU without holding the lock. - * We will repeat the flush holding the lock in the normal manner - * to catch cases where we are gazumped. - */ - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (err) - goto out; - - /* - * Proxy objects do not control access to the backing storage, ergo - * they cannot be used as a means to manipulate the cache domain - * tracking for that backing storage. The proxy object is always - * considered to be outside of any cache domain. - */ - if (i915_gem_object_is_proxy(obj)) { - err = -ENXIO; - goto out; - } - - /* - * Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - err = i915_gem_object_pin_pages(obj); - if (err) - goto out; - - err = i915_mutex_lock_interruptible(dev); - if (err) - goto out_unpin; - - if (read_domains & I915_GEM_DOMAIN_WC) - err = i915_gem_object_set_to_wc_domain(obj, write_domain); - else if (read_domains & I915_GEM_DOMAIN_GTT) - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); - else - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); - - /* And bump the LRU for this access */ - i915_gem_object_bump_inactive_ggtt(obj); - - mutex_unlock(&dev->struct_mutex); - - if (write_domain != 0) - intel_fb_obj_invalidate(obj, - fb_write_origin(obj, write_domain)); - -out_unpin: - i915_gem_object_unpin_pages(obj); -out: - i915_gem_object_put(obj); - return err; -} - /** * Called when user space has done writes to this buffer * @dev: drm device @@ -1542,514 +1281,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, return 0; } -static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj) -{ - /* - * We manually flush the CPU domain so that we can override and - * force the flush for the display, and perform it asyncrhonously. - */ - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - if (obj->cache_dirty) - i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE); - obj->write_domain = 0; -} - -void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj) -{ - if (!READ_ONCE(obj->pin_global)) - return; - - mutex_lock(&obj->base.dev->struct_mutex); - __i915_gem_object_flush_for_display(obj); - mutex_unlock(&obj->base.dev->struct_mutex); -} - -/** - * Moves a single object to the WC read, and possibly write domain. - * @obj: object to act on - * @write: ask for write access or read only - * - * This function returns when the move is complete, including waiting on - * flushes to occur. - */ -int -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) -{ - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - if (obj->write_domain == I915_GEM_DOMAIN_WC) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; - - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); - - /* Serialise direct access to this object with the barriers for - * coherent writes from the GPU, by effectively invalidating the - * WC domain upon first access. - */ - if ((obj->read_domains & I915_GEM_DOMAIN_WC) == 0) - mb(); - - /* It should now be out of any other write domains, and we can update - * the domain values for our changes. - */ - GEM_BUG_ON((obj->write_domain & ~I915_GEM_DOMAIN_WC) != 0); - obj->read_domains |= I915_GEM_DOMAIN_WC; - if (write) { - obj->read_domains = I915_GEM_DOMAIN_WC; - obj->write_domain = I915_GEM_DOMAIN_WC; - obj->mm.dirty = true; - } - - i915_gem_object_unpin_pages(obj); - return 0; -} - -/** - * Moves a single object to the GTT read, and possibly write domain. - * @obj: object to act on - * @write: ask for write access or read only - * - * This function returns when the move is complete, including waiting on - * flushes to occur. - */ -int -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) -{ - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - if (obj->write_domain == I915_GEM_DOMAIN_GTT) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; - - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); - - /* Serialise direct access to this object with the barriers for - * coherent writes from the GPU, by effectively invalidating the - * GTT domain upon first access. - */ - if ((obj->read_domains & I915_GEM_DOMAIN_GTT) == 0) - mb(); - - /* It should now be out of any other write domains, and we can update - * the domain values for our changes. - */ - GEM_BUG_ON((obj->write_domain & ~I915_GEM_DOMAIN_GTT) != 0); - obj->read_domains |= I915_GEM_DOMAIN_GTT; - if (write) { - obj->read_domains = I915_GEM_DOMAIN_GTT; - obj->write_domain = I915_GEM_DOMAIN_GTT; - obj->mm.dirty = true; - } - - i915_gem_object_unpin_pages(obj); - return 0; -} - -/** - * Changes the cache-level of an object across all VMA. - * @obj: object to act on - * @cache_level: new cache level to set for the object - * - * After this function returns, the object will be in the new cache-level - * across all GTT and the contents of the backing storage will be coherent, - * with respect to the new cache-level. In order to keep the backing storage - * coherent for all users, we only allow a single cache level to be set - * globally on the object and prevent it from being changed whilst the - * hardware is reading from the object. That is if the object is currently - * on the scanout it will be set to uncached (or equivalent display - * cache coherency) and all non-MOCS GPU access will also be uncached so - * that all direct access to the scanout remains coherent. - */ -int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, - enum i915_cache_level cache_level) -{ - struct i915_vma *vma; - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - if (obj->cache_level == cache_level) - return 0; - - /* Inspect the list of currently bound VMA and unbind any that would - * be invalid given the new cache-level. This is principally to - * catch the issue of the CS prefetch crossing page boundaries and - * reading an invalid PTE on older architectures. - */ -restart: - list_for_each_entry(vma, &obj->vma.list, obj_link) { - if (!drm_mm_node_allocated(&vma->node)) - continue; - - if (i915_vma_is_pinned(vma)) { - DRM_DEBUG("can not change the cache level of pinned objects\n"); - return -EBUSY; - } - - if (!i915_vma_is_closed(vma) && - i915_gem_valid_gtt_space(vma, cache_level)) - continue; - - ret = i915_vma_unbind(vma); - if (ret) - return ret; - - /* As unbinding may affect other elements in the - * obj->vma_list (due to side-effects from retiring - * an active vma), play safe and restart the iterator. - */ - goto restart; - } - - /* We can reuse the existing drm_mm nodes but need to change the - * cache-level on the PTE. We could simply unbind them all and - * rebind with the correct cache-level on next use. However since - * we already have a valid slot, dma mapping, pages etc, we may as - * rewrite the PTE in the belief that doing so tramples upon less - * state and so involves less work. - */ - if (obj->bind_count) { - /* Before we change the PTE, the GPU must not be accessing it. - * If we wait upon the object, we know that all the bound - * VMA are no longer active. - */ - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - if (!HAS_LLC(to_i915(obj->base.dev)) && - cache_level != I915_CACHE_NONE) { - /* Access to snoopable pages through the GTT is - * incoherent and on some machines causes a hard - * lockup. Relinquish the CPU mmaping to force - * userspace to refault in the pages and we can - * then double check if the GTT mapping is still - * valid for that pointer access. - */ - i915_gem_object_release_mmap(obj); - - /* As we no longer need a fence for GTT access, - * we can relinquish it now (and so prevent having - * to steal a fence from someone else on the next - * fence request). Note GPU activity would have - * dropped the fence as all snoopable access is - * supposed to be linear. - */ - for_each_ggtt_vma(vma, obj) { - ret = i915_vma_put_fence(vma); - if (ret) - return ret; - } - } else { - /* We either have incoherent backing store and - * so no GTT access or the architecture is fully - * coherent. In such cases, existing GTT mmaps - * ignore the cache bit in the PTE and we can - * rewrite it without confusing the GPU or having - * to force userspace to fault back in its mmaps. - */ - } - - list_for_each_entry(vma, &obj->vma.list, obj_link) { - if (!drm_mm_node_allocated(&vma->node)) - continue; - - ret = i915_vma_bind(vma, cache_level, PIN_UPDATE); - if (ret) - return ret; - } - } - - list_for_each_entry(vma, &obj->vma.list, obj_link) - vma->node.color = cache_level; - i915_gem_object_set_cache_coherency(obj, cache_level); - obj->cache_dirty = true; /* Always invalidate stale cachelines */ - - return 0; -} - -int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_gem_caching *args = data; - struct drm_i915_gem_object *obj; - int err = 0; - - rcu_read_lock(); - obj = i915_gem_object_lookup_rcu(file, args->handle); - if (!obj) { - err = -ENOENT; - goto out; - } - - switch (obj->cache_level) { - case I915_CACHE_LLC: - case I915_CACHE_L3_LLC: - args->caching = I915_CACHING_CACHED; - break; - - case I915_CACHE_WT: - args->caching = I915_CACHING_DISPLAY; - break; - - default: - args->caching = I915_CACHING_NONE; - break; - } -out: - rcu_read_unlock(); - return err; -} - -int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_private *i915 = to_i915(dev); - struct drm_i915_gem_caching *args = data; - struct drm_i915_gem_object *obj; - enum i915_cache_level level; - int ret = 0; - - switch (args->caching) { - case I915_CACHING_NONE: - level = I915_CACHE_NONE; - break; - case I915_CACHING_CACHED: - /* - * Due to a HW issue on BXT A stepping, GPU stores via a - * snooped mapping may leave stale data in a corresponding CPU - * cacheline, whereas normally such cachelines would get - * invalidated. - */ - if (!HAS_LLC(i915) && !HAS_SNOOP(i915)) - return -ENODEV; - - level = I915_CACHE_LLC; - break; - case I915_CACHING_DISPLAY: - level = HAS_WT(i915) ? I915_CACHE_WT : I915_CACHE_NONE; - break; - default: - return -EINVAL; - } - - obj = i915_gem_object_lookup(file, args->handle); - if (!obj) - return -ENOENT; - - /* - * The caching mode of proxy object is handled by its generator, and - * not allowed to be changed by userspace. - */ - if (i915_gem_object_is_proxy(obj)) { - ret = -ENXIO; - goto out; - } - - if (obj->cache_level == level) - goto out; - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (ret) - goto out; - - ret = i915_mutex_lock_interruptible(dev); - if (ret) - goto out; - - ret = i915_gem_object_set_cache_level(obj, level); - mutex_unlock(&dev->struct_mutex); - -out: - i915_gem_object_put(obj); - return ret; -} - -/* - * Prepare buffer for display plane (scanout, cursors, etc). Can be called from - * an uninterruptible phase (modesetting) and allows any flushes to be pipelined - * (for pageflips). We only flush the caches while preparing the buffer for - * display, the callers are responsible for frontbuffer flush. - */ -struct i915_vma * -i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, - u32 alignment, - const struct i915_ggtt_view *view, - unsigned int flags) -{ - struct i915_vma *vma; - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - /* Mark the global pin early so that we account for the - * display coherency whilst setting up the cache domains. - */ - obj->pin_global++; - - /* The display engine is not coherent with the LLC cache on gen6. As - * a result, we make sure that the pinning that is about to occur is - * done with uncached PTEs. This is lowest common denominator for all - * chipsets. - * - * However for gen6+, we could do better by using the GFDT bit instead - * of uncaching, which would allow us to flush all the LLC-cached data - * with that bit in the PTE to main memory with just one PIPE_CONTROL. - */ - ret = i915_gem_object_set_cache_level(obj, - HAS_WT(to_i915(obj->base.dev)) ? - I915_CACHE_WT : I915_CACHE_NONE); - if (ret) { - vma = ERR_PTR(ret); - goto err_unpin_global; - } - - /* As the user may map the buffer once pinned in the display plane - * (e.g. libkms for the bootup splash), we have to ensure that we - * always use map_and_fenceable for all scanout buffers. However, - * it may simply be too big to fit into mappable, in which case - * put it anyway and hope that userspace can cope (but always first - * try to preserve the existing ABI). - */ - vma = ERR_PTR(-ENOSPC); - if ((flags & PIN_MAPPABLE) == 0 && - (!view || view->type == I915_GGTT_VIEW_NORMAL)) - vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, - flags | - PIN_MAPPABLE | - PIN_NONBLOCK); - if (IS_ERR(vma)) - vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, flags); - if (IS_ERR(vma)) - goto err_unpin_global; - - vma->display_alignment = max_t(u64, vma->display_alignment, alignment); - - __i915_gem_object_flush_for_display(obj); - - /* It should now be out of any other write domains, and we can update - * the domain values for our changes. - */ - obj->read_domains |= I915_GEM_DOMAIN_GTT; - - return vma; - -err_unpin_global: - obj->pin_global--; - return vma; -} - -void -i915_gem_object_unpin_from_display_plane(struct i915_vma *vma) -{ - lockdep_assert_held(&vma->vm->i915->drm.struct_mutex); - - if (WARN_ON(vma->obj->pin_global == 0)) - return; - - if (--vma->obj->pin_global == 0) - vma->display_alignment = I915_GTT_MIN_ALIGNMENT; - - /* Bump the LRU to try and avoid premature eviction whilst flipping */ - i915_gem_object_bump_inactive_ggtt(vma->obj); - - i915_vma_unpin(vma); -} - -/** - * Moves a single object to the CPU read, and possibly write domain. - * @obj: object to act on - * @write: requesting write or read-only access - * - * This function returns when the move is complete, including waiting on - * flushes to occur. - */ -int -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) -{ - int ret; - - lockdep_assert_held(&obj->base.dev->struct_mutex); - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - - i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - - /* Flush the CPU cache if it's still invalid. */ - if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); - obj->read_domains |= I915_GEM_DOMAIN_CPU; - } - - /* It should now be out of any other write domains, and we can update - * the domain values for our changes. - */ - GEM_BUG_ON(obj->write_domain & ~I915_GEM_DOMAIN_CPU); - - /* If we're writing through the CPU, then the GPU read domains will - * need to be invalidated at next use. - */ - if (write) - __start_cpu_write(obj); - - return 0; -} - /* Throttle our rendering by waiting until the ring has completed our requests * emitted over 20 msec ago. * diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 908fddcc57c3..699f3f180d8a 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1026,7 +1026,7 @@ static void reloc_cache_reset(struct reloc_cache *cache) mb(); kunmap_atomic(vaddr); - i915_gem_obj_finish_shmem_access((struct drm_i915_gem_object *)cache->node.mm); + i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm); } else { wmb(); io_mapping_unmap_atomic((void __iomem *)vaddr); @@ -1058,7 +1058,7 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj, unsigned int flushes; int err; - err = i915_gem_obj_prepare_shmem_write(obj, &flushes); + err = i915_gem_object_prepare_write(obj, &flushes); if (err) return ERR_PTR(err); diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 9440024c763f..f3b42b026fff 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -84,7 +84,7 @@ static int render_state_setup(struct intel_render_state *so, u32 *d; int ret; - ret = i915_gem_obj_prepare_shmem_write(so->obj, &needs_clflush); + ret = i915_gem_object_prepare_write(so->obj, &needs_clflush); if (ret) return ret; @@ -166,7 +166,7 @@ static int render_state_setup(struct intel_render_state *so, ret = 0; out: - i915_gem_obj_finish_shmem_access(so->obj); + i915_gem_object_finish_access(so->obj); return ret; err: diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c index ce4ec87698f6..b22b8249dfbd 100644 --- a/drivers/gpu/drm/i915/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c @@ -1017,7 +1017,7 @@ static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val) unsigned long n; int err; - err = i915_gem_obj_prepare_shmem_read(obj, &needs_flush); + err = i915_gem_object_prepare_read(obj, &needs_flush); if (err) return err; @@ -1038,7 +1038,7 @@ static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val) kunmap_atomic(ptr); } - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return err; } diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c index 046a38743152..cb25b5fc8027 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c @@ -37,7 +37,7 @@ static int cpu_set(struct drm_i915_gem_object *obj, u32 *cpu; int err; - err = i915_gem_obj_prepare_shmem_write(obj, &needs_clflush); + err = i915_gem_object_prepare_write(obj, &needs_clflush); if (err) return err; @@ -54,7 +54,7 @@ static int cpu_set(struct drm_i915_gem_object *obj, drm_clflush_virt_range(cpu, sizeof(*cpu)); kunmap_atomic(map); - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return 0; } @@ -69,7 +69,7 @@ static int cpu_get(struct drm_i915_gem_object *obj, u32 *cpu; int err; - err = i915_gem_obj_prepare_shmem_read(obj, &needs_clflush); + err = i915_gem_object_prepare_read(obj, &needs_clflush); if (err) return err; @@ -83,7 +83,7 @@ static int cpu_get(struct drm_i915_gem_object *obj, *v = *cpu; kunmap_atomic(map); - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return 0; } diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c index 34ac5cc6d59f..c69c6d9a998b 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c @@ -354,7 +354,7 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 value) unsigned int n, m, need_flush; int err; - err = i915_gem_obj_prepare_shmem_write(obj, &need_flush); + err = i915_gem_object_prepare_write(obj, &need_flush); if (err) return err; @@ -369,7 +369,7 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 value) kunmap_atomic(map); } - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); obj->read_domains = I915_GEM_DOMAIN_GTT | I915_GEM_DOMAIN_CPU; obj->write_domain = 0; return 0; @@ -381,7 +381,7 @@ static noinline int cpu_check(struct drm_i915_gem_object *obj, unsigned int n, m, needs_flush; int err; - err = i915_gem_obj_prepare_shmem_read(obj, &needs_flush); + err = i915_gem_object_prepare_read(obj, &needs_flush); if (err) return err; @@ -419,7 +419,7 @@ static noinline int cpu_check(struct drm_i915_gem_object *obj, break; } - i915_gem_obj_finish_shmem_access(obj); + i915_gem_object_finish_access(obj); return err; } From patchwork Wed May 22 08:20:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955287 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EF81912 for ; Wed, 22 May 2019 08:21:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C9A028B5B for ; Wed, 22 May 2019 08:21:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 008B828A48; Wed, 22 May 2019 08:21:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 17C1328A31 for ; Wed, 22 May 2019 08:21:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5477189803; Wed, 22 May 2019 08:21:18 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1A2A0897FB for ; Wed, 22 May 2019 08:21:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636788-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:20:59 +0100 Message-Id: <20190522082104.5117-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 08/13] drm/i915: Move more GEM objects under gem/ X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the theme of separating out the GEM clutter. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Makefile | 26 +++++------ drivers/gpu/drm/i915/Makefile.header-test | 2 - .../gpu/drm/i915/{ => gem}/i915_gem_clflush.c | 24 ++-------- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 20 +++++++++ .../gpu/drm/i915/{ => gem}/i915_gem_context.c | 27 ++--------- .../gpu/drm/i915/{ => gem}/i915_gem_context.h | 22 +-------- .../i915/{ => gem}/i915_gem_context_types.h | 0 .../gpu/drm/i915/{ => gem}/i915_gem_dmabuf.c | 27 ++--------- .../drm/i915/{ => gem}/i915_gem_execbuffer.c | 30 +++---------- .../drm/i915/{ => gem}/i915_gem_internal.c | 30 +++++-------- drivers/gpu/drm/i915/gem/i915_gem_object.c | 10 ++++- drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.c | 2 +- drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.h | 0 .../drm/i915/{ => gem}/i915_gem_shrinker.c | 23 +--------- .../gpu/drm/i915/{ => gem}/i915_gem_stolen.c | 31 +++---------- .../gpu/drm/i915/{ => gem}/i915_gem_tiling.c | 30 +++---------- .../gpu/drm/i915/{ => gem}/i915_gem_userptr.c | 27 ++--------- drivers/gpu/drm/i915/{ => gem}/i915_gemfs.c | 22 +-------- drivers/gpu/drm/i915/gem/i915_gemfs.h | 16 +++++++ .../{ => gem}/selftests/huge_gem_object.c | 22 +-------- .../drm/i915/gem/selftests/huge_gem_object.h | 27 +++++++++++ .../drm/i915/{ => gem}/selftests/huge_pages.c | 35 +++++---------- .../{ => gem}/selftests/i915_gem_coherency.c | 26 ++--------- .../{ => gem}/selftests/i915_gem_context.c | 40 +++++------------ .../{ => gem}/selftests/i915_gem_dmabuf.c | 26 ++--------- .../drm/i915/gem/selftests/i915_gem_mman.c | 2 +- .../{ => gem}/selftests/i915_gem_object.c | 28 +++--------- .../i915/{ => gem}/selftests/igt_gem_utils.c | 6 +-- .../i915/{ => gem}/selftests/igt_gem_utils.h | 0 .../i915/{ => gem}/selftests/mock_context.c | 24 ++-------- .../gpu/drm/i915/gem/selftests/mock_context.h | 24 ++++++++++ .../i915/{ => gem}/selftests/mock_dmabuf.c | 22 +-------- .../gpu/drm/i915/gem/selftests/mock_dmabuf.h | 22 +++++++++ .../{ => gem}/selftests/mock_gem_object.h | 7 ++- drivers/gpu/drm/i915/gt/intel_context.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 2 + drivers/gpu/drm/i915/gt/intel_lrc.h | 14 +++--- drivers/gpu/drm/i915/gt/intel_reset.c | 2 + drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 3 ++ drivers/gpu/drm/i915/gt/mock_engine.c | 3 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 7 +-- drivers/gpu/drm/i915/gt/selftest_lrc.c | 7 ++- .../gpu/drm/i915/gt/selftest_workarounds.c | 6 ++- drivers/gpu/drm/i915/gvt/mmio_context.c | 1 + drivers/gpu/drm/i915/gvt/scheduler.c | 5 ++- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem.c | 11 ++--- drivers/gpu/drm/i915/i915_gem_clflush.h | 36 --------------- drivers/gpu/drm/i915/i915_gem_evict.c | 2 + drivers/gpu/drm/i915/i915_gemfs.h | 34 -------------- drivers/gpu/drm/i915/i915_globals.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 + drivers/gpu/drm/i915/i915_perf.c | 2 + drivers/gpu/drm/i915/i915_request.c | 3 ++ drivers/gpu/drm/i915/intel_display.c | 1 - drivers/gpu/drm/i915/intel_guc_submission.c | 2 + drivers/gpu/drm/i915/intel_overlay.c | 2 + .../gpu/drm/i915/selftests/huge_gem_object.h | 45 ------------------- drivers/gpu/drm/i915/selftests/i915_active.c | 4 +- drivers/gpu/drm/i915/selftests/i915_gem.c | 8 ++-- .../gpu/drm/i915/selftests/i915_gem_evict.c | 8 ++-- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 5 ++- drivers/gpu/drm/i915/selftests/i915_request.c | 6 ++- .../gpu/drm/i915/selftests/i915_timeline.c | 4 +- drivers/gpu/drm/i915/selftests/i915_vma.c | 5 ++- .../gpu/drm/i915/selftests/igt_flush_test.c | 6 ++- drivers/gpu/drm/i915/selftests/igt_spinner.c | 3 +- drivers/gpu/drm/i915/selftests/igt_spinner.h | 9 ++-- drivers/gpu/drm/i915/selftests/intel_guc.c | 3 +- drivers/gpu/drm/i915/selftests/mock_context.h | 42 ----------------- drivers/gpu/drm/i915/selftests/mock_dmabuf.h | 41 ----------------- .../gpu/drm/i915/selftests/mock_gem_device.c | 5 ++- drivers/gpu/drm/i915/selftests/mock_request.c | 2 +- 76 files changed, 337 insertions(+), 698 deletions(-) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_clflush.c (78%) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_clflush.h rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context.c (98%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context.h (85%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context_types.h (100%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_dmabuf.c (86%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_execbuffer.c (98%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_internal.c (81%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.c (99%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.h (100%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_shrinker.c (93%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_stolen.c (93%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_tiling.c (90%) rename drivers/gpu/drm/i915/{ => gem}/i915_gem_userptr.c (94%) rename drivers/gpu/drm/i915/{ => gem}/i915_gemfs.c (51%) create mode 100644 drivers/gpu/drm/i915/gem/i915_gemfs.h rename drivers/gpu/drm/i915/{ => gem}/selftests/huge_gem_object.c (70%) create mode 100644 drivers/gpu/drm/i915/gem/selftests/huge_gem_object.h rename drivers/gpu/drm/i915/{ => gem}/selftests/huge_pages.c (97%) rename drivers/gpu/drm/i915/{ => gem}/selftests/i915_gem_coherency.c (87%) rename drivers/gpu/drm/i915/{ => gem}/selftests/i915_gem_context.c (96%) rename drivers/gpu/drm/i915/{ => gem}/selftests/i915_gem_dmabuf.c (87%) rename drivers/gpu/drm/i915/{ => gem}/selftests/i915_gem_object.c (61%) rename drivers/gpu/drm/i915/{ => gem}/selftests/igt_gem_utils.c (87%) rename drivers/gpu/drm/i915/{ => gem}/selftests/igt_gem_utils.h (100%) rename drivers/gpu/drm/i915/{ => gem}/selftests/mock_context.c (63%) create mode 100644 drivers/gpu/drm/i915/gem/selftests/mock_context.h rename drivers/gpu/drm/i915/{ => gem}/selftests/mock_dmabuf.c (73%) create mode 100644 drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.h rename drivers/gpu/drm/i915/{ => gem}/selftests/mock_gem_object.h (65%) delete mode 100644 drivers/gpu/drm/i915/i915_gem_clflush.h delete mode 100644 drivers/gpu/drm/i915/i915_gemfs.h delete mode 100644 drivers/gpu/drm/i915/selftests/huge_gem_object.h delete mode 100644 drivers/gpu/drm/i915/selftests/mock_context.h delete mode 100644 drivers/gpu/drm/i915/selftests/mock_dmabuf.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 29fc924ba819..0c39ef967cf0 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -87,33 +87,33 @@ i915-y += $(gt-y) # GEM (Graphics Execution Management) code obj-y += gem/ gem-y += \ + gem/i915_gem_clflush.o \ + gem/i915_gem_context.o \ + gem/i915_gem_dmabuf.o \ gem/i915_gem_domain.o \ + gem/i915_gem_execbuffer.o \ + gem/i915_gem_internal.o \ gem/i915_gem_object.o \ gem/i915_gem_mman.o \ gem/i915_gem_pages.o \ gem/i915_gem_phys.o \ - gem/i915_gem_shmem.o + gem/i915_gem_pm.o \ + gem/i915_gem_shmem.o \ + gem/i915_gem_shrinker.o \ + gem/i915_gem_stolen.o \ + gem/i915_gem_tiling.o \ + gem/i915_gem_userptr.o \ + gem/i915_gemfs.o i915-y += \ $(gem-y) \ i915_active.o \ i915_cmd_parser.o \ i915_gem_batch_pool.o \ - i915_gem_clflush.o \ - i915_gem_context.o \ - i915_gem_dmabuf.o \ i915_gem_evict.o \ - i915_gem_execbuffer.o \ i915_gem_fence_reg.o \ i915_gem_gtt.o \ - i915_gem_internal.o \ i915_gem.o \ - i915_gem_pm.o \ i915_gem_render_state.o \ - i915_gem_shrinker.o \ - i915_gem_stolen.o \ - i915_gem_tiling.o \ - i915_gem_userptr.o \ - i915_gemfs.o \ i915_globals.o \ i915_query.o \ i915_request.o \ @@ -198,10 +198,10 @@ i915-y += dvo_ch7017.o \ # Post-mortem debug and GPU hang state capture i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o i915-$(CONFIG_DRM_I915_SELFTEST) += \ + gem/selftests/igt_gem_utils.o \ selftests/i915_random.o \ selftests/i915_selftest.o \ selftests/igt_flush_test.o \ - selftests/igt_gem_utils.o \ selftests/igt_live_test.o \ selftests/igt_reset.o \ selftests/igt_spinner.o diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test index 2ca4a5fa3186..752ad8ce2411 100644 --- a/drivers/gpu/drm/i915/Makefile.header-test +++ b/drivers/gpu/drm/i915/Makefile.header-test @@ -6,8 +6,6 @@ header_test := \ i915_active_types.h \ i915_debugfs.h \ i915_drv.h \ - i915_gem_context_types.h \ - i915_gem_pm.h \ i915_irq.h \ i915_params.h \ i915_priolist_types.h \ diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c similarity index 78% rename from drivers/gpu/drm/i915/i915_gem_clflush.c rename to drivers/gpu/drm/i915/gem/i915_gem_clflush.c index 8e74c23cbd91..45d238d784fc 100644 --- a/drivers/gpu/drm/i915/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -1,30 +1,12 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #include "i915_drv.h" -#include "intel_frontbuffer.h" #include "i915_gem_clflush.h" +#include "intel_frontbuffer.h" static DEFINE_SPINLOCK(clflush_lock); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h new file mode 100644 index 000000000000..e6c382973129 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h @@ -0,0 +1,20 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef __I915_GEM_CLFLUSH_H__ +#define __I915_GEM_CLFLUSH_H__ + +#include + +struct drm_i915_private; +struct drm_i915_gem_object; + +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, + unsigned int flags); +#define I915_CLFLUSH_FORCE BIT(0) +#define I915_CLFLUSH_SYNC BIT(1) + +#endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c similarity index 98% rename from drivers/gpu/drm/i915/i915_gem_context.c rename to drivers/gpu/drm/i915/gem/i915_gem_context.c index 5d2f8ba92b59..5dcdf6540f43 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1,28 +1,7 @@ /* - * Copyright © 2011-2012 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - * Authors: - * Ben Widawsky + * SPDX-License-Identifier: MIT * + * Copyright © 2011-2012 Intel Corporation */ /* @@ -92,7 +71,7 @@ #include "gt/intel_lrc_reg.h" -#include "i915_drv.h" +#include "i915_gem_context.h" #include "i915_globals.h" #include "i915_trace.h" #include "i915_user_extensions.h" diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h similarity index 85% rename from drivers/gpu/drm/i915/i915_gem_context.h rename to drivers/gpu/drm/i915/gem/i915_gem_context.h index 9ad4a6362438..630392c77e48 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -1,25 +1,7 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #ifndef __I915_GEM_CONTEXT_H__ diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h similarity index 100% rename from drivers/gpu/drm/i915/i915_gem_context_types.h rename to drivers/gpu/drm/i915/gem/i915_gem_context_types.h diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c similarity index 86% rename from drivers/gpu/drm/i915/i915_gem_dmabuf.c rename to drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 5a101a9462d8..600fc926f81e 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -1,34 +1,15 @@ /* - * Copyright 2012 Red Hat Inc - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. + * SPDX-License-Identifier: MIT * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER - * DEALINGS IN THE SOFTWARE. - * - * Authors: - * Dave Airlie + * Copyright 2012 Red Hat Inc */ #include +#include #include - #include "i915_drv.h" +#include "i915_gem_object.h" static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf) { diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c similarity index 98% rename from drivers/gpu/drm/i915/i915_gem_execbuffer.c rename to drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 699f3f180d8a..09e64bf33842 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1,29 +1,7 @@ /* - * Copyright © 2008,2010 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - * Authors: - * Eric Anholt - * Chris Wilson + * SPDX-License-Identifier: MIT * + * Copyright © 2008,2010 Intel Corporation */ #include @@ -35,10 +13,12 @@ #include #include "gem/i915_gem_ioctls.h" +#include "gt/intel_context.h" #include "gt/intel_gt_pm.h" -#include "i915_drv.h" +#include "i915_gem_ioctls.h" #include "i915_gem_clflush.h" +#include "i915_gem_context.h" #include "i915_trace.h" #include "intel_drv.h" #include "intel_frontbuffer.h" diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c similarity index 81% rename from drivers/gpu/drm/i915/i915_gem_internal.c rename to drivers/gpu/drm/i915/gem/i915_gem_internal.c index ab627ed1269c..46f5e8ded00c 100644 --- a/drivers/gpu/drm/i915/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -1,29 +1,19 @@ /* - * Copyright © 2014-2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2014-2016 Intel Corporation */ +#include +#include +#include + #include + #include "i915_drv.h" +#include "i915_gem.h" +#include "i915_gem_object.h" +#include "i915_utils.h" #define QUIET (__GFP_NORETRY | __GFP_NOWARN) #define MAYFAIL (__GFP_RETRY_MAYFAIL | __GFP_NOWARN) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 4ed28ac9ab3a..457e694a5c3f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -23,8 +23,9 @@ */ #include "i915_drv.h" -#include "i915_gem_object.h" #include "i915_gem_clflush.h" +#include "i915_gem_context.h" +#include "i915_gem_object.h" #include "i915_globals.h" #include "intel_frontbuffer.h" @@ -442,3 +443,10 @@ int __init i915_global_objects_init(void) i915_global_register(&global.base); return 0; } + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/huge_gem_object.c" +#include "selftests/huge_pages.c" +#include "selftests/i915_gem_object.c" +#include "selftests/i915_gem_coherency.c" +#endif diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c similarity index 99% rename from drivers/gpu/drm/i915/i915_gem_pm.c rename to drivers/gpu/drm/i915/gem/i915_gem_pm.c index fa9c2ebd966a..be75a52a7ab0 100644 --- a/drivers/gpu/drm/i915/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -4,10 +4,10 @@ * Copyright © 2019 Intel Corporation */ +#include "gem/i915_gem_pm.h" #include "gt/intel_gt_pm.h" #include "i915_drv.h" -#include "i915_gem_pm.h" #include "i915_globals.h" static void i915_gem_park(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/gem/i915_gem_pm.h similarity index 100% rename from drivers/gpu/drm/i915/i915_gem_pm.h rename to drivers/gpu/drm/i915/gem/i915_gem_pm.h diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c similarity index 93% rename from drivers/gpu/drm/i915/i915_gem_shrinker.c rename to drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 2c7aefb3e101..cd42299f019a 100644 --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -1,25 +1,7 @@ /* - * Copyright © 2008-2015 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2008-2015 Intel Corporation */ #include @@ -32,7 +14,6 @@ #include #include -#include "i915_drv.h" #include "i915_trace.h" static bool shrinker_lock(struct drm_i915_private *i915, diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c similarity index 93% rename from drivers/gpu/drm/i915/i915_gem_stolen.c rename to drivers/gpu/drm/i915/gem/i915_gem_stolen.c index 0a8082cfc761..9080a736663a 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c @@ -1,32 +1,15 @@ /* - * Copyright © 2008-2012 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - * Authors: - * Eric Anholt - * Chris Wilson + * SPDX-License-Identifier: MIT * + * Copyright © 2008-2012 Intel Corporation */ +#include +#include + +#include #include + #include "i915_drv.h" /* diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c similarity index 90% rename from drivers/gpu/drm/i915/i915_gem_tiling.c rename to drivers/gpu/drm/i915/gem/i915_gem_tiling.c index 86d6d92ccbc9..ca0c2f451742 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c @@ -1,37 +1,17 @@ /* - * Copyright © 2008 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - * Authors: - * Eric Anholt + * SPDX-License-Identifier: MIT * + * Copyright © 2008 Intel Corporation */ #include #include #include -#include "gem/i915_gem_ioctls.h" - #include "i915_drv.h" +#include "i915_gem.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" /** * DOC: buffer object tiling diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c similarity index 94% rename from drivers/gpu/drm/i915/i915_gem_userptr.c rename to drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 2c1b6bb7a040..ccac73b72597 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -1,25 +1,7 @@ /* - * Copyright © 2012-2014 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2012-2014 Intel Corporation */ #include @@ -30,9 +12,8 @@ #include -#include "gem/i915_gem_ioctls.h" - -#include "i915_drv.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" #include "i915_trace.h" #include "intel_drv.h" diff --git a/drivers/gpu/drm/i915/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c similarity index 51% rename from drivers/gpu/drm/i915/i915_gemfs.c rename to drivers/gpu/drm/i915/gem/i915_gemfs.c index 888b7d3f04c3..099f3397aada 100644 --- a/drivers/gpu/drm/i915/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -1,25 +1,7 @@ /* - * Copyright © 2017 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2017 Intel Corporation */ #include diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.h b/drivers/gpu/drm/i915/gem/i915_gemfs.h new file mode 100644 index 000000000000..2a1e59af3e4a --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.h @@ -0,0 +1,16 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2017 Intel Corporation + */ + +#ifndef __I915_GEMFS_H__ +#define __I915_GEMFS_H__ + +struct drm_i915_private; + +int i915_gemfs_init(struct drm_i915_private *i915); + +void i915_gemfs_fini(struct drm_i915_private *i915); + +#endif diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c similarity index 70% rename from drivers/gpu/drm/i915/selftests/huge_gem_object.c rename to drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c index 419fd4d6a8f0..824f3761314c 100644 --- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c @@ -1,25 +1,7 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #include "huge_gem_object.h" diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.h b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.h new file mode 100644 index 000000000000..549c1394bcdc --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.h @@ -0,0 +1,27 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef __HUGE_GEM_OBJECT_H +#define __HUGE_GEM_OBJECT_H + +struct drm_i915_gem_object * +huge_gem_object(struct drm_i915_private *i915, + phys_addr_t phys_size, + dma_addr_t dma_size); + +static inline phys_addr_t +huge_gem_object_phys_size(struct drm_i915_gem_object *obj) +{ + return obj->scratch; +} + +static inline dma_addr_t +huge_gem_object_dma_size(struct drm_i915_gem_object *obj) +{ + return obj->base.size; +} + +#endif /* !__HUGE_GEM_OBJECT_H */ diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c similarity index 97% rename from drivers/gpu/drm/i915/selftests/huge_pages.c rename to drivers/gpu/drm/i915/gem/selftests/huge_pages.c index b22b8249dfbd..7b437f06a9be 100644 --- a/drivers/gpu/drm/i915/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -1,34 +1,21 @@ /* - * Copyright © 2017 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2017 Intel Corporation */ -#include "../i915_selftest.h" - #include +#include "i915_selftest.h" + +#include "gem/i915_gem_pm.h" + #include "igt_gem_utils.h" -#include "mock_drm.h" -#include "i915_random.h" +#include "mock_context.h" + +#include "selftests/mock_drm.h" +#include "selftests/mock_gem_device.h" +#include "selftests/i915_random.h" static const unsigned int page_sizes[] = { I915_GTT_PAGE_SIZE_2M, diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c similarity index 87% rename from drivers/gpu/drm/i915/selftests/i915_gem_coherency.c rename to drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index cb25b5fc8027..5495875b48b3 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -1,31 +1,13 @@ /* - * Copyright © 2017 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2017 Intel Corporation */ #include -#include "../i915_selftest.h" -#include "i915_random.h" +#include "i915_selftest.h" +#include "selftests/i915_random.h" static int cpu_set(struct drm_i915_gem_object *obj, unsigned long offset, diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c similarity index 96% rename from drivers/gpu/drm/i915/selftests/i915_gem_context.c rename to drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index c69c6d9a998b..653ae08a277f 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1,42 +1,26 @@ /* - * Copyright © 2017 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2017 Intel Corporation */ #include +#include "gem/i915_gem_pm.h" #include "gt/intel_reset.h" #include "i915_selftest.h" -#include "i915_random.h" -#include "igt_flush_test.h" -#include "igt_gem_utils.h" -#include "igt_live_test.h" -#include "igt_reset.h" -#include "igt_spinner.h" +#include "gem/selftests/igt_gem_utils.h" +#include "selftests/i915_random.h" +#include "selftests/igt_flush_test.h" +#include "selftests/igt_live_test.h" +#include "selftests/igt_reset.h" +#include "selftests/igt_spinner.h" +#include "selftests/mock_drm.h" +#include "selftests/mock_gem_device.h" -#include "mock_drm.h" -#include "mock_gem_device.h" #include "huge_gem_object.h" +#include "igt_gem_utils.h" #define DW_PER_PAGE (PAGE_SIZE / sizeof(u32)) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c similarity index 87% rename from drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c rename to drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c index cc65a503e2f0..b7431712de66 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c @@ -1,31 +1,13 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ -#include "../i915_selftest.h" +#include "i915_selftest.h" -#include "mock_gem_device.h" #include "mock_dmabuf.h" +#include "selftests/mock_gem_device.h" static int igt_dmabuf_export(void *arg) { diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 87da01230179..12c90d8fe0fb 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -7,8 +7,8 @@ #include #include "gt/intel_gt_pm.h" +#include "huge_gem_object.h" #include "i915_selftest.h" -#include "selftests/huge_gem_object.h" #include "selftests/igt_flush_test.h" struct tile { diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c similarity index 61% rename from drivers/gpu/drm/i915/selftests/i915_gem_object.c rename to drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c index a3dd2f1be95b..2b6db6f799de 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c @@ -1,32 +1,14 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ -#include "../i915_selftest.h" +#include "i915_selftest.h" -#include "igt_flush_test.h" -#include "mock_gem_device.h" #include "huge_gem_object.h" +#include "selftests/igt_flush_test.h" +#include "selftests/mock_gem_device.h" static int igt_gem_object(void *arg) { diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c similarity index 87% rename from drivers/gpu/drm/i915/selftests/igt_gem_utils.c rename to drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 16891b1a3e50..b232e6d2cd92 100644 --- a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -6,11 +6,11 @@ #include "igt_gem_utils.h" +#include "gem/i915_gem_context.h" +#include "gem/i915_gem_pm.h" #include "gt/intel_context.h" -#include "../i915_gem_context.h" -#include "../i915_gem_pm.h" -#include "../i915_request.h" +#include "i915_request.h" struct i915_request * igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h similarity index 100% rename from drivers/gpu/drm/i915/selftests/igt_gem_utils.h rename to drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.h diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c similarity index 63% rename from drivers/gpu/drm/i915/selftests/mock_context.c rename to drivers/gpu/drm/i915/gem/selftests/mock_context.c index 10e67c931ed1..68d50da035e6 100644 --- a/drivers/gpu/drm/i915/selftests/mock_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c @@ -1,29 +1,11 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #include "mock_context.h" -#include "mock_gtt.h" +#include "selftests/mock_gtt.h" struct i915_gem_context * mock_context(struct drm_i915_private *i915, diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h b/drivers/gpu/drm/i915/gem/selftests/mock_context.h new file mode 100644 index 000000000000..0b926653914f --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h @@ -0,0 +1,24 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef __MOCK_CONTEXT_H +#define __MOCK_CONTEXT_H + +void mock_init_contexts(struct drm_i915_private *i915); + +struct i915_gem_context * +mock_context(struct drm_i915_private *i915, + const char *name); + +void mock_context_close(struct i915_gem_context *ctx); + +struct i915_gem_context * +live_context(struct drm_i915_private *i915, struct drm_file *file); + +struct i915_gem_context *kernel_context(struct drm_i915_private *i915); +void kernel_context_close(struct i915_gem_context *ctx); + +#endif /* !__MOCK_CONTEXT_H */ diff --git a/drivers/gpu/drm/i915/selftests/mock_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c similarity index 73% rename from drivers/gpu/drm/i915/selftests/mock_dmabuf.c rename to drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c index ca682caf1062..b9e059d4328a 100644 --- a/drivers/gpu/drm/i915/selftests/mock_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c @@ -1,25 +1,7 @@ /* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. + * SPDX-License-Identifier: MIT * + * Copyright © 2016 Intel Corporation */ #include "mock_dmabuf.h" diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.h b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.h new file mode 100644 index 000000000000..f0f8bbd82dfc --- /dev/null +++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.h @@ -0,0 +1,22 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef __MOCK_DMABUF_H__ +#define __MOCK_DMABUF_H__ + +#include + +struct mock_dmabuf { + int npages; + struct page *pages[]; +}; + +static struct mock_dmabuf *to_mock(struct dma_buf *buf) +{ + return buf->priv; +} + +#endif /* !__MOCK_DMABUF_H__ */ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_object.h b/drivers/gpu/drm/i915/gem/selftests/mock_gem_object.h similarity index 65% rename from drivers/gpu/drm/i915/selftests/mock_gem_object.h rename to drivers/gpu/drm/i915/gem/selftests/mock_gem_object.h index 20acdbee7bd0..370360b4a148 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_object.h +++ b/drivers/gpu/drm/i915/gem/selftests/mock_gem_object.h @@ -1,4 +1,9 @@ -/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + #ifndef __MOCK_GEM_OBJECT_H__ #define __MOCK_GEM_OBJECT_H__ diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 5b31e1e05ddd..c78ec0b58e77 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -4,8 +4,10 @@ * Copyright © 2019 Intel Corporation */ +#include "gem/i915_gem_context.h" +#include "gem/i915_gem_pm.h" + #include "i915_drv.h" -#include "i915_gem_context.h" #include "i915_globals.h" #include "intel_context.h" diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4c3753c1b573..ac5882c36cf5 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -24,10 +24,13 @@ #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "intel_engine.h" #include "intel_engine_pm.h" +#include "intel_context.h" #include "intel_lrc.h" #include "intel_reset.h" diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index db0b9687b904..bf16cb510758 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -133,6 +133,8 @@ */ #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "i915_gem_render_state.h" #include "i915_vgpu.h" diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h index e029aee87adf..c2bba82bcc16 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h @@ -24,7 +24,15 @@ #ifndef _INTEL_LRC_H_ #define _INTEL_LRC_H_ -#include "intel_engine.h" +#include + +struct drm_printer; + +struct drm_i915_private; +struct i915_gem_context; +struct i915_request; +struct intel_context; +struct intel_engine_cs; /* Execlists regs */ #define RING_ELSP(base) _MMIO((base) + 0x230) @@ -96,10 +104,6 @@ int intel_execlists_submission_init(struct intel_engine_cs *engine); */ #define LRC_HEADER_PAGES LRC_PPHWSP_PN -struct drm_printer; - -struct drm_i915_private; - void intel_execlists_set_default_submission(struct intel_engine_cs *engine); void intel_lr_context_reset(struct intel_engine_cs *engine, diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 464369bc55ad..091671c5bbb5 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -7,6 +7,8 @@ #include #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "i915_gpu_error.h" #include "i915_irq.h" diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index ac93080bd863..66d5a52d505c 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -31,9 +31,12 @@ #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "i915_gem_render_state.h" #include "i915_trace.h" +#include "intel_context.h" #include "intel_reset.h" #include "intel_workarounds.h" diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 2941916b37bf..6d7562769eb2 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -22,8 +22,9 @@ * */ +#include "gem/i915_gem_context.h" + #include "i915_drv.h" -#include "i915_gem_context.h" #include "intel_context.h" #include "intel_engine_pm.h" diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index dab3d30c9c73..4b4589b92521 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -24,18 +24,19 @@ #include +#include "gem/i915_gem_context.h" #include "intel_engine_pm.h" #include "i915_selftest.h" #include "selftests/i915_random.h" #include "selftests/igt_flush_test.h" -#include "selftests/igt_gem_utils.h" #include "selftests/igt_reset.h" #include "selftests/igt_wedge_me.h" - -#include "selftests/mock_context.h" #include "selftests/mock_drm.h" +#include "gem/selftests/mock_context.h" +#include "gem/selftests/igt_gem_utils.h" + #define IGT_IDLE_TIMEOUT 50 /* ms; time to wait after flushing between tests */ struct hang { diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index a8c50900e2d4..dfacc46ae7d3 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -6,15 +6,18 @@ #include +#include "gem/i915_gem_pm.h" #include "gt/intel_reset.h" + #include "i915_selftest.h" #include "selftests/i915_random.h" #include "selftests/igt_flush_test.h" -#include "selftests/igt_gem_utils.h" #include "selftests/igt_live_test.h" #include "selftests/igt_spinner.h" #include "selftests/lib_sw_fence.h" -#include "selftests/mock_context.h" + +#include "gem/selftests/igt_gem_utils.h" +#include "gem/selftests/mock_context.h" static int live_sanitycheck(void *arg) { diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 9f7680b9984b..40b6df911d8d 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -4,17 +4,19 @@ * Copyright © 2018 Intel Corporation */ +#include "gem/i915_gem_pm.h" #include "i915_selftest.h" #include "intel_reset.h" #include "selftests/igt_flush_test.h" -#include "selftests/igt_gem_utils.h" #include "selftests/igt_reset.h" #include "selftests/igt_spinner.h" #include "selftests/igt_wedge_me.h" -#include "selftests/mock_context.h" #include "selftests/mock_drm.h" +#include "gem/selftests/igt_gem_utils.h" +#include "gem/selftests/mock_context.h" + static const struct wo_register { enum intel_platform platform; u32 reg; diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index 96e1edf21b3f..2998999e8568 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -34,6 +34,7 @@ */ #include "i915_drv.h" +#include "gt/intel_context.h" #include "gvt.h" #include "trace.h" diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index 3a691447f76c..d66bf77f55fd 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -35,8 +35,11 @@ #include +#include "gem/i915_gem_context.h" +#include "gem/i915_gem_pm.h" +#include "gt/intel_context.h" + #include "i915_drv.h" -#include "i915_gem_pm.h" #include "gvt.h" #define RING_CTX_OFF(x) \ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 633a08c0f907..9684fd846cbc 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -32,10 +32,10 @@ #include #include +#include "gem/i915_gem_context.h" #include "gt/intel_reset.h" #include "i915_debugfs.h" -#include "i915_gem_context.h" #include "i915_irq.h" #include "intel_csr.h" #include "intel_dp.h" diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 30afd8aef737..15975519407b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -47,6 +47,7 @@ #include #include +#include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" #include "gt/intel_gt_pm.h" #include "gt/intel_reset.h" diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 91689a23d9e6..eba7eff74184 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -79,7 +79,7 @@ #include "intel_wopcm.h" #include "i915_gem.h" -#include "i915_gem_context.h" +#include "gem/i915_gem_context_types.h" #include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 92e555e1c1ba..879dd382d3e8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -38,7 +38,11 @@ #include #include +#include "gem/i915_gem_clflush.h" +#include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" +#include "gem/i915_gem_pm.h" +#include "gem/i915_gemfs.h" #include "gt/intel_engine_pm.h" #include "gt/intel_gt_pm.h" #include "gt/intel_mocs.h" @@ -46,9 +50,6 @@ #include "gt/intel_workarounds.h" #include "i915_drv.h" -#include "i915_gem_clflush.h" -#include "i915_gemfs.h" -#include "i915_gem_pm.h" #include "i915_trace.h" #include "i915_vgpu.h" @@ -2367,9 +2368,5 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/scatterlist.c" #include "selftests/mock_gem_device.c" -#include "selftests/huge_gem_object.c" -#include "selftests/huge_pages.c" -#include "selftests/i915_gem_object.c" -#include "selftests/i915_gem_coherency.c" #include "selftests/i915_gem.c" #endif diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.h b/drivers/gpu/drm/i915/i915_gem_clflush.h deleted file mode 100644 index f390247561b3..000000000000 --- a/drivers/gpu/drm/i915/i915_gem_clflush.h +++ /dev/null @@ -1,36 +0,0 @@ -/* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#ifndef __I915_GEM_CLFLUSH_H__ -#define __I915_GEM_CLFLUSH_H__ - -struct drm_i915_private; -struct drm_i915_gem_object; - -bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, - unsigned int flags); -#define I915_CLFLUSH_FORCE BIT(0) -#define I915_CLFLUSH_SYNC BIT(1) - -#endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 0bdb3e072ba5..a5783c4cb98b 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -28,6 +28,8 @@ #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "intel_drv.h" #include "i915_trace.h" diff --git a/drivers/gpu/drm/i915/i915_gemfs.h b/drivers/gpu/drm/i915/i915_gemfs.h deleted file mode 100644 index cca8bdc5b93e..000000000000 --- a/drivers/gpu/drm/i915/i915_gemfs.h +++ /dev/null @@ -1,34 +0,0 @@ -/* - * Copyright © 2017 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#ifndef __I915_GEMFS_H__ -#define __I915_GEMFS_H__ - -struct drm_i915_private; - -int i915_gemfs_init(struct drm_i915_private *i915); - -void i915_gemfs_fini(struct drm_i915_private *i915); - -#endif diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c index db52a58eadcc..2d5fcba98841 100644 --- a/drivers/gpu/drm/i915/i915_globals.c +++ b/drivers/gpu/drm/i915/i915_globals.c @@ -8,7 +8,7 @@ #include #include "i915_active.h" -#include "i915_gem_context.h" +#include "gem/i915_gem_context.h" #include "gem/i915_gem_object.h" #include "i915_globals.h" #include "i915_request.h" diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4f85cbdddb0d..c86865a34972 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -36,6 +36,8 @@ #include +#include "gem/i915_gem_context.h" + #include "i915_drv.h" #include "i915_gpu_error.h" #include "intel_atomic.h" diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 379fd89a180f..2e33a9b4eae7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -195,6 +195,8 @@ #include #include +#include "gem/i915_gem_context.h" +#include "gem/i915_gem_pm.h" #include "gt/intel_lrc_reg.h" #include "i915_drv.h" diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 18b34b0bf872..da1e6984a8cc 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -29,6 +29,9 @@ #include #include +#include "gem/i915_gem_context.h" +#include "gt/intel_context.h" + #include "i915_active.h" #include "i915_drv.h" #include "i915_globals.h" diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 012ad08f38c3..e00de2237fd8 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -45,7 +45,6 @@ #include #include "i915_drv.h" -#include "i915_gem_clflush.h" #include "i915_trace.h" #include "intel_acpi.h" #include "intel_atomic.h" diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index ea0e3734d37c..389a9131d413 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -26,6 +26,8 @@ #include "gt/intel_engine_pm.h" #include "gt/intel_lrc_reg.h" +#include "gt/intel_context.h" +#include "gem/i915_gem_context.h" #include "intel_guc_submission.h" #include "i915_drv.h" diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index b64b45d9b538..80dcd879fc58 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -29,6 +29,8 @@ #include #include +#include "gem/i915_gem_pm.h" + #include "i915_drv.h" #include "i915_reg.h" #include "intel_drv.h" diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.h b/drivers/gpu/drm/i915/selftests/huge_gem_object.h deleted file mode 100644 index a6133a9e8029..000000000000 --- a/drivers/gpu/drm/i915/selftests/huge_gem_object.h +++ /dev/null @@ -1,45 +0,0 @@ -/* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#ifndef __HUGE_GEM_OBJECT_H -#define __HUGE_GEM_OBJECT_H - -struct drm_i915_gem_object * -huge_gem_object(struct drm_i915_private *i915, - phys_addr_t phys_size, - dma_addr_t dma_size); - -static inline phys_addr_t -huge_gem_object_phys_size(struct drm_i915_gem_object *obj) -{ - return obj->scratch; -} - -static inline dma_addr_t -huge_gem_object_dma_size(struct drm_i915_gem_object *obj) -{ - return obj->base.size; -} - -#endif /* !__HUGE_GEM_OBJECT_H */ diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c index eee838dc0634..cc1ca4be1a00 100644 --- a/drivers/gpu/drm/i915/selftests/i915_active.c +++ b/drivers/gpu/drm/i915/selftests/i915_active.c @@ -4,7 +4,9 @@ * Copyright © 2018 Intel Corporation */ -#include "../i915_selftest.h" +#include "gem/i915_gem_pm.h" + +#include "i915_selftest.h" #include "igt_flush_test.h" #include "lib_sw_fence.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c index c6a9bff85311..83643929416c 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c @@ -6,11 +6,13 @@ #include -#include "../i915_selftest.h" +#include "gem/selftests/igt_gem_utils.h" +#include "gem/selftests/mock_context.h" + +#include "i915_selftest.h" -#include "igt_gem_utils.h" #include "igt_flush_test.h" -#include "mock_context.h" +#include "mock_drm.h" static int switch_to_context(struct drm_i915_private *i915, struct i915_gem_context *ctx) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c index 4fc6e5445dd1..1d8235303edf 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c @@ -22,11 +22,13 @@ * */ -#include "../i915_selftest.h" +#include "gem/i915_gem_pm.h" +#include "gem/selftests/igt_gem_utils.h" +#include "gem/selftests/mock_context.h" + +#include "i915_selftest.h" -#include "igt_gem_utils.h" #include "lib_sw_fence.h" -#include "mock_context.h" #include "mock_drm.h" #include "mock_gem_device.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index 9cca66e4420a..f1e95eaf6923 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -25,10 +25,11 @@ #include #include -#include "../i915_selftest.h" +#include "gem/selftests/mock_context.h" + #include "i915_random.h" +#include "i915_selftest.h" -#include "mock_context.h" #include "mock_drm.h" #include "mock_gem_device.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index b60591531e4a..4fd5356c6577 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -24,12 +24,14 @@ #include -#include "../i915_selftest.h" +#include "gem/i915_gem_pm.h" +#include "gem/selftests/mock_context.h" + #include "i915_random.h" +#include "i915_selftest.h" #include "igt_live_test.h" #include "lib_sw_fence.h" -#include "mock_context.h" #include "mock_drm.h" #include "mock_gem_device.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c index ff9ebe50fae8..acb2cc5136b7 100644 --- a/drivers/gpu/drm/i915/selftests/i915_timeline.c +++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c @@ -6,8 +6,10 @@ #include -#include "../i915_selftest.h" +#include "gem/i915_gem_pm.h" + #include "i915_random.h" +#include "i915_selftest.h" #include "igt_flush_test.h" #include "mock_gem_device.h" diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c index 0027c1fac336..425b76133850 100644 --- a/drivers/gpu/drm/i915/selftests/i915_vma.c +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c @@ -24,10 +24,11 @@ #include -#include "../i915_selftest.h" +#include "gem/selftests/mock_context.h" + +#include "i915_selftest.h" #include "mock_gem_device.h" -#include "mock_context.h" #include "mock_gtt.h" static bool assert_vma(struct i915_vma *vma, diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c index e42f3c58536a..5bfd1b2626a2 100644 --- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c +++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c @@ -4,9 +4,11 @@ * Copyright © 2018 Intel Corporation */ -#include "../i915_drv.h" +#include "gem/i915_gem_context.h" + +#include "i915_drv.h" +#include "i915_selftest.h" -#include "../i915_selftest.h" #include "igt_flush_test.h" int igt_flush_test(struct drm_i915_private *i915, unsigned int flags) diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c index ece8a8a0d3b0..38d6f1b10c54 100644 --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c @@ -4,7 +4,8 @@ * Copyright © 2018 Intel Corporation */ -#include "igt_gem_utils.h" +#include "gem/selftests/igt_gem_utils.h" + #include "igt_spinner.h" int igt_spinner_init(struct igt_spinner *spin, struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.h b/drivers/gpu/drm/i915/selftests/igt_spinner.h index d312e7cdab68..34a88ac9b47a 100644 --- a/drivers/gpu/drm/i915/selftests/igt_spinner.h +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.h @@ -7,13 +7,12 @@ #ifndef __I915_SELFTESTS_IGT_SPINNER_H__ #define __I915_SELFTESTS_IGT_SPINNER_H__ -#include "../i915_selftest.h" - +#include "gem/i915_gem_context.h" #include "gt/intel_engine.h" -#include "../i915_drv.h" -#include "../i915_request.h" -#include "../i915_gem_context.h" +#include "i915_drv.h" +#include "i915_request.h" +#include "i915_selftest.h" struct igt_spinner { struct drm_i915_private *i915; diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c index b05a21eaa8f4..7fd0321e0947 100644 --- a/drivers/gpu/drm/i915/selftests/intel_guc.c +++ b/drivers/gpu/drm/i915/selftests/intel_guc.c @@ -22,7 +22,8 @@ * */ -#include "../i915_selftest.h" +#include "i915_selftest.h" +#include "gem/i915_gem_pm.h" /* max doorbell number + negative test for each client type */ #define ATTEMPTS (GUC_NUM_DOORBELLS + GUC_CLIENT_PRIORITY_NUM) diff --git a/drivers/gpu/drm/i915/selftests/mock_context.h b/drivers/gpu/drm/i915/selftests/mock_context.h deleted file mode 100644 index 29b9d60a158b..000000000000 --- a/drivers/gpu/drm/i915/selftests/mock_context.h +++ /dev/null @@ -1,42 +0,0 @@ -/* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#ifndef __MOCK_CONTEXT_H -#define __MOCK_CONTEXT_H - -void mock_init_contexts(struct drm_i915_private *i915); - -struct i915_gem_context * -mock_context(struct drm_i915_private *i915, - const char *name); - -void mock_context_close(struct i915_gem_context *ctx); - -struct i915_gem_context * -live_context(struct drm_i915_private *i915, struct drm_file *file); - -struct i915_gem_context *kernel_context(struct drm_i915_private *i915); -void kernel_context_close(struct i915_gem_context *ctx); - -#endif /* !__MOCK_CONTEXT_H */ diff --git a/drivers/gpu/drm/i915/selftests/mock_dmabuf.h b/drivers/gpu/drm/i915/selftests/mock_dmabuf.h deleted file mode 100644 index ec80613159b9..000000000000 --- a/drivers/gpu/drm/i915/selftests/mock_dmabuf.h +++ /dev/null @@ -1,41 +0,0 @@ - -/* - * Copyright © 2016 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - * - */ - -#ifndef __MOCK_DMABUF_H__ -#define __MOCK_DMABUF_H__ - -#include - -struct mock_dmabuf { - int npages; - struct page *pages[]; -}; - -static struct mock_dmabuf *to_mock(struct dma_buf *buf) -{ - return buf->priv; -} - -#endif /* !__MOCK_DMABUF_H__ */ diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index 9fd02025d382..e25b74a27f83 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -27,13 +27,14 @@ #include "gt/mock_engine.h" -#include "mock_context.h" #include "mock_request.h" #include "mock_gem_device.h" -#include "mock_gem_object.h" #include "mock_gtt.h" #include "mock_uncore.h" +#include "gem/selftests/mock_context.h" +#include "gem/selftests/mock_gem_object.h" + void mock_device_flush(struct drm_i915_private *i915) { struct intel_engine_cs *engine; diff --git a/drivers/gpu/drm/i915/selftests/mock_request.c b/drivers/gpu/drm/i915/selftests/mock_request.c index b99f7576153c..9390fc09984b 100644 --- a/drivers/gpu/drm/i915/selftests/mock_request.c +++ b/drivers/gpu/drm/i915/selftests/mock_request.c @@ -22,9 +22,9 @@ * */ +#include "gem/selftests/igt_gem_utils.h" #include "gt/mock_engine.h" -#include "igt_gem_utils.h" #include "mock_request.h" struct i915_request * From patchwork Wed May 22 08:21:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955291 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9AD3F912 for ; Wed, 22 May 2019 08:21:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D8981FFC8 for ; Wed, 22 May 2019 08:21:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 822C428A28; Wed, 22 May 2019 08:21:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5F851289B7 for ; Wed, 22 May 2019 08:21:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E51308984E; Wed, 22 May 2019 08:21:19 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id F0928897D0 for ; Wed, 22 May 2019 08:21:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636789-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:21:00 +0100 Message-Id: <20190522082104.5117-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 09/13] drm/i915: Pull scatterlist utils out of i915_gem.h X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Out scatterlist utility routines can be pulled out of i915_gem.h for a bit more decluttering. v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the caller. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_internal.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_pages.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_phys.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 1 + .../drm/i915/gem/selftests/huge_gem_object.c | 2 + .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 1 + drivers/gpu/drm/i915/i915_drv.h | 110 --------------- drivers/gpu/drm/i915/i915_gem.c | 30 +---- drivers/gpu/drm/i915/i915_gem_fence_reg.c | 2 + drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +- drivers/gpu/drm/i915/i915_gem_gtt.h | 4 +- drivers/gpu/drm/i915/i915_gpu_error.c | 1 + drivers/gpu/drm/i915/i915_scatterlist.c | 39 ++++++ drivers/gpu/drm/i915/i915_scatterlist.h | 127 ++++++++++++++++++ drivers/gpu/drm/i915/selftests/i915_vma.c | 1 + drivers/gpu/drm/i915/selftests/scatterlist.c | 3 +- 19 files changed, 188 insertions(+), 142 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_scatterlist.c create mode 100644 drivers/gpu/drm/i915/i915_scatterlist.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 0c39ef967cf0..9f5f4acacae5 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -44,6 +44,7 @@ i915-y += i915_drv.o \ i915_irq.o \ i915_params.o \ i915_pci.o \ + i915_scatterlist.o \ i915_suspend.o \ i915_sysfs.o \ intel_csr.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 600fc926f81e..625397deb701 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -10,6 +10,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c index 46f5e8ded00c..5a9880a8e60b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -13,6 +13,7 @@ #include "i915_drv.h" #include "i915_gem.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" #include "i915_utils.h" #define QUIET (__GFP_NORETRY | __GFP_NOWARN) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 3879b3669dea..e53860147f21 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -6,6 +6,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct sg_table *pages, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 1c0ce69f765b..2deac933cf59 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -15,6 +15,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 568164ca66fd..665f22ebf8e8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -9,6 +9,7 @@ #include "i915_drv.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" /* * Move pages to appropriate lru and release the pagevec, decrementing the diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index ccac73b72597..cfa990edb351 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -14,6 +14,7 @@ #include "i915_gem_ioctls.h" #include "i915_gem_object.h" +#include "i915_scatterlist.h" #include "i915_trace.h" #include "intel_drv.h" diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c index 824f3761314c..3c5d17b2b670 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c @@ -4,6 +4,8 @@ * Copyright © 2016 Intel Corporation */ +#include "i915_scatterlist.h" + #include "huge_gem_object.h" static void huge_free_pages(struct drm_i915_gem_object *obj, diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c index b7431712de66..e3a64edef918 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c @@ -4,6 +4,7 @@ * Copyright © 2016 Intel Corporation */ +#include "i915_drv.h" #include "i915_selftest.h" #include "mock_dmabuf.h" diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index eba7eff74184..89fbf9927e6c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2158,111 +2158,6 @@ enum hdmi_force_audio { GENMASK(INTEL_FRONTBUFFER_BITS_PER_PIPE * ((pipe) + 1) - 1, \ INTEL_FRONTBUFFER_BITS_PER_PIPE * (pipe)) -/* - * Optimised SGL iterator for GEM objects - */ -static __always_inline struct sgt_iter { - struct scatterlist *sgp; - union { - unsigned long pfn; - dma_addr_t dma; - }; - unsigned int curr; - unsigned int max; -} __sgt_iter(struct scatterlist *sgl, bool dma) { - struct sgt_iter s = { .sgp = sgl }; - - if (s.sgp) { - s.max = s.curr = s.sgp->offset; - s.max += s.sgp->length; - if (dma) - s.dma = sg_dma_address(s.sgp); - else - s.pfn = page_to_pfn(sg_page(s.sgp)); - } - - return s; -} - -static inline struct scatterlist *____sg_next(struct scatterlist *sg) -{ - ++sg; - if (unlikely(sg_is_chain(sg))) - sg = sg_chain_ptr(sg); - return sg; -} - -/** - * __sg_next - return the next scatterlist entry in a list - * @sg: The current sg entry - * - * Description: - * If the entry is the last, return NULL; otherwise, step to the next - * element in the array (@sg@+1). If that's a chain pointer, follow it; - * otherwise just return the pointer to the current element. - **/ -static inline struct scatterlist *__sg_next(struct scatterlist *sg) -{ - return sg_is_last(sg) ? NULL : ____sg_next(sg); -} - -/** - * for_each_sgt_dma - iterate over the DMA addresses of the given sg_table - * @__dmap: DMA address (output) - * @__iter: 'struct sgt_iter' (iterator state, internal) - * @__sgt: sg_table to iterate over (input) - */ -#define for_each_sgt_dma(__dmap, __iter, __sgt) \ - for ((__iter) = __sgt_iter((__sgt)->sgl, true); \ - ((__dmap) = (__iter).dma + (__iter).curr); \ - (((__iter).curr += I915_GTT_PAGE_SIZE) >= (__iter).max) ? \ - (__iter) = __sgt_iter(__sg_next((__iter).sgp), true), 0 : 0) - -/** - * for_each_sgt_page - iterate over the pages of the given sg_table - * @__pp: page pointer (output) - * @__iter: 'struct sgt_iter' (iterator state, internal) - * @__sgt: sg_table to iterate over (input) - */ -#define for_each_sgt_page(__pp, __iter, __sgt) \ - for ((__iter) = __sgt_iter((__sgt)->sgl, false); \ - ((__pp) = (__iter).pfn == 0 ? NULL : \ - pfn_to_page((__iter).pfn + ((__iter).curr >> PAGE_SHIFT))); \ - (((__iter).curr += PAGE_SIZE) >= (__iter).max) ? \ - (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0) - -bool i915_sg_trim(struct sg_table *orig_st); - -static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg) -{ - unsigned int page_sizes; - - page_sizes = 0; - while (sg) { - GEM_BUG_ON(sg->offset); - GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE)); - page_sizes |= sg->length; - sg = __sg_next(sg); - } - - return page_sizes; -} - -static inline unsigned int i915_sg_segment_size(void) -{ - unsigned int size = swiotlb_max_segment(); - - if (size == 0) - return SCATTERLIST_MAX_SEGMENT; - - size = rounddown(size, PAGE_SIZE); - /* swiotlb_max_segment_size can return 1 byte when it means one page. */ - if (size < PAGE_SIZE) - size = PAGE_SIZE; - - return size; -} - #define INTEL_INFO(dev_priv) (&(dev_priv)->__info) #define RUNTIME_INFO(dev_priv) (&(dev_priv)->__runtime) #define DRIVER_CAPS(dev_priv) (&(dev_priv)->caps) @@ -2798,11 +2693,6 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj); void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); -static inline int __sg_page_count(const struct scatterlist *sg) -{ - return sg->length >> PAGE_SHIFT; -} - static inline int __must_check i915_mutex_lock_interruptible(struct drm_device *dev) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 879dd382d3e8..91df93542a68 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -50,6 +50,7 @@ #include "gt/intel_workarounds.h" #include "i915_drv.h" +#include "i915_scatterlist.h" #include "i915_trace.h" #include "i915_vgpu.h" @@ -1085,34 +1086,6 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv) } } -bool i915_sg_trim(struct sg_table *orig_st) -{ - struct sg_table new_st; - struct scatterlist *sg, *new_sg; - unsigned int i; - - if (orig_st->nents == orig_st->orig_nents) - return false; - - if (sg_alloc_table(&new_st, orig_st->nents, GFP_KERNEL | __GFP_NOWARN)) - return false; - - new_sg = new_st.sgl; - for_each_sg(orig_st->sgl, sg, orig_st->nents, i) { - sg_set_page(new_sg, sg_page(sg), sg->length, 0); - sg_dma_address(new_sg) = sg_dma_address(sg); - sg_dma_len(new_sg) = sg_dma_len(sg); - - new_sg = sg_next(new_sg); - } - GEM_BUG_ON(new_sg); /* Should walk exactly nents and hit the end */ - - sg_free_table(orig_st); - - *orig_st = new_st; - return true; -} - static unsigned long to_wait_timeout(s64 timeout_ns) { if (timeout_ns < 0) @@ -2366,7 +2339,6 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) -#include "selftests/scatterlist.c" #include "selftests/mock_gem_device.c" #include "selftests/i915_gem.c" #endif diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c index 3084f52e3372..2e9e32330aaa 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c +++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c @@ -22,7 +22,9 @@ */ #include + #include "i915_drv.h" +#include "i915_scatterlist.h" /** * DOC: fence register handling diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9ed41aefb456..2af762a36ab2 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -36,8 +36,9 @@ #include #include "i915_drv.h" -#include "i915_vgpu.h" +#include "i915_scatterlist.h" #include "i915_trace.h" +#include "i915_vgpu.h" #include "intel_drv.h" #include "intel_frontbuffer.h" diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 248a26b1bb04..516b89115898 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -40,6 +40,7 @@ #include "gt/intel_reset.h" #include "i915_request.h" +#include "i915_scatterlist.h" #include "i915_selftest.h" #include "i915_timeline.h" @@ -162,7 +163,8 @@ typedef u64 gen8_ppgtt_pml4e_t; #define GEN8_PDE_IPS_64K BIT(11) #define GEN8_PDE_PS_2M BIT(7) -struct sg_table; +#define for_each_sgt_dma(__dmap, __iter, __sgt) \ + __for_each_sgt_dma(__dmap, __iter, __sgt, I915_GTT_PAGE_SIZE) struct intel_remapped_plane_info { /* in gtt pages */ diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c86865a34972..707811256501 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -40,6 +40,7 @@ #include "i915_drv.h" #include "i915_gpu_error.h" +#include "i915_scatterlist.h" #include "intel_atomic.h" #include "intel_csr.h" #include "intel_overlay.h" diff --git a/drivers/gpu/drm/i915/i915_scatterlist.c b/drivers/gpu/drm/i915/i915_scatterlist.c new file mode 100644 index 000000000000..cc6b3846a8c7 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scatterlist.c @@ -0,0 +1,39 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#include "i915_scatterlist.h" + +bool i915_sg_trim(struct sg_table *orig_st) +{ + struct sg_table new_st; + struct scatterlist *sg, *new_sg; + unsigned int i; + + if (orig_st->nents == orig_st->orig_nents) + return false; + + if (sg_alloc_table(&new_st, orig_st->nents, GFP_KERNEL | __GFP_NOWARN)) + return false; + + new_sg = new_st.sgl; + for_each_sg(orig_st->sgl, sg, orig_st->nents, i) { + sg_set_page(new_sg, sg_page(sg), sg->length, 0); + sg_dma_address(new_sg) = sg_dma_address(sg); + sg_dma_len(new_sg) = sg_dma_len(sg); + + new_sg = sg_next(new_sg); + } + GEM_BUG_ON(new_sg); /* Should walk exactly nents and hit the end */ + + sg_free_table(orig_st); + + *orig_st = new_st; + return true; +} + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftests/scatterlist.c" +#endif diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h new file mode 100644 index 000000000000..6617963df9ed --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -0,0 +1,127 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#ifndef I915_SCATTERLIST_H +#define I915_SCATTERLIST_H + +#include +#include +#include + +#include "i915_gem.h" + +/* + * Optimised SGL iterator for GEM objects + */ +static __always_inline struct sgt_iter { + struct scatterlist *sgp; + union { + unsigned long pfn; + dma_addr_t dma; + }; + unsigned int curr; + unsigned int max; +} __sgt_iter(struct scatterlist *sgl, bool dma) { + struct sgt_iter s = { .sgp = sgl }; + + if (s.sgp) { + s.max = s.curr = s.sgp->offset; + s.max += s.sgp->length; + if (dma) + s.dma = sg_dma_address(s.sgp); + else + s.pfn = page_to_pfn(sg_page(s.sgp)); + } + + return s; +} + +static inline int __sg_page_count(const struct scatterlist *sg) +{ + return sg->length >> PAGE_SHIFT; +} + +static inline struct scatterlist *____sg_next(struct scatterlist *sg) +{ + ++sg; + if (unlikely(sg_is_chain(sg))) + sg = sg_chain_ptr(sg); + return sg; +} + +/** + * __sg_next - return the next scatterlist entry in a list + * @sg: The current sg entry + * + * Description: + * If the entry is the last, return NULL; otherwise, step to the next + * element in the array (@sg@+1). If that's a chain pointer, follow it; + * otherwise just return the pointer to the current element. + **/ +static inline struct scatterlist *__sg_next(struct scatterlist *sg) +{ + return sg_is_last(sg) ? NULL : ____sg_next(sg); +} + +/** + * __for_each_sgt_dma - iterate over the DMA addresses of the given sg_table + * @__dmap: DMA address (output) + * @__iter: 'struct sgt_iter' (iterator state, internal) + * @__sgt: sg_table to iterate over (input) + * @__step: step size + */ +#define __for_each_sgt_dma(__dmap, __iter, __sgt, __step) \ + for ((__iter) = __sgt_iter((__sgt)->sgl, true); \ + ((__dmap) = (__iter).dma + (__iter).curr); \ + (((__iter).curr += (__step)) >= (__iter).max) ? \ + (__iter) = __sgt_iter(__sg_next((__iter).sgp), true), 0 : 0) + +/** + * for_each_sgt_page - iterate over the pages of the given sg_table + * @__pp: page pointer (output) + * @__iter: 'struct sgt_iter' (iterator state, internal) + * @__sgt: sg_table to iterate over (input) + */ +#define for_each_sgt_page(__pp, __iter, __sgt) \ + for ((__iter) = __sgt_iter((__sgt)->sgl, false); \ + ((__pp) = (__iter).pfn == 0 ? NULL : \ + pfn_to_page((__iter).pfn + ((__iter).curr >> PAGE_SHIFT))); \ + (((__iter).curr += PAGE_SIZE) >= (__iter).max) ? \ + (__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0 : 0) + +static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg) +{ + unsigned int page_sizes; + + page_sizes = 0; + while (sg) { + GEM_BUG_ON(sg->offset); + GEM_BUG_ON(!IS_ALIGNED(sg->length, PAGE_SIZE)); + page_sizes |= sg->length; + sg = __sg_next(sg); + } + + return page_sizes; +} + +static inline unsigned int i915_sg_segment_size(void) +{ + unsigned int size = swiotlb_max_segment(); + + if (size == 0) + return SCATTERLIST_MAX_SEGMENT; + + size = rounddown(size, PAGE_SIZE); + /* swiotlb_max_segment_size can return 1 byte when it means one page. */ + if (size < PAGE_SIZE) + size = PAGE_SIZE; + + return size; +} + +bool i915_sg_trim(struct sg_table *orig_st); + +#endif diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c index 425b76133850..6919207883f6 100644 --- a/drivers/gpu/drm/i915/selftests/i915_vma.c +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c @@ -26,6 +26,7 @@ #include "gem/selftests/mock_context.h" +#include "i915_scatterlist.h" #include "i915_selftest.h" #include "mock_gem_device.h" diff --git a/drivers/gpu/drm/i915/selftests/scatterlist.c b/drivers/gpu/drm/i915/selftests/scatterlist.c index cd6d2a16071f..d599186d5b71 100644 --- a/drivers/gpu/drm/i915/selftests/scatterlist.c +++ b/drivers/gpu/drm/i915/selftests/scatterlist.c @@ -24,7 +24,8 @@ #include #include -#include "../i915_selftest.h" +#include "i915_selftest.h" +#include "i915_utils.h" #define PFN_BIAS (1 << 10) From patchwork Wed May 22 08:21:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2FFF6112C for ; Wed, 22 May 2019 08:21:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BD9F28B4D for ; Wed, 22 May 2019 08:21:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 10D0228A51; Wed, 22 May 2019 08:21:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8E9AF28A51 for ; Wed, 22 May 2019 08:21:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BF992897D0; Wed, 22 May 2019 08:21:17 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4158E897E8 for ; Wed, 22 May 2019 08:21:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636790-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:21:01 +0100 Message-Id: <20190522082104.5117-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 10/13] drm/i915: Move GEM object domain management from struct_mutex to local X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 10 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 70 +++++----- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 123 ++++++++++++------ drivers/gpu/drm/i915/gem/i915_gem_fence.c | 96 ++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 + drivers/gpu/drm/i915/gem/i915_gem_object.h | 14 ++ drivers/gpu/drm/i915/gem/i915_gem_pm.c | 7 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 12 +- .../i915/gem/selftests/i915_gem_coherency.c | 12 ++ .../drm/i915/gem/selftests/i915_gem_context.c | 20 +++ .../drm/i915/gem/selftests/i915_gem_mman.c | 6 + .../drm/i915/gem/selftests/i915_gem_phys.c | 4 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 4 + drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 + .../gpu/drm/i915/gt/selftest_workarounds.c | 6 + drivers/gpu/drm/i915/gvt/cmd_parser.c | 2 + drivers/gpu/drm/i915/gvt/scheduler.c | 8 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 23 ++-- drivers/gpu/drm/i915/i915_gem.c | 122 +++++++++-------- drivers/gpu/drm/i915/i915_gem_gtt.c | 5 +- drivers/gpu/drm/i915/i915_gem_render_state.c | 2 + drivers/gpu/drm/i915/i915_vma.c | 8 +- drivers/gpu/drm/i915/i915_vma.h | 12 ++ drivers/gpu/drm/i915/intel_display.c | 5 + drivers/gpu/drm/i915/intel_guc_log.c | 6 +- drivers/gpu/drm/i915/intel_overlay.c | 25 ++-- drivers/gpu/drm/i915/intel_uc_fw.c | 6 +- drivers/gpu/drm/i915/selftests/i915_request.c | 4 + drivers/gpu/drm/i915/selftests/igt_spinner.c | 2 + 31 files changed, 443 insertions(+), 180 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_fence.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9f5f4acacae5..e5348c355987 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -93,6 +93,7 @@ gem-y += \ gem/i915_gem_dmabuf.o \ gem/i915_gem_domain.o \ gem/i915_gem_execbuffer.o \ + gem/i915_gem_fence.o \ gem/i915_gem_internal.o \ gem/i915_gem_object.o \ gem/i915_gem_mman.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index 45d238d784fc..537aa2337cc8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -95,6 +95,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, { struct clflush *clflush; + assert_object_held(obj); + /* * Stolen memory is always coherent with the GPU as it is explicitly * marked as wc by the system, or the system is cache-coherent. @@ -144,9 +146,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, true, I915_FENCE_TIMEOUT, I915_FENCE_GFP); - reservation_object_lock(obj->resv, NULL); reservation_object_add_excl_fence(obj->resv, &clflush->dma); - reservation_object_unlock(obj->resv); i915_sw_fence_commit(&clflush->wait); } else if (obj->mm.pages) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 625397deb701..a93e233cfaa9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -151,7 +151,6 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct * static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); - struct drm_device *dev = obj->base.dev; bool write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE); int err; @@ -159,12 +158,12 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (err) return err; - err = i915_mutex_lock_interruptible(dev); + err = i915_gem_object_lock_interruptible(obj); if (err) goto out; err = i915_gem_object_set_to_cpu_domain(obj, write); - mutex_unlock(&dev->struct_mutex); + i915_gem_object_unlock(obj); out: i915_gem_object_unpin_pages(obj); @@ -174,19 +173,18 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); - struct drm_device *dev = obj->base.dev; int err; err = i915_gem_object_pin_pages(obj); if (err) return err; - err = i915_mutex_lock_interruptible(dev); + err = i915_gem_object_lock_interruptible(obj); if (err) goto out; err = i915_gem_object_set_to_gtt_domain(obj, false); - mutex_unlock(&dev->struct_mutex); + i915_gem_object_unlock(obj); out: i915_gem_object_unpin_pages(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index bbc7fb758186..cce96e6c6e52 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -29,9 +29,9 @@ void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj) if (!READ_ONCE(obj->pin_global)) return; - mutex_lock(&obj->base.dev->struct_mutex); + i915_gem_object_lock(obj); __i915_gem_object_flush_for_display(obj); - mutex_unlock(&obj->base.dev->struct_mutex); + i915_gem_object_unlock(obj); } /** @@ -47,11 +47,10 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) { int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); + assert_object_held(obj); ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | (write ? I915_WAIT_ALL : 0), MAX_SCHEDULE_TIMEOUT); if (ret) @@ -109,11 +108,10 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) { int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); + assert_object_held(obj); ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | (write ? I915_WAIT_ALL : 0), MAX_SCHEDULE_TIMEOUT); if (ret) @@ -179,7 +177,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); + assert_object_held(obj); if (obj->cache_level == cache_level) return 0; @@ -228,7 +226,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, */ ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT); if (ret) @@ -372,12 +369,16 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, if (ret) goto out; - ret = i915_mutex_lock_interruptible(dev); + ret = mutex_lock_interruptible(&i915->drm.struct_mutex); if (ret) goto out; - ret = i915_gem_object_set_cache_level(obj, level); - mutex_unlock(&dev->struct_mutex); + ret = i915_gem_object_lock_interruptible(obj); + if (ret == 0) { + ret = i915_gem_object_set_cache_level(obj, level); + i915_gem_object_unlock(obj); + } + mutex_unlock(&i915->drm.struct_mutex); out: i915_gem_object_put(obj); @@ -399,7 +400,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, struct i915_vma *vma; int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); + assert_object_held(obj); /* Mark the global pin early so that we account for the * display coherency whilst setting up the cache domains. @@ -484,16 +485,18 @@ static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma) { - lockdep_assert_held(&vma->vm->i915->drm.struct_mutex); + struct drm_i915_gem_object *obj = vma->obj; + + assert_object_held(obj); - if (WARN_ON(vma->obj->pin_global == 0)) + if (WARN_ON(obj->pin_global == 0)) return; - if (--vma->obj->pin_global == 0) + if (--obj->pin_global == 0) vma->display_alignment = I915_GTT_MIN_ALIGNMENT; /* Bump the LRU to try and avoid premature eviction whilst flipping */ - i915_gem_object_bump_inactive_ggtt(vma->obj); + i915_gem_object_bump_inactive_ggtt(obj); i915_vma_unpin(vma); } @@ -511,11 +514,10 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) { int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); + assert_object_held(obj); ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | (write ? I915_WAIT_ALL : 0), MAX_SCHEDULE_TIMEOUT); if (ret) @@ -637,7 +639,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (err) goto out; - err = i915_mutex_lock_interruptible(dev); + err = i915_gem_object_lock_interruptible(obj); if (err) goto out_unpin; @@ -651,7 +653,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, /* And bump the LRU for this access */ i915_gem_object_bump_inactive_ggtt(obj); - mutex_unlock(&dev->struct_mutex); + i915_gem_object_unlock(obj); if (write_domain != 0) intel_fb_obj_invalidate(obj, @@ -674,22 +676,23 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, { int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); - *needs_clflush = 0; if (!i915_gem_object_has_struct_page(obj)) return -ENODEV; + ret = i915_gem_object_lock_interruptible(obj); + if (ret) + return ret; + ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED, + I915_WAIT_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); if (ret) - return ret; + goto err_unlock; ret = i915_gem_object_pin_pages(obj); if (ret) - return ret; + goto err_unlock; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || !static_cpu_has(X86_FEATURE_CLFLUSH)) { @@ -717,6 +720,8 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, err_unpin: i915_gem_object_unpin_pages(obj); +err_unlock: + i915_gem_object_unlock(obj); return ret; } @@ -725,23 +730,24 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, { int ret; - lockdep_assert_held(&obj->base.dev->struct_mutex); - *needs_clflush = 0; if (!i915_gem_object_has_struct_page(obj)) return -ENODEV; + ret = i915_gem_object_lock_interruptible(obj); + if (ret) + return ret; + ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT); if (ret) - return ret; + goto err_unlock; ret = i915_gem_object_pin_pages(obj); if (ret) - return ret; + goto err_unlock; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || !static_cpu_has(X86_FEATURE_CLFLUSH)) { @@ -778,5 +784,7 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, err_unpin: i915_gem_object_unpin_pages(obj); +err_unlock: + i915_gem_object_unlock(obj); return ret; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 09e64bf33842..ed522fdfbe7f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1075,7 +1075,9 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (use_cpu_reloc(cache, obj)) return NULL; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_unlock(obj); if (err) return ERR_PTR(err); @@ -1164,6 +1166,26 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes) *addr = value; } +static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma) +{ + struct drm_i915_gem_object *obj = vma->obj; + int err; + + i915_vma_lock(vma); + + if (obj->cache_dirty & ~obj->cache_coherent) + i915_gem_clflush_object(obj, 0); + obj->write_domain = 0; + + err = i915_request_await_object(rq, vma->obj, true); + if (err == 0) + err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + + i915_vma_unlock(vma); + + return err; +} + static int __reloc_gpu_alloc(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -1175,15 +1197,6 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, u32 *cmd; int err; - if (DBG_FORCE_RELOC == FORCE_GPU_RELOC) { - obj = vma->obj; - if (obj->cache_dirty & ~obj->cache_coherent) - i915_gem_clflush_object(obj, 0); - obj->write_domain = 0; - } - - GEM_BUG_ON(vma->obj->write_domain & I915_GEM_DOMAIN_CPU); - obj = i915_gem_batch_pool_get(&eb->engine->batch_pool, PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -1212,7 +1225,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, goto err_unpin; } - err = i915_request_await_object(rq, vma->obj, true); + err = reloc_move_to_gpu(rq, vma); if (err) goto err_request; @@ -1220,14 +1233,12 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, batch->node.start, PAGE_SIZE, cache->gen > 5 ? 0 : I915_DISPATCH_SECURE); if (err) - goto err_request; + goto skip_request; + i915_vma_lock(batch); GEM_BUG_ON(!reservation_object_test_signaled_rcu(batch->resv, true)); err = i915_vma_move_to_active(batch, rq, 0); - if (err) - goto skip_request; - - err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(batch); if (err) goto skip_request; @@ -1837,24 +1848,59 @@ static int eb_relocate(struct i915_execbuffer *eb) static int eb_move_to_gpu(struct i915_execbuffer *eb) { const unsigned int count = eb->buffer_count; + struct ww_acquire_ctx acquire; unsigned int i; - int err; + int err = 0; + + ww_acquire_init(&acquire, &reservation_ww_class); for (i = 0; i < count; i++) { + struct i915_vma *vma = eb->vma[i]; + + err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire); + if (!err) + continue; + + GEM_BUG_ON(err == -EALREADY); /* No duplicate vma */ + + if (err == -EDEADLK) { + GEM_BUG_ON(i == 0); + do { + int j = i - 1; + + ww_mutex_unlock(&eb->vma[j]->resv->lock); + + swap(eb->flags[i], eb->flags[j]); + swap(eb->vma[i], eb->vma[j]); + eb->vma[i]->exec_flags = &eb->flags[i]; + } while (--i); + GEM_BUG_ON(vma != eb->vma[0]); + vma->exec_flags = &eb->flags[0]; + + err = ww_mutex_lock_slow_interruptible(&vma->resv->lock, + &acquire); + } + if (err) + break; + } + ww_acquire_done(&acquire); + + while (i--) { unsigned int flags = eb->flags[i]; struct i915_vma *vma = eb->vma[i]; struct drm_i915_gem_object *obj = vma->obj; + assert_vma_held(vma); + if (flags & EXEC_OBJECT_CAPTURE) { struct i915_capture_list *capture; capture = kmalloc(sizeof(*capture), GFP_KERNEL); - if (unlikely(!capture)) - return -ENOMEM; - - capture->next = eb->request->capture_list; - capture->vma = eb->vma[i]; - eb->request->capture_list = capture; + if (capture) { + capture->next = eb->request->capture_list; + capture->vma = vma; + eb->request->capture_list = capture; + } } /* @@ -1874,24 +1920,15 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) flags &= ~EXEC_OBJECT_ASYNC; } - if (flags & EXEC_OBJECT_ASYNC) - continue; - - err = i915_request_await_object - (eb->request, obj, flags & EXEC_OBJECT_WRITE); - if (err) - return err; - } + if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) { + err = i915_request_await_object + (eb->request, obj, flags & EXEC_OBJECT_WRITE); + } - for (i = 0; i < count; i++) { - unsigned int flags = eb->flags[i]; - struct i915_vma *vma = eb->vma[i]; + if (err == 0) + err = i915_vma_move_to_active(vma, eb->request, flags); - err = i915_vma_move_to_active(vma, eb->request, flags); - if (unlikely(err)) { - i915_request_skip(eb->request, err); - return err; - } + i915_vma_unlock(vma); __eb_unreserve_vma(vma, flags); vma->exec_flags = NULL; @@ -1899,12 +1936,20 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) if (unlikely(flags & __EXEC_OBJECT_HAS_REF)) i915_vma_put(vma); } + ww_acquire_fini(&acquire); + + if (unlikely(err)) + goto err_skip; + eb->exec = NULL; /* Unconditionally flush any chipset caches (for streaming writes). */ i915_gem_chipset_flush(eb->i915); - return 0; + +err_skip: + i915_request_skip(eb->request, err); + return err; } static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_fence.c b/drivers/gpu/drm/i915/gem/i915_gem_fence.c new file mode 100644 index 000000000000..57dbc0862713 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_fence.c @@ -0,0 +1,96 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "i915_gem_object.h" + +struct stub_fence { + struct dma_fence dma; + struct i915_sw_fence chain; +}; + +static int __i915_sw_fence_call +stub_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) +{ + struct stub_fence *stub = container_of(fence, typeof(*stub), chain); + + switch (state) { + case FENCE_COMPLETE: + dma_fence_signal(&stub->dma); + break; + + case FENCE_FREE: + dma_fence_put(&stub->dma); + break; + } + + return NOTIFY_DONE; +} + +static const char *stub_driver_name(struct dma_fence *fence) +{ + return DRIVER_NAME; +} + +static const char *stub_timeline_name(struct dma_fence *fence) +{ + return "object"; +} + +static void stub_release(struct dma_fence *fence) +{ + struct stub_fence *stub = container_of(fence, typeof(*stub), dma); + + i915_sw_fence_fini(&stub->chain); + + BUILD_BUG_ON(offsetof(typeof(*stub), dma)); + dma_fence_free(&stub->dma); +} + +static const struct dma_fence_ops stub_fence_ops = { + .get_driver_name = stub_driver_name, + .get_timeline_name = stub_timeline_name, + .release = stub_release, +}; + +struct dma_fence * +i915_gem_object_lock_fence(struct drm_i915_gem_object *obj) +{ + struct stub_fence *stub; + + assert_object_held(obj); + + stub = kmalloc(sizeof(*stub), GFP_KERNEL); + if (!stub) + return NULL; + + i915_sw_fence_init(&stub->chain, stub_notify); + dma_fence_init(&stub->dma, &stub_fence_ops, &stub->chain.wait.lock, + to_i915(obj->base.dev)->mm.unordered_timeline, + 0); + + if (i915_sw_fence_await_reservation(&stub->chain, + obj->resv, NULL, + true, I915_FENCE_TIMEOUT, + I915_FENCE_GFP) < 0) + goto err; + + reservation_object_add_excl_fence(obj->resv, &stub->dma); + + return &stub->dma; + +err: + stub_release(&stub->dma); + return NULL; +} + +void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj, + struct dma_fence *fence) +{ + struct stub_fence *stub = container_of(fence, typeof(*stub), dma); + + i915_sw_fence_commit(&stub->chain); +} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 457e694a5c3f..a6a3452d2b3e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -378,6 +378,8 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj, struct drm_i915_private *dev_priv = to_i915(obj->base.dev); struct i915_vma *vma; + assert_object_held(obj); + if (!(obj->write_domain & flush_domains)) return; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 8cf082abb0ab..b0488517e945 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -99,16 +99,29 @@ i915_gem_object_put(struct drm_i915_gem_object *obj) __drm_gem_object_put(&obj->base); } +#define assert_object_held(obj) reservation_object_assert_held((obj)->resv) + static inline void i915_gem_object_lock(struct drm_i915_gem_object *obj) { reservation_object_lock(obj->resv, NULL); } +static inline int +i915_gem_object_lock_interruptible(struct drm_i915_gem_object *obj) +{ + return reservation_object_lock_interruptible(obj->resv, NULL); +} + static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) { reservation_object_unlock(obj->resv); } +struct dma_fence * +i915_gem_object_lock_fence(struct drm_i915_gem_object *obj); +void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj, + struct dma_fence *fence); + static inline void i915_gem_object_set_readonly(struct drm_i915_gem_object *obj) { @@ -372,6 +385,7 @@ static inline void i915_gem_object_finish_access(struct drm_i915_gem_object *obj) { i915_gem_object_unpin_pages(obj); + i915_gem_object_unlock(obj); } static inline struct intel_engine_cs * diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index be75a52a7ab0..111b687f14b5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -186,12 +186,13 @@ void i915_gem_suspend_late(struct drm_i915_private *i915) * machine in an unusable condition. */ - mutex_lock(&i915->drm.struct_mutex); for (phase = phases; *phase; phase++) { - list_for_each_entry(obj, *phase, mm.link) + list_for_each_entry(obj, *phase, mm.link) { + i915_gem_object_lock(obj); WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false)); + i915_gem_object_unlock(obj); + } } - mutex_unlock(&i915->drm.struct_mutex); intel_uc_sanitize(i915); i915_gem_sanitize(i915); diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 7b437f06a9be..465e0e1d4aa3 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -960,10 +960,6 @@ static int gpu_write(struct i915_vma *vma, GEM_BUG_ON(!intel_engine_can_store_dword(engine)); - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); - if (err) - return err; - batch = gpu_write_dw(vma, dword * sizeof(u32), value); if (IS_ERR(batch)) return PTR_ERR(batch); @@ -974,13 +970,19 @@ static int gpu_write(struct i915_vma *vma, goto err_batch; } + i915_vma_lock(batch); err = i915_vma_move_to_active(batch, rq, 0); + i915_vma_unlock(batch); if (err) goto err_request; i915_gem_object_set_active_reference(batch->obj); - err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_lock(vma); + err = i915_gem_object_set_to_gtt_domain(vma->obj, false); + if (err == 0) + err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); if (err) goto err_request; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index 5495875b48b3..b5c5dd034d5c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -78,7 +78,9 @@ static int gtt_set(struct drm_i915_gem_object *obj, u32 __iomem *map; int err; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_unlock(obj); if (err) return err; @@ -105,7 +107,9 @@ static int gtt_get(struct drm_i915_gem_object *obj, u32 __iomem *map; int err; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_unlock(obj); if (err) return err; @@ -131,7 +135,9 @@ static int wc_set(struct drm_i915_gem_object *obj, u32 *map; int err; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_wc_domain(obj, true); + i915_gem_object_unlock(obj); if (err) return err; @@ -152,7 +158,9 @@ static int wc_get(struct drm_i915_gem_object *obj, u32 *map; int err; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_wc_domain(obj, false); + i915_gem_object_unlock(obj); if (err) return err; @@ -176,7 +184,9 @@ static int gpu_set(struct drm_i915_gem_object *obj, u32 *cs; int err; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_unlock(obj); if (err) return err; @@ -215,7 +225,9 @@ static int gpu_set(struct drm_i915_gem_object *obj, } intel_ring_advance(rq, cs); + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); i915_vma_unpin(vma); i915_request_add(rq); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 653ae08a277f..72eedd6c2a0a 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -209,7 +209,9 @@ gpu_fill_dw(struct i915_vma *vma, u64 offset, unsigned long count, u32 value) i915_gem_object_flush_map(obj); i915_gem_object_unpin_map(obj); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_unlock(obj); if (err) goto err; @@ -261,7 +263,9 @@ static int gpu_fill(struct drm_i915_gem_object *obj, if (IS_ERR(vma)) return PTR_ERR(vma); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_unlock(obj); if (err) return err; @@ -302,11 +306,15 @@ static int gpu_fill(struct drm_i915_gem_object *obj, if (err) goto err_request; + i915_vma_lock(batch); err = i915_vma_move_to_active(batch, rq, 0); + i915_vma_unlock(batch); if (err) goto skip_request; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); if (err) goto skip_request; @@ -754,7 +762,9 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, if (IS_ERR(vma)) return PTR_ERR(vma); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_unlock(obj); if (err) return err; @@ -780,11 +790,15 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, if (err) goto err_request; + i915_vma_lock(batch); err = i915_vma_move_to_active(batch, rq, 0); + i915_vma_unlock(batch); if (err) goto skip_request; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); if (err) goto skip_request; @@ -1345,7 +1359,9 @@ static int write_to_scratch(struct i915_gem_context *ctx, if (err) goto err_request; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, 0); + i915_vma_unlock(vma); if (err) goto skip_request; @@ -1440,7 +1456,9 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto err_request; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); if (err) goto skip_request; @@ -1449,7 +1467,9 @@ static int read_from_scratch(struct i915_gem_context *ctx, i915_request_add(rq); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_cpu_domain(obj, false); + i915_gem_object_unlock(obj); if (err) goto err; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 12c90d8fe0fb..297f8864d392 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -110,7 +110,9 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj, GEM_BUG_ON(view.partial.size > nreal); cond_resched(); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_unlock(obj); if (err) { pr_err("Failed to flush to GTT write domain; err=%d\n", err); @@ -142,7 +144,9 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj, if (offset >= obj->base.size) continue; + i915_gem_object_lock(obj); i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + i915_gem_object_unlock(obj); p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT); cpu = kmap(p) + offset_in_page(offset); @@ -344,7 +348,9 @@ static int make_obj_busy(struct drm_i915_gem_object *obj) return PTR_ERR(rq); } + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); i915_request_add(rq); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index ed64012e5d24..94a15e3f6db8 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -46,9 +46,9 @@ static int mock_phys_object(void *arg) } /* Make the object dirty so that put_pages must do copy back the data */ - mutex_lock(&i915->drm.struct_mutex); + i915_gem_object_lock(obj); err = i915_gem_object_set_to_gtt_domain(obj, true); - mutex_unlock(&i915->drm.struct_mutex); + i915_gem_object_unlock(obj); if (err) { pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", err); diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 4b4589b92521..45745e36b5f4 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -115,7 +115,9 @@ static int move_to_active(struct i915_vma *vma, { int err; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, flags); + i915_vma_unlock(vma); if (err) return err; @@ -1298,7 +1300,9 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915, } } + i915_vma_lock(arg.vma); err = i915_vma_move_to_active(arg.vma, rq, flags); + i915_vma_unlock(arg.vma); if (flags & EXEC_OBJECT_NEEDS_FENCE) i915_vma_unpin_fence(arg.vma); diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index dfacc46ae7d3..d162b0d3ec8e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -1108,11 +1108,13 @@ static int smoke_submit(struct preempt_smoke *smoke, } if (vma) { + i915_vma_lock(vma); err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0); if (!err) err = i915_vma_move_to_active(vma, rq, 0); + i915_vma_unlock(vma); } i915_request_add(rq); diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 40b6df911d8d..d4ba5296b2c8 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -111,7 +111,9 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine) goto err_pin; } + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(vma); if (err) goto err_req; @@ -188,8 +190,10 @@ static int check_whitelist(struct i915_gem_context *ctx, return PTR_ERR(results); err = 0; + i915_gem_object_lock(results); igt_wedge_on_timeout(&wedge, ctx->i915, HZ / 5) /* a safety net! */ err = i915_gem_object_set_to_cpu_domain(results, false); + i915_gem_object_unlock(results); if (i915_terminally_wedged(ctx->i915)) err = -EIO; if (err) @@ -360,7 +364,9 @@ static struct i915_vma *create_batch(struct i915_gem_context *ctx) if (err) goto err_obj; + i915_gem_object_lock(obj); err = i915_gem_object_set_to_wc_domain(obj, true); + i915_gem_object_unlock(obj); if (err) goto err_obj; diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c index e3608b170105..75cb98da6cc8 100644 --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c @@ -2844,7 +2844,9 @@ static int shadow_indirect_ctx(struct intel_shadow_wa_ctx *wa_ctx) goto put_obj; } + i915_gem_object_lock(obj); ret = i915_gem_object_set_to_cpu_domain(obj, false); + i915_gem_object_unlock(obj); if (ret) { gvt_vgpu_err("failed to set shadow indirect ctx to CPU\n"); goto unmap_src; diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index d66bf77f55fd..8a00e2d0c81c 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -509,18 +509,18 @@ static int prepare_shadow_batch_buffer(struct intel_vgpu_workload *workload) } ret = i915_gem_object_set_to_gtt_domain(bb->obj, - false); + false); if (ret) goto err; - i915_gem_object_finish_access(bb->obj); - bb->accessing = false; - ret = i915_vma_move_to_active(bb->vma, workload->req, 0); if (ret) goto err; + + i915_gem_object_finish_access(bb->obj); + bb->accessing = false; } } return 0; diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index c893bd4eb2c8..a28bcd2d7c09 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1058,19 +1058,20 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, void *dst, *src; int ret; - ret = i915_gem_object_prepare_read(src_obj, &src_needs_clflush); + ret = i915_gem_object_prepare_write(dst_obj, &dst_needs_clflush); if (ret) return ERR_PTR(ret); - ret = i915_gem_object_prepare_write(dst_obj, &dst_needs_clflush); - if (ret) { - dst = ERR_PTR(ret); - goto unpin_src; - } - dst = i915_gem_object_pin_map(dst_obj, I915_MAP_FORCE_WB); + i915_gem_object_finish_access(dst_obj); if (IS_ERR(dst)) - goto unpin_dst; + return dst; + + ret = i915_gem_object_prepare_read(src_obj, &src_needs_clflush); + if (ret) { + i915_gem_object_unpin_map(dst_obj); + return ERR_PTR(ret); + } src = ERR_PTR(-ENODEV); if (src_needs_clflush && @@ -1116,13 +1117,11 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, } } + i915_gem_object_finish_access(src_obj); + /* dst_obj is returned with vmap pinned */ *needs_clflush_after = dst_needs_clflush & CLFLUSH_AFTER; -unpin_dst: - i915_gem_object_finish_access(dst_obj); -unpin_src: - i915_gem_object_finish_access(src_obj); return dst; } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 91df93542a68..d0bcfef5608c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -104,19 +104,10 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj) { struct i915_vma *vma; LIST_HEAD(still_in_list); - int ret; + int ret = 0; lockdep_assert_held(&obj->base.dev->struct_mutex); - /* Closed vma are removed from the obj->vma_list - but they may - * still have an active binding on the object. To remove those we - * must wait for all rendering to complete to the object (as unbinding - * must anyway), and retire the requests. - */ - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - return ret; - spin_lock(&obj->vma.lock); while (!ret && (vma = list_first_entry_or_null(&obj->vma.list, struct i915_vma, @@ -139,29 +130,17 @@ i915_gem_object_wait_fence(struct dma_fence *fence, unsigned int flags, long timeout) { - struct i915_request *rq; - BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1); if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return timeout; - if (!dma_fence_is_i915(fence)) - return dma_fence_wait_timeout(fence, - flags & I915_WAIT_INTERRUPTIBLE, - timeout); + if (dma_fence_is_i915(fence)) + return i915_request_wait(to_request(fence), flags, timeout); - rq = to_request(fence); - if (i915_request_completed(rq)) - goto out; - - timeout = i915_request_wait(rq, flags, timeout); - -out: - if (flags & I915_WAIT_LOCKED && i915_request_completed(rq)) - i915_request_retire_upto(rq); - - return timeout; + return dma_fence_wait_timeout(fence, + flags & I915_WAIT_INTERRUPTIBLE, + timeout); } static long @@ -487,21 +466,22 @@ static int i915_gem_shmem_pread(struct drm_i915_gem_object *obj, struct drm_i915_gem_pread *args) { - char __user *user_data; - u64 remain; unsigned int needs_clflush; unsigned int idx, offset; + struct dma_fence *fence; + char __user *user_data; + u64 remain; int ret; - ret = mutex_lock_interruptible(&obj->base.dev->struct_mutex); - if (ret) - return ret; - ret = i915_gem_object_prepare_read(obj, &needs_clflush); - mutex_unlock(&obj->base.dev->struct_mutex); if (ret) return ret; + fence = i915_gem_object_lock_fence(obj); + i915_gem_object_finish_access(obj); + if (!fence) + return -ENOMEM; + remain = args->size; user_data = u64_to_user_ptr(args->data_ptr); offset = offset_in_page(args->offset); @@ -519,7 +499,7 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj, offset = 0; } - i915_gem_object_finish_access(obj); + i915_gem_object_unlock_fence(obj, fence); return ret; } @@ -555,8 +535,9 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj, struct i915_ggtt *ggtt = &i915->ggtt; intel_wakeref_t wakeref; struct drm_mm_node node; - struct i915_vma *vma; + struct dma_fence *fence; void __user *user_data; + struct i915_vma *vma; u64 remain, offset; int ret; @@ -585,11 +566,24 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj, GEM_BUG_ON(!node.allocated); } - ret = i915_gem_object_set_to_gtt_domain(obj, false); + mutex_unlock(&i915->drm.struct_mutex); + + ret = i915_gem_object_lock_interruptible(obj); if (ret) goto out_unpin; - mutex_unlock(&i915->drm.struct_mutex); + ret = i915_gem_object_set_to_gtt_domain(obj, false); + if (ret) { + i915_gem_object_unlock(obj); + goto out_unpin; + } + + fence = i915_gem_object_lock_fence(obj); + i915_gem_object_unlock(obj); + if (!fence) { + ret = -ENOMEM; + goto out_unpin; + } user_data = u64_to_user_ptr(args->data_ptr); remain = args->size; @@ -627,8 +621,9 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj, offset += page_length; } - mutex_lock(&i915->drm.struct_mutex); + i915_gem_object_unlock_fence(obj, fence); out_unpin: + mutex_lock(&i915->drm.struct_mutex); if (node.allocated) { wmb(); ggtt->vm.clear_range(&ggtt->vm, node.start, node.size); @@ -739,6 +734,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, struct i915_ggtt *ggtt = &i915->ggtt; intel_wakeref_t wakeref; struct drm_mm_node node; + struct dma_fence *fence; struct i915_vma *vma; u64 remain, offset; void __user *user_data; @@ -786,11 +782,24 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, GEM_BUG_ON(!node.allocated); } - ret = i915_gem_object_set_to_gtt_domain(obj, true); + mutex_unlock(&i915->drm.struct_mutex); + + ret = i915_gem_object_lock_interruptible(obj); if (ret) goto out_unpin; - mutex_unlock(&i915->drm.struct_mutex); + ret = i915_gem_object_set_to_gtt_domain(obj, true); + if (ret) { + i915_gem_object_unlock(obj); + goto out_unpin; + } + + fence = i915_gem_object_lock_fence(obj); + i915_gem_object_unlock(obj); + if (!fence) { + ret = -ENOMEM; + goto out_unpin; + } intel_fb_obj_invalidate(obj, ORIGIN_CPU); @@ -835,8 +844,9 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj, } intel_fb_obj_flush(obj, ORIGIN_CPU); - mutex_lock(&i915->drm.struct_mutex); + i915_gem_object_unlock_fence(obj, fence); out_unpin: + mutex_lock(&i915->drm.struct_mutex); if (node.allocated) { wmb(); ggtt->vm.clear_range(&ggtt->vm, node.start, node.size); @@ -882,23 +892,23 @@ static int i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pwrite *args) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - void __user *user_data; - u64 remain; unsigned int partial_cacheline_write; unsigned int needs_clflush; unsigned int offset, idx; + struct dma_fence *fence; + void __user *user_data; + u64 remain; int ret; - ret = mutex_lock_interruptible(&i915->drm.struct_mutex); - if (ret) - return ret; - ret = i915_gem_object_prepare_write(obj, &needs_clflush); - mutex_unlock(&i915->drm.struct_mutex); if (ret) return ret; + fence = i915_gem_object_lock_fence(obj); + i915_gem_object_finish_access(obj); + if (!fence) + return -ENOMEM; + /* If we don't overwrite a cacheline completely we need to be * careful to have up-to-date data by first clflushing. Don't * overcomplicate things and flush the entire patch. @@ -926,7 +936,8 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, } intel_fb_obj_flush(obj, ORIGIN_CPU); - i915_gem_object_finish_access(obj); + i915_gem_object_unlock_fence(obj, fence); + return ret; } @@ -1805,7 +1816,9 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915) if (err) goto err_active; + i915_gem_object_lock(state->obj); err = i915_gem_object_set_to_cpu_domain(state->obj, false); + i915_gem_object_unlock(state->obj); if (err) goto err_active; @@ -2252,12 +2265,13 @@ int i915_gem_freeze_late(struct drm_i915_private *i915) i915_gem_shrink(i915, -1UL, NULL, I915_SHRINK_UNBOUND); i915_gem_drain_freed_objects(i915); - mutex_lock(&i915->drm.struct_mutex); for (phase = phases; *phase; phase++) { - list_for_each_entry(obj, *phase, mm.link) + list_for_each_entry(obj, *phase, mm.link) { + i915_gem_object_lock(obj); WARN_ON(i915_gem_object_set_to_cpu_domain(obj, true)); + i915_gem_object_unlock(obj); + } } - mutex_unlock(&i915->drm.struct_mutex); return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2af762a36ab2..5747f785c0f6 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3519,8 +3519,11 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv) WARN_ON(i915_vma_bind(vma, obj ? obj->cache_level : 0, PIN_UPDATE)); - if (obj) + if (obj) { + i915_gem_object_lock(obj); WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false)); + i915_gem_object_unlock(obj); + } lock: mutex_lock(&ggtt->vm.mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index f3b42b026fff..706ed71468e8 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -222,7 +222,9 @@ int i915_gem_render_state_emit(struct i915_request *rq) goto err_unpin; } + i915_vma_lock(so.vma); err = i915_vma_move_to_active(so.vma, rq, 0); + i915_vma_unlock(so.vma); err_unpin: i915_vma_unpin(so.vma); err_vma: diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index cf405ffda045..db94d7b6c5a6 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -840,13 +840,14 @@ void i915_vma_destroy(struct i915_vma *vma) { lockdep_assert_held(&vma->vm->i915->drm.struct_mutex); - GEM_BUG_ON(i915_vma_is_active(vma)); GEM_BUG_ON(i915_vma_is_pinned(vma)); if (i915_vma_is_closed(vma)) list_del(&vma->closed_link); WARN_ON(i915_vma_unbind(vma)); + GEM_BUG_ON(i915_vma_is_active(vma)); + __i915_vma_destroy(vma); } @@ -908,12 +909,10 @@ static void export_fence(struct i915_vma *vma, * handle an error right now. Worst case should be missed * synchronisation leading to rendering corruption. */ - reservation_object_lock(resv, NULL); if (flags & EXEC_OBJECT_WRITE) reservation_object_add_excl_fence(resv, &rq->fence); else if (reservation_object_reserve_shared(resv, 1) == 0) reservation_object_add_shared_fence(resv, &rq->fence); - reservation_object_unlock(resv); } int i915_vma_move_to_active(struct i915_vma *vma, @@ -922,7 +921,8 @@ int i915_vma_move_to_active(struct i915_vma *vma, { struct drm_i915_gem_object *obj = vma->obj; - lockdep_assert_held(&rq->i915->drm.struct_mutex); + assert_vma_held(vma); + assert_object_held(obj); GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); /* diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 754a762d90b4..2657c99fe187 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -298,6 +298,18 @@ void i915_vma_close(struct i915_vma *vma); void i915_vma_reopen(struct i915_vma *vma); void i915_vma_destroy(struct i915_vma *vma); +#define assert_vma_held(vma) reservation_object_assert_held((vma)->resv) + +static inline void i915_vma_lock(struct i915_vma *vma) +{ + reservation_object_lock(vma->resv, NULL); +} + +static inline void i915_vma_unlock(struct i915_vma *vma) +{ + reservation_object_unlock(vma->resv); +} + int __i915_vma_do_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags); static inline int __must_check diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index e00de2237fd8..8643637b154b 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2112,6 +2112,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, * pin/unpin/fence and not more. */ wakeref = intel_runtime_pm_get(dev_priv); + i915_gem_object_lock(obj); atomic_inc(&dev_priv->gpu_error.pending_fb_pin); @@ -2166,6 +2167,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, err: atomic_dec(&dev_priv->gpu_error.pending_fb_pin); + i915_gem_object_unlock(obj); intel_runtime_pm_put(dev_priv, wakeref); return vma; } @@ -2174,9 +2176,12 @@ void intel_unpin_fb_vma(struct i915_vma *vma, unsigned long flags) { lockdep_assert_held(&vma->vm->i915->drm.struct_mutex); + i915_gem_object_lock(vma->obj); if (flags & PLANE_HAS_FENCE) i915_vma_unpin_fence(vma); i915_gem_object_unpin_from_display_plane(vma); + i915_gem_object_unlock(vma->obj); + i915_vma_put(vma); } diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c index 7146524264dd..67eadc82c396 100644 --- a/drivers/gpu/drm/i915/intel_guc_log.c +++ b/drivers/gpu/drm/i915/intel_guc_log.c @@ -343,8 +343,6 @@ static void capture_logs_work(struct work_struct *work) static int guc_log_map(struct intel_guc_log *log) { - struct intel_guc *guc = log_to_guc(log); - struct drm_i915_private *dev_priv = guc_to_i915(guc); void *vaddr; int ret; @@ -353,9 +351,9 @@ static int guc_log_map(struct intel_guc_log *log) if (!log->vma) return -ENODEV; - mutex_lock(&dev_priv->drm.struct_mutex); + i915_gem_object_lock(log->vma->obj); ret = i915_gem_object_set_to_wc_domain(log->vma->obj, true); - mutex_unlock(&dev_priv->drm.struct_mutex); + i915_gem_object_unlock(log->vma->obj); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 80dcd879fc58..a2ac06a08715 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -765,8 +765,10 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, atomic_inc(&dev_priv->gpu_error.pending_fb_pin); + i915_gem_object_lock(new_bo); vma = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL, PIN_MAPPABLE); + i915_gem_object_unlock(new_bo); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto out_pin_section; @@ -1305,15 +1307,20 @@ int intel_overlay_attrs_ioctl(struct drm_device *dev, void *data, static int get_registers(struct intel_overlay *overlay, bool use_phys) { + struct drm_i915_private *i915 = overlay->i915; struct drm_i915_gem_object *obj; struct i915_vma *vma; int err; - obj = i915_gem_object_create_stolen(overlay->i915, PAGE_SIZE); + mutex_lock(&i915->drm.struct_mutex); + + obj = i915_gem_object_create_stolen(i915, PAGE_SIZE); if (obj == NULL) - obj = i915_gem_object_create_internal(overlay->i915, PAGE_SIZE); - if (IS_ERR(obj)) - return PTR_ERR(obj); + obj = i915_gem_object_create_internal(i915, PAGE_SIZE); + if (IS_ERR(obj)) { + err = PTR_ERR(obj); + goto err_unlock; + } vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE); if (IS_ERR(vma)) { @@ -1334,10 +1341,13 @@ static int get_registers(struct intel_overlay *overlay, bool use_phys) } overlay->reg_bo = obj; + mutex_unlock(&i915->drm.struct_mutex); return 0; err_put_bo: i915_gem_object_put(obj); +err_unlock: + mutex_unlock(&i915->drm.struct_mutex); return err; } @@ -1363,18 +1373,16 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv) INIT_ACTIVE_REQUEST(&overlay->last_flip); - mutex_lock(&dev_priv->drm.struct_mutex); - ret = get_registers(overlay, OVERLAY_NEEDS_PHYSICAL(dev_priv)); if (ret) goto out_free; + i915_gem_object_lock(overlay->reg_bo); ret = i915_gem_object_set_to_gtt_domain(overlay->reg_bo, true); + i915_gem_object_unlock(overlay->reg_bo); if (ret) goto out_reg_bo; - mutex_unlock(&dev_priv->drm.struct_mutex); - memset_io(overlay->regs, 0, sizeof(struct overlay_registers)); update_polyphase_filter(overlay->regs); update_reg_attrs(overlay, overlay->regs); @@ -1386,7 +1394,6 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv) out_reg_bo: i915_gem_object_put(overlay->reg_bo); out_free: - mutex_unlock(&dev_priv->drm.struct_mutex); kfree(overlay); } diff --git a/drivers/gpu/drm/i915/intel_uc_fw.c b/drivers/gpu/drm/i915/intel_uc_fw.c index 3257a054cb0b..e0042650726a 100644 --- a/drivers/gpu/drm/i915/intel_uc_fw.c +++ b/drivers/gpu/drm/i915/intel_uc_fw.c @@ -246,15 +246,13 @@ int intel_uc_fw_upload(struct intel_uc_fw *uc_fw, intel_uc_fw_type_repr(uc_fw->type), intel_uc_fw_status_repr(uc_fw->load_status)); - intel_uc_fw_ggtt_bind(uc_fw); - /* Call custom loader */ + intel_uc_fw_ggtt_bind(uc_fw); err = xfer(uc_fw); + intel_uc_fw_ggtt_unbind(uc_fw); if (err) goto fail; - intel_uc_fw_ggtt_unbind(uc_fw); - uc_fw->load_status = INTEL_UC_FIRMWARE_SUCCESS; DRM_DEBUG_DRIVER("%s fw load %s\n", intel_uc_fw_type_repr(uc_fw->type), diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index 4fd5356c6577..2c5479ca1f69 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -874,7 +874,9 @@ static int live_all_engines(void *arg) i915_gem_object_set_active_reference(batch->obj); } + i915_vma_lock(batch); err = i915_vma_move_to_active(batch, request[id], 0); + i915_vma_unlock(batch); GEM_BUG_ON(err); i915_request_get(request[id]); @@ -989,7 +991,9 @@ static int live_sequential_engines(void *arg) GEM_BUG_ON(err); request[id]->batch = batch; + i915_vma_lock(batch); err = i915_vma_move_to_active(batch, request[id], 0); + i915_vma_unlock(batch); GEM_BUG_ON(err); i915_gem_object_set_active_reference(batch->obj); diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c index 38d6f1b10c54..15c0f0af9658 100644 --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c @@ -76,7 +76,9 @@ static int move_to_active(struct i915_vma *vma, { int err; + i915_vma_lock(vma); err = i915_vma_move_to_active(vma, rq, flags); + i915_vma_unlock(vma); if (err) return err; From patchwork Wed May 22 08:21:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955277 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5481A912 for ; Wed, 22 May 2019 08:21:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4675F28B59 for ; Wed, 22 May 2019 08:21:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A87828A7A; Wed, 22 May 2019 08:21:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1E9F928B5D for ; Wed, 22 May 2019 08:21:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1739989801; Wed, 22 May 2019 08:21:18 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2321D897D0 for ; Wed, 22 May 2019 08:21:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636791-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:21:02 +0100 Message-Id: <20190522082104.5117-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 11/13] drm/i915: Move GEM object waiting to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c by moving the object wait decomposition into its own file. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + drivers/gpu/drm/i915/gem/i915_gem_wait.c | 277 +++++++++++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 7 - drivers/gpu/drm/i915/i915_gem.c | 254 ------------------- drivers/gpu/drm/i915/i915_utils.h | 10 - 6 files changed, 286 insertions(+), 271 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_wait.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e5348c355987..a4cc2f7f9bc6 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -105,6 +105,7 @@ gem-y += \ gem/i915_gem_stolen.o \ gem/i915_gem_tiling.o \ gem/i915_gem_userptr.o \ + gem/i915_gem_wait.o \ gem/i915_gemfs.o i915-y += \ $(gem-y) \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index b0488517e945..5233ec3a056d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -441,4 +441,12 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj) obj->cache_dirty = true; } +int i915_gem_object_wait(struct drm_i915_gem_object *obj, + unsigned int flags, + long timeout); +int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, + unsigned int flags, + const struct i915_sched_attr *attr); +#define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX) + #endif diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c new file mode 100644 index 000000000000..fed5c751ef37 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -0,0 +1,277 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2016 Intel Corporation + */ + +#include +#include + +#include "gt/intel_engine.h" + +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" + +static long +i915_gem_object_wait_fence(struct dma_fence *fence, + unsigned int flags, + long timeout) +{ + BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1); + + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) + return timeout; + + if (dma_fence_is_i915(fence)) + return i915_request_wait(to_request(fence), flags, timeout); + + return dma_fence_wait_timeout(fence, + flags & I915_WAIT_INTERRUPTIBLE, + timeout); +} + +static long +i915_gem_object_wait_reservation(struct reservation_object *resv, + unsigned int flags, + long timeout) +{ + unsigned int seq = __read_seqcount_begin(&resv->seq); + struct dma_fence *excl; + bool prune_fences = false; + + if (flags & I915_WAIT_ALL) { + struct dma_fence **shared; + unsigned int count, i; + int ret; + + ret = reservation_object_get_fences_rcu(resv, + &excl, &count, &shared); + if (ret) + return ret; + + for (i = 0; i < count; i++) { + timeout = i915_gem_object_wait_fence(shared[i], + flags, timeout); + if (timeout < 0) + break; + + dma_fence_put(shared[i]); + } + + for (; i < count; i++) + dma_fence_put(shared[i]); + kfree(shared); + + /* + * If both shared fences and an exclusive fence exist, + * then by construction the shared fences must be later + * than the exclusive fence. If we successfully wait for + * all the shared fences, we know that the exclusive fence + * must all be signaled. If all the shared fences are + * signaled, we can prune the array and recover the + * floating references on the fences/requests. + */ + prune_fences = count && timeout >= 0; + } else { + excl = reservation_object_get_excl_rcu(resv); + } + + if (excl && timeout >= 0) + timeout = i915_gem_object_wait_fence(excl, flags, timeout); + + dma_fence_put(excl); + + /* + * Opportunistically prune the fences iff we know they have *all* been + * signaled and that the reservation object has not been changed (i.e. + * no new fences have been added). + */ + if (prune_fences && !__read_seqcount_retry(&resv->seq, seq)) { + if (reservation_object_trylock(resv)) { + if (!__read_seqcount_retry(&resv->seq, seq)) + reservation_object_add_excl_fence(resv, NULL); + reservation_object_unlock(resv); + } + } + + return timeout; +} + +static void __fence_set_priority(struct dma_fence *fence, + const struct i915_sched_attr *attr) +{ + struct i915_request *rq; + struct intel_engine_cs *engine; + + if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence)) + return; + + rq = to_request(fence); + engine = rq->engine; + + local_bh_disable(); + rcu_read_lock(); /* RCU serialisation for set-wedged protection */ + if (engine->schedule) + engine->schedule(rq, attr); + rcu_read_unlock(); + local_bh_enable(); /* kick the tasklets if queues were reprioritised */ +} + +static void fence_set_priority(struct dma_fence *fence, + const struct i915_sched_attr *attr) +{ + /* Recurse once into a fence-array */ + if (dma_fence_is_array(fence)) { + struct dma_fence_array *array = to_dma_fence_array(fence); + int i; + + for (i = 0; i < array->num_fences; i++) + __fence_set_priority(array->fences[i], attr); + } else { + __fence_set_priority(fence, attr); + } +} + +int +i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, + unsigned int flags, + const struct i915_sched_attr *attr) +{ + struct dma_fence *excl; + + if (flags & I915_WAIT_ALL) { + struct dma_fence **shared; + unsigned int count, i; + int ret; + + ret = reservation_object_get_fences_rcu(obj->resv, + &excl, &count, &shared); + if (ret) + return ret; + + for (i = 0; i < count; i++) { + fence_set_priority(shared[i], attr); + dma_fence_put(shared[i]); + } + + kfree(shared); + } else { + excl = reservation_object_get_excl_rcu(obj->resv); + } + + if (excl) { + fence_set_priority(excl, attr); + dma_fence_put(excl); + } + return 0; +} + +/** + * Waits for rendering to the object to be completed + * @obj: i915 gem object + * @flags: how to wait (under a lock, for all rendering or just for writes etc) + * @timeout: how long to wait + */ +int +i915_gem_object_wait(struct drm_i915_gem_object *obj, + unsigned int flags, + long timeout) +{ + might_sleep(); + GEM_BUG_ON(timeout < 0); + + timeout = i915_gem_object_wait_reservation(obj->resv, flags, timeout); + return timeout < 0 ? timeout : 0; +} + +static inline unsigned long nsecs_to_jiffies_timeout(const u64 n) +{ + /* nsecs_to_jiffies64() does not guard against overflow */ + if (NSEC_PER_SEC % HZ && + div_u64(n, NSEC_PER_SEC) >= MAX_JIFFY_OFFSET / HZ) + return MAX_JIFFY_OFFSET; + + return min_t(u64, MAX_JIFFY_OFFSET, nsecs_to_jiffies64(n) + 1); +} + +static unsigned long to_wait_timeout(s64 timeout_ns) +{ + if (timeout_ns < 0) + return MAX_SCHEDULE_TIMEOUT; + + if (timeout_ns == 0) + return 0; + + return nsecs_to_jiffies_timeout(timeout_ns); +} + +/** + * i915_gem_wait_ioctl - implements DRM_IOCTL_I915_GEM_WAIT + * @dev: drm device pointer + * @data: ioctl data blob + * @file: drm file pointer + * + * Returns 0 if successful, else an error is returned with the remaining time in + * the timeout parameter. + * -ETIME: object is still busy after timeout + * -ERESTARTSYS: signal interrupted the wait + * -ENONENT: object doesn't exist + * Also possible, but rare: + * -EAGAIN: incomplete, restart syscall + * -ENOMEM: damn + * -ENODEV: Internal IRQ fail + * -E?: The add request failed + * + * The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any + * non-zero timeout parameter the wait ioctl will wait for the given number of + * nanoseconds on an object becoming unbusy. Since the wait itself does so + * without holding struct_mutex the object may become re-busied before this + * function completes. A similar but shorter * race condition exists in the busy + * ioctl + */ +int +i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_i915_gem_wait *args = data; + struct drm_i915_gem_object *obj; + ktime_t start; + long ret; + + if (args->flags != 0) + return -EINVAL; + + obj = i915_gem_object_lookup(file, args->bo_handle); + if (!obj) + return -ENOENT; + + start = ktime_get(); + + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + I915_WAIT_ALL, + to_wait_timeout(args->timeout_ns)); + + if (args->timeout_ns > 0) { + args->timeout_ns -= ktime_to_ns(ktime_sub(ktime_get(), start)); + if (args->timeout_ns < 0) + args->timeout_ns = 0; + + /* + * Apparently ktime isn't accurate enough and occasionally has a + * bit of mismatch in the jiffies<->nsecs<->ktime loop. So patch + * things up to make the test happy. We allow up to 1 jiffy. + * + * This is a regression from the timespec->ktime conversion. + */ + if (ret == -ETIME && !nsecs_to_jiffies(args->timeout_ns)) + args->timeout_ns = 0; + + /* Asked to wait beyond the jiffie/scheduler precision? */ + if (ret == -ETIME && args->timeout_ns) + ret = -EAGAIN; + } + + i915_gem_object_put(obj); + return ret; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 89fbf9927e6c..4db5ae14698b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2747,13 +2747,6 @@ void i915_gem_suspend(struct drm_i915_private *dev_priv); void i915_gem_suspend_late(struct drm_i915_private *dev_priv); void i915_gem_resume(struct drm_i915_private *dev_priv); vm_fault_t i915_gem_fault(struct vm_fault *vmf); -int i915_gem_object_wait(struct drm_i915_gem_object *obj, - unsigned int flags, - long timeout); -int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, - unsigned int flags, - const struct i915_sched_attr *attr); -#define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX) int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file); void i915_gem_release(struct drm_device *dev, struct drm_file *file); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d0bcfef5608c..597624bd91e6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -125,178 +125,6 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj) return ret; } -static long -i915_gem_object_wait_fence(struct dma_fence *fence, - unsigned int flags, - long timeout) -{ - BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1); - - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) - return timeout; - - if (dma_fence_is_i915(fence)) - return i915_request_wait(to_request(fence), flags, timeout); - - return dma_fence_wait_timeout(fence, - flags & I915_WAIT_INTERRUPTIBLE, - timeout); -} - -static long -i915_gem_object_wait_reservation(struct reservation_object *resv, - unsigned int flags, - long timeout) -{ - unsigned int seq = __read_seqcount_begin(&resv->seq); - struct dma_fence *excl; - bool prune_fences = false; - - if (flags & I915_WAIT_ALL) { - struct dma_fence **shared; - unsigned int count, i; - int ret; - - ret = reservation_object_get_fences_rcu(resv, - &excl, &count, &shared); - if (ret) - return ret; - - for (i = 0; i < count; i++) { - timeout = i915_gem_object_wait_fence(shared[i], - flags, timeout); - if (timeout < 0) - break; - - dma_fence_put(shared[i]); - } - - for (; i < count; i++) - dma_fence_put(shared[i]); - kfree(shared); - - /* - * If both shared fences and an exclusive fence exist, - * then by construction the shared fences must be later - * than the exclusive fence. If we successfully wait for - * all the shared fences, we know that the exclusive fence - * must all be signaled. If all the shared fences are - * signaled, we can prune the array and recover the - * floating references on the fences/requests. - */ - prune_fences = count && timeout >= 0; - } else { - excl = reservation_object_get_excl_rcu(resv); - } - - if (excl && timeout >= 0) - timeout = i915_gem_object_wait_fence(excl, flags, timeout); - - dma_fence_put(excl); - - /* - * Opportunistically prune the fences iff we know they have *all* been - * signaled and that the reservation object has not been changed (i.e. - * no new fences have been added). - */ - if (prune_fences && !__read_seqcount_retry(&resv->seq, seq)) { - if (reservation_object_trylock(resv)) { - if (!__read_seqcount_retry(&resv->seq, seq)) - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); - } - } - - return timeout; -} - -static void __fence_set_priority(struct dma_fence *fence, - const struct i915_sched_attr *attr) -{ - struct i915_request *rq; - struct intel_engine_cs *engine; - - if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence)) - return; - - rq = to_request(fence); - engine = rq->engine; - - local_bh_disable(); - rcu_read_lock(); /* RCU serialisation for set-wedged protection */ - if (engine->schedule) - engine->schedule(rq, attr); - rcu_read_unlock(); - local_bh_enable(); /* kick the tasklets if queues were reprioritised */ -} - -static void fence_set_priority(struct dma_fence *fence, - const struct i915_sched_attr *attr) -{ - /* Recurse once into a fence-array */ - if (dma_fence_is_array(fence)) { - struct dma_fence_array *array = to_dma_fence_array(fence); - int i; - - for (i = 0; i < array->num_fences; i++) - __fence_set_priority(array->fences[i], attr); - } else { - __fence_set_priority(fence, attr); - } -} - -int -i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, - unsigned int flags, - const struct i915_sched_attr *attr) -{ - struct dma_fence *excl; - - if (flags & I915_WAIT_ALL) { - struct dma_fence **shared; - unsigned int count, i; - int ret; - - ret = reservation_object_get_fences_rcu(obj->resv, - &excl, &count, &shared); - if (ret) - return ret; - - for (i = 0; i < count; i++) { - fence_set_priority(shared[i], attr); - dma_fence_put(shared[i]); - } - - kfree(shared); - } else { - excl = reservation_object_get_excl_rcu(obj->resv); - } - - if (excl) { - fence_set_priority(excl, attr); - dma_fence_put(excl); - } - return 0; -} - -/** - * Waits for rendering to the object to be completed - * @obj: i915 gem object - * @flags: how to wait (under a lock, for all rendering or just for writes etc) - * @timeout: how long to wait - */ -int -i915_gem_object_wait(struct drm_i915_gem_object *obj, - unsigned int flags, - long timeout) -{ - might_sleep(); - GEM_BUG_ON(timeout < 0); - - timeout = i915_gem_object_wait_reservation(obj->resv, flags, timeout); - return timeout < 0 ? timeout : 0; -} - static int i915_gem_phys_pwrite(struct drm_i915_gem_object *obj, struct drm_i915_gem_pwrite *args, @@ -1097,88 +925,6 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv) } } -static unsigned long to_wait_timeout(s64 timeout_ns) -{ - if (timeout_ns < 0) - return MAX_SCHEDULE_TIMEOUT; - - if (timeout_ns == 0) - return 0; - - return nsecs_to_jiffies_timeout(timeout_ns); -} - -/** - * i915_gem_wait_ioctl - implements DRM_IOCTL_I915_GEM_WAIT - * @dev: drm device pointer - * @data: ioctl data blob - * @file: drm file pointer - * - * Returns 0 if successful, else an error is returned with the remaining time in - * the timeout parameter. - * -ETIME: object is still busy after timeout - * -ERESTARTSYS: signal interrupted the wait - * -ENONENT: object doesn't exist - * Also possible, but rare: - * -EAGAIN: incomplete, restart syscall - * -ENOMEM: damn - * -ENODEV: Internal IRQ fail - * -E?: The add request failed - * - * The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any - * non-zero timeout parameter the wait ioctl will wait for the given number of - * nanoseconds on an object becoming unbusy. Since the wait itself does so - * without holding struct_mutex the object may become re-busied before this - * function completes. A similar but shorter * race condition exists in the busy - * ioctl - */ -int -i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) -{ - struct drm_i915_gem_wait *args = data; - struct drm_i915_gem_object *obj; - ktime_t start; - long ret; - - if (args->flags != 0) - return -EINVAL; - - obj = i915_gem_object_lookup(file, args->bo_handle); - if (!obj) - return -ENOENT; - - start = ktime_get(); - - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - I915_WAIT_ALL, - to_wait_timeout(args->timeout_ns)); - - if (args->timeout_ns > 0) { - args->timeout_ns -= ktime_to_ns(ktime_sub(ktime_get(), start)); - if (args->timeout_ns < 0) - args->timeout_ns = 0; - - /* - * Apparently ktime isn't accurate enough and occasionally has a - * bit of mismatch in the jiffies<->nsecs<->ktime loop. So patch - * things up to make the test happy. We allow up to 1 jiffy. - * - * This is a regression from the timespec->ktime conversion. - */ - if (ret == -ETIME && !nsecs_to_jiffies(args->timeout_ns)) - args->timeout_ns = 0; - - /* Asked to wait beyond the jiffie/scheduler precision? */ - if (ret == -ETIME && args->timeout_ns) - ret = -EAGAIN; - } - - i915_gem_object_put(obj); - return ret; -} - static int wait_for_engines(struct drm_i915_private *i915) { if (wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT)) { diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index e52866084891..2987219a6300 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -220,16 +220,6 @@ static inline unsigned long msecs_to_jiffies_timeout(const unsigned int m) return min_t(unsigned long, MAX_JIFFY_OFFSET, j + 1); } -static inline unsigned long nsecs_to_jiffies_timeout(const u64 n) -{ - /* nsecs_to_jiffies64() does not guard against overflow */ - if (NSEC_PER_SEC % HZ && - div_u64(n, NSEC_PER_SEC) >= MAX_JIFFY_OFFSET / HZ) - return MAX_JIFFY_OFFSET; - - return min_t(u64, MAX_JIFFY_OFFSET, nsecs_to_jiffies64(n) + 1); -} - /* * If you need to wait X milliseconds between events A and B, but event B * doesn't happen exactly after event A, you record the timestamp (jiffies) of From patchwork Wed May 22 08:21:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B987B912 for ; Wed, 22 May 2019 08:21:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA749202A5 for ; Wed, 22 May 2019 08:21:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E88D28A19; Wed, 22 May 2019 08:21:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CB832287C6 for ; Wed, 22 May 2019 08:21:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3065E897FD; Wed, 22 May 2019 08:21:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 678D4897D0 for ; Wed, 22 May 2019 08:21:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636792-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:21:03 +0100 Message-Id: <20190522082104.5117-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 12/13] drm/i915: Move GEM object busy checking to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c by moving the object busy checking into its own file. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_busy.c | 138 +++++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem.c | 128 --------------------- 3 files changed, 139 insertions(+), 128 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_busy.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index a4cc2f7f9bc6..865d7b51c297 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -88,6 +88,7 @@ i915-y += $(gt-y) # GEM (Graphics Execution Management) code obj-y += gem/ gem-y += \ + gem/i915_gem_busy.o \ gem/i915_gem_clflush.o \ gem/i915_gem_context.o \ gem/i915_gem_dmabuf.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c new file mode 100644 index 000000000000..5a5eda3003e9 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -0,0 +1,138 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include "gt/intel_engine.h" + +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" + +static __always_inline u32 __busy_read_flag(u8 id) +{ + if (id == (u8)I915_ENGINE_CLASS_INVALID) + return 0xffff0000u; + + GEM_BUG_ON(id >= 16); + return 0x10000u << id; +} + +static __always_inline u32 __busy_write_id(u8 id) +{ + /* + * The uABI guarantees an active writer is also amongst the read + * engines. This would be true if we accessed the activity tracking + * under the lock, but as we perform the lookup of the object and + * its activity locklessly we can not guarantee that the last_write + * being active implies that we have set the same engine flag from + * last_read - hence we always set both read and write busy for + * last_write. + */ + if (id == (u8)I915_ENGINE_CLASS_INVALID) + return 0xffffffffu; + + return (id + 1) | __busy_read_flag(id); +} + +static __always_inline unsigned int +__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u8 id)) +{ + const struct i915_request *rq; + + /* + * We have to check the current hw status of the fence as the uABI + * guarantees forward progress. We could rely on the idle worker + * to eventually flush us, but to minimise latency just ask the + * hardware. + * + * Note we only report on the status of native fences. + */ + if (!dma_fence_is_i915(fence)) + return 0; + + /* opencode to_request() in order to avoid const warnings */ + rq = container_of(fence, const struct i915_request, fence); + if (i915_request_completed(rq)) + return 0; + + /* Beware type-expansion follies! */ + BUILD_BUG_ON(!typecheck(u8, rq->engine->uabi_class)); + return flag(rq->engine->uabi_class); +} + +static __always_inline unsigned int +busy_check_reader(const struct dma_fence *fence) +{ + return __busy_set_if_active(fence, __busy_read_flag); +} + +static __always_inline unsigned int +busy_check_writer(const struct dma_fence *fence) +{ + if (!fence) + return 0; + + return __busy_set_if_active(fence, __busy_write_id); +} + +int +i915_gem_busy_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_busy *args = data; + struct drm_i915_gem_object *obj; + struct reservation_object_list *list; + unsigned int seq; + int err; + + err = -ENOENT; + rcu_read_lock(); + obj = i915_gem_object_lookup_rcu(file, args->handle); + if (!obj) + goto out; + + /* + * A discrepancy here is that we do not report the status of + * non-i915 fences, i.e. even though we may report the object as idle, + * a call to set-domain may still stall waiting for foreign rendering. + * This also means that wait-ioctl may report an object as busy, + * where busy-ioctl considers it idle. + * + * We trade the ability to warn of foreign fences to report on which + * i915 engines are active for the object. + * + * Alternatively, we can trade that extra information on read/write + * activity with + * args->busy = + * !reservation_object_test_signaled_rcu(obj->resv, true); + * to report the overall busyness. This is what the wait-ioctl does. + * + */ +retry: + seq = raw_read_seqcount(&obj->resv->seq); + + /* Translate the exclusive fence to the READ *and* WRITE engine */ + args->busy = busy_check_writer(rcu_dereference(obj->resv->fence_excl)); + + /* Translate shared fences to READ set of engines */ + list = rcu_dereference(obj->resv->fence); + if (list) { + unsigned int shared_count = list->shared_count, i; + + for (i = 0; i < shared_count; ++i) { + struct dma_fence *fence = + rcu_dereference(list->shared[i]); + + args->busy |= busy_check_reader(fence); + } + } + + if (args->busy && read_seqcount_retry(&obj->resv->seq, seq)) + goto retry; + + err = 0; +out: + rcu_read_unlock(); + return err; +} diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 597624bd91e6..823459082be6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1142,134 +1142,6 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, return vma; } -static __always_inline u32 __busy_read_flag(u8 id) -{ - if (id == (u8)I915_ENGINE_CLASS_INVALID) - return 0xffff0000u; - - GEM_BUG_ON(id >= 16); - return 0x10000u << id; -} - -static __always_inline u32 __busy_write_id(u8 id) -{ - /* - * The uABI guarantees an active writer is also amongst the read - * engines. This would be true if we accessed the activity tracking - * under the lock, but as we perform the lookup of the object and - * its activity locklessly we can not guarantee that the last_write - * being active implies that we have set the same engine flag from - * last_read - hence we always set both read and write busy for - * last_write. - */ - if (id == (u8)I915_ENGINE_CLASS_INVALID) - return 0xffffffffu; - - return (id + 1) | __busy_read_flag(id); -} - -static __always_inline unsigned int -__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u8 id)) -{ - const struct i915_request *rq; - - /* - * We have to check the current hw status of the fence as the uABI - * guarantees forward progress. We could rely on the idle worker - * to eventually flush us, but to minimise latency just ask the - * hardware. - * - * Note we only report on the status of native fences. - */ - if (!dma_fence_is_i915(fence)) - return 0; - - /* opencode to_request() in order to avoid const warnings */ - rq = container_of(fence, const struct i915_request, fence); - if (i915_request_completed(rq)) - return 0; - - /* Beware type-expansion follies! */ - BUILD_BUG_ON(!typecheck(u8, rq->engine->uabi_class)); - return flag(rq->engine->uabi_class); -} - -static __always_inline unsigned int -busy_check_reader(const struct dma_fence *fence) -{ - return __busy_set_if_active(fence, __busy_read_flag); -} - -static __always_inline unsigned int -busy_check_writer(const struct dma_fence *fence) -{ - if (!fence) - return 0; - - return __busy_set_if_active(fence, __busy_write_id); -} - -int -i915_gem_busy_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_i915_gem_busy *args = data; - struct drm_i915_gem_object *obj; - struct reservation_object_list *list; - unsigned int seq; - int err; - - err = -ENOENT; - rcu_read_lock(); - obj = i915_gem_object_lookup_rcu(file, args->handle); - if (!obj) - goto out; - - /* - * A discrepancy here is that we do not report the status of - * non-i915 fences, i.e. even though we may report the object as idle, - * a call to set-domain may still stall waiting for foreign rendering. - * This also means that wait-ioctl may report an object as busy, - * where busy-ioctl considers it idle. - * - * We trade the ability to warn of foreign fences to report on which - * i915 engines are active for the object. - * - * Alternatively, we can trade that extra information on read/write - * activity with - * args->busy = - * !reservation_object_test_signaled_rcu(obj->resv, true); - * to report the overall busyness. This is what the wait-ioctl does. - * - */ -retry: - seq = raw_read_seqcount(&obj->resv->seq); - - /* Translate the exclusive fence to the READ *and* WRITE engine */ - args->busy = busy_check_writer(rcu_dereference(obj->resv->fence_excl)); - - /* Translate shared fences to READ set of engines */ - list = rcu_dereference(obj->resv->fence); - if (list) { - unsigned int shared_count = list->shared_count, i; - - for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence = - rcu_dereference(list->shared[i]); - - args->busy |= busy_check_reader(fence); - } - } - - if (args->busy && read_seqcount_retry(&obj->resv->seq, seq)) - goto retry; - - err = 0; -out: - rcu_read_unlock(); - return err; -} - int i915_gem_throttle_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) From patchwork Wed May 22 08:21:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10955283 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1167112C for ; Wed, 22 May 2019 08:21:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A4B5E28B4D for ; Wed, 22 May 2019 08:21:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9997328A48; Wed, 22 May 2019 08:21:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0B54428A26 for ; Wed, 22 May 2019 08:21:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 31A1289836; Wed, 22 May 2019 08:21:19 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4926F89789 for ; Wed, 22 May 2019 08:21:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16636793-1500050 for ; Wed, 22 May 2019 09:21:06 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 09:21:04 +0100 Message-Id: <20190522082104.5117-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190522082104.5117-1-chris@chris-wilson.co.uk> References: <20190522082104.5117-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 13/13] drm/i915: Move GEM client throttling to its own file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Continuing the decluttering of i915_gem.c by moving the client self throttling into its own file. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_throttle.c | 73 ++++++++++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 6 -- drivers/gpu/drm/i915/i915_gem.c | 58 ---------------- 4 files changed, 74 insertions(+), 64 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_throttle.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 865d7b51c297..78d36eaff070 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -104,6 +104,7 @@ gem-y += \ gem/i915_gem_shmem.o \ gem/i915_gem_shrinker.o \ gem/i915_gem_stolen.o \ + gem/i915_gem_throttle.o \ gem/i915_gem_tiling.o \ gem/i915_gem_userptr.o \ gem/i915_gem_wait.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_throttle.c b/drivers/gpu/drm/i915/gem/i915_gem_throttle.c new file mode 100644 index 000000000000..adb3074d9ce2 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_throttle.c @@ -0,0 +1,73 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2014-2016 Intel Corporation + */ + +#include + +#include + +#include "i915_drv.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_object.h" + +/* + * 20ms is a fairly arbitrary limit (greater than the average frame time) + * chosen to prevent the CPU getting more than a frame ahead of the GPU + * (when using lax throttling for the frontbuffer). We also use it to + * offer free GPU waitboosts for severely congested workloads. + */ +#define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20) + +/* + * Throttle our rendering by waiting until the ring has completed our requests + * emitted over 20 msec ago. + * + * Note that if we were to use the current jiffies each time around the loop, + * we wouldn't escape the function with any frames outstanding if the time to + * render a frame was over 20ms. + * + * This should get us reasonable parallelism between CPU and GPU but also + * relatively low latency when blocking on a particular request to finish. + */ +int +i915_gem_throttle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES; + struct i915_request *request, *target = NULL; + long ret; + + /* ABI: return -EIO if already wedged */ + ret = i915_terminally_wedged(to_i915(dev)); + if (ret) + return ret; + + spin_lock(&file_priv->mm.lock); + list_for_each_entry(request, &file_priv->mm.request_list, client_link) { + if (time_after_eq(request->emitted_jiffies, recent_enough)) + break; + + if (target) { + list_del(&target->client_link); + target->file_priv = NULL; + } + + target = request; + } + if (target) + i915_request_get(target); + spin_unlock(&file_priv->mm.lock); + + if (!target) + return 0; + + ret = i915_request_wait(target, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + i915_request_put(target); + + return ret < 0 ? ret : 0; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4db5ae14698b..25e29ee03935 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -212,12 +212,6 @@ struct drm_i915_file_private { struct { spinlock_t lock; struct list_head request_list; -/* 20ms is a fairly arbitrary limit (greater than the average frame time) - * chosen to prevent the CPU getting more than a frame ahead of the GPU - * (when using lax throttling for the frontbuffer). We also use it to - * offer free GPU waitboosts for severely congested workloads. - */ -#define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20) } mm; struct idr context_idr; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 823459082be6..811b4db087e9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1012,57 +1012,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, return 0; } -/* Throttle our rendering by waiting until the ring has completed our requests - * emitted over 20 msec ago. - * - * Note that if we were to use the current jiffies each time around the loop, - * we wouldn't escape the function with any frames outstanding if the time to - * render a frame was over 20ms. - * - * This should get us reasonable parallelism between CPU and GPU but also - * relatively low latency when blocking on a particular request to finish. - */ -static int -i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) -{ - struct drm_i915_private *dev_priv = to_i915(dev); - struct drm_i915_file_private *file_priv = file->driver_priv; - unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES; - struct i915_request *request, *target = NULL; - long ret; - - /* ABI: return -EIO if already wedged */ - ret = i915_terminally_wedged(dev_priv); - if (ret) - return ret; - - spin_lock(&file_priv->mm.lock); - list_for_each_entry(request, &file_priv->mm.request_list, client_link) { - if (time_after_eq(request->emitted_jiffies, recent_enough)) - break; - - if (target) { - list_del(&target->client_link); - target->file_priv = NULL; - } - - target = request; - } - if (target) - i915_request_get(target); - spin_unlock(&file_priv->mm.lock); - - if (target == NULL) - return 0; - - ret = i915_request_wait(target, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - i915_request_put(target); - - return ret < 0 ? ret : 0; -} - struct i915_vma * i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, @@ -1142,13 +1091,6 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, return vma; } -int -i915_gem_throttle_ioctl(struct drm_device *dev, void *data, - struct drm_file *file_priv) -{ - return i915_gem_ring_throttle(dev, file_priv); -} - int i915_gem_madvise_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)