From patchwork Thu Nov 5 11:56:28 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 7560891 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id EE328BEEA4 for ; Thu, 5 Nov 2015 11:56:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E0A5E2077A for ; Thu, 5 Nov 2015 11:56:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 921E0206B8 for ; Thu, 5 Nov 2015 11:56:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BFF34721A5; Thu, 5 Nov 2015 03:56:33 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [87.106.93.118]) by gabe.freedesktop.org (Postfix) with ESMTP id 6BB83721A5 for ; Thu, 5 Nov 2015 03:56:32 -0800 (PST) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from nuc-i3427.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 47777521-1500048 for multiple; Thu, 05 Nov 2015 11:56:46 +0000 Received: by nuc-i3427.alporthouse.com (sSMTP sendmail emulation); Thu, 05 Nov 2015 11:56:28 +0000 Date: Thu, 5 Nov 2015 11:56:28 +0000 From: Chris Wilson To: Imre Deak Message-ID: <20151105115628.GZ669@nuc-i3427.alporthouse.com> Mail-Followup-To: Chris Wilson , Imre Deak , intel-gfx@lists.freedesktop.org References: <1446665132-22491-1-git-send-email-imre.deak@intel.com> <20151104205741.GA3478@nuc-i3427.alporthouse.com> <1446722912.26194.14.camel@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1446722912.26194.14.camel@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH] drm/i915: get runtime PM reference around GEM set_caching IOCTL X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote: > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote: > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote: > > > After Damien's D3 fix I started to get runtime suspend residency for the > > > first time and that revealed a breakage on the set_caching IOCTL path > > > that accesses the HW but doesn't take an RPM ref. Fix this up. > > > > Why here (and in so many random places) and not around the PTE write > > itself? > > Imo we should take the RPM ref outside of any of our locks. Otherwise we > need hacks like we have currently in the runtime suspend handler to work > around lock inversions. It works now, but we couldn't do the same trick > if we needed to take struct_mutex for example in the resume handler too > for some reason. On the other hand, it leads to hard to track down bugs and improper documentation. Neither solution is perfect. Note since intel_runtime_suspend has ample barriers of its own, you can avoid the struct_mutex if you have a dedicated dev_priv->mm.fault_list. Something along the lines of: The tricky part is reviewing the i915_gem_release_mmap() callers to ensure that they have the right barrier. If you make i915_gem_release_mmap() assert it holds an rpm ref, and then make the i915_gem_release_all_mmaps() unlink the node directly that should do the trick. -Chris diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 86734be..fe91ce5 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -180,11 +180,11 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) } if (obj->stolen) seq_printf(m, " (stolen: %08llx)", obj->stolen->start); - if (obj->pin_display || obj->fault_mappable) { + if (obj->pin_display || !list_empty(&obj->fault_link)) { char s[3], *t = s; if (obj->pin_display) *t++ = 'p'; - if (obj->fault_mappable) + if (!list_empty(&obj->fault_link)) *t++ = 'f'; *t = '\0'; seq_printf(m, " (%s mappable)", s); @@ -474,7 +474,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data) size = count = mappable_size = mappable_count = 0; list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (obj->fault_mappable) { + if (!list_empty(&obj->fault_link)) { size += i915_gem_obj_ggtt_size(obj); ++count; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 1d88745..179594e 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1447,28 +1447,10 @@ static int intel_runtime_suspend(struct device *device) DRM_DEBUG_KMS("Suspending device\n"); /* - * We could deadlock here in case another thread holding struct_mutex - * calls RPM suspend concurrently, since the RPM suspend will wait - * first for this RPM suspend to finish. In this case the concurrent - * RPM resume will be followed by its RPM suspend counterpart. Still - * for consistency return -EAGAIN, which will reschedule this suspend. - */ - if (!mutex_trylock(&dev->struct_mutex)) { - DRM_DEBUG_KMS("device lock contention, deffering suspend\n"); - /* - * Bump the expiration timestamp, otherwise the suspend won't - * be rescheduled. - */ - pm_runtime_mark_last_busy(device); - - return -EAGAIN; - } - /* * We are safe here against re-faults, since the fault handler takes * an RPM reference. */ i915_gem_release_all_mmaps(dev_priv); - mutex_unlock(&dev->struct_mutex); intel_suspend_gt_powersave(dev); intel_runtime_pm_disable_interrupts(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 55611d8..5e4904a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1268,6 +1268,8 @@ struct i915_gem_mm { */ struct list_head unbound_list; + struct list_head fault_list; + /** Usable portion of the GTT for GEM */ unsigned long stolen_base; /* limited to low memory (32-bit) */ @@ -2025,6 +2027,8 @@ struct drm_i915_gem_object { struct list_head batch_pool_link; + struct list_head fault_link; + /** * This is set if the object is on the active lists (has pending * rendering and so a non-zero seqno), and is not set if it i s on @@ -2069,13 +2073,6 @@ struct drm_i915_gem_object { */ unsigned int map_and_fenceable:1; - /** - * Whether the current gtt mapping needs to be mappable (and isn't just - * mappable by accident). Track pin and fault separate for a more - * accurate mappable working set. - */ - unsigned int fault_mappable:1; - /* * Is the object to be mapped as read-only to the GPU * Only honoured if hardware has relevant pte bit diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 407b6b3..a90d1d8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1814,9 +1814,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) break; } - obj->fault_mappable = true; + if (list_empty(&obj->fault_link)) + list_add(&obj->fault_link, &dev_priv->mm.fault_list); } else { - if (!obj->fault_mappable) { + if (list_empty(&obj->fault_link)) { unsigned long size = min_t(unsigned long, vma->vm_end - vma->vm_start, obj->base.size); @@ -1830,7 +1831,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) break; } - obj->fault_mappable = true; + list_add(&obj->fault_link, &dev_priv->mm.fault_list); } else ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, @@ -1903,20 +1904,20 @@ out: void i915_gem_release_mmap(struct drm_i915_gem_object *obj) { - if (!obj->fault_mappable) + if (list_empty(&obj->fault_link)) return; drm_vma_node_unmap(&obj->base.vma_node, obj->base.dev->anon_inode->i_mapping); - obj->fault_mappable = false; + list_del_init(&obj->fault_link); } void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv) { - struct drm_i915_gem_object *obj; + struct drm_i915_gem_object *obj, *n; - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) + list_for_each_entry_safe(obj, n, &dev_priv->mm.fault_list, fault_link) i915_gem_release_mmap(obj); } @@ -4224,6 +4229,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, INIT_LIST_HEAD(&obj->obj_exec_link); INIT_LIST_HEAD(&obj->vma_list); INIT_LIST_HEAD(&obj->batch_pool_link); + INIT_LIST_HEAD(&obj->fault_link); obj->ops = ops; @@ -4858,6 +4864,7 @@ i915_gem_load(struct drm_device *dev) INIT_LIST_HEAD(&dev_priv->context_list); INIT_LIST_HEAD(&dev_priv->mm.unbound_list); INIT_LIST_HEAD(&dev_priv->mm.bound_list); + INIT_LIST_HEAD(&dev_priv->mm.fault_list); INIT_LIST_HEAD(&dev_priv->mm.fence_list); for (i = 0; i < I915_NUM_RINGS; i++) init_ring_lists(&dev_priv->ring[i]);