From patchwork Thu Aug 4 19:52:15 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9264207 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1406660839 for ; Thu, 4 Aug 2016 19:52:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 050AD2841C for ; Thu, 4 Aug 2016 19:52:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EDAEA28425; Thu, 4 Aug 2016 19:52:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CD802841C for ; Thu, 4 Aug 2016 19:52:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 017116EA43; Thu, 4 Aug 2016 19:52:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) by gabe.freedesktop.org (Postfix) with ESMTPS id D27BC6EA3E for ; Thu, 4 Aug 2016 19:52:44 +0000 (UTC) Received: by mail-wm0-x244.google.com with SMTP id x83so681767wma.3 for ; Thu, 04 Aug 2016 12:52:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=GegvlFkpEoyL7n2scid99gGMvFZjj4/LKE0QICC2AcI=; b=LYOhGJZQU2v0YbDo8bYoAkqBNCdZtIezYQFAgWta3VPNwgsqJY+qzPMizNZk0nZ9lS glYEUqoaEQHzTplNkYGYuEZtA7W1kWI7oAbPNLYWw/Hlzr2v+9xEPyZQduH6b1zUE3oE 1HA12Qa/SxwP727VaWSjkvYrZNk2K7HmpV1XA143fjTs/V0PljcT+Bn6xgOAtCDpuNFS bJahwTmpQ9WoiT1x+X6nZ/pGszMo+NtDtASHDiasL21oEU/JlOnTtEPYh184W8ARMFAd JBTq/sByAKb568CCS7vy/rMwfhUwluxGS9SgBwxlQOPqPjgGO2Ukd3hPV3lLiU+VRqyb Ub8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=GegvlFkpEoyL7n2scid99gGMvFZjj4/LKE0QICC2AcI=; b=HLTG2+LgWcepI0Oxu0Fi/NCyJf/kBnD9Z/sEsbq6JwaL1/6z9Iz9Q4I6fv0s9NDpf0 eK3Ym9+qHtGl8Wb/TrIhcSVnPM6TTBc4XzI7BVOW2ZISA1g5loMd8EHKbFX3K0ifqH/Y Xl8vvCZPMTULWiPF8j5Y7uiF0IaFaLzyL7ZScMwRCoxvKqa5fdiRxlXSPdiDLZrJvl5b TQp7jt1tsWnEQ90rIQP0KY0iLGWIBKwXf27X8Cnn23lD0uR+NeYck2SkCGVevdByFH22 sdxnS7dkkR6qKPlsABhrRvY6C9PZAm6g2D7PNeUQ1BDiENfY2Ioo0p7r4VR1Ye4VajM0 WiCw== X-Gm-Message-State: AEkoouu55L3xI8crQnNiip7lUlTPQIKHcvAE/LEWQy2J8+O2cnbjZokA+eaWAFZ5bFBCBw== X-Received: by 10.28.203.136 with SMTP id b130mr30415871wmg.13.1470340361768; Thu, 04 Aug 2016 12:52:41 -0700 (PDT) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id i3sm14301931wjd.31.2016.08.04.12.52.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Aug 2016 12:52:40 -0700 (PDT) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 4 Aug 2016 20:52:15 +0100 Message-Id: <1470340352-16118-3-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1470340352-16118-1-git-send-email-chris@chris-wilson.co.uk> References: <1470340352-16118-1-git-send-email-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 02/19] drm/i915: Convert non-blocking waits for requests over to using RCU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. v2: Move i915_gem_fault tracepoint to start of function, before the unlocked wait. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 114 +++++++++++++++++----------------------- 1 file changed, 48 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index aa40d6c35c30..c600f2366d2c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -349,24 +349,20 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, return 0; } -/* A nonblocking variant of the above wait. This is a highly dangerous routine - * as the object state may change during this call. +/* A nonblocking variant of the above wait. Must be called prior to + * acquiring the mutex for the object, as the object state may change + * during this call. A reference must be held by the caller for the object. */ static __must_check int -i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, - struct intel_rps_client *rps, - bool readonly) +__unsafe_wait_rendering(struct drm_i915_gem_object *obj, + struct intel_rps_client *rps, + bool readonly) { - struct drm_device *dev = obj->base.dev; - struct drm_i915_gem_request *requests[I915_NUM_ENGINES]; struct i915_gem_active *active; unsigned long active_mask; - int ret, i, n = 0; - - lockdep_assert_held(&dev->struct_mutex); - GEM_BUG_ON(!to_i915(dev)->mm.interruptible); + int idx; - active_mask = i915_gem_object_get_active(obj); + active_mask = __I915_BO_ACTIVE(obj); if (!active_mask) return 0; @@ -377,25 +373,16 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, active = &obj->last_write; } - for_each_active(active_mask, i) { - struct drm_i915_gem_request *req; + for_each_active(active_mask, idx) { + int ret; - req = i915_gem_active_get(&active[i], - &obj->base.dev->struct_mutex); - if (req) - requests[n++] = req; + ret = i915_gem_active_wait_unlocked(&active[idx], + true, NULL, rps); + if (ret) + return ret; } - mutex_unlock(&dev->struct_mutex); - ret = 0; - for (i = 0; ret == 0 && i < n; i++) - ret = i915_wait_request(requests[i], true, NULL, rps); - mutex_lock(&dev->struct_mutex); - - for (i = 0; i < n; i++) - i915_gem_request_put(requests[i]); - - return ret; + return 0; } static struct intel_rps_client *to_rps_client(struct drm_file *file) @@ -1467,10 +1454,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, int ret; /* Only handle setting domains to types used by the CPU. */ - if (write_domain & I915_GEM_GPU_DOMAINS) - return -EINVAL; - - if (read_domains & I915_GEM_GPU_DOMAINS) + if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS) return -EINVAL; /* Having something in the write domain implies it's in the read @@ -1479,25 +1463,21 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (write_domain != 0 && read_domains != write_domain) return -EINVAL; - ret = i915_mutex_lock_interruptible(dev); - if (ret) - return ret; - obj = i915_gem_object_lookup(file, args->handle); - if (!obj) { - ret = -ENOENT; - goto unlock; - } + if (!obj) + return -ENOENT; /* Try to flush the object off the GPU without holding the lock. * We will repeat the flush holding the lock in the normal manner * to catch cases where we are gazumped. */ - ret = i915_gem_object_wait_rendering__nonblocking(obj, - to_rps_client(file), - !write_domain); + ret = __unsafe_wait_rendering(obj, to_rps_client(file), !write_domain); + if (ret) + goto err; + + ret = i915_mutex_lock_interruptible(dev); if (ret) - goto unref; + goto err; if (read_domains & I915_GEM_DOMAIN_GTT) ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0); @@ -1507,11 +1487,13 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (write_domain != 0) intel_fb_obj_invalidate(obj, write_origin(obj, write_domain)); -unref: i915_gem_object_put(obj); -unlock: mutex_unlock(&dev->struct_mutex); return ret; + +err: + i915_gem_object_put_unlocked(obj); + return ret; } /** @@ -1648,36 +1630,36 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) struct drm_i915_private *dev_priv = to_i915(dev); struct i915_ggtt *ggtt = &dev_priv->ggtt; struct i915_ggtt_view view = i915_ggtt_view_normal; + bool write = !!(vmf->flags & FAULT_FLAG_WRITE); pgoff_t page_offset; unsigned long pfn; - int ret = 0; - bool write = !!(vmf->flags & FAULT_FLAG_WRITE); - - intel_runtime_pm_get(dev_priv); + int ret; /* We don't use vmf->pgoff since that has the fake offset */ page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT; - ret = i915_mutex_lock_interruptible(dev); - if (ret) - goto out; - trace_i915_gem_object_fault(obj, page_offset, true, write); /* Try to flush the object off the GPU first without holding the lock. - * Upon reacquiring the lock, we will perform our sanity checks and then + * Upon acquiring the lock, we will perform our sanity checks and then * repeat the flush holding the lock in the normal manner to catch cases * where we are gazumped. */ - ret = i915_gem_object_wait_rendering__nonblocking(obj, NULL, !write); + ret = __unsafe_wait_rendering(obj, NULL, !write); if (ret) - goto unlock; + goto err; + + intel_runtime_pm_get(dev_priv); + + ret = i915_mutex_lock_interruptible(dev); + if (ret) + goto err_rpm; /* Access to snoopable pages through the GTT is incoherent. */ if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(dev)) { ret = -EFAULT; - goto unlock; + goto err_unlock; } /* Use a partial view if the object is bigger than the aperture. */ @@ -1698,15 +1680,15 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) /* Now pin it into the GTT if needed */ ret = i915_gem_object_ggtt_pin(obj, &view, 0, 0, PIN_MAPPABLE); if (ret) - goto unlock; + goto err_unlock; ret = i915_gem_object_set_to_gtt_domain(obj, write); if (ret) - goto unpin; + goto err_unpin; ret = i915_gem_object_get_fence(obj); if (ret) - goto unpin; + goto err_unpin; /* Finally, remap it using the new GTT offset */ pfn = ggtt->mappable_base + @@ -1751,11 +1733,13 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) (unsigned long)vmf->virtual_address, pfn + page_offset); } -unpin: +err_unpin: i915_gem_object_ggtt_unpin_view(obj, &view); -unlock: +err_unlock: mutex_unlock(&dev->struct_mutex); -out: +err_rpm: + intel_runtime_pm_put(dev_priv); +err: switch (ret) { case -EIO: /* @@ -1796,8 +1780,6 @@ out: ret = VM_FAULT_SIGBUS; break; } - - intel_runtime_pm_put(dev_priv); return ret; }