From patchwork Fri Mar 10 00:09:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 9614369 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F050C602B4 for ; Fri, 10 Mar 2017 00:09:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2A20286B4 for ; Fri, 10 Mar 2017 00:09:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D7474286EB; Fri, 10 Mar 2017 00:09:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5C380286B4 for ; Fri, 10 Mar 2017 00:09:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 217836E086; Fri, 10 Mar 2017 00:09:47 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wr0-x241.google.com (mail-wr0-x241.google.com [IPv6:2a00:1450:400c:c0c::241]) by gabe.freedesktop.org (Postfix) with ESMTPS id DE49E6E086 for ; Fri, 10 Mar 2017 00:09:45 +0000 (UTC) Received: by mail-wr0-x241.google.com with SMTP id u108so9764891wrb.2 for ; Thu, 09 Mar 2017 16:09:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id; bh=Z6oz3MipTlYwuzMhDyoK/77xLKRa3QuaTFrhYMZ7JiA=; b=olRQGtTe9XmtLw8dG8KaOljgEURJsdgiQHOaxLC/5w3lKI2k+0jj2Da+ej7jT9LnKR pqnMNQGlonWliKH8x918ezu4cb9k7gTSSftNGUqTanH/NiOYKApXD9hwGYBUSyy4T4wd XNgxkNt6hUJQbbQWAZ1Iw8kyVsQ67OZeYi8EPjgi2Ksu0ZmiIeAvarDeTRGQy1GHB8+R U068nmW+2jbYfC3sQaiB7M8bFlHeDz1z5Sp39NqMTHzgzZ6Sz3Ytit9r5xcqEFFUB5g1 VoNyFOLjuidWZkkCoA/MILSc1qhRLdNPLFyBcdKi367WWhuk5YG7aMSOjIawqCGZzg2B fD4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id; bh=Z6oz3MipTlYwuzMhDyoK/77xLKRa3QuaTFrhYMZ7JiA=; b=QXfv4bjAKoKNNlPJ6v6n5XWqJggvd6QpjoHvNb2qv3zuCXyQqlq8NmzJtRC7Pqjheo vC+tvrUlPu8rYQRX4Ntdd0fQXfZ0S/73Brw9PtpT4b9XjOlaSHo6DIuV7qruM8kuwh3z ujHCzh3TOukX7d1/+U8M6txcCF9xMQxWpSYYEJJTNPMcGSvPsOluBtpckB92NvmsuLoV EIUBIqFj5whKNHLRM1Y0rWYW0Ujvkn/ey2tczA9i4PANps6+e3KSONBF+vVwVX+jVkOe Yg20wWjYQ4S/aFysIsQSLH9ejIjwvjAwq5qJoUj7BbUMVxT0k6We2W6y+PTYVEUHQgeT prNg== X-Gm-Message-State: AMke39lOljOK2Tlyn6+oyNiUidiTn6CergCm9kfmJvEj7hKEVXR4zIIBVcJAIi789VjEcQ== X-Received: by 10.223.162.130 with SMTP id s2mr14244819wra.149.1489104584555; Thu, 09 Mar 2017 16:09:44 -0800 (PST) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id i82sm1581926wmf.1.2017.03.09.16.09.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Mar 2017 16:09:44 -0800 (PST) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 10 Mar 2017 00:09:42 +0000 Message-Id: <20170310000942.11661-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.11.0 Subject: [Intel-gfx] [PATCH] drm/i915: Move whole object to CPU domain for coherent shmem access X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP If the object is coherent, we can simply update the cache domain on the whole object rather than calculate the before/after clflushes. The advantage is that we then get correct tracking of ellided flushes when changing coherency later. Testcase: igt/gem_pwrite_snooped Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 6 +++--- drivers/gpu/drm/i915/i915_gem.c | 45 +++++++++++++++++++++-------------------- 2 files changed, 26 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3002996ddbed..8de104b63209 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3353,9 +3353,9 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, unsigned int *needs_clflush); int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, unsigned int *needs_clflush); -#define CLFLUSH_BEFORE 0x1 -#define CLFLUSH_AFTER 0x2 -#define CLFLUSH_FLAGS (CLFLUSH_BEFORE | CLFLUSH_AFTER) +#define CLFLUSH_BEFORE BIT(0) +#define CLFLUSH_AFTER BIT(1) +#define CLFLUSH_FLAGS (CLFLUSH_BEFORE | CLFLUSH_AFTER) static inline void i915_gem_obj_finish_shmem_access(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index aca1eaddafb4..202bb850f260 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -788,6 +788,15 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, if (ret) return ret; + if (i915_gem_object_is_coherent(obj) || + !static_cpu_has(X86_FEATURE_CLFLUSH)) { + ret = i915_gem_object_set_to_cpu_domain(obj, false); + if (ret) + goto err_unpin; + else + goto out; + } + i915_gem_object_flush_gtt_write_domain(obj); /* If we're not in the cpu read domain, set ourself into the gtt @@ -796,16 +805,9 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, * anyway again before the next pread happens. */ if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) - *needs_clflush = !i915_gem_object_is_coherent(obj); - - if (*needs_clflush && !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - - *needs_clflush = 0; - } + *needs_clflush = CLFLUSH_BEFORE; +out: /* return with the pages pinned */ return 0; @@ -838,6 +840,15 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, if (ret) return ret; + if (i915_gem_object_is_coherent(obj) || + !static_cpu_has(X86_FEATURE_CLFLUSH)) { + ret = i915_gem_object_set_to_cpu_domain(obj, true); + if (ret) + goto err_unpin; + else + goto out; + } + i915_gem_object_flush_gtt_write_domain(obj); /* If we're not in the cpu write domain, set ourself into the @@ -846,25 +857,15 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, * right away and we therefore have to clflush anyway. */ if (obj->base.write_domain != I915_GEM_DOMAIN_CPU) - *needs_clflush |= cpu_write_needs_clflush(obj) << 1; + *needs_clflush |= CLFLUSH_AFTER; /* Same trick applies to invalidate partially written cachelines read * before writing. */ if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) - *needs_clflush |= !i915_gem_object_is_coherent(obj); - - if (*needs_clflush && !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - - *needs_clflush = 0; - } - - if ((*needs_clflush & CLFLUSH_AFTER) == 0) - obj->cache_dirty = true; + *needs_clflush |= CLFLUSH_BEFORE; +out: intel_fb_obj_invalidate(obj, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */