From patchwork Sat Dec 7 17:01:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11277685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33B2014B7 for ; Sat, 7 Dec 2019 17:01:31 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1C49320637 for ; Sat, 7 Dec 2019 17:01:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C49320637 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D9ADE6E1F4; Sat, 7 Dec 2019 17:01:28 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 52DB16E1F1 for ; Sat, 7 Dec 2019 17:01:24 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 19496909-1500050 for multiple; Sat, 07 Dec 2019 17:01:11 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sat, 7 Dec 2019 17:01:07 +0000 Message-Id: <20191207170110.2200142-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20191207170110.2200142-1-chris@chris-wilson.co.uk> References: <20191207170110.2200142-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/8] drm/i915: Align start for memcpy_from_wc X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" The movntqda requires 16-byte alignment for the source pointer. Avoid falling back to clflush if the source pointer is misaligned by doing the doing a small uncached memcpy to fixup the alignments. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_cmd_parser.c | 30 +++++++++++++++++--------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 6cf4e336461b..2977316d64ae 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1132,8 +1132,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, { unsigned int src_needs_clflush; unsigned int dst_needs_clflush; - void *dst, *src; - int ret; + void *dst, *src, *ptr; + int ret, len; ret = i915_gem_object_prepare_write(dst_obj, &dst_needs_clflush); if (ret) @@ -1150,19 +1150,30 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, return ERR_PTR(ret); } + ptr = dst; src = ERR_PTR(-ENODEV); - if (src_needs_clflush && - i915_can_memcpy_from_wc(NULL, offset, 0)) { + if (src_needs_clflush && i915_has_memcpy_from_wc()) { src = i915_gem_object_pin_map(src_obj, I915_MAP_WC); if (!IS_ERR(src)) { - i915_memcpy_from_wc(dst, - src + offset, - ALIGN(length, 16)); + src += offset; + + if (!IS_ALIGNED(offset, 16)) { + len = min(ALIGN(offset, 16) - offset, length); + + memcpy(ptr, src, len); + + offset += len; + length -= len; + ptr += len; + src += len; + } + GEM_BUG_ON(!IS_ALIGNED((unsigned long)src, 16)); + + i915_memcpy_from_wc(ptr, src, ALIGN(length, 16)); i915_gem_object_unpin_map(src_obj); } } if (IS_ERR(src)) { - void *ptr; int x, n; /* @@ -1177,10 +1188,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj, length = round_up(length, boot_cpu_data.x86_clflush_size); - ptr = dst; x = offset_in_page(offset); for (n = offset >> PAGE_SHIFT; length; n++) { - int len = min_t(int, length, PAGE_SIZE - x); + len = min_t(int, length, PAGE_SIZE - x); src = kmap_atomic(i915_gem_object_get_page(src_obj, n)); if (src_needs_clflush)