From patchwork Sun Oct 15 16:30:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Noralf_Tr=C3=B8nnes?= X-Patchwork-Id: 10007185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 18274601E9 for ; Sun, 15 Oct 2017 16:31:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C5FD20415 for ; Sun, 15 Oct 2017 16:31:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 113CE20453; Sun, 15 Oct 2017 16:31:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B5FD120415 for ; Sun, 15 Oct 2017 16:31:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 866116E27D; Sun, 15 Oct 2017 16:31:10 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.domeneshop.no (smtp.domeneshop.no [IPv6:2a01:5b40:0:3005::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id BE69C6E1D9 for ; Sun, 15 Oct 2017 16:31:05 +0000 (UTC) Received: from 211.81-166-168.customer.lyse.net ([81.166.168.211]:55432 helo=localhost.localdomain) by smtp.domeneshop.no with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_CBC_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1e3loe-0004VL-18; Sun, 15 Oct 2017 18:31:04 +0200 From: =?UTF-8?q?Noralf=20Tr=C3=B8nnes?= To: dri-devel@lists.freedesktop.org Subject: [PATCH v2 8/8] drm/tinydrm: Relax buffer line prefetch Date: Sun, 15 Oct 2017 18:30:42 +0200 Message-Id: <20171015163042.35017-9-noralf@tronnes.org> X-Mailer: git-send-email 2.14.2 In-Reply-To: <20171015163042.35017-1-noralf@tronnes.org> References: <20171015163042.35017-1-noralf@tronnes.org> MIME-Version: 1.0 Cc: daniel.vetter@ffwll.ch, matt@gatt.is X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP vmalloc BO's gives us cached reads, so no need to prefetch in that case. Prefetching gives a ~20% speedup on a cma buffer using the mi0283qt driver on a Raspberry Pi 1. Signed-off-by: Noralf Trønnes --- drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c | 54 ++++++++++++++------------ 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c index ee9a8f305b26..bca905213cdd 100644 --- a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c +++ b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c @@ -15,6 +15,8 @@ #include #include +#include +#include #include #include @@ -115,22 +117,25 @@ void tinydrm_swab16(u16 *dst, void *vaddr, struct drm_framebuffer *fb, struct drm_clip_rect *clip) { size_t len = (clip->x2 - clip->x1) * sizeof(u16); + u16 *src, *buf = NULL; unsigned int x, y; - u16 *src, *buf; /* - * The cma memory is write-combined so reads are uncached. - * Speed up by fetching one line at a time. + * Imported buffers are likely to be write-combined with uncached + * reads. Speed up by fetching one line at a time. + * prefetch_range() was tried, but didn't give any noticeable speedup + * on the Raspberry Pi 1. */ - buf = kmalloc(len, GFP_KERNEL); - if (!buf) - return; + if (drm_gem_fb_get_obj(fb, 0)->import_attach) + buf = kmalloc(len, GFP_KERNEL); for (y = clip->y1; y < clip->y2; y++) { src = vaddr + (y * fb->pitches[0]); src += clip->x1; - memcpy(buf, src, len); - src = buf; + if (buf) { + memcpy(buf, src, len); + src = buf; + } for (x = clip->x1; x < clip->x2; x++) *dst++ = swab16(*src++); } @@ -155,19 +160,21 @@ void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr, struct drm_clip_rect *clip, bool swap) { size_t len = (clip->x2 - clip->x1) * sizeof(u32); + u32 *src, *buf = NULL; unsigned int x, y; - u32 *src, *buf; u16 val16; - buf = kmalloc(len, GFP_KERNEL); - if (!buf) - return; + /* See tinydrm_swab16() for an explanation */ + if (drm_gem_fb_get_obj(fb, 0)->import_attach) + buf = kmalloc(len, GFP_KERNEL); for (y = clip->y1; y < clip->y2; y++) { src = vaddr + (y * fb->pitches[0]); src += clip->x1; - memcpy(buf, src, len); - src = buf; + if (buf) { + memcpy(buf, src, len); + src = buf; + } for (x = clip->x1; x < clip->x2; x++) { val16 = ((*src & 0x00F80000) >> 8) | ((*src & 0x0000FC00) >> 5) | @@ -205,24 +212,23 @@ void tinydrm_xrgb8888_to_gray8(u8 *dst, void *vaddr, struct drm_framebuffer *fb, { unsigned int len = (clip->x2 - clip->x1) * sizeof(u32); unsigned int x, y; - void *buf; + void *buf = NULL; u32 *src; if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB8888)) return; - /* - * The cma memory is write-combined so reads are uncached. - * Speed up by fetching one line at a time. - */ - buf = kmalloc(len, GFP_KERNEL); - if (!buf) - return; + + /* See tinydrm_swab16() for an explanation */ + if (drm_gem_fb_get_obj(fb, 0)->import_attach) + buf = kmalloc(len, GFP_KERNEL); for (y = clip->y1; y < clip->y2; y++) { src = vaddr + (y * fb->pitches[0]); src += clip->x1; - memcpy(buf, src, len); - src = buf; + if (buf) { + memcpy(buf, src, len); + src = buf; + } for (x = clip->x1; x < clip->x2; x++) { u8 r = (*src & 0x00ff0000) >> 16; u8 g = (*src & 0x0000ff00) >> 8;