From patchwork Tue Nov 20 11:37:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sharat Masetty X-Patchwork-Id: 10690303 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD38513BB for ; Tue, 20 Nov 2018 11:37:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EF0B2906E for ; Tue, 20 Nov 2018 11:37:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9361B2A7F2; Tue, 20 Nov 2018 11:37:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 141BF2906E for ; Tue, 20 Nov 2018 11:37:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729361AbeKTWGf (ORCPT ); Tue, 20 Nov 2018 17:06:35 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:46676 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729209AbeKTWGe (ORCPT ); Tue, 20 Nov 2018 17:06:34 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 051C860B7A; Tue, 20 Nov 2018 11:37:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1542713871; bh=gFjn1lv60d4Qy6SLytIr6IUVo6cX8Uuu5lOnqf9GqcQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UH5/87KZns8dJyHAPfWSLCZM5FwqJ32uYQ5ZhbP8S5tWGgMYulp0u2060zrI5cCxt wEwSSHH7D3rONLVv6qdGvP6o1gBgXfaVUsQAYns/ovvFGOgzdmYUeDvlwGGABwFARW B8hyIrUnx3N1M3lU3TB31r4aDIfjnfaz1fLl7bcI= Received: from smasetty-linux.qualcomm.com (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: smasetty@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id DF50560B7A; Tue, 20 Nov 2018 11:37:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1542713869; bh=gFjn1lv60d4Qy6SLytIr6IUVo6cX8Uuu5lOnqf9GqcQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KWHCRQ96B2qYZxSy5C3OrXbVgq6vYcqYRKpnFloxU88RU0jDO++k1JM1z4aEdSTs9 9F6PaJ0a4RbQ9l4nLhN2+0rUP6tBa2OvRMHU/iDThUf9E8+Eavm+03J1WnL7aQjlem FmYErLPY7OogSnmHTFyg/7Y71siSHH8c9uxKGFa8= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org DF50560B7A Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=smasetty@codeaurora.org From: Sharat Masetty To: freedreno@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, jcrouse@codeaurora.org, Sharat Masetty Subject: [PATCH 4/4] drm/msm: Optimize adreno_show_object() Date: Tue, 20 Nov 2018 17:07:31 +0530 Message-Id: <1542713851-14375-4-git-send-email-smasetty@codeaurora.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1542713851-14375-1-git-send-email-smasetty@codeaurora.org> References: <1542713851-14375-1-git-send-email-smasetty@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When the userspace tries to read the crashstate dump, the read side implementation in the driver currently ascii85 encodes all the binary buffers and it does this each time the read system call is called. A userspace tool like cat typically does a page by page read and the number of read calls depends on the size of the data captured by the driver. This is certainly not desirable and does not scale well with large captures. This patch encodes the buffer only once in the read path. With this there is an immediate >10X speed improvement in crashstate save time. Signed-off-by: Sharat Masetty Reviewed-by: Jordan Crouse --- Changes from v1: Addressed comments from Jordan Crouse drivers/gpu/drm/msm/adreno/adreno_gpu.c | 80 ++++++++++++++++++++++++--------- drivers/gpu/drm/msm/msm_gpu.h | 1 + 2 files changed, 60 insertions(+), 21 deletions(-) -- 1.9.1 diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index bbf8d3e..7749967 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -441,11 +441,15 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state) { int i; - for (i = 0; i < ARRAY_SIZE(state->ring); i++) + for (i = 0; i < ARRAY_SIZE(state->ring); i++) { kvfree(state->ring[i].bo.data); + kvfree(state->ring[i].bo.encoded); + } - for (i = 0; state->bos && i < state->nr_bos; i++) + for (i = 0; state->bos && i < state->nr_bos; i++) { kvfree(state->bos[i].data); + kvfree(state->bos[i].encoded); + } kfree(state->bos); kfree(state->comm); @@ -472,34 +476,70 @@ int adreno_gpu_state_put(struct msm_gpu_state *state) #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP) -static void adreno_show_object(struct drm_printer *p, u32 *ptr, int len) +static char *adreno_gpu_ascii85_encode(u32 *src, size_t len) { - char out[ASCII85_BUFSZ]; - long l, datalen, i; + void *buf; + size_t buf_itr = 0; + long i, l; - if (!ptr || !len) - return; + if (!len) + return NULL; + + l = ascii85_encode_len(len); /* - * Only dump the non-zero part of the buffer - rarely will any data - * completely fill the entire allocated size of the buffer + * ascii85 outputs either a 5 byte string or a 1 byte string. So we + * account for the worst case of 5 bytes per dword plus the 1 for '\0' */ - for (datalen = 0, i = 0; i < len >> 2; i++) { - if (ptr[i]) - datalen = (i << 2) + 1; + buf = kvmalloc((l * 5) + 1, GFP_KERNEL); + if (!buf) + return NULL; + + for (i = 0; i < l; i++) { + ascii85_encode(src[i], buf + buf_itr); + + if (src[i] == 0) + buf_itr += 1; + else + buf_itr += 5; } - /* Skip printing the object if it is empty */ - if (datalen == 0) + return buf; +} + +static void adreno_show_object(struct drm_printer *p, + struct msm_gpu_state_bo *bo) +{ + if ((!bo->data && !bo->encoded) || !bo->size) return; - l = ascii85_encode_len(datalen); + if (!bo->encoded) { + long datalen, i; + u32 *buf = bo->data; + + /* + * Only dump the non-zero part of the buffer - rarely will + * any data completely fill the entire allocated size of + * the buffer. + */ + for (datalen = 0, i = 0; i < (bo->size) >> 2; i++) { + if (buf[i]) + datalen = ((i + 1) << 2); + } + + bo->encoded = adreno_gpu_ascii85_encode(buf, datalen); + + kvfree(bo->data); + bo->data = NULL; + + if (!bo->encoded) + return; + } drm_puts(p, " data: !!ascii85 |\n"); drm_puts(p, " "); - for (i = 0; i < l; i++) - drm_puts(p, ascii85_encode(ptr[i], out)); + drm_puts(p, bo->encoded); drm_puts(p, "\n"); } @@ -531,8 +571,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state, drm_printf(p, " wptr: %d\n", state->ring[i].wptr); drm_printf(p, " size: %d\n", MSM_GPU_RINGBUFFER_SZ); - adreno_show_object(p, state->ring[i].bo.data, - state->ring[i].bo.size); + adreno_show_object(p, &(state->ring[i].bo)); } if (state->bos) { @@ -543,8 +582,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state, state->bos[i].iova); drm_printf(p, " size: %zd\n", state->bos[i].size); - adreno_show_object(p, state->bos[i].data, - state->bos[i].size); + adreno_show_object(p, &(state->bos[i])); } } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index a3a6371..737bf45 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -191,6 +191,7 @@ struct msm_gpu_state_bo { u64 iova; size_t size; void *data; + char *encoded; }; struct msm_gpu_state {