From patchwork Thu Jan 21 16:51:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathu Baronia X-Patchwork-Id: 12037107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A8A0C433E0 for ; Thu, 21 Jan 2021 16:53:50 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0B7B823A57 for ; Thu, 21 Jan 2021 16:53:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0B7B823A57 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:References:In-Reply-To:Message-Id:Date:Subject:To: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=hlL3lWjF3sR4mzWDD9gk1ZHwVlroy7aJVgXvyWdIZ5M=; b=3lFqyViIXaZRGg7YmaJftLaIo6 0Je8Y0Pies1GZ14jSKOy3WmHBa+7Vx9Be0vpbGbAWYpYcgQkmKUF+YCBAzQHIuBFuUDJCn7szhEns QRn2CADtTNNPS/N7W4h2qkS/gCiO5UVyylLEAc62NvYqPmJfTMJ9PPO98osiJFk5S4BSIjydigkkK b1/0oHG97NstbalX65eqT+LyAq79US7oASHrHiWGDJbj+f53/nQ2qrdZpX1WfcNuGGKra5WLWU8s5 7zB6iFNEZs6N3xuvne6msr5hlxGGcXkzoF+/EZG7Dqh9vKWyUArD727Y9PwerN4HUdj7YlZhM8w7K 510FdHbg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l2dC6-0006Zw-BL; Thu, 21 Jan 2021 16:52:26 +0000 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l2dC3-0006ZL-Uy for linux-arm-kernel@lists.infradead.org; Thu, 21 Jan 2021 16:52:24 +0000 Received: by mail-pf1-x42f.google.com with SMTP id j12so1820136pfj.12 for ; Thu, 21 Jan 2021 08:52:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9W5ZfzceZEprkNwKnWxd/HA2+wZNESBmIZ6OZLN9IFs=; b=ckzOtzlwn3A88sAj8WM5TYRP6OOOxFgTO7qXVqyF5No7YgF71gnW4IRy+/9lLfhNYg 9K7SXhQluGFJMxjs8t9MkpR6njztLa5rVZC+52x+tNjtbJ6/q0X857JvUXaspEjVCANq yyidiLqA4khKahGyELHlKazQGtd1FFFaoetdhUj25fEZip1ISo5yCAK27rT07nYouTuV uhTa04KC2wmEZbfYXpdkplIUovLimLQ95u0Tedre+h9YOyYsDTd7iS7A8oA65+YCd3lY aEjiPMRaThFSpCtzm/gMe0amvZlQN11HDg2eE1hmVz1XW5qMuqPbsES6T6/hoAExC6F4 PYfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9W5ZfzceZEprkNwKnWxd/HA2+wZNESBmIZ6OZLN9IFs=; b=ZGkjbZetnlYt6Nrsq06jWwQhUtyU3+D2FMVnNp1DYuInuvEZp3Isg8rair96Y3YQv3 xQz2sBcGf0mB6gtSKKYbbxleMFpc/0zSTRq4YBHQMDp28t+TilIhXTb2qzjI+a8IPjzX yT6iNfUV1sYHfe92SMbSrvGRlTU6Qp/k7Ob6CTLvFbOykJk6iKcVFZOSPig1n1X5iToI kOImYfQ49EfY7Qz2DIB+q3TGAT7uQUzGZJ4ADr+7xClmK2sFP+7OyTnOb8ZNVqQL3nlj REOf1uFrP10744Q0oaaHE+zfVBzRV8I+Vq5amRkskN0tmzBxry4DEkAAaGey4ubs1IzR kfJA== X-Gm-Message-State: AOAM532ZjqVOR7Ad93AkJKrETjg6jrhyVliwacpsGLrabr6tNWt4Y+XL K/D27lRqmXV/QpJfPaIdJvQ= X-Google-Smtp-Source: ABdhPJxNpyN17S4hDGlHxKA4qYwirkO64sVLE7fKwFZ/xhJi6EzTiP6oB/d9hAYSWmT7OmCXsJCkdA== X-Received: by 2002:a62:32c5:0:b029:1b6:7586:f718 with SMTP id y188-20020a6232c50000b02901b67586f718mr266295pfy.74.1611247941684; Thu, 21 Jan 2021 08:52:21 -0800 (PST) Received: from localhost.localdomain ([2405:201:5c0b:3035:cd47:c5b3:4276:dc05]) by smtp.gmail.com with ESMTPSA id m27sm5924291pgn.62.2021.01.21.08.52.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Jan 2021 08:52:21 -0800 (PST) From: Prathu Baronia X-Google-Original-From: Prathu Baronia To: linux-kernel@vger.kernel.org Subject: [PATCH 1/1] mm: Optimizing hugepage zeroing in arm64 Date: Thu, 21 Jan 2021 22:21:51 +0530 Message-Id: <20210121165153.17828-2-prathu.baronia@oneplus.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210121165153.17828-1-prathu.baronia@oneplus.com> References: <20210121165153.17828-1-prathu.baronia@oneplus.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210121_115224_056885_FEAD8F05 X-CRM114-Status: GOOD ( 15.44 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Prathu Baronia , Catalin Marinas , Anshuman Khandual , chintan.pandya@oneplus.com, "glider@google.com" , Andrey Konovalov , Andrew Morton , Vincenzo Frascino , Will Deacon , linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In !HIGHMEM cases, specially in 64-bit architectures, we don't need temp mapping of pages. Hence, k(map|unmap)_atomic() acts as nothing more than multiple barrier() calls, for example for a 2MB hugepage in clear_huge_page() these are called 512 times i.e. to map and unmap each subpage that means in total 2048 barrier calls. This called for optimization. Simply getting VADDR from page does the job for us. We profiled clear_huge_page() using ftrace and observed an improvement of 62%. Setup:- Below data has been collected on Qualcomm's SM7250 SoC THP enabled (kernel v4.19.113) with only CPU-0(Cortex-A55) and CPU-7(Cortex-A76) switched on and set to max frequency, also DDR set to perf governor. FTRACE Data:- Base data:- Number of iterations: 48 Mean of allocation time: 349.5 us std deviation: 74.5 us v1 data:- Number of iterations: 48 Mean of allocation time: 131 us std deviation: 32.7 us The following simple userspace experiment to allocate 100MB(BUF_SZ) of pages and writing to it gave us a good insight, we observed an improvement of 42% in allocation and writing timings. ------------------------------------------------------------- Test code snippet ------------------------------------------------------------- clock_start(); buf = malloc(BUF_SZ); /* Allocate 100 MB of memory */ for(i=0; i < BUF_SZ_PAGES; i++) { *((int *)(buf + (i*PAGE_SIZE))) = 1; } clock_end(); ------------------------------------------------------------- Malloc test timings for 100MB anon allocation:- Base data:- Number of iterations: 100 Mean of allocation time: 31831 us std deviation: 4286 us v1 data:- Number of iterations: 100 Mean of allocation time: 18193 us std deviation: 4915 us Reported-by: Chintan Pandya Signed-off-by: Prathu Baronia --- arch/arm64/include/asm/page.h | 3 +++ arch/arm64/mm/copypage.c | 8 ++++++++ 2 files changed, 11 insertions(+) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..8f9d005a11bb 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -35,6 +35,9 @@ void copy_highpage(struct page *to, struct page *from); #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) +#define clear_user_highpage clear_user_highpage +void clear_user_highpage(struct page *page, unsigned long vaddr); + typedef struct page *pgtable_t; extern int pfn_valid(unsigned long); diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index b5447e53cd73..7f5943c6fc12 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -44,3 +44,11 @@ void copy_user_highpage(struct page *to, struct page *from, flush_dcache_page(to); } EXPORT_SYMBOL_GPL(copy_user_highpage); + +inline void clear_user_highpage(struct page *page, unsigned long vaddr) +{ + void *addr = page_address(page); + + clear_user_page(addr, vaddr, page); +} +EXPORT_SYMBOL_GPL(clear_user_highpage);