From patchwork Fri May 18 03:03:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Huang Ying X-Patchwork-Id: 10408107 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1657A60247 for ; Fri, 18 May 2018 03:04:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 037A028741 for ; Fri, 18 May 2018 03:04:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E9E6228751; Fri, 18 May 2018 03:03:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AEEC5287BE for ; Fri, 18 May 2018 03:03:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B99F46B055A; Thu, 17 May 2018 23:03:56 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B48A86B055B; Thu, 17 May 2018 23:03:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3A616B055C; Thu, 17 May 2018 23:03:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f69.google.com (mail-pl0-f69.google.com [209.85.160.69]) by kanga.kvack.org (Postfix) with ESMTP id 5FD8F6B055A for ; Thu, 17 May 2018 23:03:56 -0400 (EDT) Received: by mail-pl0-f69.google.com with SMTP id d4-v6so4142563plr.17 for ; Thu, 17 May 2018 20:03:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=O9PdT3jZPzLTbkygrLavgoCKxnKNpmbEfDAdrUbyDfo=; b=Ln/eabKooPFYxyrpO1ioMWG2P7Qe685KvTXot9dZkgm3EWrm5d0aOThsJNKFPlultA Hww9ELK8XKSHC0qlJQ/lOJ843/t/3Xda7/nXBWxE1uI9MyFcI1urfrZqUTMPCJH/Oy7F rcegDhVoKqdqbssB5Ltxrtis71P/AbgEu9i7owfvTd3k6BxO1oqewZVLFxrnqsVHQ9IX uWKgpFo5uY7zj7p8dG0WcNbLvF/8eJSeRos9VOVu0YBvKtoA+7Nby07DiNeSZlzthKcy E96n3uX4ntUR634Uajc62w1l8hnHYYn6BFzIENJ6Oan/ZHlejqWfmbYFjZ08O6iG8tLP qWVA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ALKqPwf2LZxlDOOlWJPwDyjpv6cvCl1wzkCdHxN6ScPBd0CETMjpQIIz jM+ssFBOCgpYmfEUs7CobjywkrMfEE/svonhdCIr5X9Ub0RMmwcd817u4XfrGZ9CbZd/wJiAdSc LYa3KqgBd5ZWjHU73hb0qlOv++Cn50XzAlwumVxn/cHUeeaq0v2MUIpdsBeqwMl4Dqg== X-Received: by 2002:a65:5ac9:: with SMTP id d9-v6mr6055057pgt.342.1526612635867; Thu, 17 May 2018 20:03:55 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoGygK0M5AQFdS8bsf1kCeM4dQSarn4DL50wmEyHKkInbL1EyT0mMTBxaAzgD833sUl64GH X-Received: by 2002:a65:5ac9:: with SMTP id d9-v6mr6054991pgt.342.1526612634470; Thu, 17 May 2018 20:03:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526612634; cv=none; d=google.com; s=arc-20160816; b=NHiIjmENXwvYAlsyneULTKrw5tWQxho7vz1zSoMD2f7Nuj8eputgqSP7O8bKDgCUcm LisJXkefngXKvyjihC/YEjQzDoAOiGg6JcbxLoRCr0ytn5FpsqW7Jyf9FmELA0aaA2rw I2AKMEpNDjS7l8PUL36CODuAnF3nML/yjFUw9++NdQ9cWCo3OOuqym6EIxIzUX+OaS7X fkYcIYYp40KEIJiZPo/z8srybgU97HoD75CyC4+0JfFd5RG6RlI2Lp1Zsc8G2/85MYcc WrFoCCPoHokPu1Arx6L74rELToayLmlO7+H56mIF2jm+n1VJtb55jWEzpJ7ePgUUSchw zABQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from:arc-authentication-results; bh=O9PdT3jZPzLTbkygrLavgoCKxnKNpmbEfDAdrUbyDfo=; b=L5NvktmDmMUrrZTrzBBMkx2XixcGuYnmbpbUnXdkgdkGWlsm2T7RvSDGE/bOtbWqwN g4vvofvH09N4/tmAlbVAu4nhKM7fMMngugltsWWdChZyhE6hRLfv53m2R8RoRoP6vvNv ARAjWdGbtx8AEExSpn37ssveAbZJV9yb9d1Ovwnk89fzUQbXz9OHgQ/rzakw8EFroHUc aDMN1tZxSinL/+XjcyNNqLtfnjTv843eA4SpF2jAQipHArGNQ5sFbn0ayTEyAcUDceu2 vL4Q/suBGXi17TJX3zaUNf7ia260RDRfKANQ/DdonwRRp9yZMjE8yJIYJ7DNSBe58Vgy uKkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id w11-v6si5257213pgv.329.2018.05.17.20.03.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 20:03:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 May 2018 20:03:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,413,1520924400"; d="scan'208";a="56313529" Received: from yhuang-gentoo.sh.intel.com ([10.239.197.37]) by fmsmga001.fm.intel.com with ESMTP; 17 May 2018 20:03:48 -0700 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Andi Kleen , Jan Kara , Michal Hocko , Andrea Arcangeli , "Kirill A. Shutemov" , Matthew Wilcox , Hugh Dickins , Minchan Kim , Shaohua Li , Christopher Lameter , Mike Kravetz Subject: [PATCH -mm] mm, huge page: Copy to access sub-page last when copy huge page Date: Fri, 18 May 2018 11:03:16 +0800 Message-Id: <20180518030316.31019-1-ying.huang@intel.com> X-Mailer: git-send-email 2.16.1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Huang Ying Huge page helps to reduce TLB miss rate, but it has higher cache footprint, sometimes this may cause some issue. For example, when copying huge page on x86_64 platform, the cache footprint is 4M. But on a Xeon E5 v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC (last level cache). That is, in average, there are 2.5M LLC for each core and 1.25M LLC for each thread. If the cache pressure is heavy when copying the huge page, and we copy the huge page from the begin to the end, it is possible that the begin of huge page is evicted from the cache after we finishing copying the end of the huge page. And it is possible for the application to access the begin of the huge page after copying the huge page. To help the above situation, in this patch, when we copy a huge page, the order to copy sub-pages is changed. In quite some situation, we can get the address that the application will access after we copy the huge page, for example, in a page fault handler. Instead of copying the huge page from begin to end, we will copy the sub-pages farthest from the the sub-page to access firstly, and copy the sub-page to access last. This will make the sub-page to access most cache-hot and sub-pages around it more cache-hot too. If we cannot know the address the application will access, the begin of the huge page is assumed to be the the address the application will access. The patch is a generic optimization which should benefit quite some workloads, not for a specific use case. To demonstrate the performance benefit of the patch, we tested it with vm-scalability run on transparent huge page. With this patch, the throughput increases ~16.6% in vm-scalability anon-cow-seq test case with 36 processes on a 2 socket Xeon E5 v3 2699 system (36 cores, 72 threads). The test case set /sys/kernel/mm/transparent_hugepage/enabled to be always, mmap() a big anonymous memory area and populate it, then forked 36 child processes, each writes to the anonymous memory area from the begin to the end, so cause copy on write. For each child process, other child processes could be seen as other workloads which generate heavy cache pressure. At the same time, the IPC (instruction per cycle) increased from 0.63 to 0.78, and the time spent in user space is reduced ~7.2%. Signed-off-by: "Huang, Ying" Cc: Andi Kleen Cc: Jan Kara Cc: Michal Hocko Cc: Andrea Arcangeli Cc: "Kirill A. Shutemov" Cc: Matthew Wilcox Cc: Hugh Dickins Cc: Minchan Kim Cc: Shaohua Li Cc: Christopher Lameter Cc: Mike Kravetz --- include/linux/mm.h | 3 ++- mm/huge_memory.c | 3 ++- mm/memory.c | 43 +++++++++++++++++++++++++++++++++++++++---- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3fa3b1356c34..a5fae31988e6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2732,7 +2732,8 @@ extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); extern void copy_user_huge_page(struct page *dst, struct page *src, - unsigned long addr, struct vm_area_struct *vma, + unsigned long addr_hint, + struct vm_area_struct *vma, unsigned int pages_per_huge_page); extern long copy_huge_page_from_user(struct page *dst_page, const void __user *usr_src, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 323acdd14e6e..7e720e92fcd6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1331,7 +1331,8 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) if (!page) clear_huge_page(new_page, vmf->address, HPAGE_PMD_NR); else - copy_user_huge_page(new_page, page, haddr, vma, HPAGE_PMD_NR); + copy_user_huge_page(new_page, page, vmf->address, + vma, HPAGE_PMD_NR); __SetPageUptodate(new_page); mmun_start = haddr; diff --git a/mm/memory.c b/mm/memory.c index 14578158ed20..f8868c94d6ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4654,10 +4654,12 @@ static void copy_user_gigantic_page(struct page *dst, struct page *src, } void copy_user_huge_page(struct page *dst, struct page *src, - unsigned long addr, struct vm_area_struct *vma, + unsigned long addr_hint, struct vm_area_struct *vma, unsigned int pages_per_huge_page) { - int i; + int i, n, base, l; + unsigned long addr = addr_hint & + ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { copy_user_gigantic_page(dst, src, addr, vma, @@ -4665,10 +4667,43 @@ void copy_user_huge_page(struct page *dst, struct page *src, return; } + /* Copy sub-page to access last to keep its cache lines hot */ might_sleep(); - for (i = 0; i < pages_per_huge_page; i++) { + n = (addr_hint - addr) / PAGE_SIZE; + if (2 * n <= pages_per_huge_page) { + /* If sub-page to access in first half of huge page */ + base = 0; + l = n; + /* Copy sub-pages at the end of huge page */ + for (i = pages_per_huge_page - 1; i >= 2 * n; i--) { + cond_resched(); + copy_user_highpage(dst + i, src + i, + addr + i * PAGE_SIZE, vma); + } + } else { + /* If sub-page to access in second half of huge page */ + base = pages_per_huge_page - 2 * (pages_per_huge_page - n); + l = pages_per_huge_page - n; + /* Copy sub-pages at the begin of huge page */ + for (i = 0; i < base; i++) { + cond_resched(); + copy_user_highpage(dst + i, src + i, + addr + i * PAGE_SIZE, vma); + } + } + /* + * Copy remaining sub-pages in left-right-left-right pattern + * towards the sub-page to access + */ + for (i = 0; i < l; i++) { + cond_resched(); + copy_user_highpage(dst + base + i, src + base + i, + addr + (base + i) * PAGE_SIZE, vma); cond_resched(); - copy_user_highpage(dst + i, src + i, addr + i*PAGE_SIZE, vma); + copy_user_highpage(dst + base + 2 * l - 1 - i, + src + base + 2 * l - 1 - i, + addr + (base + 2 * l - 1 - i) * PAGE_SIZE, + vma); } }