From patchwork Sun Apr 12 09:08:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11484485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1469E112C for ; Sun, 12 Apr 2020 09:13:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CADF420709 for ; Sun, 12 Apr 2020 09:13:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="H6V+wkIR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CADF420709 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 14CC38E00C8; Sun, 12 Apr 2020 05:13:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0FDF38E0007; Sun, 12 Apr 2020 05:13:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F06878E00C8; Sun, 12 Apr 2020 05:13:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id D55AC8E0007 for ; Sun, 12 Apr 2020 05:13:29 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 95DC95DDC for ; Sun, 12 Apr 2020 09:13:29 +0000 (UTC) X-FDA: 76698639738.01.vest78_75c101c0ad940 X-Spam-Summary: 2,0,0,e1176cb0cec1a681,d41d8cd98f00b204,liliang.opensource@gmail.com,,RULES_HIT:41:355:379:800:960:966:973:988:989:1260:1277:1312:1313:1314:1345:1381:1437:1516:1518:1519:1535:1542:1593:1594:1595:1596:1711:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2553:2559:2562:2693:2731:2895:2898:3138:3139:3140:3141:3142:3353:3865:3867:3870:3871:4117:4321:4385:5007:6120:6261:6653:6737:7903:8957:9010:9413:9707:10004:10400:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:13439:13895:14093:14097:14181:14687:14721:21080:21444:21451:21554:21627:21666:21987:21990:30054:30070:30090,0,RBL:209.85.214.193:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.50.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: vest78_75c101c0ad940 X-Filterd-Recvd-Size: 6613 Received: from mail-pl1-f193.google.com (mail-pl1-f193.google.com [209.85.214.193]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Sun, 12 Apr 2020 09:13:29 +0000 (UTC) Received: by mail-pl1-f193.google.com with SMTP id h11so2334193plr.11 for ; Sun, 12 Apr 2020 02:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=RuB1CaDpM0woLJXutDrQTOUUotmzhwc0OsCFpbFNptA=; b=H6V+wkIRDejLir6iRbxVY7CX2fia/E09n9eexfrXPrbXSwPB7iYZSqJIMSDwa+w9Z7 3j1u23NQvBvNsKsq/r+qCdUim8VsGxkUpXC80iNFjD28ASiuzwT0ViCnPsUuDzKO/9au WlSD8vkDoFIajB9XtSwyIvuyu70EsfMh7xgLvcH2gEQXf+4MdOBXiwraC0/qxwLU84T3 4FApS/7RK58A0rgXMcgAaA/oVkTY58KwDEbg7VQJdmxbhQdLVrx2xDQWU8gKlK0IU1gG gAZsGlJ0OWf/2QeRUC4h3ol6hBF6CK4VdUAw7bOti56H/siMGzMOk+euNtiujucc96W3 cjcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:mail-followup-to :mime-version:content-disposition:user-agent; bh=RuB1CaDpM0woLJXutDrQTOUUotmzhwc0OsCFpbFNptA=; b=oxeFORQJspu4B+0/TSUeboTQmTMMk+PPcNVQ6Sk975qgJKd2U+Nd8AbQHPVG5MBKCf lOJghMWX4nLu2EORZcYTdJ2go7z96XIXA3FUZiAl9653GE2Gf52MCJ8nlidE9gN31Ef/ +yC2o+L7aqlA4OkmPp0/9KrDiDuFIKh2jtgoyqgRPxzkZ7A3EcpUu+xz15FLeUl3b04e g5Try1fYQLpAdbvnDpCHXcHX5wy2dL/shEzAg2ouEpFez1hAHPewQzyp5jn6IQGOfD3H Y1rqQ9RvcTJmaXF/2N9GaM8bS6z7p78TzH/FrelVyFkdlCPMcbu2B2eZHQA7uq1F7cuK 142A== X-Gm-Message-State: AGi0PuZ/rVF/o8mcs+yeHYYR8dn0QUcDM0jTV1cZ9xMFSSsj37rxPlAc IoFEqug3fCrqiTQOqSh5dEVc2Id4 X-Google-Smtp-Source: APiQypJuLi6FpbQOASVt3bnvLG7scASlrJ5x7KwCWifh3zbzoiuzuJkj8eG+Sx+zIbBUBby62zO42g== X-Received: by 2002:a17:902:8a8f:: with SMTP id p15mr12549686plo.45.1586682808174; Sun, 12 Apr 2020 02:13:28 -0700 (PDT) Received: from open-light-1.localdomain ([66.98.113.28]) by smtp.gmail.com with ESMTPSA id p12sm6057594pfq.153.2020.04.12.02.13.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Apr 2020 02:13:27 -0700 (PDT) From: liliangleo X-Google-Original-From: liliangleo Date: Sun, 12 Apr 2020 05:08:00 -0400 To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson Subject: [RFC PATCH 1/4] mm: reduce the impaction of page reporing worker Message-ID: <20200412090756.GA19574@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When scaning the free list, 'page_reporting_cycle' may hold the zone->lock for a long time when there are no reported page in the free list. Setting PAGE_REPORTING_MIN_ORDER to a lower oder will make this issue worse. Two ways were used to reduce the impact: 1. Release zone lock periodicly 2. Yield cpu voluntarily if needed. Signed-off-by: liliangleo --- mm/page_reporting.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/mm/page_reporting.c b/mm/page_reporting.c index 3bbd471cfc81..3a7084e508e1 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -6,11 +6,14 @@ #include #include #include +#include #include "page_reporting.h" #include "internal.h" #define PAGE_REPORTING_DELAY (2 * HZ) +#define MAX_SCAN_NUM 1024 + static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; enum { @@ -115,7 +118,7 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, unsigned int page_len = PAGE_SIZE << order; struct page *page, *next; long budget; - int err = 0; + int err = 0, scan_cnt = 0; /* * Perform early check, if free area is empty there is @@ -145,8 +148,14 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, /* loop through free list adding unreported pages to sg list */ list_for_each_entry_safe(page, next, list, lru) { /* We are going to skip over the reported pages. */ - if (PageReported(page)) + if (PageReported(page)) { + if (++scan_cnt >= MAX_SCAN_NUM) { + err = scan_cnt; + break; + } continue; + } + /* * If we fully consumed our budget then update our @@ -219,6 +228,26 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, return err; } +static int +reporting_order_type(struct page_reporting_dev_info *prdev, struct zone *zone, + unsigned int order, unsigned int mt, + struct scatterlist *sgl, unsigned int *offset) +{ + int ret = 0; + unsigned long total = 0; + + might_sleep(); + do { + cond_resched(); + ret = page_reporting_cycle(prdev, zone, order, mt, + sgl, offset); + if (ret > 0) + total += ret; + } while (ret > 0 && total < zone->free_area[order].nr_free); + + return ret; +} + static int page_reporting_process_zone(struct page_reporting_dev_info *prdev, struct scatterlist *sgl, struct zone *zone) @@ -245,7 +274,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev, if (is_migrate_isolate(mt)) continue; - err = page_reporting_cycle(prdev, zone, order, mt, + err = reporting_order_type(prdev, zone, order, mt, sgl, &offset); if (err) return err; From patchwork Sun Apr 12 09:08:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11484487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D005C81 for ; Sun, 12 Apr 2020 09:14:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8E17D206DA for ; Sun, 12 Apr 2020 09:14:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hoeVsy/P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E17D206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CE5928E00C9; Sun, 12 Apr 2020 05:14:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C949A8E0007; Sun, 12 Apr 2020 05:14:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B83BA8E00C9; Sun, 12 Apr 2020 05:14:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 9FF128E0007 for ; Sun, 12 Apr 2020 05:14:24 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 661555DC2 for ; Sun, 12 Apr 2020 09:14:24 +0000 (UTC) X-FDA: 76698642048.10.stew75_7dbab60a1f41c X-Spam-Summary: 2,0,0,0ef6e2e23d1d2792,d41d8cd98f00b204,liliang.opensource@gmail.com,,RULES_HIT:41:355:379:800:960:965:966:973:988:989:1260:1277:1312:1313:1314:1345:1381:1431:1437:1516:1518:1519:1535:1542:1593:1594:1595:1596:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:3138:3139:3140:3141:3142:3353:3865:3866:3870:3871:4117:4321:4385:4390:4395:5007:6120:6261:6653:6737:8603:9413:10004:10400:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12895:12986:13161:13229:13439:13869:13895:14096:14097:14181:14687:14721:21080:21444:21451:21554:21627:21666:21990:30005:30054:30064,0,RBL:209.85.214.194:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: stew75_7dbab60a1f41c X-Filterd-Recvd-Size: 6160 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Sun, 12 Apr 2020 09:14:23 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id t4so2330188plq.12 for ; Sun, 12 Apr 2020 02:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=hUbPx5am7etltmgoue9kYTx6WYTCCq/rXLPPJIBdwBM=; b=hoeVsy/PTGVtnZswBoThfSYj3QyaEdgABmWYzdgE9+v4MzxAcHXiHwFyVjvbGuPHDt 4cn1IWPV/t2tu5UVyplPVhq37lKqT/grtiihag1SZkKY6U+ypcsPgBGDNp1GItKNePFN biT7HlJHUef1ciLA/ywlvGGsG3PZkt1qQ5eM1Z21pSkdQ0eO+FhbxJ+7b5BO/VcaTszo nCWEKZBYiFBHhu+5SSUOMtOcPnlL7X9WnI4ENXAxSiDPemNEPkXjequM6r3ZVI7K6KRQ EkN8IKN5geL2OuSBZFen07/5LuU5FK4oCNY8Ujl1uHhX6mQDEippj0an3EZTa6n7g/rM gvvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:mail-followup-to :mime-version:content-disposition:user-agent; bh=hUbPx5am7etltmgoue9kYTx6WYTCCq/rXLPPJIBdwBM=; b=M/e4FDiSEKbonsblEe5xHTr+n17y+dkNYVICiPrI73ONIV5kbJKTo4p0Fypqs8xsL/ 83sDzfcg65u8MEkcq4m9blFBbtpwQnWKFCavDQyssd+crwilfkQAFTHl85a8uQUBODZN JIPkORzSfKWbNT6S8VyD5TMGAxoFcQ2Cg5HzlIpNa+BDhFfR08T8VSQIs+XGxdNe3Mzg fLBwCCIJou19C85UCJ2cZIo37v5KSJU45+Td+zOqh+VeotRgSjIpdxG6dKbzy7y81qul pAo7JZFpGRJxwynYSE+SW+9rTHvt3vSPIVWcLPpZ2uos8G3NpRV0i+n0Ls7HpLLLcpdK dtsw== X-Gm-Message-State: AGi0Pub7FQjNxNjPw9TlsrDwLwN5SNhkJfMH9lytpbVQYocgwLO7Ufoz VEKovGuJt6AcInsQXUOCumg= X-Google-Smtp-Source: APiQypIi6PdmqU07abT760TSvrN+rjwg5fAVXPTDAPmm3ult7jHtilBHeAO7O8+uyjKEljrcAf1Bng== X-Received: by 2002:a17:90a:f00b:: with SMTP id bt11mr15479384pjb.71.1586682863092; Sun, 12 Apr 2020 02:14:23 -0700 (PDT) Received: from open-light-1.localdomain ([66.98.113.28]) by smtp.gmail.com with ESMTPSA id bt19sm6173756pjb.3.2020.04.12.02.14.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Apr 2020 02:14:22 -0700 (PDT) From: liliangleo X-Google-Original-From: liliangleo Date: Sun, 12 Apr 2020 05:08:56 -0400 To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson Subject: [RFC PATCH 2/4] mm: Add batch size for free page reporting Message-ID: <20200412090853.GA19578@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use the page order as the only threshold for page reporting is not flexible, add a batch size as another threshold, so the reporting will be triggered only when the amount of free page is bigger than the batch size. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Signed-off-by: liliangleo --- mm/page_reporting.c | 2 ++ mm/page_reporting.h | 12 ++++++++++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/page_reporting.c b/mm/page_reporting.c index 3a7084e508e1..dc7a22a4b752 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -14,6 +14,8 @@ #define PAGE_REPORTING_DELAY (2 * HZ) #define MAX_SCAN_NUM 1024 +unsigned long page_report_batch_size __read_mostly = 4 * 1024 * 1024UL; + static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; enum { diff --git a/mm/page_reporting.h b/mm/page_reporting.h index aa6d37f4dc22..f18c85ecdfe0 100644 --- a/mm/page_reporting.h +++ b/mm/page_reporting.h @@ -12,6 +12,8 @@ #define PAGE_REPORTING_MIN_ORDER pageblock_order +extern unsigned long page_report_batch_size; + #ifdef CONFIG_PAGE_REPORTING DECLARE_STATIC_KEY_FALSE(page_reporting_enabled); void __page_reporting_notify(void); @@ -33,6 +35,8 @@ static inline bool page_reported(struct page *page) */ static inline void page_reporting_notify_free(unsigned int order) { + static long batch_size; + /* Called from hot path in __free_one_page() */ if (!static_branch_unlikely(&page_reporting_enabled)) return; @@ -41,8 +45,12 @@ static inline void page_reporting_notify_free(unsigned int order) if (order < PAGE_REPORTING_MIN_ORDER) return; - /* This will add a few cycles, but should be called infrequently */ - __page_reporting_notify(); + batch_size += (1 << order) << PAGE_SHIFT; + if (batch_size >= page_report_batch_size) { + batch_size = 0; + /* This add a few cycles, but should be called infrequently */ + __page_reporting_notify(); + } } #else /* CONFIG_PAGE_REPORTING */ #define page_reported(_page) false From patchwork Sun Apr 12 09:09:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11484489 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2865F112C for ; Sun, 12 Apr 2020 09:14:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CE749206DA for ; Sun, 12 Apr 2020 09:14:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TMSPD8CY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CE749206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 088368E00CA; Sun, 12 Apr 2020 05:14:51 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 037F98E0007; Sun, 12 Apr 2020 05:14:50 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E90A68E00CA; Sun, 12 Apr 2020 05:14:50 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id D17C18E0007 for ; Sun, 12 Apr 2020 05:14:50 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8C0E84FE6 for ; Sun, 12 Apr 2020 09:14:50 +0000 (UTC) X-FDA: 76698643140.12.smash35_818ab726cfb35 X-Spam-Summary: 2,0,0,15289a8d0727b7d8,d41d8cd98f00b204,liliang.opensource@gmail.com,,RULES_HIT:1:2:41:355:379:800:960:966:973:988:989:1260:1277:1312:1313:1314:1345:1381:1431:1437:1516:1518:1519:1593:1594:1595:1596:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:4050:4321:4385:4423:4605:5007:6120:6261:6653:6737:7901:7903:8603:9010:9413:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13221:13229:13439:13895:13972:14096:14097:14687:21080:21444:21451:21554:21627:21666:21740:21990:30045:30054:30056:30064,0,RBL:209.85.215.195:@gmail.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: smash35_818ab726cfb35 X-Filterd-Recvd-Size: 11080 Received: from mail-pg1-f195.google.com (mail-pg1-f195.google.com [209.85.215.195]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Sun, 12 Apr 2020 09:14:50 +0000 (UTC) Received: by mail-pg1-f195.google.com with SMTP id w11so3133126pga.12 for ; Sun, 12 Apr 2020 02:14:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=645bKxd6X1AK15EMEGsJ7MidT4FGWFCl1Q2sCxk51Hw=; b=TMSPD8CYaHf7z6vnobwe1NUr51t3OGXC3jJqBgNK0ek99bzjymh+t+itHap5mPGdvO dEV030oIbRbLZUcxgqdVkoHQ1iMhLoIDaZJLlvjyo0Xmv7P2whFtTMiTw+SBRB2m5nwC Bo+3/NZZu6ao2zUKzLLawbkYiHyk3r0ry9RWUZqylg9sUTK03m5mWuZmIM67CyQqaEiM lw8V8ji0/5Q1E2kWdfUgl+X011Q7IiOlsrmG16/EbSeDSSEsTTVZDeEYxtjh47/GF1m4 hPFLnQfQiCmhiJxW8ts5Pz/EJ+Tl9sdjzZwmKr/6aZKl9oEkuaNxQOWwcfkUrl3Vnoc1 hUTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:mail-followup-to :mime-version:content-disposition:user-agent; bh=645bKxd6X1AK15EMEGsJ7MidT4FGWFCl1Q2sCxk51Hw=; b=TZs9Sw1aIlu/lTbFxn9iuWirXKpJLdQDn+NNHQY2b0Cws1eDdOHcvbl4kIXYUmWvb3 vqRQOsAvFaWd+C+IpeIbWBpXRmCOHVj1cQi1/8k7MvEeKlKeqA/KIQDHukiXab2Jxxxr sJm64OpJpLG5xWUjMS/NNTWd/vhCYLpOncnWUr4zoZ8+RGn1LCZdbax2J/Pqz8jo5L9C mJeEaqX7ZOnksOJG7R3LOmIr0p16YYv0Z/bITKTxBndHIU0r1V7BOLdiKww7Hf9k9F55 QURE6gcwsu6rtEnaWWDcaql2BfKNdqr377ZOp+htOjcW4uuHDI016JJghiCCNGWbbBSi /Ldg== X-Gm-Message-State: AGi0PuaEEDR0ld2XAvEgfeMJvBEJtNx8BhCzCk82IAS38QMvoiA1cU9e +MbKNVOTEuqkXMYFG708+tY= X-Google-Smtp-Source: APiQypLw91cgexxg9C84CDs1cpXXDV8EOUaWLEkb+LJ78u7NE1KoHRVkvqkzLQjUZyTJyWXYCdRlYg== X-Received: by 2002:a62:7d11:: with SMTP id y17mr12652324pfc.127.1586682889135; Sun, 12 Apr 2020 02:14:49 -0700 (PDT) Received: from open-light-1.localdomain ([66.98.113.28]) by smtp.gmail.com with ESMTPSA id a9sm5408165pgv.18.2020.04.12.02.14.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Apr 2020 02:14:48 -0700 (PDT) From: liliangleo X-Google-Original-From: liliangleo Date: Sun, 12 Apr 2020 05:09:22 -0400 To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson Subject: [RFC PATCH 3/4] mm: add sys fs configuration for page reporting Message-ID: <20200412090919.GA19580@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch add 'delay_millisecs', 'mini_order', 'batch_size', in '/sys/kernel/mm/page_report/'. Usage: "delay_millisecs": Time delay interval between page free and work start to run. "mini_order": Only pages with order equal or greater than mini_order will be reported. "batch_size" Wake up the worker only when free pages total size are greater than 'batch_size'. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Signed-off-by: liliangleo --- mm/page_reporting.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++-- mm/page_reporting.h | 4 +- 2 files changed, 141 insertions(+), 7 deletions(-) diff --git a/mm/page_reporting.c b/mm/page_reporting.c index dc7a22a4b752..cc6a42596560 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -7,15 +7,19 @@ #include #include #include +#include #include "page_reporting.h" #include "internal.h" -#define PAGE_REPORTING_DELAY (2 * HZ) #define MAX_SCAN_NUM 1024 unsigned long page_report_batch_size __read_mostly = 4 * 1024 * 1024UL; +static unsigned long page_report_delay_millisecs __read_mostly = 2000; + +unsigned int page_report_mini_order __read_mostly = 8; + static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; enum { @@ -48,7 +52,8 @@ __page_reporting_request(struct page_reporting_dev_info *prdev) * now we are limiting this to running no more than once every * couple of seconds. */ - schedule_delayed_work(&prdev->work, PAGE_REPORTING_DELAY); + schedule_delayed_work(&prdev->work, + msecs_to_jiffies(page_report_delay_millisecs)); } /* notify prdev of free page reporting request */ @@ -260,7 +265,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev, /* Generate minimum watermark to be able to guarantee progress */ watermark = low_wmark_pages(zone) + - (PAGE_REPORTING_CAPACITY << PAGE_REPORTING_MIN_ORDER); + (PAGE_REPORTING_CAPACITY << page_report_mini_order); /* * Cancel request if insufficient free memory or if we failed @@ -270,7 +275,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev, return err; /* Process each free list starting from lowest order/mt */ - for (order = PAGE_REPORTING_MIN_ORDER; order < MAX_ORDER; order++) { + for (order = page_report_mini_order; order < MAX_ORDER; order++) { for (mt = 0; mt < MIGRATE_TYPES; mt++) { /* We do not pull pages from the isolate free list */ if (is_migrate_isolate(mt)) @@ -337,7 +342,8 @@ static void page_reporting_process(struct work_struct *work) */ state = atomic_cmpxchg(&prdev->state, state, PAGE_REPORTING_IDLE); if (state == PAGE_REPORTING_REQUESTED) - schedule_delayed_work(&prdev->work, PAGE_REPORTING_DELAY); + schedule_delayed_work(&prdev->work, + msecs_to_jiffies(page_report_delay_millisecs)); } static DEFINE_MUTEX(page_reporting_mutex); @@ -393,3 +399,131 @@ void page_reporting_unregister(struct page_reporting_dev_info *prdev) mutex_unlock(&page_reporting_mutex); } EXPORT_SYMBOL_GPL(page_reporting_unregister); + +static ssize_t batch_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%lu\n", page_report_batch_size); +} + +static ssize_t batch_size_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned long size; + int err; + + err = kstrtoul(buf, 10, &size); + if (err || size >= UINT_MAX) + return -EINVAL; + + page_report_batch_size = size; + + return count; +} + +static struct kobj_attribute batch_size_attr = + __ATTR(batch_size, 0644, batch_size_show, batch_size_store); + +static ssize_t delay_millisecs_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%lu\n", page_report_delay_millisecs); +} + +static ssize_t delay_millisecs_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned long msecs; + int err; + + err = kstrtoul(buf, 10, &msecs); + if (err || msecs >= UINT_MAX) + return -EINVAL; + + page_report_delay_millisecs = msecs; + + return count; +} + +static struct kobj_attribute wake_delay_millisecs_attr = + __ATTR(delay_millisecs, 0644, delay_millisecs_show, + delay_millisecs_store); + +static ssize_t mini_order_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%u\n", page_report_mini_order); +} + +static ssize_t mini_order_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned int order; + int err; + + err = kstrtouint(buf, 10, &order); + if (err || order >= MAX_ORDER) + return -EINVAL; + + if (page_report_mini_order != order) { + mutex_lock(&page_reporting_mutex); + page_report_mini_order = order; + mutex_unlock(&page_reporting_mutex); + } + + return count; +} + +static struct kobj_attribute mini_order_attr = + __ATTR(mini_order, 0644, mini_order_show, mini_order_store); + +static struct attribute *page_report_attr[] = { + &mini_order_attr.attr, + &wake_delay_millisecs_attr.attr, + &batch_size_attr.attr, + NULL, +}; + +static struct attribute_group page_report_attr_group = { + .attrs = page_report_attr, +}; + +static int __init page_report_init_sysfs(struct kobject **page_report_kobj) +{ + int err; + + *page_report_kobj = kobject_create_and_add("page_report", mm_kobj); + if (unlikely(!*page_report_kobj)) { + pr_err("page_report: failed to create page_report kobject\n"); + return -ENOMEM; + } + + err = sysfs_create_group(*page_report_kobj, &page_report_attr_group); + if (err) { + pr_err("page_report: failed to register page_report group\n"); + goto delete_obj; + } + + return 0; + +delete_obj: + kobject_put(*page_report_kobj); + return err; +} + +static int __init page_report_init(void) +{ + int err; + struct kobject *page_report_kobj; + + msecs_to_jiffies(page_report_delay_millisecs); + err = page_report_init_sysfs(&page_report_kobj); + if (err) + return err; + + return 0; +} +subsys_initcall(page_report_init); diff --git a/mm/page_reporting.h b/mm/page_reporting.h index f18c85ecdfe0..5e52777c934d 100644 --- a/mm/page_reporting.h +++ b/mm/page_reporting.h @@ -10,7 +10,7 @@ #include #include -#define PAGE_REPORTING_MIN_ORDER pageblock_order +extern unsigned int page_report_mini_order; extern unsigned long page_report_batch_size; @@ -42,7 +42,7 @@ static inline void page_reporting_notify_free(unsigned int order) return; /* Determine if we have crossed reporting threshold */ - if (order < PAGE_REPORTING_MIN_ORDER) + if (order < page_report_mini_order) return; batch_size += (1 << order) << PAGE_SHIFT; From patchwork Sun Apr 12 09:09:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11484491 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78FC3112C for ; Sun, 12 Apr 2020 09:15:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2A28C206DA for ; Sun, 12 Apr 2020 09:15:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aNK0ACOh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A28C206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B78E8E00CB; Sun, 12 Apr 2020 05:15:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 591778E0007; Sun, 12 Apr 2020 05:15:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47E718E00CB; Sun, 12 Apr 2020 05:15:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id 2D5FB8E0007 for ; Sun, 12 Apr 2020 05:15:17 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D79ED181AEF23 for ; Sun, 12 Apr 2020 09:15:16 +0000 (UTC) X-FDA: 76698644232.28.snake34_856002194d14d X-Spam-Summary: 2,0,0,3ea56f4295766bd7,d41d8cd98f00b204,liliang.opensource@gmail.com,,RULES_HIT:4:41:355:379:800:960:966:973:981:988:989:1260:1277:1312:1313:1314:1345:1381:1431:1437:1516:1518:1519:1593:1594:1595:1596:1605:1730:1747:1777:1792:1801:2194:2196:2198:2199:2200:2201:2393:2538:2559:2562:2638:2731:2892:3138:3139:3140:3141:3142:3369:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6120:6261:6653:6691:6737:7903:8957:9149:9413:10004:10394:11026:11473:11658:11914:12043:12048:12291:12294:12296:12297:12438:12517:12519:12555:12683:12895:12986:13160:13161:13229:13439:13895:13972:14096:14097:14687:21080:21444:21451:21524:21554:21627:21666:21740:21789:21795:21966:21990:30051:30054:30056:30064:30067:30070:30075,0,RBL:209.85.214.194:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.50.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: snake34_856002194d14d X-Filterd-Recvd-Size: 15772 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Sun, 12 Apr 2020 09:15:16 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id x2so2333116plv.13 for ; Sun, 12 Apr 2020 02:15:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=w2t8ynRQtZD2Msdal+FOK6sOorbXy2D4HYAx3w0ig3U=; b=aNK0ACOhQllXZs0VCPl/RMrScHbzox0I19JvSeLQkHr7cpAsFZnk3sBTNvhRXQT0Pq 4HysbmsQIS2ozo2sw7IYekVT95jU1y+TgqISvlP9DVTbXt/SS0Ss9sSz7XsCje0E9jwP xYyxnboszWw+e6PoqCI63DdeQnN9N2YptazidMhuQEESF5CHTEkZArOdEm1ZMmeLgeuC Kv+2vlziBOv8Mxs+PlBL4cm6fzki8dKT2FVk2bW8cZL/sRDdXAY7lFteobPEKCnEqxDU 3qI+nZg+LS49bgfB8aRbNhcPrfUmHsoquLvxsXIN7bmiExdnBu4x4OemnG453UIcMwTy 02hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:mail-followup-to :mime-version:content-disposition:user-agent; bh=w2t8ynRQtZD2Msdal+FOK6sOorbXy2D4HYAx3w0ig3U=; b=DM2CBT60yw7scojYB1YKoVs6bqgHimXXqvr9B0U2P+yz+7NGDhp/IaFPOkCmgCl9qB H3KWgsqJ9WfZ/GhSSjlWNEf2CkjaY6NFIs9jYknHa5AHh47fAt/VF76vN8tDajpTzPzg y8MHVGhl66BOrhTuymqp7/srq14yIZLs0yQxwz3NVKCjg02NSIMdjPZca9WRrCbMae2A frILtYMcale2o8YIWPy/WTiBRuHg4zn8bHbZbtU1I79ejOERTKKs+KSJZdM1ar/XCMTz nF98WFEooFIdXU9zefZW1enzhSwqm3KOSrV/gRnKWmNqjgkblxEd+P59SKLGnp8cRrEY grKQ== X-Gm-Message-State: AGi0PubsNXw6+UeX80O2pE/X8w+cXNFZB8vbgTqJN8eIzin6YnX/XiAi dLdKTwY9VAkAs1Fbs6CPW4A= X-Google-Smtp-Source: APiQypLyQhpHzaamG8hC/k4msCztQyfU7Dz0cXUrQ8pk66EhcehcJxHjyyj3q+lZbH3P576DWP5BTw== X-Received: by 2002:a17:90a:fe18:: with SMTP id ck24mr16235456pjb.57.1586682915585; Sun, 12 Apr 2020 02:15:15 -0700 (PDT) Received: from open-light-1.localdomain ([66.98.113.28]) by smtp.gmail.com with ESMTPSA id y131sm5858792pfb.78.2020.04.12.02.15.14 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Apr 2020 02:15:14 -0700 (PDT) From: liliangleo X-Google-Original-From: liliangleo Date: Sun, 12 Apr 2020 05:09:49 -0400 To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson Subject: [RFC PATCH 4/4] mm: Add PG_zero support Message-ID: <20200412090945.GA19582@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Dan Williams , Dave Hansen , David Hildenbrand , Michal Hocko , Andrew Morton , Alex Williamson MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Zero out the page content usually happens when allocating pages, this is a time consuming operation, it makes pin and mlock operation very slowly, especially for a large batch of memory. This patch introduce a new feature for zero out pages before page allocation, it can help to speed up page allocation. The idea is very simple, zero out free pages when the system is not busy and mark the page with PG_zero, when allocating a page, if the page need to be filled with zero, check the flag in the struct page, if it's marked as PG_zero, zero out can be skipped, it can save cpu time and speed up page allocation. This serial is based on the feature 'free page reporting' which introduced by Alexander Duyck We can benefit from this feature in the flowing case: 1. User space mlock a large chunk of memory 2. VFIO pin pages for DMA 3. Allocating transparent huge page 4. Speed up page fault process My original intention for adding this feature is to shorten VM creation time when VFIO device is attached, it works good and the VM creation time is reduced obviously. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Signed-off-by: liliangleo Reported-by: kernel test robot --- include/linux/highmem.h | 31 ++++++++- include/linux/page-flags.h | 18 ++++- include/trace/events/mmflags.h | 7 ++ mm/Kconfig | 10 +++ mm/Makefile | 1 + mm/huge_memory.c | 3 +- mm/page_alloc.c | 2 + mm/zero_page.c | 151 +++++++++++++++++++++++++++++++++++++++++ mm/zero_page.h | 13 ++++ 9 files changed, 231 insertions(+), 5 deletions(-) create mode 100644 mm/zero_page.c create mode 100644 mm/zero_page.h diff --git a/include/linux/highmem.h b/include/linux/highmem.h index ea5cdbd8c2c3..0308837adc19 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -157,7 +157,13 @@ do { \ #ifndef clear_user_highpage static inline void clear_user_highpage(struct page *page, unsigned long vaddr) { - void *addr = kmap_atomic(page); + void *addr; + +#ifdef CONFIG_ZERO_PAGE + if (TestClearPageZero(page)) + return; +#endif + addr = kmap_atomic(page); clear_user_page(addr, vaddr, page); kunmap_atomic(addr); } @@ -208,9 +214,30 @@ alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma, return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr); } +#ifdef CONFIG_ZERO_PAGE +static inline void __clear_highpage(struct page *page) +{ + void *kaddr; + + if (PageZero(page)) + return; + + kaddr = kmap_atomic(page); + clear_page(kaddr); + SetPageZero(page); + kunmap_atomic(kaddr); +} +#endif + static inline void clear_highpage(struct page *page) { - void *kaddr = kmap_atomic(page); + void *kaddr; + +#ifdef CONFIG_ZERO_PAGE + if (TestClearPageZero(page)) + return; +#endif + kaddr = kmap_atomic(page); clear_page(kaddr); kunmap_atomic(kaddr); } diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 222f6f7b2bb3..ace247c5d3ec 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -136,6 +136,10 @@ enum pageflags { PG_young, PG_idle, #endif +#ifdef CONFIG_ZERO_PAGE + PG_zero, +#endif + __NR_PAGEFLAGS, /* Filesystems */ @@ -447,6 +451,16 @@ PAGEFLAG(Idle, idle, PF_ANY) */ __PAGEFLAG(Reported, reported, PF_NO_COMPOUND) +#ifdef CONFIG_ZERO_PAGE +PAGEFLAG(Zero, zero, PF_ANY) +TESTSCFLAG(Zero, zero, PF_ANY) +#define __PG_ZERO (1UL << PG_zero) +#else +PAGEFLAG_FALSE(Zero) +#define __PG_ZERO 0 +#endif + + /* * On an anonymous page mapped into a user virtual memory area, * page->mapping points to its anon_vma, not to a struct address_space; @@ -843,7 +857,7 @@ static inline void ClearPageSlabPfmemalloc(struct page *page) 1UL << PG_private | 1UL << PG_private_2 | \ 1UL << PG_writeback | 1UL << PG_reserved | \ 1UL << PG_slab | 1UL << PG_active | \ - 1UL << PG_unevictable | __PG_MLOCKED) + 1UL << PG_unevictable | __PG_MLOCKED | __PG_ZERO) /* * Flags checked when a page is prepped for return by the page allocator. @@ -854,7 +868,7 @@ static inline void ClearPageSlabPfmemalloc(struct page *page) * alloc-free cycle to prevent from reusing the page. */ #define PAGE_FLAGS_CHECK_AT_PREP \ - (((1UL << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON) + (((1UL << NR_PAGEFLAGS) - 1) & ~(__PG_HWPOISON | __PG_ZERO)) #define PAGE_FLAGS_PRIVATE \ (1UL << PG_private | 1UL << PG_private_2) diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 5fb752034386..7be4153bed2c 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -73,6 +73,12 @@ #define IF_HAVE_PG_HWPOISON(flag,string) #endif +#ifdef CONFIG_ZERO_PAGE +#define IF_HAVE_PG_ZERO(flag,string) ,{1UL << flag, string} +#else +#define IF_HAVE_PG_ZERO(flag,string) +#endif + #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT) #define IF_HAVE_PG_IDLE(flag,string) ,{1UL << flag, string} #else @@ -104,6 +110,7 @@ IF_HAVE_PG_MLOCK(PG_mlocked, "mlocked" ) \ IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ +IF_HAVE_PG_ZERO(PG_zero, "zero" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ IF_HAVE_PG_IDLE(PG_idle, "idle" ) diff --git a/mm/Kconfig b/mm/Kconfig index c1acc34c1c35..3806bdbff4c9 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -252,6 +252,16 @@ config PAGE_REPORTING those pages to another entity, such as a hypervisor, so that the memory can be freed within the host for other uses. +# +# support for zero free page +config ZERO_PAGE + bool "Zero free page" + def_bool y + depends on PAGE_REPORTING + help + Zero page allows zero out free pages in freelist based on free + page reporting + # # support for page migration # diff --git a/mm/Makefile b/mm/Makefile index fccd3756b25f..ee23147a623f 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -112,3 +112,4 @@ obj-$(CONFIG_MEMFD_CREATE) += memfd.o obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o obj-$(CONFIG_PTDUMP_CORE) += ptdump.o obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o +obj-$(CONFIG_ZERO_PAGE) += zero_page.o diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6ecd1045113b..a28707aea3c5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2542,7 +2542,8 @@ static void __split_huge_page_tail(struct page *head, int tail, (1L << PG_workingset) | (1L << PG_locked) | (1L << PG_unevictable) | - (1L << PG_dirty))); + (1L << PG_dirty) | + __PG_ZERO)); /* ->mapping in first tail page is compound_mapcount */ VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 69827d4fa052..3e9601d0b944 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -75,6 +75,7 @@ #include "internal.h" #include "shuffle.h" #include "page_reporting.h" +#include "zero_page.h" /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); @@ -1179,6 +1180,7 @@ static __always_inline bool free_pages_prepare(struct page *page, trace_mm_page_free(page, order); + clear_zero_page_flag(page, order); /* * Check tail pages before head page information is cleared to * avoid checking PageCompound for order-0 pages. diff --git a/mm/zero_page.c b/mm/zero_page.c new file mode 100644 index 000000000000..f3b3d58f0ef2 --- /dev/null +++ b/mm/zero_page.c @@ -0,0 +1,151 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (C) 2020 Didi chuxing. + * + * Authors: Liang Li + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#include +#include +#include +#include "internal.h" +#include "zero_page.h" + +#define ZERO_PAGE_STOP 0 +#define ZERO_PAGE_RUN 1 + +static unsigned long zeropage_enable __read_mostly; +static DEFINE_MUTEX(kzeropaged_mutex); +static struct page_reporting_dev_info zero_page_dev_info; + +inline void clear_zero_page_flag(struct page *page, int order) +{ + int i; + + for (i = 0; i < (1 << order); i++) + ClearPageZero(page + i); +} + +static int zero_free_pages(struct page_reporting_dev_info *pr_dev_info, + struct scatterlist *sgl, unsigned int nents) +{ + struct scatterlist *sg = sgl; + + might_sleep(); + do { + struct page *page = sg_page(sg); + unsigned int order = get_order(sg->length); + int i; + + VM_BUG_ON(PageBuddy(page) || page_order(page)); + + for (i = 0; i < (1 << order); i++) { + cond_resched(); + __clear_highpage(page + i); + } + } while ((sg = sg_next(sg))); + + return 0; +} + +static int start_kzeropaged(void) +{ + int err = 0; + + if (zeropage_enable) { + zero_page_dev_info.report = zero_free_pages; + err = page_reporting_register(&zero_page_dev_info); + pr_info("Zero page enabled\n"); + } else { + page_reporting_unregister(&zero_page_dev_info); + pr_info("Zero page disabled\n"); + } + + return err; +} + +static ssize_t enabled_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%lu\n", zeropage_enable); +} + +static ssize_t enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + ssize_t ret = 0; + unsigned long flags; + int err; + + err = kstrtoul(buf, 10, &flags); + if (err || flags > UINT_MAX) + return -EINVAL; + if (flags > ZERO_PAGE_RUN) + return -EINVAL; + + if (zeropage_enable != flags) { + mutex_lock(&kzeropaged_mutex); + zeropage_enable = flags; + ret = start_kzeropaged(); + mutex_unlock(&kzeropaged_mutex); + } + + return count; +} + +static struct kobj_attribute enabled_attr = + __ATTR(enabled, 0644, enabled_show, enabled_store); + +static struct attribute *zeropage_attr[] = { + &enabled_attr.attr, + NULL, +}; + +static struct attribute_group zeropage_attr_group = { + .attrs = zeropage_attr, +}; + +static int __init zeropage_init_sysfs(struct kobject **zeropage_kobj) +{ + int err; + + *zeropage_kobj = kobject_create_and_add("zero_page", mm_kobj); + if (unlikely(!*zeropage_kobj)) { + pr_err("zeropage: failed to create zeropage kobject\n"); + return -ENOMEM; + } + + err = sysfs_create_group(*zeropage_kobj, &zeropage_attr_group); + if (err) { + pr_err("zeropage: failed to register zeropage group\n"); + goto delete_obj; + } + + return 0; + +delete_obj: + kobject_put(*zeropage_kobj); + return err; +} + +static int __init zeropage_init(void) +{ + int err; + struct kobject *zeropage_kobj; + + err = zeropage_init_sysfs(&zeropage_kobj); + if (err) + return err; + + start_kzeropaged(); + + return 0; +} +subsys_initcall(zeropage_init); diff --git a/mm/zero_page.h b/mm/zero_page.h new file mode 100644 index 000000000000..bfa3c9fe94d3 --- /dev/null +++ b/mm/zero_page.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_ZERO_PAGE_H +#define _LINUX_ZERO_PAGE_H + +#ifdef CONFIG_ZERO_PAGE +extern inline void clear_zero_page_flag(struct page *page, int order); +#else +inline void clear_zero_page_flag(struct page *page, int order) +{ +} +#endif +#endif /*_LINUX_ZERO_NG_H */ +