From patchwork Wed Apr 1 22:57:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 11469685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF8031392 for ; Wed, 1 Apr 2020 22:57:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7EDCF206E9 for ; Wed, 1 Apr 2020 22:57:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="hD05TIO+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7EDCF206E9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=soleen.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6628E8E0009; Wed, 1 Apr 2020 18:57:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6121D8E0006; Wed, 1 Apr 2020 18:57:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 502F38E0009; Wed, 1 Apr 2020 18:57:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 372B18E0006 for ; Wed, 1 Apr 2020 18:57:30 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EA661181AEF09 for ; Wed, 1 Apr 2020 22:57:29 +0000 (UTC) X-FDA: 76660799418.03.play67_7b9c6ba6e011d X-Spam-Summary: 2,0,0,b74533eed044c611,d41d8cd98f00b204,pasha.tatashin@soleen.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1345:1359:1381:1437:1535:1543:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2689:2693:2731:2736:2895:2904:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3873:3874:4117:4321:4470:4605:5007:6117:6261:6653:6737:7903:8634:9038:9592:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12895:12986:13053:13161:13221:13229:13869:14093:14096:14181:14394:14721:21080:21324:21444:21451:21627:21966:21990:30005:30034:30054:30064:30070,0,RBL:209.85.222.195:@soleen.com:.lbl8.mailshell.net-62.2.0.100 66.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: play67_7b9c6ba6e011d X-Filterd-Recvd-Size: 6888 Received: from mail-qk1-f195.google.com (mail-qk1-f195.google.com [209.85.222.195]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Wed, 1 Apr 2020 22:57:29 +0000 (UTC) Received: by mail-qk1-f195.google.com with SMTP id o10so1972472qki.10 for ; Wed, 01 Apr 2020 15:57:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=Ua0ip48NRuNb7M82MPpw9n9AzPFq0D90ae7TFh2Pxk4=; b=hD05TIO+ZC3rcKCYSDzUly6N/UrVaUQ0LGPHMldrr06PrDntWHxAO4yQ2gYQ6M+bw8 310p7HAxkzRHj0X9fHCS5yV+fvCbpNycZIFV60gw+ia71s0pf7pPgnBlehVPQyS9JDtL 9nYTuVLLC2x47dHOFZbPFY5jZKAr62mdkoyB7kUbiOuHyTN1A4OKGZ4BS9rvH5YGrki3 DpL8x5JmB018e9hMrvUD7xJlxwcQASQxBRRiey0rl2aQUZG+8tpXpVZ1N+eBeItdQTH3 T0a6fBDMGZ5w+enooGb0d1oMlmwgZMkHstzj1G/mfOMLbAxWFvCMwQmGKrKjKSURMV6Q 1Qdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=Ua0ip48NRuNb7M82MPpw9n9AzPFq0D90ae7TFh2Pxk4=; b=m11ciD7SIRrdY0lVb5zatzD30xP6Gv9iNhC+atiPCVGgcRsVgLsR2vlwq2J9f3SUiU 3oW3WYcXsc5PhBQ69rHbqmWLOywuEttAvQPZSMDaM/Fvu6vZFD2lvFHTiaJPG+NeERoU AcDZx/W2O0mtKGC5DPdjFJa6ZEJ8lT+CvZc7HfhlbIBcmWbmFsKCD2n3a0+5hc+khAYs 2IT2G2Jjl1bbdPETYQ6OJ/oWU0TAyApByeUFYUbbdsxGIh9+Fu4n64qAUr5K6IPbt0uH V03b8KHVauv4Yz7kD8ADqV396DgjdS5EpD0juhMvORLBkXqx2sRP8bUGVB7IP1jYVaX/ Nb8w== X-Gm-Message-State: AGi0PuZvNDzVFinmb3nMG2zQTzEABrUvfKwbu2Nu26Yynd4YY/oEM7/V t8PUyy9h3mzCBPDxodTY7SUrfWGr8geNGw== X-Google-Smtp-Source: APiQypLIbzgor45d6rbnA0BNl055815+9FeI/CdjzWNSk5MG14Rpx+4uHE38nGhF1N8CbXIDJHPQFA== X-Received: by 2002:a37:a952:: with SMTP id s79mr717481qke.368.1585781848692; Wed, 01 Apr 2020 15:57:28 -0700 (PDT) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id q5sm2402635qkq.17.2020.04.01.15.57.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2020 15:57:28 -0700 (PDT) From: Pavel Tatashin To: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, mhocko@suse.com, linux-mm@kvack.org, dan.j.williams@intel.com, shile.zhang@linux.alibaba.com, daniel.m.jordan@oracle.com, pasha.tatashin@soleen.com, ktkhai@virtuozzo.com, david@redhat.com, jmorris@namei.org, sashal@kernel.org, vbabka@suse.cz Subject: [PATCH v2 2/2] mm: initialize deferred pages with interrupts enabled Date: Wed, 1 Apr 2020 18:57:23 -0400 Message-Id: <20200401225723.14164-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200401225723.14164-1-pasha.tatashin@soleen.com> References: <20200401225723.14164-1-pasha.tatashin@soleen.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Initializing struct pages is a long task and keeping interrupts disabled for the duration of this operation introduces a number of problems. 1. jiffies are not updated for long period of time, and thus incorrect time is reported. See proposed solution and discussion here: lkml/20200311123848.118638-1-shile.zhang@linux.alibaba.com 2. It prevents farther improving deferred page initialization by allowing intra-node multi-threading. We are keeping interrupts disabled to solve a rather theoretical problem that was never observed in real world (See 3a2d7fa8a3d5). Lets keep interrupts enabled. In case we ever encounter a scenario where an interrupt thread wants to allocate large amount of memory this early in boot we can deal with that by growing zone (see deferred_grow_zone()) by the needed amount before starting deferred_init_memmap() threads. Before: [ 1.232459] node 0 initialised, 12058412 pages in 1ms After: [ 1.632580] node 0 initialised, 12051227 pages in 436ms Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages") Cc: stable@vger.kernel.org # 4.17+ Reported-by: Shile Zhang Signed-off-by: Pavel Tatashin Reviewed-by: Daniel Jordan Acked-by: Michal Hocko Acked-by: Vlastimil Babka --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c | 22 ++++++++-------------- 2 files changed, 10 insertions(+), 14 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 462f6873905a..c5bdf55da034 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -721,6 +721,8 @@ typedef struct pglist_data { /* * Must be held any time you expect node_start_pfn, * node_present_pages, node_spanned_pages or nr_zones to stay constant. + * Also synchronizes pgdat->first_deferred_pfn during deferred page + * init. * * pgdat_resize_lock() and pgdat_resize_unlock() are provided to * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e8ff6a176164..68669d3a5a66 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1790,6 +1790,13 @@ static int __init deferred_init_memmap(void *data) BUG_ON(pgdat->first_deferred_pfn > pgdat_end_pfn(pgdat)); pgdat->first_deferred_pfn = ULONG_MAX; + /* + * Once we unlock here, the zone cannot be grown anymore, thus if an + * interrupt thread must allocate this early in boot, zone must be + * pre-grown prior to start of deferred page initialization. + */ + pgdat_resize_unlock(pgdat, &flags); + /* Only the highest zone is deferred so find it */ for (zid = 0; zid < MAX_NR_ZONES; zid++) { zone = pgdat->node_zones + zid; @@ -1809,11 +1816,9 @@ static int __init deferred_init_memmap(void *data) */ while (spfn < epfn) { nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); - touch_nmi_watchdog(); + cond_resched(); } zone_empty: - pgdat_resize_unlock(pgdat, &flags); - /* Sanity check that the next zone really is unpopulated */ WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone)); @@ -1855,17 +1860,6 @@ deferred_grow_zone(struct zone *zone, unsigned int order) pgdat_resize_lock(pgdat, &flags); - /* - * If deferred pages have been initialized while we were waiting for - * the lock, return true, as the zone was grown. The caller will retry - * this zone. We won't return to this function since the caller also - * has this static branch. - */ - if (!static_branch_unlikely(&deferred_pages)) { - pgdat_resize_unlock(pgdat, &flags); - return true; - } - /* * If someone grew this zone while we were waiting for spinlock, return * true, as there might be enough pages already.