From patchwork Fri Apr 3 13:35:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 11472685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3F0C14DD for ; Fri, 3 Apr 2020 13:35:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 865672078C for ; Fri, 3 Apr 2020 13:35:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="gKxgUTjj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 865672078C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=soleen.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4C76D8E000A; Fri, 3 Apr 2020 09:35:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 427848E0007; Fri, 3 Apr 2020 09:35:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A59E8E000A; Fri, 3 Apr 2020 09:35:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0145.hostedemail.com [216.40.44.145]) by kanga.kvack.org (Postfix) with ESMTP id 0CDDB8E0007 for ; Fri, 3 Apr 2020 09:35:57 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B3DCF181AEF1A for ; Fri, 3 Apr 2020 13:35:56 +0000 (UTC) X-FDA: 76666641912.26.touch79_8b284215fa314 X-Spam-Summary: 2,0,0,cc5d054bfdb440ac,d41d8cd98f00b204,pasha.tatashin@soleen.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1345:1359:1381:1437:1535:1543:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2689:2693:2731:2736:2895:2904:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3873:3874:4117:4321:4470:4605:5007:6261:6653:6737:7903:9038:9592:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12895:12986:13053:13161:13221:13229:13869:14093:14096:14181:14394:14721:21080:21324:21444:21451:21627:21966:21990:30005:30034:30054:30064:30070,0,RBL:209.85.219.65:@soleen.com:.lbl8.mailshell.net-66.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:28,LUA_SUMMARY:none X-HE-Tag: touch79_8b284215fa314 X-Filterd-Recvd-Size: 6800 Received: from mail-qv1-f65.google.com (mail-qv1-f65.google.com [209.85.219.65]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Fri, 3 Apr 2020 13:35:56 +0000 (UTC) Received: by mail-qv1-f65.google.com with SMTP id ca9so3530263qvb.9 for ; Fri, 03 Apr 2020 06:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=xV2roW1zd//aHKXeeRMtDzonpXvqHnYcXH/dEAXKebY=; b=gKxgUTjjyK9JdLcYNwX9EauAoepcu27x6GXWjEpR46ij/dc6xlZ7wnYtViH3EQavJn HOt9nW/20SrzNtfqIRm0G2ISZkKcFAf3C1KVjNhUhp9pEiukugQ605rm9vmnPx7mkNvP ZNYIS8j0tHZyXmvLMU1BS8iBHURCmSg3JSRu0bFChoOuOp8IokGwoQkXkzHn6Xzm1H7i KS9GDYAKbLmo2cOCmgJ2EA6QYBMkreysneeVcit2+SPPDpUxjs/S2QI1rQEVhP7gmj4f iUGTxyAGbfL60BYr10LpDL9NhXno8Nt7chB+LlqVlV7B3m8doWFbLszscU0ZGkIE/VCr sTPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=xV2roW1zd//aHKXeeRMtDzonpXvqHnYcXH/dEAXKebY=; b=eiQg06ixTSDMxxGeXoAgnTeDhSxDG82oqH3jrYdo2pYGMmB+GYQNP/v9txXcSSgM5Y pjIVZLaxEy/uIJC+GvVFguZi3XH0KpxAvvI8fUkHxqIdFgkD+dKb9mNbo1pg4rAKZil1 gfBh8aMLnQIp8berdtwHHSA/ZL6z5rP1wVGj2ZipPtaRs+MxEcn7bH42NtA2vu0XrxTz g4dCP7RrYeLiCBEh4CSaGEw4LM+6JRa5jU5eF1Opbg/Npxj9vcfEUU919yjKJMbzi1FO tfELTnsvOr1ArPGhdwGTFFNecc5znk7Phl6nsZFUGvHix53BM2UGg0HnBpFHYadb0M76 U44Q== X-Gm-Message-State: AGi0Pub19sUtMEJ7gPPuupVJ4IPAL5P++6Ybl+Z2be40cryZRfyzVgdc n8BvBy3mt9VJ0pmtoRKbZCzPpg== X-Google-Smtp-Source: APiQypIbv6y9E8OGazxpKwwHMm4tbcInhdZmQGRl8YFqOn9cVkWqcwO9mwtCdRPeq+IuzgWJUSlyaw== X-Received: by 2002:a0c:db86:: with SMTP id m6mr7945568qvk.116.1585920955477; Fri, 03 Apr 2020 06:35:55 -0700 (PDT) Received: from localhost.localdomain (c-73-69-118-222.hsd1.nh.comcast.net. [73.69.118.222]) by smtp.gmail.com with ESMTPSA id 17sm6210799qkm.105.2020.04.03.06.35.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Apr 2020 06:35:54 -0700 (PDT) From: Pavel Tatashin To: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, mhocko@suse.com, linux-mm@kvack.org, dan.j.williams@intel.com, shile.zhang@linux.alibaba.com, daniel.m.jordan@oracle.com, pasha.tatashin@soleen.com, ktkhai@virtuozzo.com, david@redhat.com, jmorris@namei.org, sashal@kernel.org, vbabka@suse.cz Subject: [PATCH v3 2/3] mm: initialize deferred pages with interrupts enabled Date: Fri, 3 Apr 2020 09:35:48 -0400 Message-Id: <20200403133549.14338-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200403133549.14338-1-pasha.tatashin@soleen.com> References: <20200403133549.14338-1-pasha.tatashin@soleen.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Initializing struct pages is a long task and keeping interrupts disabled for the duration of this operation introduces a number of problems. 1. jiffies are not updated for long period of time, and thus incorrect time is reported. See proposed solution and discussion here: lkml/20200311123848.118638-1-shile.zhang@linux.alibaba.com 2. It prevents farther improving deferred page initialization by allowing intra-node multi-threading. We are keeping interrupts disabled to solve a rather theoretical problem that was never observed in real world (See 3a2d7fa8a3d5). Lets keep interrupts enabled. In case we ever encounter a scenario where an interrupt thread wants to allocate large amount of memory this early in boot we can deal with that by growing zone (see deferred_grow_zone()) by the needed amount before starting deferred_init_memmap() threads. Before: [ 1.232459] node 0 initialised, 12058412 pages in 1ms After: [ 1.632580] node 0 initialised, 12051227 pages in 436ms Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages") Cc: stable@vger.kernel.org # 4.17+ Reported-by: Shile Zhang Signed-off-by: Pavel Tatashin Reviewed-by: Daniel Jordan Acked-by: Michal Hocko Acked-by: Vlastimil Babka --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c | 20 +++++++------------- 2 files changed, 9 insertions(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 462f6873905a..c5bdf55da034 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -721,6 +721,8 @@ typedef struct pglist_data { /* * Must be held any time you expect node_start_pfn, * node_present_pages, node_spanned_pages or nr_zones to stay constant. + * Also synchronizes pgdat->first_deferred_pfn during deferred page + * init. * * pgdat_resize_lock() and pgdat_resize_unlock() are provided to * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e8ff6a176164..4a60f2427eb0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1790,6 +1790,13 @@ static int __init deferred_init_memmap(void *data) BUG_ON(pgdat->first_deferred_pfn > pgdat_end_pfn(pgdat)); pgdat->first_deferred_pfn = ULONG_MAX; + /* + * Once we unlock here, the zone cannot be grown anymore, thus if an + * interrupt thread must allocate this early in boot, zone must be + * pre-grown prior to start of deferred page initialization. + */ + pgdat_resize_unlock(pgdat, &flags); + /* Only the highest zone is deferred so find it */ for (zid = 0; zid < MAX_NR_ZONES; zid++) { zone = pgdat->node_zones + zid; @@ -1812,8 +1819,6 @@ static int __init deferred_init_memmap(void *data) touch_nmi_watchdog(); } zone_empty: - pgdat_resize_unlock(pgdat, &flags); - /* Sanity check that the next zone really is unpopulated */ WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone)); @@ -1855,17 +1860,6 @@ deferred_grow_zone(struct zone *zone, unsigned int order) pgdat_resize_lock(pgdat, &flags); - /* - * If deferred pages have been initialized while we were waiting for - * the lock, return true, as the zone was grown. The caller will retry - * this zone. We won't return to this function since the caller also - * has this static branch. - */ - if (!static_branch_unlikely(&deferred_pages)) { - pgdat_resize_unlock(pgdat, &flags); - return true; - } - /* * If someone grew this zone while we were waiting for spinlock, return * true, as there might be enough pages already.