From patchwork Mon Aug 10 16:10:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charan Teja Kalla X-Patchwork-Id: 11707637 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2247213B6 for ; Mon, 10 Aug 2020 16:10:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E22BC2073E for ; Mon, 10 Aug 2020 16:10:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg.codeaurora.org header.i=@mg.codeaurora.org header.b="EZLYx+MW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E22BC2073E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA1C66B0003; Mon, 10 Aug 2020 12:10:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E2B486B0006; Mon, 10 Aug 2020 12:10:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF35B6B0007; Mon, 10 Aug 2020 12:10:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0148.hostedemail.com [216.40.44.148]) by kanga.kvack.org (Postfix) with ESMTP id B5BAC6B0003 for ; Mon, 10 Aug 2020 12:10:55 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 72911181AC9CB for ; Mon, 10 Aug 2020 16:10:55 +0000 (UTC) X-FDA: 77135147670.02.deer12_190e6cb26fdb Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 32F6410097AA5 for ; Mon, 10 Aug 2020 16:10:55 +0000 (UTC) X-Spam-Summary: 1,0,0,c44ac2c7ef93cbdc,d41d8cd98f00b204,bounce+d06763.be9e4a-linux-mm=kvack.org@mg.codeaurora.org,,RULES_HIT:41:355:379:541:800:960:965:966:967:973:988:989:1260:1345:1437:1535:1543:1711:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2525:2540:2559:2563:2682:2685:2693:2731:2741:2859:2895:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:4390:4395:5007:6119:6261:6653:7903:8660:9010:9025:10004:11026:11218:11232:11658:11914:12043:12048:12114:12297:12438:12517:12519:12555:12986:13148:13161:13229:13230:14181:14394:14721:21080:21451:21627:21740:21796:21939:21966:21972:21987:21990:22119:30012:30036:30054:30070,0,RBL:104.130.122.29:@mg.codeaurora.org:.lbl8.mailshell.net-64.201.201.201 62.14.0.100;04yfwegneoy3t9c3fy4ss5oxw4bnwyc6x873f9t4oksqcss1b4b447dy4h6nk58.cuzhwzd1giub8518rxxkmcy3x36sgqyjrkyt14w9po3sri1pfh6tpcck4kixfa1.e-lbl8.mailshell.net- 223.238. X-HE-Tag: deer12_190e6cb26fdb X-Filterd-Recvd-Size: 5715 Received: from mail29.static.mailgun.info (mail29.static.mailgun.info [104.130.122.29]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Mon, 10 Aug 2020 16:10:50 +0000 (UTC) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1597075854; h=Message-Id: Date: Subject: Cc: To: From: Sender; bh=dEsKv9hzNjliBML00TJmrjA5+nUkRn13a4PtjeFm8Uk=; b=EZLYx+MWPKP0MobCAcHoco/KKeeK7WeQX/DGoL4TCstDWcQUN2LcWCaAEDEHir4aeav8+4zP HVtW5DFr3IXmgvwM7a99K/Gs00yfaQuDqD08p6c4+MSxnKHl/sj+3yT2maYJ6ltWuF90/1oY gYnBzywZeNH/ZqMCDaVef5m0khw= X-Mailgun-Sending-Ip: 104.130.122.29 X-Mailgun-Sid: WyIwY2Q3OCIsICJsaW51eC1tbUBrdmFjay5vcmciLCAiYmU5ZTRhIl0= Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n02.prod.us-east-1.postgun.com with SMTP id 5f317186d78a2e58336cedf7 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Mon, 10 Aug 2020 16:10:46 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 69A77C43391; Mon, 10 Aug 2020 16:10:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED,SPF_NONE autolearn=ham autolearn_force=no version=3.4.0 Received: from charante-linux.qualcomm.com (unknown [202.46.22.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: charante) by smtp.codeaurora.org (Postfix) with ESMTPSA id 7EFFDC433C6; Mon, 10 Aug 2020 16:10:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 7EFFDC433C6 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=charante@codeaurora.org From: Charan Teja Reddy To: akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, vinmenon@codeaurora.org, Charan Teja Reddy Subject: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk() Date: Mon, 10 Aug 2020 21:40:33 +0530 Message-Id: <1597075833-16736-1-git-send-email-charante@codeaurora.org> X-Mailer: git-send-email 1.9.1 X-Rspamd-Queue-Id: 32F6410097AA5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The following race is observed with the repeated online, offline and a delay between two successive online of memory blocks of movable zone. P1 P2 Online the first memory block in the movable zone. The pcp struct values are initialized to default values,i.e., pcp->high = 0 & pcp->batch = 1. Allocate the pages from the movable zone. Try to Online the second memory block in the movable zone thus it entered the online_pages() but yet to call zone_pcp_update(). This process is entered into the exit path thus it tries to release the order-0 pages to pcp lists through free_unref_page_commit(). As pcp->high = 0, pcp->count = 1 proceed to call the function free_pcppages_bulk(). Update the pcp values thus the new pcp values are like, say, pcp->high = 378, pcp->batch = 63. Read the pcp's batch value using READ_ONCE() and pass the same to free_pcppages_bulk(), pcp values passed here are, batch = 63, count = 1. Since num of pages in the pcp lists are less than ->batch, then it will stuck in while(list_empty(list)) loop with interrupts disabled thus a core hung. Avoid this by ensuring free_pcppages_bulk() called with proper count of pcp list pages. The mentioned race is some what easily reproducible without [1] because pcp's are not updated for the first memory block online and thus there is a enough race window for P2 between alloc+free and pcp struct values update through onlining of second memory block. With [1], the race is still exists but it is very much narrow as we update the pcp struct values for the first memory block online itself. [1]: https://patchwork.kernel.org/patch/11696389/ Signed-off-by: Charan Teja Reddy --- mm/page_alloc.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e4896e6..25e7e12 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) struct zone *zone = page_zone(page); struct per_cpu_pages *pcp; int migratetype; + int high; migratetype = get_pcppage_migratetype(page); __count_vm_event(PGFREE); @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) pcp = &this_cpu_ptr(zone->pageset)->pcp; list_add(&page->lru, &pcp->lists[migratetype]); pcp->count++; - if (pcp->count >= pcp->high) { - unsigned long batch = READ_ONCE(pcp->batch); + high = READ_ONCE(pcp->high); + if (pcp->count >= high) { + int batch; + + batch = READ_ONCE(pcp->batch); + /* + * For non-default pcp struct values, high is always + * greater than the batch. If high < batch then pass + * proper count to free the pcp's list pages. + */ + if (unlikely(high < batch)) + batch = min(pcp->count, batch); + free_pcppages_bulk(zone, batch, pcp); } }