From patchwork Tue Apr 5 19:40:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12801832 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C787C433F5 for ; Tue, 5 Apr 2022 19:41:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E1318D0001; Tue, 5 Apr 2022 15:41:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 490846B0074; Tue, 5 Apr 2022 15:41:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3316F8D0001; Tue, 5 Apr 2022 15:41:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 2735B6B0073 for ; Tue, 5 Apr 2022 15:41:26 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id EA65F21112 for ; Tue, 5 Apr 2022 19:41:15 +0000 (UTC) X-FDA: 79323844110.08.BF1733E Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 5A390C0038 for ; Tue, 5 Apr 2022 19:41:15 +0000 (UTC) Received: by mail-pf1-f170.google.com with SMTP id h19so441003pfv.1 for ; Tue, 05 Apr 2022 12:41:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=ThM6MrSVtq+3lNF0sf7Tw4sa0ajw/Oa8+sO8FhMiOWs=; b=kandRw3R9OLZT/gTdz6EpJoE1BmvF9LW2SjmBdFumgpBZev/vT9YgIVEVYhtllmT68 mQsB7Trc428kFW+AHqn4Dr5eVH1eqdqkqFcKLBcrso6JBB+Rj0Aek5W8q4KZAfFr0MOV ixJ+18ht4Gbw/A3j1yMNI9JB/4ZT1GPbBw8TfAXggK/U2CLESag9i54cQmGw6NGKaPiB YcE0JZbSh1F1+V6dYC9mgbF1pS1ebuLJHGJLBP3cE8WGlN9kTf0MEggpzeA2zZRxB5Mg FCSDmgzdY+tlW3y1z3AHySdCpmYrtod5/HGNp6BMtjfy2k+sIBR1U42v8BAtCoC5RAn8 fGng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=ThM6MrSVtq+3lNF0sf7Tw4sa0ajw/Oa8+sO8FhMiOWs=; b=3fqsIQaqylgZ3ZmwLhaQDjC3+1a0ghNm4u5I0O6sjqjfYnZT1tCvDbtTTv5tMpwlZ+ l1aV2f1GbCYU59M/NDuQNT+axi2mIYbc++vum2MzVOtExtZC4DofyeFnxgCVSVMf+h3L p83w1lnuepn6irdCA6DuJFqaCX+egBaEVAfcpWwn7RY+bb0LNSS1DpzXBJTMPUPTOEI5 aa9fSFeZAWieYzbn1v/H5zD4tLK7xkMuAOmmf7te/SM/IR8o8PxkopUZYLBSFhYWwt9n mFo+Sn3Zwv86sCwC2yPk7WvLBW8g+gSthRyPiPdFPEcW2JXlnXy9gh2hwlTWLGeLempg obPQ== X-Gm-Message-State: AOAM530cKLe+y/Z43B150ZVpo2FmBOeqbhalVVOiw+x7/V2l/hUN04fv KlPhqXJC7qJ3Dqyk4mrEbLZmL0j6QuZlgw== X-Google-Smtp-Source: ABdhPJzmCawk9PVsi0JEwh4OX1T0prQxnV9yU3PUWoMM0LRc7oduqUoIlrCsMvn2oD7zr5B1Vl37tA== X-Received: by 2002:a05:6a00:14ca:b0:4fb:5d3e:5f77 with SMTP id w10-20020a056a0014ca00b004fb5d3e5f77mr5371618pfu.34.1649187673683; Tue, 05 Apr 2022 12:41:13 -0700 (PDT) Received: from relinquished.tfbnw.net ([2620:10d:c090:400::5:1ae7]) by smtp.gmail.com with ESMTPSA id 204-20020a6302d5000000b00385f29b02b2sm14205755pgc.50.2022.04.05.12.41.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Apr 2022 12:41:12 -0700 (PDT) From: Omar Sandoval To: linux-mm@kvack.org, kexec@lists.infradead.org, Andrew Morton Cc: Uladzislau Rezki , Christoph Hellwig , Cliff Wickman , x86@kernel.org, kernel-team@fb.com Subject: [PATCH] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Date: Tue, 5 Apr 2022 12:40:31 -0700 Message-Id: <75014514645de97f2d9e087aa3df0880ea311b77.1649187356.git.osandov@fb.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 X-Stat-Signature: hoyir1o1rodmeqj1ehc1iz94o3ugwmko X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5A390C0038 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=osandov-com.20210112.gappssmtp.com header.s=20210112 header.b=kandRw3R; dmarc=none; spf=none (imf10.hostedemail.com: domain of osandov@osandov.com has no SPF policy when checking 209.85.210.170) smtp.mailfrom=osandov@osandov.com X-Rspam-User: X-HE-Tag: 1649187675-695030 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Omar Sandoval Commit 3ee48b6af49c ("mm, x86: Saving vmcore with non-lazy freeing of vmas") introduced set_iounmap_nonlazy(), which sets vmap_lazy_nr to lazy_max_pages() + 1, ensuring that any future vunmaps() immediately purges the vmap areas instead of doing it lazily. Commit 690467c81b1a ("mm/vmalloc: Move draining areas out of caller context") moved the purging from the vunmap() caller to a worker thread. Unfortunately, set_iounmap_nonlazy() can cause the worker thread to spin (possibly forever). For example, consider the following scenario: 1. Thread reads from /proc/vmcore. This eventually calls __copy_oldmem_page() -> set_iounmap_nonlazy(), which sets vmap_lazy_nr to lazy_max_pages() + 1. 2. Then it calls free_vmap_area_noflush() (via iounmap()), which adds 2 pages (one page plus the guard page) to the purge list and vmap_lazy_nr. vmap_lazy_nr is now lazy_max_pages() + 3, so the drain_vmap_work is scheduled. 3. Thread returns from the kernel and is scheduled out. 4. Worker thread is scheduled in and calls drain_vmap_area_work(). It frees the 2 pages on the purge list. vmap_lazy_nr is now lazy_max_pages() + 1. 5. This is still over the threshold, so it tries to purge areas again, but doesn't find anything. 6. Repeat 5. If the system is running with only one CPU (which is typicial for kdump) and preemption is disabled, then this will never make forward progress: there aren't any more pages to purge, so it hangs. If there is more than one CPU or preemption is enabled, then the worker thread will spin forever in the background. (Note that if there were already pages to be purged at the time that set_iounmap_nonlazy() was called, this bug is avoided.) This can be reproduced with anything that reads from /proc/vmcore multiple times. E.g., vmcore-dmesg /proc/vmcore. A simple way to "fix" this would be to make set_iounmap_nonlazy() set vmap_lazy_nr to lazy_max_pages() instead of lazy_max_pages() + 1. But, I think it'd be better to get rid of this hack of clobbering vmap_lazy_nr. Instead, this fix makes __copy_oldmem_page() explicitly drain the vmap areas itself. Signed-off-by: Omar Sandoval Reviewed-by: Uladzislau Rezki (Sony) --- arch/x86/include/asm/io.h | 2 +- arch/x86/kernel/crash_dump_64.c | 2 +- mm/vmalloc.c | 21 ++++++++++----------- 3 files changed, 12 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index f6d91ecb8026..da466352f27c 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -210,7 +210,7 @@ void __iomem *ioremap(resource_size_t offset, unsigned long size); extern void iounmap(volatile void __iomem *addr); #define iounmap iounmap -extern void set_iounmap_nonlazy(void); +void iounmap_purge_vmap_area(void); #ifdef __KERNEL__ diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c index a7f617a3981d..075dd36c502d 100644 --- a/arch/x86/kernel/crash_dump_64.c +++ b/arch/x86/kernel/crash_dump_64.c @@ -37,8 +37,8 @@ static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize, } else memcpy(buf, vaddr + offset, csize); - set_iounmap_nonlazy(); iounmap((void __iomem *)vaddr); + iounmap_purge_vmap_area(); return csize; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e163372d3967..48084d742688 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1671,17 +1671,6 @@ static DEFINE_MUTEX(vmap_purge_lock); /* for per-CPU blocks */ static void purge_fragmented_blocks_allcpus(void); -#ifdef CONFIG_X86_64 -/* - * called before a call to iounmap() if the caller wants vm_area_struct's - * immediately freed. - */ -void set_iounmap_nonlazy(void) -{ - atomic_long_set(&vmap_lazy_nr, lazy_max_pages()+1); -} -#endif /* CONFIG_X86_64 */ - /* * Purges all lazily-freed vmap areas. */ @@ -1753,6 +1742,16 @@ static void purge_vmap_area_lazy(void) mutex_unlock(&vmap_purge_lock); } +#ifdef CONFIG_X86_64 +/* Called after iounmap() to immediately free vm_area_struct's. */ +void iounmap_purge_vmap_area(void) +{ + mutex_lock(&vmap_purge_lock); + __purge_vmap_area_lazy(ULONG_MAX, 0); + mutex_unlock(&vmap_purge_lock); +} +#endif /* CONFIG_X86_64 */ + static void drain_vmap_area_work(struct work_struct *work) { unsigned long nr_lazy;