From patchwork Mon May 29 06:22:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13258172 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFE9BC77B7A for ; Mon, 29 May 2023 06:22:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C5AD900003; Mon, 29 May 2023 02:22:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 77651900002; Mon, 29 May 2023 02:22:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63E93900003; Mon, 29 May 2023 02:22:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 54EE7900002 for ; Mon, 29 May 2023 02:22:49 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 371CD1601ED for ; Mon, 29 May 2023 06:22:49 +0000 (UTC) X-FDA: 80842299258.11.128794E Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 0F262A0014 for ; Mon, 29 May 2023 06:22:46 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=JnCv5o3g; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685341367; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MHjeymamXVEDQ3Kw25JCka7yLmkzZOMAvMVAEydeLzM=; b=0/60yBa5ztcBLJC69KiDS03bY5RDLQ6/o+9kXSPizU6upWz7ftI0q2wuLhOr47W3tMtLzD KUpVGiVvfWEKfc/22bbD7a4DN08Fdlt8TrvtaJNSj2sI5Cxb2MPfoDqdGW4+rqRBIDjMdC 1oon9P99kTPillvto+GIia5VsKAYnz4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=JnCv5o3g; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685341367; a=rsa-sha256; cv=none; b=3jzM0M7BFdE2HjNMi3FBV7tgI7JUMBSleVm9NlqUxMmpz16LlfTjzBeZBm+7JTb2ufvj6r 4klx7i6+tnVFdgeCd/jfXJ3eysvaYWlDGksaUhg+ZbjdkCppgDXKCGY/eU4u5qnFXr+6/y egRfCCn5DCZANgq2LKnFp4S8VSvdIoY= Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-565a63087e9so40117937b3.2 for ; Sun, 28 May 2023 23:22:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685341366; x=1687933366; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=MHjeymamXVEDQ3Kw25JCka7yLmkzZOMAvMVAEydeLzM=; b=JnCv5o3gZCSjIaRpRDj860LJkDOANHqriyLGSv5pPF/RAyg9m3lQdupCXaTyei8DMf e3KqZBCpZIzZeMjk/yaxPctH6c49veiWRbUeGgx17gSNxkwyxqN1QFATHf6q6YutmR3t HvT+zDNBDGXXzqUl/LmcNCNapCFXfgc8yyTJIwdBnMkvFfp9LdRJqcRYSLqq8ynJ3LiF LXxpx3gFYAZnedKKq/SPuYQMj/8ol6u4/3db6BOFo/ME2oJub0w8Z8IEMye4507Zx0Ui smMsg0FV9ON+iRDLGBbSQlzdjRNF2P0/vV32Ee1vXvhsjJhpzkmQMG6hoZRiP5zjDNGY IC9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685341366; x=1687933366; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MHjeymamXVEDQ3Kw25JCka7yLmkzZOMAvMVAEydeLzM=; b=fE9VlUXjcvipnA3b8XYBpV/j5kYIuXtGoRgWvyO2slFYWMoY1qLW0ljt+JC7MNduvu jSzQ8yspbADY0swPKgckTmiPUFKIjdKRFhw0iHcbOkrcgZa7MhBo4+vko8nKoZUFuvQm keRuj5CeJL/8F/GveXvXeLqKi69dSGcJow7A2MvrvA1GDsjjRc1BIdg06zV07MJkAK5j bjrq5sdFvWiK3PBecRymulqyN5yGfPDHRlB0lX7YH2bCsH7b4NAYjA5NZN1Q5IYIJRpJ DxKXrE+h53Tg4xjFyusv8AbOV/wjK9ExQphMag1eY9nZqe/lmM4N5ottaMGmmM6j8oZD H+Qg== X-Gm-Message-State: AC+VfDyv9li8PXEuc2Wm1vBaApvjTxa7a3nsQLLS5ugbpJPQZ0rY5WMN kTzJMsG4BbkvHKJTJyuZ0IxVpA== X-Google-Smtp-Source: ACHHUZ4SIMI4er/AF0deXgjHIDsOF79qc5oA7wlkFnZQpWYe4q0dmwzIjbUF/Ql+fhbfUc3dtFgx4A== X-Received: by 2002:a81:a043:0:b0:560:beeb:6fc1 with SMTP id x64-20020a81a043000000b00560beeb6fc1mr13114394ywg.16.1685341365991; Sun, 28 May 2023 23:22:45 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id t66-20020a818345000000b00568938ca41bsm405426ywf.53.2023.05.28.23.22.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 May 2023 23:22:45 -0700 (PDT) Date: Sun, 28 May 2023 23:22:40 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , Russell King , "David S. Miller" , Michael Ellerman , "Aneesh Kumar K.V" , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , Alexander Gordeev , Jann Horn , linux-arm-kernel@lists.infradead.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async() In-Reply-To: <35e983f5-7ed3-b310-d949-9ae8b130cdab@google.com> Message-ID: <6dd63b39-e71f-2e8b-7e0-83e02f3bcb39@google.com> References: <35e983f5-7ed3-b310-d949-9ae8b130cdab@google.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 0F262A0014 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zqbknmtu3c7ouc4doy1rrrexjshferkr X-HE-Tag: 1685341366-453062 X-HE-Meta: U2FsdGVkX1/tGWicXTNAkRwTjkcLShLEFChzoEAkEV6hYRJdqSZupdU9T+oSnN6UxEJb1aLXXVI8l3ZJ3DHndJc3FUQZomZgfnatBvqr8Z05ug2r2AVRjyRhVNwSKsQ3/4mcfRvkPU1g8UbYKEze3ETT9WgTUcwXkAXotQA7qAjsfZm2e6vtnIskrN8rnn6gOIrWgC3/0WVYWALrinR2wjHOZnD5QqNqAWVbRBCGvC4V9UcgIkYz2cV+O96gTYYZIR9c6Jf/aFRDkvxKxvNPXAzcryf25NUwmJxgqlCUQrJ0N7lfyKkEN1fls0017FXLnZgrQveepbdB/fYtGGB+6qGD7qgFT7AJ3maWyfRofGzesmT4DMDm9CqMOGvuM1KgOUy1qH0VK8WpvhZckPrhAzloGhxD4n4y0QzjyPT8zIMMLf7eIvxilBj0G8IL6H2A2RFcuUFu/sn9tQ/TwOKDay4JF7cwMyHeZjdCxV34x2huQ/bAKYHn8GddM7ys3Jb5FZF9g5+Bf4zJtndMMuyM9ZX1qtmC9QG3SCUtuKFzS5zAarYtzxKgk0wnkT2y8g9ycuVQh3a4ADIjSPD8JUVlofKr3eYJQ0bUi1DrxCVC57E33aqdMR+12o10y+10foxda9m5d61Nq42vfHyUVhznpFtvQyma3TWk7SKpRQJ2fzfWpJ51ytK+C7kMlr4zo7bF9kjqf+HkTk3aJJoA2QDTEc/VJCHmlSyUYgimGaPFqsj4l327k8bZzW3/2O7sHRalrd9HQrpExp33AtPsLcy+pGvy8gPbScU8X8TwVfBP/9TUe4jPL9a+A/dSR3QlFDYo05MbX4A/nLHwTwUOdovdG9eaJR0lSiQlm8wUKXfHW7WLN59kvO+V2P34hA7Q5BTshsNrvLmYkorODxjnGPC2IZDxvYA85BZXnK4yAm3teT7y8AWmN/C32wyURctYny3Ab//ICVUpvaEct0dNwwe LtwILAoZ Uk9htXNuXngr4nrXYsXC7AopLJClrwf6diDg4DaAcYqCObCZptgSwpn1mygLkDR7U1na0jCyrbCFZBK5/yox4nsH0CIz2aFfpZAxRaVMytbXgXPwJooLymrFUXadZk7vY2xisZrFlLiiIy2j2JGywURmAvgxmIbcfTLILKEl5JvsZpOzrl3BViGHjPmgDvQJh35JcVoVJtLCJfxsE0Ovq2zE9TCwvDWGEUW+E4drfqfHbQCyePlZlNOyAnd7/Lkg8fhk4VHF0ywF3z7uCfTwpFABR6RpyYsM2B35cdSQyPI1hNNGK3TjMfxwxIASw6bVUg7PiceDw6SoraP4fjid9rjGK909JDT7mvb0uBihE2Hs9sT84nzNHC6opDDswrNUIrbwVsj28xRkLB2sQ2Acp8v2DOpbw7Dd5HzekaXjspRHTBU2xFfbqVfgQPw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add s390-specific pte_free_defer(), to call pte_free() via call_rcu(). pte_free_defer() will be called inside khugepaged's retract_page_tables() loop, where allocating extra memory cannot be relied upon. This precedes the generic version to avoid build breakage from incompatible pgtable_t. This version is more complicated than others: because page_table_free() needs to know which fragment is being freed, and which mm to link it to. page_table_free()'s fragment handling is clever, but I could too easily break it: what's done here in pte_free_defer() and pte_free_now() might be better integrated with page_table_free()'s cleverness, but not by me! By the time that page_table_free() gets called via RCU, it's conceivable that mm would already have been freed: so mmgrab() in pte_free_defer() and mmdrop() in pte_free_now(). No, that is not a good context to call mmdrop() from, so make mmdrop_async() public and use that. Signed-off-by: Hugh Dickins Reviewed-by: Gerald Schaefer --- arch/s390/include/asm/pgalloc.h | 4 ++++ arch/s390/mm/pgalloc.c | 34 +++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 2 +- include/linux/sched/mm.h | 1 + kernel/fork.c | 2 +- 5 files changed, 41 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h index 17eb618f1348..89a9d5ef94f8 100644 --- a/arch/s390/include/asm/pgalloc.h +++ b/arch/s390/include/asm/pgalloc.h @@ -143,6 +143,10 @@ static inline void pmd_populate(struct mm_struct *mm, #define pte_free_kernel(mm, pte) page_table_free(mm, (unsigned long *) pte) #define pte_free(mm, pte) page_table_free(mm, (unsigned long *) pte) +/* arch use pte_free_defer() implementation in arch/s390/mm/pgalloc.c */ +#define pte_free_defer pte_free_defer +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable); + void vmem_map_init(void); void *vmem_crst_alloc(unsigned long val); pte_t *vmem_pte_alloc(void); diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c index 66ab68db9842..0129de9addfd 100644 --- a/arch/s390/mm/pgalloc.c +++ b/arch/s390/mm/pgalloc.c @@ -346,6 +346,40 @@ void page_table_free(struct mm_struct *mm, unsigned long *table) __free_page(page); } +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static void pte_free_now(struct rcu_head *head) +{ + struct page *page; + unsigned long mm_bit; + struct mm_struct *mm; + unsigned long *table; + + page = container_of(head, struct page, rcu_head); + table = (unsigned long *)page_to_virt(page); + mm_bit = (unsigned long)page->pt_mm; + /* 4K page has only two 2K fragments, but alignment allows eight */ + mm = (struct mm_struct *)(mm_bit & ~7); + table += PTRS_PER_PTE * (mm_bit & 7); + page_table_free(mm, table); + mmdrop_async(mm); +} + +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable) +{ + struct page *page; + unsigned long mm_bit; + + mmgrab(mm); + page = virt_to_page(pgtable); + /* Which 2K page table fragment of a 4K page? */ + mm_bit = ((unsigned long)pgtable & ~PAGE_MASK) / + (PTRS_PER_PTE * sizeof(pte_t)); + mm_bit += (unsigned long)mm; + page->pt_mm = (struct mm_struct *)mm_bit; + call_rcu(&page->rcu_head, pte_free_now); +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + void page_table_free_rcu(struct mmu_gather *tlb, unsigned long *table, unsigned long vmaddr) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 306a3d1a0fa6..1667a1bdb8a8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -146,7 +146,7 @@ struct page { pgtable_t pmd_huge_pte; /* protected by page->ptl */ unsigned long _pt_pad_2; /* mapping */ union { - struct mm_struct *pt_mm; /* x86 pgds only */ + struct mm_struct *pt_mm; /* x86 pgd, s390 */ atomic_t pt_frag_refcount; /* powerpc */ }; #if ALLOC_SPLIT_PTLOCKS diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 8d89c8c4fac1..a9043d1a0d55 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -41,6 +41,7 @@ static inline void smp_mb__after_mmgrab(void) smp_mb__after_atomic(); } +extern void mmdrop_async(struct mm_struct *mm); extern void __mmdrop(struct mm_struct *mm); static inline void mmdrop(struct mm_struct *mm) diff --git a/kernel/fork.c b/kernel/fork.c index ed4e01daccaa..fa4486b65c56 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -942,7 +942,7 @@ static void mmdrop_async_fn(struct work_struct *work) __mmdrop(mm); } -static void mmdrop_async(struct mm_struct *mm) +void mmdrop_async(struct mm_struct *mm) { if (unlikely(atomic_dec_and_test(&mm->mm_count))) { INIT_WORK(&mm->async_put_work, mmdrop_async_fn);