From patchwork Wed Nov 20 13:12:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 13881188 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8BD61D63938 for ; Wed, 20 Nov 2024 13:12:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=GEVAwRmaNxTKHkAAugpeREwwaI0nqhCIoDOdpuIlIDw=; b=aAVIDwfhn/McPA kTc7FuqY3zdCVllhF2EZ2npvYDuQnTMFTEBFgXXNb2r2DEO8CDjIPvwSWQgvrbhB11aj656Nz8bwL NbAuq+NTtnO7Cda/kxBBPgpxq/HtIkxqrNa4r681L3X6ysa/UfuhkCn7nYmnNk2ch0hW2TNRYfTJt HgBqrMbkNOGfZIVy2fHJ3wrYWP2E0yXALFwugkqqBvTyGIoLaMdUrK+HvXBUl6XORwpS3rJHtZR6/ Y/OlVeZCQ3+73HezyC9sgIX9gQDPUnQl9Bk5fhjycTqS+vR+Kn1FJqQnE6FsMU723xY+35g9YKVMf zpqDvSSbYxIe36eNxtBQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tDkVC-0000000FO3M-1dC0; Wed, 20 Nov 2024 13:12:14 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tDkV9-0000000FO2e-1q1u for linux-riscv@lists.infradead.org; Wed, 20 Nov 2024 13:12:12 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 66B385C4C65; Wed, 20 Nov 2024 13:11:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BB08CC4CECD; Wed, 20 Nov 2024 13:12:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732108330; bh=k+5dEYh0Wx0gYDwBov9cnbkAU6CMVjt/gVpf33OC/N8=; h=From:To:Cc:Subject:Date:From; b=jGZhu9Dwi+lGk9ZsjijKZtKKmfps855+KQDnqkWJagaRykvm2OPjoa3N7MXkOMafs TE+QWilPmnA3oHJchvWXEJw1WePCqb4iEZzbli7EjPOJcdSJGkR7Pl9HO21Q2LWteL eAwdFkF+dvkwEOJ2xB+30+nVCbEYibc1HqmsSgV7/nIoBve5Db+l6ZIACcoiViKHcT cccKAxYjyGLTsGykPjnZ4z1n8KFZXypqjklh5YxiP+OM3MzFg21y6pHacUYDc1qedp H6ayz5UPejtY03knpY4awE4AgRy3FByrk7wJdPB4tmtz0/3GEmcJTYBrmhvSGk1mN+ a9rDfMwo0nRnw== From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: Alexandre Ghiti , Albert Ou , David Hildenbrand , Palmer Dabbelt , Paul Walmsley , linux-riscv@lists.infradead.org, Oscar Salvador Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Andrew Bresticker , linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org Subject: [PATCH fixes] riscv: mm: Do not call pmd dtor on vmemmap page table teardown Date: Wed, 20 Nov 2024 14:12:02 +0100 Message-ID: <20241120131203.1859787-1-bjorn@kernel.org> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241120_051211_564744_2523F36B X-CRM114-Status: GOOD ( 14.95 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Björn Töpel The vmemmap's, which is used for RV64 with SPARSEMEM_VMEMMAP, page tables are populated using pmd (page middle directory) hugetables. However, the pmd allocation is not using the generic mechanism used by the VMA code (e.g. pmd_alloc()), or the RISC-V specific create_pgd_mapping()/alloc_pmd_late(). Instead, the vmemmap page table code allocates a page, and calls vmemmap_set_pmd(). This results in that the pmd ctor is *not* called, nor would it make sense to do so. Now, when tearing down a vmemmap page table pmd, the cleanup code would unconditionally, and incorrectly call the pmd dtor, which results in a crash (best case). This issue was found when running the HMM selftests: | tools/testing/selftests/mm# ./test_hmm.sh smoke | ... # when unloading the test_hmm.ko module | page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10915b | flags: 0x1000000000000000(node=0|zone=1) | raw: 1000000000000000 0000000000000000 dead000000000122 0000000000000000 | raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 | page dumped because: VM_BUG_ON_PAGE(ptdesc->pmd_huge_pte) | ------------[ cut here ]------------ | kernel BUG at include/linux/mm.h:3080! | Kernel BUG [#1] | Modules linked in: test_hmm(-) sch_fq_codel fuse drm drm_panel_orientation_quirks backlight dm_mod | CPU: 1 UID: 0 PID: 514 Comm: modprobe Tainted: G W 6.12.0-00982-gf2a4f1682d07 #2 | Tainted: [W]=WARN | Hardware name: riscv-virtio qemu/qemu, BIOS 2024.10 10/01/2024 | epc : remove_pgd_mapping+0xbec/0x1070 | ra : remove_pgd_mapping+0xbec/0x1070 | epc : ffffffff80010a68 ra : ffffffff80010a68 sp : ff20000000a73940 | gp : ffffffff827b2d88 tp : ff6000008785da40 t0 : ffffffff80fbce04 | t1 : 0720072007200720 t2 : 706d756420656761 s0 : ff20000000a73a50 | s1 : ff6000008915cff8 a0 : 0000000000000039 a1 : 0000000000000008 | a2 : ff600003fff0de20 a3 : 0000000000000000 a4 : 0000000000000000 | a5 : 0000000000000000 a6 : c0000000ffffefff a7 : ffffffff824469b8 | s2 : ff1c0000022456c0 s3 : ff1ffffffdbfffff s4 : ff6000008915c000 | s5 : ff6000008915c000 s6 : ff6000008915c000 s7 : ff1ffffffdc00000 | s8 : 0000000000000001 s9 : ff1ffffffdc00000 s10: ffffffff819a31f0 | s11: ffffffffffffffff t3 : ffffffff8000c950 t4 : ff60000080244f00 | t5 : ff60000080244000 t6 : ff20000000a73708 | status: 0000000200000120 badaddr: ffffffff80010a68 cause: 0000000000000003 | [] remove_pgd_mapping+0xbec/0x1070 | [] vmemmap_free+0x14/0x1e | [] section_deactivate+0x220/0x452 | [] sparse_remove_section+0x4a/0x58 | [] __remove_pages+0x7e/0xba | [] memunmap_pages+0x2bc/0x3fe | [] dmirror_device_remove_chunks+0x2ea/0x518 [test_hmm] | [] hmm_dmirror_exit+0x3e/0x1018 [test_hmm] | [] __riscv_sys_delete_module+0x15a/0x2a6 | [] do_trap_ecall_u+0x1f2/0x266 | [] _new_vmalloc_restore_context_a0+0xc6/0xd2 | Code: bf51 7597 0184 8593 76a5 854a 4097 0029 80e7 2c00 (9002) 7597 | ---[ end trace 0000000000000000 ]--- | Kernel panic - not syncing: Fatal exception in interrupt Add a check to avoid calling the pmd dtor, if the calling context is vmemmap_free(). Fixes: c75a74f4ba19 ("riscv: mm: Add memory hotplugging support") Signed-off-by: Björn Töpel --- arch/riscv/mm/init.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) base-commit: 57f7c7dc78cd09622b12920d92b40c1ce11b234e diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 0e8c20adcd98..fc53ce748c80 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1566,7 +1566,7 @@ static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd) pmd_clear(pmd); } -static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud) +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud, bool is_vmemmap) { struct page *page = pud_page(*pud); struct ptdesc *ptdesc = page_ptdesc(page); @@ -1579,7 +1579,8 @@ static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud) return; } - pagetable_pmd_dtor(ptdesc); + if (!is_vmemmap) + pagetable_pmd_dtor(ptdesc); if (PageReserved(page)) free_reserved_page(page); else @@ -1703,7 +1704,7 @@ static void __meminit remove_pud_mapping(pud_t *pud_base, unsigned long addr, un remove_pmd_mapping(pmd_base, addr, next, is_vmemmap, altmap); if (pgtable_l4_enabled) - free_pmd_table(pmd_base, pudp); + free_pmd_table(pmd_base, pudp, is_vmemmap); } }