From patchwork Fri Jun 18 14:09:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jisheng Zhang X-Patchwork-Id: 12331543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79D91C48BDF for ; Fri, 18 Jun 2021 14:15:35 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3E421610C7 for ; Fri, 18 Jun 2021 14:15:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3E421610C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=mail.ustc.edu.cn Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Subject:Cc:To: From:Date:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=5XQbGOco/pP7j039zwn1UIm2FYDCDbYdSUiIjv9RMj0=; b=Gv2wGX3cv5UIjr eNTMjEoVtuO/w+MOlT68i3t4iWbJbfH3FSQgUJ1gpU9sW4qAHOM7h9u6x+EtJxCFxZrFbZOWAw0Gv lQSFFf+1hrdumKhzjKRNrBwqCzEeDhfkKWJEPL5y9ysYRkL+7GKiFHKMoEuEZ+yd8spJYf417Lpd4 k98Mu9yW+zYKtVQutzP7xmhws5BnqS4kjE3+Uhne73D43J3CnbtCX6pBgLQ6F6DyI+cB9ngpbJAjy 81KI+/havyhKVMNHZT1wVD39YZ0F8wrK/E9bTQU/Tt4Pg4EeMOcWkuLjh2lO5aBigdMHjDCkVGvMT yI6zeAs7/dYCRfkLAgrg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1luFHD-00ERha-KZ; Fri, 18 Jun 2021 14:15:19 +0000 Received: from email6.ustc.edu.cn ([2001:da8:d800::8] helo=ustc.edu.cn) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1luFHA-00ERgk-Bo for linux-riscv@lists.infradead.org; Fri, 18 Jun 2021 14:15:18 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mail.ustc.edu.cn; s=dkim; h=Received:Date:From:To:Cc:Subject: Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding; bh=R5rGWfYjWzannEPfCFi17k2rcSwUm8SqL8TNYdn/BEM=; b=s7whCFqb7FLUu kEGCCMwhOpiWfcIUIGwYidH0mGDW/FoKMrc96cyok59W4hsU7T1htQ2r76jFyOIP YFS1jr5YtCfjFGlghncCdqGReB6SBs5x8MbYy8IdjmGq/O9kSdzQcRJF1uN2hk30 8QhvQx85uOKcI2gJXwcjOSKtGEGAYA= Received: from xhacker (unknown [101.86.20.15]) by newmailweb.ustc.edu.cn (Coremail) with SMTP id LkAmygAHk6BXqsxgBAv_AA--.19232S2; Fri, 18 Jun 2021 22:14:48 +0800 (CST) Date: Fri, 18 Jun 2021 22:09:13 +0800 From: Jisheng Zhang To: Jonathan Corbet , Paul Walmsley , Palmer Dabbelt , Albert Ou , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Alexandre Ghiti , Andreas Schwab Cc: linux-doc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH v3] riscv: Ensure BPF_JIT_REGION_START aligned with PMD size Message-ID: <20210618220913.6fde1957@xhacker> MIME-Version: 1.0 X-CM-TRANSID: LkAmygAHk6BXqsxgBAv_AA--.19232S2 X-Coremail-Antispam: 1UD129KBjvJXoW3AFy3GrWDWF1xCw4UAr4fKrg_yoW7Zw1xpr 45Jr1xGrW8JryUXw18Ary5Cr1UA3WUC3W3JrnxJr15XFyUGF1UAr1UtFW3Xr1DXF4rJ3W7 tr1DGrWUtr1UAw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkGb7Iv0xC_Kw4lb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I2 0VC2zVCF04k26cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rw A2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xII jxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwV C2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC 0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr 1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7 MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr 0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8ZVWrXwCIc40Y0x0E wIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JV WxJwCI42IY6xAIw20EY4v20xvaj40_Wr1j6rW3Jr1lIxAIcVC2z280aVAFwI0_Jr0_Gr1l IxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07b0NVkUUU UU= X-CM-SenderInfo: xmv2xttqjtqzxdloh3xvwfhvlgxou0/ X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210618_071517_247450_EF4D5D25 X-CRM114-Status: GOOD ( 15.23 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Jisheng Zhang Andreas reported commit fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") breaks booting with one kind of defconfig, I reproduced a kernel panic with the defconfig: [ 0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220 [ 0.139159] Oops [#1] [ 0.139303] Modules linked in: [ 0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1 [ 0.139934] Hardware name: riscv-virtio,qemu (DT) [ 0.140193] epc : __memset+0xc4/0xfc [ 0.140416] ra : skb_flow_dissector_init+0x1e/0x82 [ 0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0 [ 0.140878] gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158 [ 0.141156] t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0 [ 0.141424] s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000 [ 0.141654] a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064 [ 0.141893] a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff [ 0.142126] s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088 [ 0.142353] s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438 [ 0.142584] s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac [ 0.142810] s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000 [ 0.143042] t5 : 0000000000000155 t6 : 00000000000003ff [ 0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f [ 0.143560] [] __memset+0xc4/0xfc [ 0.143859] [] init_default_flow_dissectors+0x22/0x60 [ 0.144092] [] do_one_initcall+0x3e/0x168 [ 0.144278] [] kernel_init_freeable+0x1c8/0x224 [ 0.144479] [] kernel_init+0x12/0x110 [ 0.144658] [] ret_from_exception+0x0/0xc [ 0.145124] ---[ end trace f1e9643daa46d591 ]--- After some investigation, I think I found the root cause: commit 2bfc6cd81bd ("move kernel mapping outside of linear mapping") moves BPF JIT region after the kernel: | #define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end) The &_end is unlikely aligned with PMD size, so the front bpf jit region sits with part of kernel .data section in one PMD size mapping. But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is called to make the first bpf jit prog ROX, we will make part of kernel .data section RO too, so when we write to, for example memset the .data section, MMU will trigger a store page fault. To fix the issue, we need to ensure the BPF JIT region is PMD size aligned. This patch acchieve this goal by restoring the BPF JIT region to original position, I.E the 128MB before kernel .text section. The modification to kasan_init.c is inspired by Alexandre. Fixes: fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") Reported-by: Andreas Schwab Signed-off-by: Jisheng Zhang --- Since v2: - Split the local vars rename modification into another patch per Alexandre suggestion - Add Fixes tag Since v1: - Fix early boot hang when kasan is enabled - Update Documentation/riscv/vm-layout.rst Documentation/riscv/vm-layout.rst | 4 ++-- arch/riscv/include/asm/pgtable.h | 5 ++--- arch/riscv/mm/kasan_init.c | 2 +- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/Documentation/riscv/vm-layout.rst b/Documentation/riscv/vm-layout.rst index 329d32098af4..b7f98930d38d 100644 --- a/Documentation/riscv/vm-layout.rst +++ b/Documentation/riscv/vm-layout.rst @@ -58,6 +58,6 @@ RISC-V Linux Kernel SV39 | ____________________________________________________________|____________________________________________________________ | | | | - ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules - ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel, BPF + ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF + ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel __________________|____________|__________________|_________|____________________________________________________________ diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 9469f464e71a..380cd3a7e548 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -30,9 +30,8 @@ #define BPF_JIT_REGION_SIZE (SZ_128M) #ifdef CONFIG_64BIT -/* KASLR should leave at least 128MB for BPF after the kernel */ -#define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end) -#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE) +#define BPF_JIT_REGION_START (BPF_JIT_REGION_END - BPF_JIT_REGION_SIZE) +#define BPF_JIT_REGION_END (MODULES_END) #else #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c index 9daacae93e33..55c113345460 100644 --- a/arch/riscv/mm/kasan_init.c +++ b/arch/riscv/mm/kasan_init.c @@ -201,7 +201,7 @@ void __init kasan_init(void) /* Populate kernel, BPF, modules mapping */ kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR), - kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END)); + kasan_mem_to_shadow((const void *)MODULES_VADDR + SZ_2G)); for (i = 0; i < PTRS_PER_PTE; i++) set_pte(&kasan_early_shadow_pte[i],