From patchwork Tue Aug 7 10:24:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joerg Roedel X-Patchwork-Id: 10558493 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D6B314E2 for ; Tue, 7 Aug 2018 10:24:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18E862876D for ; Tue, 7 Aug 2018 10:24:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0CF81295FC; Tue, 7 Aug 2018 10:24:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A69C2876D for ; Tue, 7 Aug 2018 10:24:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEC926B000C; Tue, 7 Aug 2018 06:24:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D219C6B000D; Tue, 7 Aug 2018 06:24:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE8296B000E; Tue, 7 Aug 2018 06:24:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by kanga.kvack.org (Postfix) with ESMTP id 5AD616B000C for ; Tue, 7 Aug 2018 06:24:37 -0400 (EDT) Received: by mail-ed1-f71.google.com with SMTP id i24-v6so5215933edq.16 for ; Tue, 07 Aug 2018 03:24:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=YS+GBQdU5JJOJDYsLMSyeVGhVWQEHkSmLOshkg+Lk1o=; b=QNoEg3Az7ZyUCV9LJPFFBS7+j8Vly9oSSonjEZ7b6Y7Cucl/3STm55XQ2u5+uhfl7W Meke67BSzx+OeUl3l2CnIroa+2CCXiAyKaIWfpia2bdbSK7Vg1BBDP9eqhTadWE0T3gx LDa0h22eQ2xiZwHpwvX8HAkLKlxIlO6Qn5Ylw5DFfUv5AwiiAXWaIDqtwwdU0/2x08Mo FZVZ2VTkmvTAzrMRagi7cTxx1oCqwFivXlQgOSoOsOxRauvpbhYa9XH2TnQfgMRxM9Tv 7ZxEeMqiZEW9+R56woTfGPqAjbeem1ljTt/VIlw6BGO3t+kHz6Wck5CL4/NODGK7w2Tp Ep3Q== X-Gm-Message-State: AOUpUlFkSRooHVlnnJEd/IJvF0Lb6r+6yNEKJcbxZBWKaY5X2XDAD/J+ kfZnEDni0bAhNkzzyPEWELroEzjIvJR6HyEeT7Jvs9suER9u2AJdt6iflBM9M/rvfpCbPH2Hymc 7mljTSFcuYCSoBzj6xhdqJx5rcKgjun5N09OCbEOw5/ZtbREFxazKhaVtZyfLJUHI8A== X-Received: by 2002:a50:f145:: with SMTP id z5-v6mr23066379edl.0.1533637476825; Tue, 07 Aug 2018 03:24:36 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd9WhJKych7sgjz4oTmdiOGLW07AlqYCT1xsDYpsU63QGNhyZ00F06W5n5O6oeoz22BFeNW X-Received: by 2002:a50:f145:: with SMTP id z5-v6mr23066323edl.0.1533637475980; Tue, 07 Aug 2018 03:24:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533637475; cv=none; d=google.com; s=arc-20160816; b=Qtq3sh5LxCgDvIHziKFkqSeQrFMg1DJ0wpC6GDHiT0hwt1eP10QDClK8fxp19qGwMp gZYLR8XiGnpasZrHHNgF1bRU3gJPWYpWy/U0vpt8hCdzhR5wpibVyzh/BoydmXr9T+Lm VkDdHBRiQqPzOEQohcCH+qhJRUPF035yr6/gJCzzbvJaGmNRpCV8jNe9rL6AsrB8LJnt H09pvKsjLQBiZollvFo1FIroI9uVBYkt/rjKrgN2uWXZR0ETsTzerwSGhE/VZxp8RgkG ID9L7/tqpAMKDvpjeVep6EcEajPZ72nIxb/8Pw3hHQkpV0tTYMqYq6z6nH29CsMTztik ENFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=YS+GBQdU5JJOJDYsLMSyeVGhVWQEHkSmLOshkg+Lk1o=; b=ORUx5raXuwlQvjRTwZeZ02iW9Fafvw5wdAP1TLgBY3Ssvn2mni31JrxWJ+zSd6Kbct yiQyBgBlk/wdC9MFvdHrA6dTO+HhMjWsgd9m3wn1YZG/sDnlomhkbIqW3xckzVmQFD8Q vPBFKxXaS+g+jIOM8iTyjsTuGm0XKtSjg69tcFJ1ty6s4x7dI9agTUBWsaKV5xxp9r+j YQEdsB8e8uMkEhzHSitV/3JHFylq9oIeUNj/LFJY3eZK90CiHYfi9bagZ+WU2yqT2oNA cMW6hQOOEty9c6VhwC/9D/CTjbo0mPB7u7vDesgGb+ah4vgAcqmJQA9JELex/eD2ftgh nI+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass (test mode) header.i=@8bytes.org header.s=mail-1 header.b=YSEYQnF1; spf=pass (google.com: domain of joro@8bytes.org designates 81.169.241.247 as permitted sender) smtp.mailfrom=joro@8bytes.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Received: from theia.8bytes.org (8bytes.org. [81.169.241.247]) by mx.google.com with ESMTPS id i2-v6si578507edt.286.2018.08.07.03.24.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 Aug 2018 03:24:35 -0700 (PDT) Received-SPF: pass (google.com: domain of joro@8bytes.org designates 81.169.241.247 as permitted sender) client-ip=81.169.241.247; Authentication-Results: mx.google.com; dkim=pass (test mode) header.i=@8bytes.org header.s=mail-1 header.b=YSEYQnF1; spf=pass (google.com: domain of joro@8bytes.org designates 81.169.241.247 as permitted sender) smtp.mailfrom=joro@8bytes.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Received: by theia.8bytes.org (Postfix, from userid 1000) id C6F4F301; Tue, 7 Aug 2018 12:24:33 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=8bytes.org; s=mail-1; t=1533637473; bh=yH8pfODF7JiPfNr3+5cXoh3BC/3nmtRH4rF1Of9Df6I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YSEYQnF1NEY3V04R9RQEy3euU+xgfCZJGx8eGq+F/UDYU82g2DAbNojf/CZxCz+jc 4IPBLIIYB33fWa14hJ2WhR87I2vpCsWIiP+oGaxOYiXMh0n7DtgnYMMY/usEtchbZu Rg+9YDxY/m/jIzkNYIFqrl82RY72tuwRGNBywKFlRR1fEq6WUfccfffXR8mfMgteGT KVO3OLq7caJSxVeBMOTV8nf9PUW9iQLEL7Y42CYxWJfigMdJzyG1tj2jeL7zooluqF pbq6yPq0qrHDnRBB0Exdxe4jy+Qo4+rL4ttio3EOXfQ0Km0cDSi02t3Dm3oNF07+dL JNNUvIh+lu9MA== From: Joerg Roedel To: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andy Lutomirski , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , aliguori@amazon.com, daniel.gruss@iaik.tugraz.at, hughd@google.com, keescook@google.com, Andrea Arcangeli , Waiman Long , Pavel Machek , "David H . Gutteridge" , jroedel@suse.de, joro@8bytes.org Subject: [PATCH 3/3] x86/mm/pti: Clone kernel-image on PTE level for 32 bit Date: Tue, 7 Aug 2018 12:24:31 +0200 Message-Id: <1533637471-30953-4-git-send-email-joro@8bytes.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1533637471-30953-1-git-send-email-joro@8bytes.org> References: <1533637471-30953-1-git-send-email-joro@8bytes.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Joerg Roedel On 32 bit the kernel sections are not huge-page aligned. When we clone them on PMD-level we unevitably map some areas that are normal kernel memory and may contain secrets to user-space. To prevent that we need to clone the kernel-image on PTE-level for 32 bit. Also make the page-table cloning clode more general so that it can handle PMD and PTE level cloning. This can be generalized further in the future to also handle clones on the P4D-level. Signed-off-by: Joerg Roedel --- arch/x86/mm/pti.c | 140 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 99 insertions(+), 41 deletions(-) diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 5164c98..1dc5c68 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -54,6 +54,16 @@ #define __GFP_NOTRACK 0 #endif +/* + * Define the page-table levels we clone for user-space on 32 + * and 64 bit. + */ +#ifdef CONFIG_X86_64 +#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PMD +#else +#define PTI_LEVEL_KERNEL_IMAGE PTI_CLONE_PTE +#endif + static void __init pti_print_if_insecure(const char *reason) { if (boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) @@ -228,7 +238,6 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) return pmd_offset(pud, address); } -#ifdef CONFIG_X86_VSYSCALL_EMULATION /* * Walk the shadow copy of the page tables (optionally) trying to allocate * page table pages on the way down. Does not support large pages. @@ -270,6 +279,7 @@ static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address) return pte; } +#ifdef CONFIG_X86_VSYSCALL_EMULATION static void __init pti_setup_vsyscall(void) { pte_t *pte, *target_pte; @@ -290,8 +300,14 @@ static void __init pti_setup_vsyscall(void) static void __init pti_setup_vsyscall(void) { } #endif +enum pti_clone_level { + PTI_CLONE_PMD, + PTI_CLONE_PTE, +}; + static void -pti_clone_pmds(unsigned long start, unsigned long end) +pti_clone_pgtable(unsigned long start, unsigned long end, + enum pti_clone_level level) { unsigned long addr; @@ -299,7 +315,8 @@ pti_clone_pmds(unsigned long start, unsigned long end) * Clone the populated PMDs which cover start to end. These PMD areas * can have holes. */ - for (addr = start; addr < end; addr += PMD_SIZE) { + for (addr = start; addr < end;) { + pte_t *pte, *target_pte; pmd_t *pmd, *target_pmd; pgd_t *pgd; p4d_t *p4d; @@ -315,44 +332,84 @@ pti_clone_pmds(unsigned long start, unsigned long end) p4d = p4d_offset(pgd, addr); if (WARN_ON(p4d_none(*p4d))) return; + pud = pud_offset(p4d, addr); - if (pud_none(*pud)) + if (pud_none(*pud)) { + addr += PUD_SIZE; continue; + } + pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) + if (pmd_none(*pmd)) { + addr += PMD_SIZE; continue; + } - target_pmd = pti_user_pagetable_walk_pmd(addr); - if (WARN_ON(!target_pmd)) - return; - - /* - * Only clone present PMDs. This ensures only setting - * _PAGE_GLOBAL on present PMDs. This should only be - * called on well-known addresses anyway, so a non- - * present PMD would be a surprise. - */ - if (WARN_ON(!(pmd_flags(*pmd) & _PAGE_PRESENT))) - return; - - /* - * Setting 'target_pmd' below creates a mapping in both - * the user and kernel page tables. It is effectively - * global, so set it as global in both copies. Note: - * the X86_FEATURE_PGE check is not _required_ because - * the CPU ignores _PAGE_GLOBAL when PGE is not - * supported. The check keeps consistentency with - * code that only set this bit when supported. - */ - if (boot_cpu_has(X86_FEATURE_PGE)) - *pmd = pmd_set_flags(*pmd, _PAGE_GLOBAL); - - /* - * Copy the PMD. That is, the kernelmode and usermode - * tables will share the last-level page tables of this - * address range - */ - *target_pmd = *pmd; + if (pmd_large(*pmd) || level == PTI_CLONE_PMD) { + target_pmd = pti_user_pagetable_walk_pmd(addr); + if (WARN_ON(!target_pmd)) + return; + + /* + * Only clone present PMDs. This ensures only setting + * _PAGE_GLOBAL on present PMDs. This should only be + * called on well-known addresses anyway, so a non- + * present PMD would be a surprise. + */ + if (WARN_ON(!(pmd_flags(*pmd) & _PAGE_PRESENT))) + return; + + /* + * Setting 'target_pmd' below creates a mapping in both + * the user and kernel page tables. It is effectively + * global, so set it as global in both copies. Note: + * the X86_FEATURE_PGE check is not _required_ because + * the CPU ignores _PAGE_GLOBAL when PGE is not + * supported. The check keeps consistentency with + * code that only set this bit when supported. + */ + if (boot_cpu_has(X86_FEATURE_PGE)) + *pmd = pmd_set_flags(*pmd, _PAGE_GLOBAL); + + /* + * Copy the PMD. That is, the kernelmode and usermode + * tables will share the last-level page tables of this + * address range + */ + *target_pmd = *pmd; + + addr += PMD_SIZE; + + } else if (level == PTI_CLONE_PTE) { + + /* Walk the page-table down to the pte level */ + pte = pte_offset_kernel(pmd, addr); + if (pte_none(*pte)) { + addr += PAGE_SIZE; + continue; + } + + /* Only clone present PTEs */ + if (WARN_ON(!(pte_flags(*pte) & _PAGE_PRESENT))) + return; + + /* Allocate PTE in the user page-table */ + target_pte = pti_user_pagetable_walk_pte(addr); + if (WARN_ON(!target_pte)) + return; + + /* Set GLOBAL bit in both PTEs */ + if (boot_cpu_has(X86_FEATURE_PGE)) + *pte = pte_set_flags(*pte, _PAGE_GLOBAL); + + /* Clone the PTE */ + *target_pte = *pte; + + addr += PAGE_SIZE; + + } else { + BUG(); + } } } @@ -398,7 +455,7 @@ static void __init pti_clone_user_shared(void) start = CPU_ENTRY_AREA_BASE; end = start + (PAGE_SIZE * CPU_ENTRY_AREA_PAGES); - pti_clone_pmds(start, end); + pti_clone_pgtable(start, end, PTI_CLONE_PMD); } #endif /* CONFIG_X86_64 */ @@ -417,8 +474,9 @@ static void __init pti_setup_espfix64(void) */ static void pti_clone_entry_text(void) { - pti_clone_pmds((unsigned long) __entry_text_start, - (unsigned long) __irqentry_text_end); + pti_clone_pgtable((unsigned long) __entry_text_start, + (unsigned long) __irqentry_text_end, + PTI_CLONE_PMD); } /* @@ -500,10 +558,10 @@ static void pti_clone_kernel_text(void) * pti_set_kernel_image_nonglobal() did to clear the * global bit. */ - pti_clone_pmds(start, end_clone); + pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE); /* - * pti_clone_pmds() will set the global bit in any PMDs + * pti_clone_pgtable() will set the global bit in any PMDs * that it clones, but we also need to get any PTEs in * the last level for areas that are not huge-page-aligned. */