From patchwork Thu Jan 10 21:09:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756897 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E84AA159A for ; Thu, 10 Jan 2019 21:10:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D935A29BB3 for ; Thu, 10 Jan 2019 21:10:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CC79C29BCE; Thu, 10 Jan 2019 21:10:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id C962129B8E for ; Thu, 10 Jan 2019 21:10:56 +0000 (UTC) Received: (qmail 8189 invoked by uid 550); 10 Jan 2019 21:10:48 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 8112 invoked from network); 10 Jan 2019 21:10:48 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=vioWdbeSIoR0OozY01KK0lWF78M6JuRfZl7Y5n70Vvw=; b=5Zr/H/gs7twRt1V1n7f/N4i72M7pDnfRyfW0SUEfU98nSjEVYXxaduRQfIFkOZ6eq9bE Wzu617vgw5kLC4psr9ydthh8fmY5vK4bwus1PBTaPdZF1pGR6WhU4lx60h3LxRUoEm/p F2MDs5KH7KdTR41o9aA93516uC8aal6RSREan5kxOy2iKlRKByWtG1L1NBJulWT8GgLu AY/LoTA2h3dnTJguFAfeS4rtoH1zSOtCclYaxScjEwsOKPC+aLTaafHAlgn74lP4SFxO OZ08qhM39REMSVoQ1McWy+qzIh1yhDGGBR6wbaHbcS7p15iZIaHqZKea3yWNhdfc5ov4 EA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Khalid Aziz Subject: [RFC PATCH v7 01/16] mm: add MAP_HUGETLB support to vm_mmap Date: Thu, 10 Jan 2019 14:09:33 -0700 Message-Id: <5b692498fde91fe22181cdb6ed20f058f598fb43.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen vm_mmap is exported, which means kernel modules can use it. In particular, for testing XPFO support, we want to use it with the MAP_HUGETLB flag, so let's support it via vm_mmap. Signed-off-by: Tycho Andersen Tested-by: Marco Benatto Signed-off-by: Khalid Aziz --- include/linux/mm.h | 2 ++ mm/mmap.c | 19 +------------------ mm/util.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 35 insertions(+), 18 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5411de93a363..30bddc7b3c75 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2361,6 +2361,8 @@ struct vm_unmapped_area_info { extern unsigned long unmapped_area(struct vm_unmapped_area_info *info); extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); +struct file *map_hugetlb_setup(unsigned long *len, unsigned long flags); + /* * Search for an unmapped address range. * diff --git a/mm/mmap.c b/mm/mmap.c index 6c04292e16a7..c668d7d27c2b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1582,24 +1582,7 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; } else if (flags & MAP_HUGETLB) { - struct user_struct *user = NULL; - struct hstate *hs; - - hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); - if (!hs) - return -EINVAL; - - len = ALIGN(len, huge_page_size(hs)); - /* - * VM_NORESERVE is used because the reservations will be - * taken when vm_ops->mmap() is called - * A dummy user value is used because we are not locking - * memory so no accounting is necessary - */ - file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, - VM_NORESERVE, - &user, HUGETLB_ANONHUGE_INODE, - (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + file = map_hugetlb_setup(&len, flags); if (IS_ERR(file)) return PTR_ERR(file); } diff --git a/mm/util.c b/mm/util.c index 8bf08b5b5760..536c14cf88ba 100644 --- a/mm/util.c +++ b/mm/util.c @@ -357,6 +357,29 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, return ret; } +struct file *map_hugetlb_setup(unsigned long *len, unsigned long flags) +{ + struct user_struct *user = NULL; + struct hstate *hs; + + hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + if (!hs) + return ERR_PTR(-EINVAL); + + *len = ALIGN(*len, huge_page_size(hs)); + + /* + * VM_NORESERVE is used because the reservations will be + * taken when vm_ops->mmap() is called + * A dummy user value is used because we are not locking + * memory so no accounting is necessary + */ + return hugetlb_file_setup(HUGETLB_ANON_FILE, *len, + VM_NORESERVE, + &user, HUGETLB_ANONHUGE_INODE, + (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); +} + unsigned long vm_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long offset) @@ -366,6 +389,15 @@ unsigned long vm_mmap(struct file *file, unsigned long addr, if (unlikely(offset_in_page(offset))) return -EINVAL; + if (flag & MAP_HUGETLB) { + if (file) + return -EINVAL; + + file = map_hugetlb_setup(&len, flag); + if (IS_ERR(file)) + return PTR_ERR(file); + } + return vm_mmap_pgoff(file, addr, len, prot, flag, offset >> PAGE_SHIFT); } EXPORT_SYMBOL(vm_mmap); From patchwork Thu Jan 10 21:09:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756901 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D2E113B5 for ; Thu, 10 Jan 2019 21:11:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D61E29BD7 for ; Thu, 10 Jan 2019 21:11:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F80A29BE7; Thu, 10 Jan 2019 21:11:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 2833629BD2 for ; Thu, 10 Jan 2019 21:10:58 +0000 (UTC) Received: (qmail 9288 invoked by uid 550); 10 Jan 2019 21:10:50 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9223 invoked from network); 10 Jan 2019 21:10:49 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=rxloVUR5hTpog4EkfpKEBfCxP1GI1EsadkVaOQCjD2o=; b=1CYkyUJc4Qs2o0butOq1ThTUw0aGOHhtGvgscR/9iH/yWacmRM01oZvKfD4Gt31YBLEM ce9ixyvlKWNm/xwoNnMIWXjHRhkJlHhKG7fBtz1G2ss13MtAzB1P+0ZMGeWdtgi4T0dq dIcMgf+jX0Jelv8dbvcLwmQz2WNq6a2d+1eNlp91l4a45r45cp5GH4GY4pnQpLkQv2wb VcVhVfb4sHJ3GpJ9krKLY7oSD+xhZEwYCYbOWVMEhq0EL+RRT4bkWE+i9wcDPueGbJ7B P7HJ8t2860x1dpq4pYilUvGWzzNx1DUnQxLD7dk3BC/9AkpMY9iRoAeGOYUL8Cme3mQO kA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Khalid Aziz Subject: [RFC PATCH v7 02/16] x86: always set IF before oopsing from page fault Date: Thu, 10 Jan 2019 14:09:34 -0700 Message-Id: <46b0f1a61dabf6440d461c063a32573b96f3a5ce.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=953 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen Oopsing might kill the task, via rewind_stack_do_exit() at the bottom, and that might sleep: Aug 23 19:30:27 xpfo kernel: [ 38.302714] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:33 Aug 23 19:30:27 xpfo kernel: [ 38.303837] in_atomic(): 0, irqs_disabled(): 1, pid: 1970, name: lkdtm_xpfo_test Aug 23 19:30:27 xpfo kernel: [ 38.304758] CPU: 3 PID: 1970 Comm: lkdtm_xpfo_test Tainted: G D 4.13.0-rc5+ #228 Aug 23 19:30:27 xpfo kernel: [ 38.305813] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Aug 23 19:30:27 xpfo kernel: [ 38.306926] Call Trace: Aug 23 19:30:27 xpfo kernel: [ 38.307243] dump_stack+0x63/0x8b Aug 23 19:30:27 xpfo kernel: [ 38.307665] ___might_sleep+0xec/0x110 Aug 23 19:30:27 xpfo kernel: [ 38.308139] __might_sleep+0x45/0x80 Aug 23 19:30:27 xpfo kernel: [ 38.308593] exit_signals+0x21/0x1c0 Aug 23 19:30:27 xpfo kernel: [ 38.309046] ? blocking_notifier_call_chain+0x11/0x20 Aug 23 19:30:27 xpfo kernel: [ 38.309677] do_exit+0x98/0xbf0 Aug 23 19:30:27 xpfo kernel: [ 38.310078] ? smp_reader+0x27/0x40 [lkdtm] Aug 23 19:30:27 xpfo kernel: [ 38.310604] ? kthread+0x10f/0x150 Aug 23 19:30:27 xpfo kernel: [ 38.311045] ? read_user_with_flags+0x60/0x60 [lkdtm] Aug 23 19:30:27 xpfo kernel: [ 38.311680] rewind_stack_do_exit+0x17/0x20 To be safe, let's just always enable irqs. The particular case I'm hitting is: Aug 23 19:30:27 xpfo kernel: [ 38.278615] __bad_area_nosemaphore+0x1a9/0x1d0 Aug 23 19:30:27 xpfo kernel: [ 38.278617] bad_area_nosemaphore+0xf/0x20 Aug 23 19:30:27 xpfo kernel: [ 38.278618] __do_page_fault+0xd1/0x540 Aug 23 19:30:27 xpfo kernel: [ 38.278620] ? irq_work_queue+0x9b/0xb0 Aug 23 19:30:27 xpfo kernel: [ 38.278623] ? wake_up_klogd+0x36/0x40 Aug 23 19:30:27 xpfo kernel: [ 38.278624] trace_do_page_fault+0x3c/0xf0 Aug 23 19:30:27 xpfo kernel: [ 38.278625] do_async_page_fault+0x14/0x60 Aug 23 19:30:27 xpfo kernel: [ 38.278627] async_page_fault+0x28/0x30 When a fault is in kernel space which has been triggered by XPFO. Signed-off-by: Tycho Andersen CC: x86@kernel.org Signed-off-by: Khalid Aziz --- arch/x86/mm/fault.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 71d4b9d4d43f..ba51652fbd33 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -748,6 +748,12 @@ no_context(struct pt_regs *regs, unsigned long error_code, /* Executive summary in case the body of the oops scrolled away */ printk(KERN_DEFAULT "CR2: %016lx\n", address); + /* + * We're about to oops, which might kill the task. Make sure we're + * allowed to sleep. + */ + flags |= X86_EFLAGS_IF; + oops_end(flags, regs, sig); } From patchwork Thu Jan 10 21:09:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756909 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 26B8B13B5 for ; Thu, 10 Jan 2019 21:11:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1555D29BB3 for ; Thu, 10 Jan 2019 21:11:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 07D4929BF0; Thu, 10 Jan 2019 21:11:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D4A3029BEF for ; Thu, 10 Jan 2019 21:11:09 +0000 (UTC) Received: (qmail 9860 invoked by uid 550); 10 Jan 2019 21:10:55 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9745 invoked from network); 10 Jan 2019 21:10:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=2WjKXUXlduh+4gXlgg+9YzVZNqCynpwL4YEjEIxTm8I=; b=iLvlkH6Stvvcg+Ctof8TpYoApkwhIj3uuBQcE0dlH8wWt/hv7+KdWl5qyHFZgSwIXI9h adxe3VO6BMtzYxP6JYXFBOHQbZBl/zF8kpTid0XqfUM+1VMxkjQmu1F9bw8u48LFAtTn RCcHQ9SJeFpsWGlaIOfHCw0t6K+82WlTWVoRIXlQj1faSaDt4J28JMU2xKGBPUgopdF5 vAQU3apAZBbI0Dv/KS7NHPueMDX1iFqDx2EUSa4Bt2LjR6vWHkfMnm9ukN3+mzxcKspE tvG+xYzUSvNDPORfv78EUKih8YpBsGG9H3QRDrDi2gxeLzW3jJMRvSUJp8wnMsMXq8Pi lQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Tycho Andersen , Marco Benatto , Khalid Aziz Subject: [RFC PATCH v7 03/16] mm, x86: Add support for eXclusive Page Frame Ownership (XPFO) Date: Thu, 10 Jan 2019 14:09:35 -0700 Message-Id: <231b09ba6bbcccc82ba001177c9d5ebcc8a4a11c.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger This patch adds support for XPFO which protects against 'ret2dir' kernel attacks. The basic idea is to enforce exclusive ownership of page frames by either the kernel or userspace, unless explicitly requested by the kernel. Whenever a page destined for userspace is allocated, it is unmapped from physmap (the kernel's page table). When such a page is reclaimed from userspace, it is mapped back to physmap. Additional fields in the page_ext struct are used for XPFO housekeeping, specifically: - two flags to distinguish user vs. kernel pages and to tag unmapped pages. - a reference counter to balance kmap/kunmap operations. - a lock to serialize access to the XPFO fields. This patch is based on the work of Vasileios P. Kemerlis et al. who published their work in this paper: http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf v6: * use flush_tlb_kernel_range() instead of __flush_tlb_one, so we flush the tlb entry on all CPUs when unmapping it in kunmap * handle lookup_page_ext()/lookup_xpfo() returning NULL * drop lots of BUG()s in favor of WARN() * don't disable irqs in xpfo_kmap/xpfo_kunmap, export __split_large_page so we can do our own alloc_pages(GFP_ATOMIC) to pass it CC: x86@kernel.org Suggested-by: Vasileios P. Kemerlis Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Marco Benatto [jsteckli@amazon.de: rebased from v4.13 to v4.19] Signed-off-by: Julian Stecklina Signed-off-by: Khalid Aziz --- .../admin-guide/kernel-parameters.txt | 2 + arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 26 ++ arch/x86/mm/Makefile | 2 + arch/x86/mm/pageattr.c | 23 +- arch/x86/mm/xpfo.c | 114 +++++++++ include/linux/highmem.h | 15 +- include/linux/xpfo.h | 47 ++++ mm/Makefile | 1 + mm/page_alloc.c | 2 + mm/page_ext.c | 4 + mm/xpfo.c | 222 ++++++++++++++++++ security/Kconfig | 19 ++ 13 files changed, 456 insertions(+), 22 deletions(-) create mode 100644 arch/x86/mm/xpfo.c create mode 100644 include/linux/xpfo.h create mode 100644 mm/xpfo.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index aefd358a5ca3..c4c62599f216 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2982,6 +2982,8 @@ nox2apic [X86-64,APIC] Do not enable x2APIC mode. + noxpfo [X86-64] Disable XPFO when CONFIG_XPFO is on. + cpu0_hotplug [X86] Turn on CPU0 hotplug feature when CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off. Some features depend on CPU0. Known dependencies are: diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8689e794a43c..d69d8cc6e57e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -207,6 +207,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select ARCH_SUPPORTS_XPFO if X86_64 config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 40616e805292..ad2d1792939d 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1437,6 +1437,32 @@ static inline bool arch_has_pfn_modify_check(void) return boot_cpu_has_bug(X86_BUG_L1TF); } +/* + * The current flushing context - we pass it instead of 5 arguments: + */ +struct cpa_data { + unsigned long *vaddr; + pgd_t *pgd; + pgprot_t mask_set; + pgprot_t mask_clr; + unsigned long numpages; + int flags; + unsigned long pfn; + unsigned force_split : 1, + force_static_prot : 1; + int curpage; + struct page **pages; +}; + + +int +should_split_large_page(pte_t *kpte, unsigned long address, + struct cpa_data *cpa); +extern spinlock_t cpa_lock; +int +__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address, + struct page *base); + #include #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 4b101dd6e52f..93b0fdaf4a99 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -53,3 +53,5 @@ obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index a1bcde35db4c..84002442ab61 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -26,23 +26,6 @@ #include #include -/* - * The current flushing context - we pass it instead of 5 arguments: - */ -struct cpa_data { - unsigned long *vaddr; - pgd_t *pgd; - pgprot_t mask_set; - pgprot_t mask_clr; - unsigned long numpages; - int flags; - unsigned long pfn; - unsigned force_split : 1, - force_static_prot : 1; - int curpage; - struct page **pages; -}; - enum cpa_warn { CPA_CONFLICT, CPA_PROTECT, @@ -57,7 +40,7 @@ static const int cpa_warn_level = CPA_PROTECT; * entries change the page attribute in parallel to some other cpu * splitting a large page entry along with changing the attribute. */ -static DEFINE_SPINLOCK(cpa_lock); +DEFINE_SPINLOCK(cpa_lock); #define CPA_FLUSHTLB 1 #define CPA_ARRAY 2 @@ -869,7 +852,7 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address, return 0; } -static int should_split_large_page(pte_t *kpte, unsigned long address, +int should_split_large_page(pte_t *kpte, unsigned long address, struct cpa_data *cpa) { int do_split; @@ -919,7 +902,7 @@ static void split_set_pte(struct cpa_data *cpa, pte_t *pte, unsigned long pfn, set_pte(pte, pfn_pte(pfn, ref_prot)); } -static int +int __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address, struct page *base) { diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c new file mode 100644 index 000000000000..d1f04ea533cd --- /dev/null +++ b/arch/x86/mm/xpfo.c @@ -0,0 +1,114 @@ +/* + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include + +#include + +extern spinlock_t cpa_lock; + +/* Update a single kernel page table entry */ +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot) +{ + unsigned int level; + pgprot_t msk_clr; + pte_t *pte = lookup_address((unsigned long)kaddr, &level); + + if (unlikely(!pte)) { + WARN(1, "xpfo: invalid address %p\n", kaddr); + return; + } + + switch (level) { + case PG_LEVEL_4K: + set_pte_atomic(pte, pfn_pte(page_to_pfn(page), canon_pgprot(prot))); + break; + case PG_LEVEL_2M: + case PG_LEVEL_1G: { + struct cpa_data cpa = { }; + int do_split; + + if (level == PG_LEVEL_2M) + msk_clr = pmd_pgprot(*(pmd_t*)pte); + else + msk_clr = pud_pgprot(*(pud_t*)pte); + + cpa.vaddr = kaddr; + cpa.pages = &page; + cpa.mask_set = prot; + cpa.mask_clr = msk_clr; + cpa.numpages = 1; + cpa.flags = 0; + cpa.curpage = 0; + cpa.force_split = 0; + + + do_split = should_split_large_page(pte, (unsigned long)kaddr, + &cpa); + if (do_split) { + struct page *base; + + base = alloc_pages(GFP_ATOMIC, 0); + if (!base) { + WARN(1, "xpfo: failed to split large page\n"); + break; + } + + if (!debug_pagealloc_enabled()) + spin_lock(&cpa_lock); + if (__split_large_page(&cpa, pte, (unsigned long)kaddr, base) < 0) + WARN(1, "xpfo: failed to split large page\n"); + if (!debug_pagealloc_enabled()) + spin_unlock(&cpa_lock); + } + + break; + } + case PG_LEVEL_512G: + /* fallthrough, splitting infrastructure doesn't + * support 512G pages. */ + default: + WARN(1, "xpfo: unsupported page level %x\n", level); + } + +} + +inline void xpfo_flush_kernel_tlb(struct page *page, int order) +{ + int level; + unsigned long size, kaddr; + + kaddr = (unsigned long)page_address(page); + + if (unlikely(!lookup_address(kaddr, &level))) { + WARN(1, "xpfo: invalid address to flush %lx %d\n", kaddr, level); + return; + } + + switch (level) { + case PG_LEVEL_4K: + size = PAGE_SIZE; + break; + case PG_LEVEL_2M: + size = PMD_SIZE; + break; + case PG_LEVEL_1G: + size = PUD_SIZE; + break; + default: + WARN(1, "xpfo: unsupported page level %x\n", level); + return; + } + + flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); +} diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 0690679832d4..1fdae929e38b 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -8,6 +8,7 @@ #include #include #include +#include #include @@ -56,24 +57,34 @@ static inline struct page *kmap_to_page(void *addr) #ifndef ARCH_HAS_KMAP static inline void *kmap(struct page *page) { + void *kaddr; + might_sleep(); - return page_address(page); + kaddr = page_address(page); + xpfo_kmap(kaddr, page); + return kaddr; } static inline void kunmap(struct page *page) { + xpfo_kunmap(page_address(page), page); } static inline void *kmap_atomic(struct page *page) { + void *kaddr; + preempt_disable(); pagefault_disable(); - return page_address(page); + kaddr = page_address(page); + xpfo_kmap(kaddr, page); + return kaddr; } #define kmap_atomic_prot(page, prot) kmap_atomic(page) static inline void __kunmap_atomic(void *addr) { + xpfo_kunmap(addr, virt_to_page(addr)); pagefault_enable(); preempt_enable(); } diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h new file mode 100644 index 000000000000..a39259ce0174 --- /dev/null +++ b/include/linux/xpfo.h @@ -0,0 +1,47 @@ +/* + * Copyright (C) 2017 Docker, Inc. + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * Tycho Andersen + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#ifndef _LINUX_XPFO_H +#define _LINUX_XPFO_H + +#include +#include + +struct page; + +#ifdef CONFIG_XPFO + +extern struct page_ext_operations page_xpfo_ops; + +void set_kpte(void *kaddr, struct page *page, pgprot_t prot); +void xpfo_dma_map_unmap_area(bool map, const void *addr, size_t size, + enum dma_data_direction dir); +void xpfo_flush_kernel_tlb(struct page *page, int order); + +void xpfo_kmap(void *kaddr, struct page *page); +void xpfo_kunmap(void *kaddr, struct page *page); +void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp); +void xpfo_free_pages(struct page *page, int order); + +#else /* !CONFIG_XPFO */ + +static inline void xpfo_kmap(void *kaddr, struct page *page) { } +static inline void xpfo_kunmap(void *kaddr, struct page *page) { } +static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } +static inline void xpfo_free_pages(struct page *page, int order) { } + +#endif /* CONFIG_XPFO */ + +#endif /* _LINUX_XPFO_H */ diff --git a/mm/Makefile b/mm/Makefile index d210cc9d6f80..e99e1e6ae5ae 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -99,3 +99,4 @@ obj-$(CONFIG_HARDENED_USERCOPY) += usercopy.o obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o obj-$(CONFIG_HMM) += hmm.o obj-$(CONFIG_MEMFD_CREATE) += memfd.o +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e95b5b7c9c3d..08e277790b5f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1038,6 +1038,7 @@ static __always_inline bool free_pages_prepare(struct page *page, kernel_poison_pages(page, 1 << order, 0); kernel_map_pages(page, 1 << order, 0); kasan_free_pages(page, order); + xpfo_free_pages(page, order); return true; } @@ -1915,6 +1916,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, kernel_map_pages(page, 1 << order, 1); kernel_poison_pages(page, 1 << order, 1); kasan_alloc_pages(page, order); + xpfo_alloc_pages(page, order, gfp_flags); set_page_owner(page, order, gfp_flags); } diff --git a/mm/page_ext.c b/mm/page_ext.c index ae44f7adbe07..38e5013dcb9a 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,6 +8,7 @@ #include #include #include +#include /* * struct page extension @@ -68,6 +69,9 @@ static struct page_ext_operations *page_ext_ops[] = { #if defined(CONFIG_IDLE_PAGE_TRACKING) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_XPFO + &page_xpfo_ops, +#endif }; static unsigned long total_usage; diff --git a/mm/xpfo.c b/mm/xpfo.c new file mode 100644 index 000000000000..bff24afcaa2e --- /dev/null +++ b/mm/xpfo.c @@ -0,0 +1,222 @@ +/* + * Copyright (C) 2017 Docker, Inc. + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * Tycho Andersen + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include +#include +#include +#include + +#include + +/* XPFO page state flags */ +enum xpfo_flags { + XPFO_PAGE_USER, /* Page is allocated to user-space */ + XPFO_PAGE_UNMAPPED, /* Page is unmapped from the linear map */ +}; + +/* Per-page XPFO house-keeping data */ +struct xpfo { + unsigned long flags; /* Page state */ + bool inited; /* Map counter and lock initialized */ + atomic_t mapcount; /* Counter for balancing map/unmap requests */ + spinlock_t maplock; /* Lock to serialize map/unmap requests */ +}; + +DEFINE_STATIC_KEY_FALSE(xpfo_inited); + +static bool xpfo_disabled __initdata; + +static int __init noxpfo_param(char *str) +{ + xpfo_disabled = true; + + return 0; +} + +early_param("noxpfo", noxpfo_param); + +static bool __init need_xpfo(void) +{ + if (xpfo_disabled) { + printk(KERN_INFO "XPFO disabled\n"); + return false; + } + + return true; +} + +static void init_xpfo(void) +{ + printk(KERN_INFO "XPFO enabled\n"); + static_branch_enable(&xpfo_inited); +} + +struct page_ext_operations page_xpfo_ops = { + .size = sizeof(struct xpfo), + .need = need_xpfo, + .init = init_xpfo, +}; + +static inline struct xpfo *lookup_xpfo(struct page *page) +{ + struct page_ext *page_ext = lookup_page_ext(page); + + if (unlikely(!page_ext)) { + WARN(1, "xpfo: failed to get page ext"); + return NULL; + } + + return (void *)page_ext + page_xpfo_ops.offset; +} + +void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) +{ + int i, flush_tlb = 0; + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + for (i = 0; i < (1 << order); i++) { + xpfo = lookup_xpfo(page + i); + if (!xpfo) + continue; + + WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), + "xpfo: unmapped page being allocated\n"); + + /* Initialize the map lock and map counter */ + if (unlikely(!xpfo->inited)) { + spin_lock_init(&xpfo->maplock); + atomic_set(&xpfo->mapcount, 0); + xpfo->inited = true; + } + WARN(atomic_read(&xpfo->mapcount), + "xpfo: already mapped page being allocated\n"); + + if ((gfp & GFP_HIGHUSER) == GFP_HIGHUSER) { + /* + * Tag the page as a user page and flush the TLB if it + * was previously allocated to the kernel. + */ + if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) + flush_tlb = 1; + } else { + /* Tag the page as a non-user (kernel) page */ + clear_bit(XPFO_PAGE_USER, &xpfo->flags); + } + } + + if (flush_tlb) + xpfo_flush_kernel_tlb(page, order); +} + +void xpfo_free_pages(struct page *page, int order) +{ + int i; + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + for (i = 0; i < (1 << order); i++) { + xpfo = lookup_xpfo(page + i); + if (!xpfo || unlikely(!xpfo->inited)) { + /* + * The page was allocated before page_ext was + * initialized, so it is a kernel page. + */ + continue; + } + + /* + * Map the page back into the kernel if it was previously + * allocated to user space. + */ + if (test_and_clear_bit(XPFO_PAGE_USER, &xpfo->flags)) { + clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + set_kpte(page_address(page + i), page + i, + PAGE_KERNEL); + } + } +} + +void xpfo_kmap(void *kaddr, struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + xpfo = lookup_xpfo(page); + + /* + * The page was allocated before page_ext was initialized (which means + * it's a kernel page) or it's allocated to the kernel, so nothing to + * do. + */ + if (!xpfo || unlikely(!xpfo->inited) || + !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + return; + + spin_lock(&xpfo->maplock); + + /* + * The page was previously allocated to user space, so map it back + * into the kernel. No TLB flush required. + */ + if ((atomic_inc_return(&xpfo->mapcount) == 1) && + test_and_clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags)) + set_kpte(kaddr, page, PAGE_KERNEL); + + spin_unlock(&xpfo->maplock); +} +EXPORT_SYMBOL(xpfo_kmap); + +void xpfo_kunmap(void *kaddr, struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + xpfo = lookup_xpfo(page); + + /* + * The page was allocated before page_ext was initialized (which means + * it's a kernel page) or it's allocated to the kernel, so nothing to + * do. + */ + if (!xpfo || unlikely(!xpfo->inited) || + !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + return; + + spin_lock(&xpfo->maplock); + + /* + * The page is to be allocated back to user space, so unmap it from the + * kernel, flush the TLB and tag it as a user page. + */ + if (atomic_dec_return(&xpfo->mapcount) == 0) { + WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), + "xpfo: unmapping already unmapped page\n"); + set_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + set_kpte(kaddr, page, __pgprot(0)); + xpfo_flush_kernel_tlb(page, 0); + } + + spin_unlock(&xpfo->maplock); +} +EXPORT_SYMBOL(xpfo_kunmap); diff --git a/security/Kconfig b/security/Kconfig index d9aa521b5206..8d0e4e303551 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -6,6 +6,25 @@ menu "Security options" source security/keys/Kconfig +config ARCH_SUPPORTS_XPFO + bool + +config XPFO + bool "Enable eXclusive Page Frame Ownership (XPFO)" + default n + depends on ARCH_SUPPORTS_XPFO + select PAGE_EXTENSION + help + This option offers protection against 'ret2dir' kernel attacks. + When enabled, every time a page frame is allocated to user space, it + is unmapped from the direct mapped RAM region in kernel space + (physmap). Similarly, when a page frame is freed/reclaimed, it is + mapped back to physmap. + + There is a slight performance impact when this option is enabled. + + If in doubt, say "N". + config SECURITY_DMESG_RESTRICT bool "Restrict unprivileged access to the kernel syslog" default n From patchwork Thu Jan 10 21:09:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756921 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D957513B5 for ; Thu, 10 Jan 2019 21:11:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9A3129BF6 for ; Thu, 10 Jan 2019 21:11:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BC40129C04; Thu, 10 Jan 2019 21:11:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D9B4D29BF6 for ; Thu, 10 Jan 2019 21:11:31 +0000 (UTC) Received: (qmail 10154 invoked by uid 550); 10 Jan 2019 21:10:59 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9994 invoked from network); 10 Jan 2019 21:10:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=65bJHm+xMDUqKLWlJszEVdy7uVRxf98qoGoZGFoK+HQ=; b=QOSqm8Zv7K4bMlhX/bbVjgLCNRTn/vGcgXRjPQDdCQ2rFWjC1jE5LPhpr0gENIIpwgu4 sz7a/UkbO23yDajdO+GXR034Mh5eIOQRKEWxMBO/MAWFMXDnD44/asxTyfBHnplQOl0/ xM03rfATk55d2MsnRnzeazoyMfZVlBG4lqi/QcaOe1G7JZs0hWiPGt4/F0RYcVJehdPd uFq1W6pfDY2BTLS1yXN5jW84mKURQytSbef5qWfTdzHCQRpyD5+DDzP6AlTYUpdvzBB5 J6YveHf8bzwo5ZZ7E9OElu46qdWTWUcuaaJLnoelH74Dj/mMvWz23RVaHcEkpd6Hf45P yg== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tycho Andersen , Khalid Aziz Subject: [RFC PATCH v7 04/16] swiotlb: Map the buffer if it was unmapped by XPFO Date: Thu, 10 Jan 2019 14:09:36 -0700 Message-Id: <98f9b9be522d694d5a52640dd1dfbdd14ca6f8e5.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=868 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger v6: * guard against lookup_xpfo() returning NULL CC: Konrad Rzeszutek Wilk Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz Reviewed-by: Konrad Rzeszutek Wilk --- include/linux/xpfo.h | 4 ++++ kernel/dma/swiotlb.c | 3 ++- mm/xpfo.c | 15 +++++++++++++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index a39259ce0174..e38b823f44e3 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -35,6 +35,8 @@ void xpfo_kunmap(void *kaddr, struct page *page); void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp); void xpfo_free_pages(struct page *page, int order); +bool xpfo_page_is_unmapped(struct page *page); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -42,6 +44,8 @@ static inline void xpfo_kunmap(void *kaddr, struct page *page) { } static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } static inline void xpfo_free_pages(struct page *page, int order) { } +static inline bool xpfo_page_is_unmapped(struct page *page) { return false; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 045930e32c0e..820a54b57491 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -396,8 +396,9 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr, { unsigned long pfn = PFN_DOWN(orig_addr); unsigned char *vaddr = phys_to_virt(tlb_addr); + struct page *page = pfn_to_page(pfn); - if (PageHighMem(pfn_to_page(pfn))) { + if (PageHighMem(page) || xpfo_page_is_unmapped(page)) { /* The buffer does not have a mapping. Map it in and copy */ unsigned int offset = orig_addr & ~PAGE_MASK; char *buffer; diff --git a/mm/xpfo.c b/mm/xpfo.c index bff24afcaa2e..cdbcbac582d5 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -220,3 +220,18 @@ void xpfo_kunmap(void *kaddr, struct page *page) spin_unlock(&xpfo->maplock); } EXPORT_SYMBOL(xpfo_kunmap); + +bool xpfo_page_is_unmapped(struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return false; + + xpfo = lookup_xpfo(page); + if (unlikely(!xpfo) && !xpfo->inited) + return false; + + return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); +} +EXPORT_SYMBOL(xpfo_page_is_unmapped); From patchwork Thu Jan 10 21:09:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A259513B5 for ; Thu, 10 Jan 2019 21:11:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 924DB29BFF for ; Thu, 10 Jan 2019 21:11:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 865FA29C2C; Thu, 10 Jan 2019 21:11:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 86F5F29C17 for ; Thu, 10 Jan 2019 21:11:41 +0000 (UTC) Received: (qmail 11699 invoked by uid 550); 10 Jan 2019 21:11:04 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 11604 invoked from network); 10 Jan 2019 21:11:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=gX2mJdtpePxKA35+8CzrSsJglZdMj3Uyq1TNFTJ+UG8=; b=Ibd0cQL66/Pb0nDbJsIMNJzyd8kKLoFMvwVPnDYSaUhVHgEfhyU47vuyhyoAQ9k2SP6C StEUBOQgSCvzTnn9h+UDKXfgtPFCWhtlMR0JzkADh9ihDe3FPhKazazVcoo14jbU7XQl K1EjNugwd2s1+zetkMhkcq/FsEe9Yd8W/GJLHPv+EMsOmp07GsbYVXK9615cjbazxiSP XJw0vyBfi7NQZ2wuGAFgjtITMe9DL6AW5g4qS9JAx9bFXnfwm4leU+XamB68ZTlkf+J6 e3SXq2H+Jlus4XUj2W07lFbgFkTlaBZXXkGQsWherXkWrPFagTorX6CU6WYs040aVs5A Fg== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Tycho Andersen , Khalid Aziz Subject: [RFC PATCH v7 05/16] arm64/mm: Add support for XPFO Date: Thu, 10 Jan 2019 14:09:37 -0700 Message-Id: <89f03091af87f5ab27bd6cafb032236d5bd81d65.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger Enable support for eXclusive Page Frame Ownership (XPFO) for arm64 and provide a hook for updating a single kernel page table entry (which is required by the generic XPFO code). v6: use flush_tlb_kernel_range() instead of __flush_tlb_one() CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz --- arch/arm64/Kconfig | 1 + arch/arm64/mm/Makefile | 2 ++ arch/arm64/mm/xpfo.c | 58 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 61 insertions(+) create mode 100644 arch/arm64/mm/xpfo.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index ea2ab0330e3a..f0a9c0007d23 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -171,6 +171,7 @@ config ARM64 select SWIOTLB select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK + select ARCH_SUPPORTS_XPFO help ARM 64-bit (AArch64) Linux support. diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 849c1df3d214..cca3808d9776 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -12,3 +12,5 @@ KASAN_SANITIZE_physaddr.o += n obj-$(CONFIG_KASAN) += kasan_init.o KASAN_SANITIZE_kasan_init.o := n + +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/arch/arm64/mm/xpfo.c b/arch/arm64/mm/xpfo.c new file mode 100644 index 000000000000..678e2be848eb --- /dev/null +++ b/arch/arm64/mm/xpfo.c @@ -0,0 +1,58 @@ +/* + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include +#include + +#include + +/* + * Lookup the page table entry for a virtual address and return a pointer to + * the entry. Based on x86 tree. + */ +static pte_t *lookup_address(unsigned long addr) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + + pgd = pgd_offset_k(addr); + if (pgd_none(*pgd)) + return NULL; + + pud = pud_offset(pgd, addr); + if (pud_none(*pud)) + return NULL; + + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + return NULL; + + return pte_offset_kernel(pmd, addr); +} + +/* Update a single kernel page table entry */ +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot) +{ + pte_t *pte = lookup_address((unsigned long)kaddr); + + set_pte(pte, pfn_pte(page_to_pfn(page), prot)); +} + +inline void xpfo_flush_kernel_tlb(struct page *page, int order) +{ + unsigned long kaddr = (unsigned long)page_address(page); + unsigned long size = PAGE_SIZE; + + flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); +} From patchwork Thu Jan 10 21:09:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71D9113B5 for ; Thu, 10 Jan 2019 21:11:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 617A129B9C for ; Thu, 10 Jan 2019 21:11:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 558F229BF7; Thu, 10 Jan 2019 21:11:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 6E35E29B9C for ; Thu, 10 Jan 2019 21:11:21 +0000 (UTC) Received: (qmail 10057 invoked by uid 550); 10 Jan 2019 21:10:58 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9950 invoked from network); 10 Jan 2019 21:10:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=igejgCvAio6u1Ussld9AuQC3Ul6s8euovTImu0aEjD4=; b=1RiH7Ayttgl67HmBUpj9Zdch6LCLPksWDKHZCY0c3O4g6K6f5lB+hFkquaMWY/CeqG2c WA3MN5u8IJiO8Q9EgRsR9weHv6aPXsg7lHvl6yVYGrFoAElAl/Qtxo/16weY5EyApWJw ZcUENEhCipmYwEhouhHCKcEms6inTWl5F8KMuKzhwQTrjTOoEAagowAB7i5Bu3g6vsTO g6DsIP3fE09F4sXWS4Bh+elRfQKiEMVnQyVSwOrGHEEUPs0EeBDb01Zp18Asj2eNwi1g Wr2MJ/rH5TiF9nQr2d44cNyQBMa8MVknWh6/vaIjGPPQMqxLcxhzgvfyKlQJRmStxkM5 KA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Khalid Aziz Subject: [RFC PATCH v7 06/16] xpfo: add primitives for mapping underlying memory Date: Thu, 10 Jan 2019 14:09:38 -0700 Message-Id: <5deed7a1eb65fc6c66acb8a00d46d63e7f0fd22f.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=972 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen In some cases (on arm64 DMA and data cache flushes) we may have unmapped the underlying pages needed for something via XPFO. Here are some primitives useful for ensuring the underlying memory is mapped/unmapped in the face of xpfo. Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz --- include/linux/xpfo.h | 22 ++++++++++++++++++++++ mm/xpfo.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index e38b823f44e3..2682a00ebbcb 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -37,6 +37,15 @@ void xpfo_free_pages(struct page *page, int order); bool xpfo_page_is_unmapped(struct page *page); +#define XPFO_NUM_PAGES(addr, size) \ + (PFN_UP((unsigned long) (addr) + (size)) - \ + PFN_DOWN((unsigned long) (addr))) + +void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len); +void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, + size_t mapping_len); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -46,6 +55,19 @@ static inline void xpfo_free_pages(struct page *page, int order) { } static inline bool xpfo_page_is_unmapped(struct page *page) { return false; } +#define XPFO_NUM_PAGES(addr, size) 0 + +static inline void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ +} + +static inline void xpfo_temp_unmap(const void *addr, size_t size, + void **mapping, size_t mapping_len) +{ +} + + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/xpfo.c b/mm/xpfo.c index cdbcbac582d5..f79075bf7d65 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -13,6 +13,7 @@ * the Free Software Foundation. */ +#include #include #include #include @@ -235,3 +236,32 @@ bool xpfo_page_is_unmapped(struct page *page) return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); } EXPORT_SYMBOL(xpfo_page_is_unmapped); + +void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ + struct page *page = virt_to_page(addr); + int i, num_pages = mapping_len / sizeof(mapping[0]); + + memset(mapping, 0, mapping_len); + + for (i = 0; i < num_pages; i++) { + if (page_to_virt(page + i) >= addr + size) + break; + + if (xpfo_page_is_unmapped(page + i)) + mapping[i] = kmap_atomic(page + i); + } +} +EXPORT_SYMBOL(xpfo_temp_map); + +void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ + int i, num_pages = mapping_len / sizeof(mapping[0]); + + for (i = 0; i < num_pages; i++) + if (mapping[i]) + kunmap_atomic(mapping[i]); +} +EXPORT_SYMBOL(xpfo_temp_unmap); From patchwork Thu Jan 10 21:09:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756937 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0EEAA159A for ; Thu, 10 Jan 2019 21:12:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0035329BE7 for ; Thu, 10 Jan 2019 21:12:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E89FE29CDE; Thu, 10 Jan 2019 21:12:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 05E4629BE7 for ; Thu, 10 Jan 2019 21:12:17 +0000 (UTC) Received: (qmail 11982 invoked by uid 550); 10 Jan 2019 21:11:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 11829 invoked from network); 10 Jan 2019 21:11:06 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=7ze7ihNQNB91qQvcMs0AZsSJuj8aoWf893BJ5dPG2Tc=; b=zQ6tEafkYELketM9PQGIBhEggEtIFIzXE2AsxM/vjfAsKwijBbkklOA/wPl76KxUn5Er MVdeIHP/9T8pCzW9SZoZghIsrZOZNKKAoGszZc9Qr6DGVzSJOzz838a2YocF87jIomut IgACh2Uy1NML3PSo4anLaJ1jkIZukutMnbnYxOgiQzp74Yun3c8MmQMlRDscX5LdOkuD yx2o0t/4gZJlb9IkGWObtSKw/0X8YYv96FPadoHq+H/eQc4qy/uxbQrk9SCnvYbKScqz NibBrnvTaUIwc9u7cdJBxSwRG8l7RjGH1NC22eVgRnz0OLUbT7rrDDNQzRc5gx2BuKNd Wg== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Tycho Andersen , Khalid Aziz Subject: [RFC PATCH v7 07/16] arm64/mm, xpfo: temporarily map dcache regions Date: Thu, 10 Jan 2019 14:09:39 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=682 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger If the page is unmapped by XPFO, a data cache flush results in a fatal page fault, so let's temporarily map the region, flush the cache, and then unmap it. v6: actually flush in the face of xpfo, and temporarily map the underlying memory so it can be flushed correctly CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz --- arch/arm64/mm/flush.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index 30695a868107..f12f26b60319 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -28,9 +29,15 @@ void sync_icache_aliases(void *kaddr, unsigned long len) { unsigned long addr = (unsigned long)kaddr; + unsigned long num_pages = XPFO_NUM_PAGES(addr, len); + void *mapping[num_pages]; if (icache_is_aliasing()) { + xpfo_temp_map(kaddr, len, mapping, + sizeof(mapping[0]) * num_pages); __clean_dcache_area_pou(kaddr, len); + xpfo_temp_unmap(kaddr, len, mapping, + sizeof(mapping[0]) * num_pages); __flush_icache_all(); } else { flush_icache_range(addr, addr + len); From patchwork Thu Jan 10 21:09:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756931 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 20524159A for ; Thu, 10 Jan 2019 21:11:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1001A29BFC for ; Thu, 10 Jan 2019 21:11:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0369529C4C; Thu, 10 Jan 2019 21:11:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 1DBB929C31 for ; Thu, 10 Jan 2019 21:11:52 +0000 (UTC) Received: (qmail 11781 invoked by uid 550); 10 Jan 2019 21:11:05 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 11653 invoked from network); 10 Jan 2019 21:11:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=bZUUZyFkbCMprgrasNXmseU+a3iCWM1WQZPOxRn+auM=; b=vcWL4IWbZwKYO2SXRab+NCyTHIBLpjZHIceUOOe25YQx4ENkJt0k1exSOtes/0a8vZYN 8i8kYPNeIinu/be7AoDq3HqHVwO7DnGY2cyZp0ailH1N0x+XrMtBb6i8etUEhucD3NNB 4PvYZRQCTABPDIyAK6SVpi1ngX5xHoDgc0Y4tHvnJh4wSJNVoj5aw++N1nU7Ka5qe57H vA4iQ9ck7Wjr57REHXLQeKTC2IMnxdS5kX7gGddyuDRwGwj5HuNWqkLxMu0IsYZzwYcD B5360GALRkAG7PSTM225aQeQ3tE5LIYp/o2E8IgOVDAFyYUGXTBSzk29Mhwe50XVADq3 gQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Khalid Aziz Subject: [RFC PATCH v7 08/16] arm64/mm: disable section/contiguous mappings if XPFO is enabled Date: Thu, 10 Jan 2019 14:09:40 -0700 Message-Id: <3dfdd42afe1749d4f82816f967532643de3a5024.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen XPFO doesn't support section/contiguous mappings yet, so let's disable it if XPFO is turned on. Thanks to Laura Abbot for the simplification from v5, and Mark Rutland for pointing out we need NO_CONT_MAPPINGS too. CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz --- arch/arm64/mm/mmu.c | 2 +- include/linux/xpfo.h | 4 ++++ mm/xpfo.c | 6 ++++++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index d1d6601b385d..f4dd27073006 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -451,7 +451,7 @@ static void __init map_mem(pgd_t *pgdp) struct memblock_region *reg; int flags = 0; - if (debug_pagealloc_enabled()) + if (debug_pagealloc_enabled() || xpfo_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; /* diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 2682a00ebbcb..0c26836a24e1 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -46,6 +46,8 @@ void xpfo_temp_map(const void *addr, size_t size, void **mapping, void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, size_t mapping_len); +bool xpfo_enabled(void); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -68,6 +70,8 @@ static inline void xpfo_temp_unmap(const void *addr, size_t size, } +static inline bool xpfo_enabled(void) { return false; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/xpfo.c b/mm/xpfo.c index f79075bf7d65..25fba05d01bd 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -70,6 +70,12 @@ struct page_ext_operations page_xpfo_ops = { .init = init_xpfo, }; +bool __init xpfo_enabled(void) +{ + return !xpfo_disabled; +} +EXPORT_SYMBOL(xpfo_enabled); + static inline struct xpfo *lookup_xpfo(struct page *page) { struct page_ext *page_ext = lookup_page_ext(page); From patchwork Thu Jan 10 21:09:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756935 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 030A1159A for ; Thu, 10 Jan 2019 21:12:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E866129C31 for ; Thu, 10 Jan 2019 21:12:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E636129C58; Thu, 10 Jan 2019 21:12:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0510629C31 for ; Thu, 10 Jan 2019 21:12:04 +0000 (UTC) Received: (qmail 11897 invoked by uid 550); 10 Jan 2019 21:11:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 11827 invoked from network); 10 Jan 2019 21:11:06 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=jM+XBJTvfFY9YOS7nm1GskqZaJnifsuOk+60I+M12Vk=; b=C3B2yu91Lrtl2z5lJo84if6WdFpkmA5XRuSlyQpmPar3mlc/WZUiZLxo0YSosTAHg1oF IbzRl+nE8n2cCq0uiPwuTVkSNI4tCdFPvJHcHpSFbnrgZuBqF8FKISs338kVazs3DHxy Y74mURefBkUqEK5hk8x9gNLCu74Glj42Oq2bTWxQrViRo/1eeY83SmFf/JgULoW4gl7C wCFLiUELZT2rnL5BqakUUdf2j58dPCc+UZ8gylgvVnG55E4GNzVfJg3YIT8R7+fsfYFW x9yd2KGgdif0YRDrUmz2D3G+h051c6TTrynk9ds3tZWEpy6tXhkLXfAX6hJeRoyhoorN cw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, Khalid Aziz Subject: [RFC PATCH v7 09/16] mm: add a user_virt_to_phys symbol Date: Thu, 10 Jan 2019 14:09:41 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen We need someting like this for testing XPFO. Since it's architecture specific, putting it in the test code is slightly awkward, so let's make it an arch-specific symbol and export it for use in LKDTM. v6: * add a definition of user_virt_to_phys in the !CONFIG_XPFO case CC: linux-arm-kernel@lists.infradead.org CC: x86@kernel.org Signed-off-by: Tycho Andersen Tested-by: Marco Benatto Signed-off-by: Khalid Aziz --- arch/x86/mm/xpfo.c | 57 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/xpfo.h | 8 +++++++ 2 files changed, 65 insertions(+) diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c index d1f04ea533cd..bcdb2f2089d2 100644 --- a/arch/x86/mm/xpfo.c +++ b/arch/x86/mm/xpfo.c @@ -112,3 +112,60 @@ inline void xpfo_flush_kernel_tlb(struct page *page, int order) flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); } + +/* Convert a user space virtual address to a physical address. + * Shamelessly copied from slow_virt_to_phys() and lookup_address() in + * arch/x86/mm/pageattr.c + */ +phys_addr_t user_virt_to_phys(unsigned long addr) +{ + phys_addr_t phys_addr; + unsigned long offset; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_offset(current->mm, addr); + if (pgd_none(*pgd)) + return 0; + + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) + return 0; + + if (p4d_large(*p4d) || !p4d_present(*p4d)) { + phys_addr = (unsigned long)p4d_pfn(*p4d) << PAGE_SHIFT; + offset = addr & ~P4D_MASK; + goto out; + } + + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) + return 0; + + if (pud_large(*pud) || !pud_present(*pud)) { + phys_addr = (unsigned long)pud_pfn(*pud) << PAGE_SHIFT; + offset = addr & ~PUD_MASK; + goto out; + } + + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + return 0; + + if (pmd_large(*pmd) || !pmd_present(*pmd)) { + phys_addr = (unsigned long)pmd_pfn(*pmd) << PAGE_SHIFT; + offset = addr & ~PMD_MASK; + goto out; + } + + pte = pte_offset_kernel(pmd, addr); + phys_addr = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT; + offset = addr & ~PAGE_MASK; + +out: + return (phys_addr_t)(phys_addr | offset); +} +EXPORT_SYMBOL(user_virt_to_phys); diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 0c26836a24e1..d4b38ab8a633 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -23,6 +23,10 @@ struct page; #ifdef CONFIG_XPFO +#include + +#include + extern struct page_ext_operations page_xpfo_ops; void set_kpte(void *kaddr, struct page *page, pgprot_t prot); @@ -48,6 +52,8 @@ void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, bool xpfo_enabled(void); +phys_addr_t user_virt_to_phys(unsigned long addr); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -72,6 +78,8 @@ static inline void xpfo_temp_unmap(const void *addr, size_t size, static inline bool xpfo_enabled(void) { return false; } +static inline phys_addr_t user_virt_to_phys(unsigned long addr) { return 0; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ From patchwork Thu Jan 10 21:09:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756949 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50375159A for ; Thu, 10 Jan 2019 21:13:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EA8E29B97 for ; Thu, 10 Jan 2019 21:13:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2F68229BC9; Thu, 10 Jan 2019 21:13:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D0B2029B97 for ; Thu, 10 Jan 2019 21:13:28 +0000 (UTC) Received: (qmail 20444 invoked by uid 550); 10 Jan 2019 21:12:05 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 20375 invoked from network); 10 Jan 2019 21:12:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=NocXrXPIffk1aqDXhGYf67qsB90O98ClX1aVCbIYhrY=; b=xrOa/nnt17Q2+ds8DXVS1ojp5oi6drEZiIIjbf4GQ7whu67hOMsZ5LyGBZXz2MeDUaqZ wJBLHCOouVVYMswLJYK2gxJB6YoBu0tXmP8YPnY3nZURPC9VRlRk/dBLxTUxQ+uEesBf ciFXZBYPaDJErP7UgMR77UkRkTBcEKzu3OAHGibVKIPEpFFwJ/FZDydKvY39/fDzVsdu TlKvrcomVejswSAwgwV1ztNKhy5oN9XqwRH0FanipNeNZm6RWhtWrh4rVDuBaf+g6EbS 5ZtAyLB1qUylvO5lm4b4U2XlTG96zS2kmzfiAATTDZgfbxs7aXWuVMavRhsAI/PSsvMu ZQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tycho Andersen , Khalid Aziz Subject: [RFC PATCH v7 10/16] lkdtm: Add test for XPFO Date: Thu, 10 Jan 2019 14:09:42 -0700 Message-Id: <5a8bb25b1f2209ea40160c6cabf2bc850800e3ad.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger This test simply reads from userspace memory via the kernel's linear map. v6: * drop an #ifdef, just let the test fail if XPFO is not supported * add XPFO_SMP test to try and test the case when one CPU does an xpfo unmap of an address, that it can't be used accidentally by other CPUs. Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Tested-by: Marco Benatto [jsteckli@amazon.de: rebased from v4.13 to v4.19] Signed-off-by: Julian Stecklina Signed-off-by: Khalid Aziz --- drivers/misc/lkdtm/Makefile | 1 + drivers/misc/lkdtm/core.c | 3 + drivers/misc/lkdtm/lkdtm.h | 5 + drivers/misc/lkdtm/xpfo.c | 194 ++++++++++++++++++++++++++++++++++++ 4 files changed, 203 insertions(+) create mode 100644 drivers/misc/lkdtm/xpfo.c diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile index 951c984de61a..97c6b7818cce 100644 --- a/drivers/misc/lkdtm/Makefile +++ b/drivers/misc/lkdtm/Makefile @@ -9,6 +9,7 @@ lkdtm-$(CONFIG_LKDTM) += refcount.o lkdtm-$(CONFIG_LKDTM) += rodata_objcopy.o lkdtm-$(CONFIG_LKDTM) += usercopy.o lkdtm-$(CONFIG_LKDTM) += stackleak.o +lkdtm-$(CONFIG_LKDTM) += xpfo.o KASAN_SANITIZE_stackleak.o := n KCOV_INSTRUMENT_rodata.o := n diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index 2837dc77478e..25f4ab4ebf50 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -185,6 +185,9 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(USERCOPY_KERNEL), CRASHTYPE(USERCOPY_KERNEL_DS), CRASHTYPE(STACKLEAK_ERASING), + CRASHTYPE(XPFO_READ_USER), + CRASHTYPE(XPFO_READ_USER_HUGE), + CRASHTYPE(XPFO_SMP), }; diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index 3c6fd327e166..6b31ff0c7f8f 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -87,4 +87,9 @@ void lkdtm_USERCOPY_KERNEL_DS(void); /* lkdtm_stackleak.c */ void lkdtm_STACKLEAK_ERASING(void); +/* lkdtm_xpfo.c */ +void lkdtm_XPFO_READ_USER(void); +void lkdtm_XPFO_READ_USER_HUGE(void); +void lkdtm_XPFO_SMP(void); + #endif diff --git a/drivers/misc/lkdtm/xpfo.c b/drivers/misc/lkdtm/xpfo.c new file mode 100644 index 000000000000..d903063bdd0b --- /dev/null +++ b/drivers/misc/lkdtm/xpfo.c @@ -0,0 +1,194 @@ +/* + * This is for all the tests related to XPFO (eXclusive Page Frame Ownership). + */ + +#include "lkdtm.h" + +#include +#include +#include +#include +#include + +#include +#include + +#define XPFO_DATA 0xdeadbeef + +static unsigned long do_map(unsigned long flags) +{ + unsigned long user_addr, user_data = XPFO_DATA; + + user_addr = vm_mmap(NULL, 0, PAGE_SIZE, + PROT_READ | PROT_WRITE | PROT_EXEC, + flags, 0); + if (user_addr >= TASK_SIZE) { + pr_warn("Failed to allocate user memory\n"); + return 0; + } + + if (copy_to_user((void __user *)user_addr, &user_data, + sizeof(user_data))) { + pr_warn("copy_to_user failed\n"); + goto free_user; + } + + return user_addr; + +free_user: + vm_munmap(user_addr, PAGE_SIZE); + return 0; +} + +static unsigned long *user_to_kernel(unsigned long user_addr) +{ + phys_addr_t phys_addr; + void *virt_addr; + + phys_addr = user_virt_to_phys(user_addr); + if (!phys_addr) { + pr_warn("Failed to get physical address of user memory\n"); + return NULL; + } + + virt_addr = phys_to_virt(phys_addr); + if (phys_addr != virt_to_phys(virt_addr)) { + pr_warn("Physical address of user memory seems incorrect\n"); + return NULL; + } + + return virt_addr; +} + +static void read_map(unsigned long *virt_addr) +{ + pr_info("Attempting bad read from kernel address %p\n", virt_addr); + if (*(unsigned long *)virt_addr == XPFO_DATA) + pr_err("FAIL: Bad read succeeded?!\n"); + else + pr_err("FAIL: Bad read didn't fail but data is incorrect?!\n"); +} + +static void read_user_with_flags(unsigned long flags) +{ + unsigned long user_addr, *kernel; + + user_addr = do_map(flags); + if (!user_addr) { + pr_err("FAIL: map failed\n"); + return; + } + + kernel = user_to_kernel(user_addr); + if (!kernel) { + pr_err("FAIL: user to kernel conversion failed\n"); + goto free_user; + } + + read_map(kernel); + +free_user: + vm_munmap(user_addr, PAGE_SIZE); +} + +/* Read from userspace via the kernel's linear map. */ +void lkdtm_XPFO_READ_USER(void) +{ + read_user_with_flags(MAP_PRIVATE | MAP_ANONYMOUS); +} + +void lkdtm_XPFO_READ_USER_HUGE(void) +{ + read_user_with_flags(MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB); +} + +struct smp_arg { + unsigned long *virt_addr; + unsigned int cpu; +}; + +static int smp_reader(void *parg) +{ + struct smp_arg *arg = parg; + unsigned long *virt_addr; + + if (arg->cpu != smp_processor_id()) { + pr_err("FAIL: scheduled on wrong CPU?\n"); + return 0; + } + + virt_addr = smp_cond_load_acquire(&arg->virt_addr, VAL != NULL); + read_map(virt_addr); + + return 0; +} + +#ifdef CONFIG_X86 +#define XPFO_SMP_KILLED SIGKILL +#elif CONFIG_ARM64 +#define XPFO_SMP_KILLED SIGSEGV +#else +#error unsupported arch +#endif + +/* The idea here is to read from the kernel's map on a different thread than + * did the mapping (and thus the TLB flushing), to make sure that the page + * faults on other cores too. + */ +void lkdtm_XPFO_SMP(void) +{ + unsigned long user_addr, *virt_addr; + struct task_struct *thread; + int ret; + struct smp_arg arg; + + if (num_online_cpus() < 2) { + pr_err("not enough to do a multi cpu test\n"); + return; + } + + arg.virt_addr = NULL; + arg.cpu = (smp_processor_id() + 1) % num_online_cpus(); + thread = kthread_create(smp_reader, &arg, "lkdtm_xpfo_test"); + if (IS_ERR(thread)) { + pr_err("couldn't create kthread? %ld\n", PTR_ERR(thread)); + return; + } + + kthread_bind(thread, arg.cpu); + get_task_struct(thread); + wake_up_process(thread); + + user_addr = do_map(MAP_PRIVATE | MAP_ANONYMOUS); + if (!user_addr) + goto kill_thread; + + virt_addr = user_to_kernel(user_addr); + if (!virt_addr) { + /* + * let's store something that will fail, so we can unblock the + * thread + */ + smp_store_release(&arg.virt_addr, &arg); + goto free_user; + } + + smp_store_release(&arg.virt_addr, virt_addr); + + /* there must be a better way to do this. */ + while (1) { + if (thread->exit_state) + break; + msleep_interruptible(100); + } + +free_user: + if (user_addr) + vm_munmap(user_addr, PAGE_SIZE); + +kill_thread: + ret = kthread_stop(thread); + if (ret != XPFO_SMP_KILLED) + pr_err("FAIL: thread wasn't killed: %d\n", ret); + put_task_struct(thread); +} From patchwork Thu Jan 10 21:09:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D302113B5 for ; Thu, 10 Jan 2019 21:13:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF9BC29B97 for ; Thu, 10 Jan 2019 21:13:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B1D1929BC9; Thu, 10 Jan 2019 21:13:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id AA5B529B97 for ; Thu, 10 Jan 2019 21:13:40 +0000 (UTC) Received: (qmail 30707 invoked by uid 550); 10 Jan 2019 21:13:21 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30637 invoked from network); 10 Jan 2019 21:13:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=/IFEDzhdI0x1IZkLHUorzjqe46w6d+OUBpCZ00v5/kI=; b=JNUIA0ZiLYSbjyCwAVK6/QyBmq8FL72dUceoNqR3YqbxV3KpfLrJ4/o4shwgV/QQkpbT i+JrwBLXO2mzCBJzNUIf2RhgKPu2jg5XHRABDtU+LXQqrUZ6aGpK2NvIydlwKaJppCdP j+wdmrhR7TDmOqvKgoKXVAWk8vUVI1gtKEbYePfC6zugyaJRhLJtLxHH13kyG9h5AdLz qJSxuJEXRAMb7+qsXPG8fmHl/+VvjCI0MtbqDIbPMADcIhXDIUSs7Ix3Av5qePZBrkqB 7GL2jlWfvCsYjC3WOY/uiO9bWQO8zEc6KoJZ/k+sNM/QF+DyHlfh3pKl8iF6Z2h4GVQH Fw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse , Khalid Aziz Subject: [RFC PATCH v7 11/16] mm, x86: omit TLB flushing by default for XPFO page table modifications Date: Thu, 10 Jan 2019 14:09:43 -0700 Message-Id: <4e51a5d4409b54116968b8c0501f6d82c4eb9cb5.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina XPFO carries a large performance overhead. In my tests, I saw >40% overhead for compiling a Linux kernel with XPFO enabled. The frequent TLB flushes that XPFO performs are the root cause of much of this overhead. TLB flushing is required for full paranoia mode where we don't want TLB entries of physmap pages to stick around potentially indefinitely. In reality, though, these TLB entries are going to be evicted pretty rapidly even without explicit flushing. That means omitting TLB flushes only marginally lowers the security benefits of XPFO. For kernel compile, omitting TLB flushes pushes the overhead below 3%. Change the default in XPFO to not flush TLBs unless the user explicitly requests to do so using a kernel parameter. Signed-off-by: Julian Stecklina Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse Signed-off-by: Khalid Aziz --- mm/xpfo.c | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/mm/xpfo.c b/mm/xpfo.c index 25fba05d01bd..e80374b0c78e 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -36,6 +36,7 @@ struct xpfo { }; DEFINE_STATIC_KEY_FALSE(xpfo_inited); +DEFINE_STATIC_KEY_FALSE(xpfo_do_tlb_flush); static bool xpfo_disabled __initdata; @@ -46,7 +47,15 @@ static int __init noxpfo_param(char *str) return 0; } +static int __init xpfotlbflush_param(char *str) +{ + static_branch_enable(&xpfo_do_tlb_flush); + + return 0; +} + early_param("noxpfo", noxpfo_param); +early_param("xpfotlbflush", xpfotlbflush_param); static bool __init need_xpfo(void) { @@ -76,6 +85,13 @@ bool __init xpfo_enabled(void) } EXPORT_SYMBOL(xpfo_enabled); + +static void xpfo_cond_flush_kernel_tlb(struct page *page, int order) +{ + if (static_branch_unlikely(&xpfo_do_tlb_flush)) + xpfo_flush_kernel_tlb(page, order); +} + static inline struct xpfo *lookup_xpfo(struct page *page) { struct page_ext *page_ext = lookup_page_ext(page); @@ -114,12 +130,17 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) "xpfo: already mapped page being allocated\n"); if ((gfp & GFP_HIGHUSER) == GFP_HIGHUSER) { - /* - * Tag the page as a user page and flush the TLB if it - * was previously allocated to the kernel. - */ - if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) - flush_tlb = 1; + if (static_branch_unlikely(&xpfo_do_tlb_flush)) { + /* + * Tag the page as a user page and flush the TLB if it + * was previously allocated to the kernel. + */ + if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) + flush_tlb = 1; + } else { + set_bit(XPFO_PAGE_USER, &xpfo->flags); + } + } else { /* Tag the page as a non-user (kernel) page */ clear_bit(XPFO_PAGE_USER, &xpfo->flags); @@ -127,7 +148,7 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) } if (flush_tlb) - xpfo_flush_kernel_tlb(page, order); + xpfo_cond_flush_kernel_tlb(page, order); } void xpfo_free_pages(struct page *page, int order) @@ -221,7 +242,7 @@ void xpfo_kunmap(void *kaddr, struct page *page) "xpfo: unmapping already unmapped page\n"); set_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); set_kpte(kaddr, page, __pgprot(0)); - xpfo_flush_kernel_tlb(page, 0); + xpfo_cond_flush_kernel_tlb(page, 0); } spin_unlock(&xpfo->maplock); From patchwork Thu Jan 10 21:09:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756947 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5602313B5 for ; Thu, 10 Jan 2019 21:13:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43C5A29B97 for ; Thu, 10 Jan 2019 21:13:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 375EF29BE7; Thu, 10 Jan 2019 21:13:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 6076729B97 for ; Thu, 10 Jan 2019 21:13:15 +0000 (UTC) Received: (qmail 15776 invoked by uid 550); 10 Jan 2019 21:11:29 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 15684 invoked from network); 10 Jan 2019 21:11:28 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=J/DtCpfjG0WDrU4iITiI5mf9/LuYasBuhGA4btT2oWM=; b=vpLe06kOkVg2gqfoWhoHvjK64OjM12NZYrBUbj/dR1MgwKTmXbt3O9ouWtwqiCvDXHvb Kg8LTVu/wRcASK1HItrw7ZYUc71MQZkXfn+4jbCsOOb4AljbtXCIZYXt9gdMdxLQOfrq t8uYPJW0mfHNtBxF7eYCzTG/InDFmKA1Q/Z1nhCcoNcrLVJqs5FE7yDUhNxvcd/cFXUm HwLGj0OKJ7rYlS3KVY3nnrT4euVP3dqQYGlnVmB7h1eilYgE+G/a+9WZCLvbd0RioBOo eoIGtsds6WipicyeAJqxSjVzOnnPao1XcZBORvOXUULELMGUSDGHi91lPzuXyjXSEwbg dA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse , Khalid Aziz Subject: [RFC PATCH v7 12/16] xpfo, mm: remove dependency on CONFIG_PAGE_EXTENSION Date: Thu, 10 Jan 2019 14:09:44 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina Instead of using the page extension debug feature, encode all information, we need for XPFO in struct page. This allows to get rid of some checks in the hot paths and there are also no pages anymore that are allocated before XPFO is enabled. Also make debugging aids configurable for maximum performance. Signed-off-by: Julian Stecklina Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse Signed-off-by: Khalid Aziz --- include/linux/mm_types.h | 8 ++ include/linux/page-flags.h | 13 +++ include/linux/xpfo.h | 3 +- include/trace/events/mmflags.h | 10 +- mm/page_alloc.c | 3 +- mm/page_ext.c | 4 - mm/xpfo.c | 162 ++++++++------------------------- security/Kconfig | 12 ++- 8 files changed, 81 insertions(+), 134 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2c471a2c43fa..d17d33f36a01 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -204,6 +204,14 @@ struct page { #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS int _last_cpupid; #endif + +#ifdef CONFIG_XPFO + /* Counts the number of times this page has been kmapped. */ + atomic_t xpfo_mapcount; + + /* Serialize kmap/kunmap of this page */ + spinlock_t xpfo_lock; +#endif } _struct_page_alignment; /* diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 50ce1bddaf56..a532063f27b5 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -101,6 +101,10 @@ enum pageflags { #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT) PG_young, PG_idle, +#endif +#ifdef CONFIG_XPFO + PG_xpfo_user, /* Page is allocated to user-space */ + PG_xpfo_unmapped, /* Page is unmapped from the linear map */ #endif __NR_PAGEFLAGS, @@ -398,6 +402,15 @@ TESTCLEARFLAG(Young, young, PF_ANY) PAGEFLAG(Idle, idle, PF_ANY) #endif +#ifdef CONFIG_XPFO +PAGEFLAG(XpfoUser, xpfo_user, PF_ANY) +TESTCLEARFLAG(XpfoUser, xpfo_user, PF_ANY) +TESTSETFLAG(XpfoUser, xpfo_user, PF_ANY) +PAGEFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +TESTCLEARFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +TESTSETFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +#endif + /* * On an anonymous page mapped into a user virtual memory area, * page->mapping points to its anon_vma, not to a struct address_space; diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index d4b38ab8a633..ea5188882f49 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -27,7 +27,7 @@ struct page; #include -extern struct page_ext_operations page_xpfo_ops; +void xpfo_init_single_page(struct page *page); void set_kpte(void *kaddr, struct page *page, pgprot_t prot); void xpfo_dma_map_unmap_area(bool map, const void *addr, size_t size, @@ -56,6 +56,7 @@ phys_addr_t user_virt_to_phys(unsigned long addr); #else /* !CONFIG_XPFO */ +static inline void xpfo_init_single_page(struct page *page) { } static inline void xpfo_kmap(void *kaddr, struct page *page) { } static inline void xpfo_kunmap(void *kaddr, struct page *page) { } static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index a1675d43777e..6bb000bb366f 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -79,6 +79,12 @@ #define IF_HAVE_PG_IDLE(flag,string) #endif +#ifdef CONFIG_XPFO +#define IF_HAVE_PG_XPFO(flag,string) ,{1UL << flag, string} +#else +#define IF_HAVE_PG_XPFO(flag,string) +#endif + #define __def_pageflag_names \ {1UL << PG_locked, "locked" }, \ {1UL << PG_waiters, "waiters" }, \ @@ -105,7 +111,9 @@ IF_HAVE_PG_MLOCK(PG_mlocked, "mlocked" ) \ IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ -IF_HAVE_PG_IDLE(PG_idle, "idle" ) +IF_HAVE_PG_IDLE(PG_idle, "idle" ) \ +IF_HAVE_PG_XPFO(PG_xpfo_user, "xpfo_user" ) \ +IF_HAVE_PG_XPFO(PG_xpfo_unmapped, "xpfo_unmapped" ) \ #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 08e277790b5f..d00382b20001 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1024,6 +1024,7 @@ static __always_inline bool free_pages_prepare(struct page *page, if (bad) return false; + xpfo_free_pages(page, order); page_cpupid_reset_last(page); page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); @@ -1038,7 +1039,6 @@ static __always_inline bool free_pages_prepare(struct page *page, kernel_poison_pages(page, 1 << order, 0); kernel_map_pages(page, 1 << order, 0); kasan_free_pages(page, order); - xpfo_free_pages(page, order); return true; } @@ -1191,6 +1191,7 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn, if (!is_highmem_idx(zone)) set_page_address(page, __va(pfn << PAGE_SHIFT)); #endif + xpfo_init_single_page(page); } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT diff --git a/mm/page_ext.c b/mm/page_ext.c index 38e5013dcb9a..ae44f7adbe07 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,7 +8,6 @@ #include #include #include -#include /* * struct page extension @@ -69,9 +68,6 @@ static struct page_ext_operations *page_ext_ops[] = { #if defined(CONFIG_IDLE_PAGE_TRACKING) && !defined(CONFIG_64BIT) &page_idle_ops, #endif -#ifdef CONFIG_XPFO - &page_xpfo_ops, -#endif }; static unsigned long total_usage; diff --git a/mm/xpfo.c b/mm/xpfo.c index e80374b0c78e..cbfeafc2f10f 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -16,33 +16,16 @@ #include #include #include -#include #include #include -/* XPFO page state flags */ -enum xpfo_flags { - XPFO_PAGE_USER, /* Page is allocated to user-space */ - XPFO_PAGE_UNMAPPED, /* Page is unmapped from the linear map */ -}; - -/* Per-page XPFO house-keeping data */ -struct xpfo { - unsigned long flags; /* Page state */ - bool inited; /* Map counter and lock initialized */ - atomic_t mapcount; /* Counter for balancing map/unmap requests */ - spinlock_t maplock; /* Lock to serialize map/unmap requests */ -}; - -DEFINE_STATIC_KEY_FALSE(xpfo_inited); +DEFINE_STATIC_KEY_TRUE(xpfo_inited); DEFINE_STATIC_KEY_FALSE(xpfo_do_tlb_flush); -static bool xpfo_disabled __initdata; - static int __init noxpfo_param(char *str) { - xpfo_disabled = true; + static_branch_disable(&xpfo_inited); return 0; } @@ -57,34 +40,13 @@ static int __init xpfotlbflush_param(char *str) early_param("noxpfo", noxpfo_param); early_param("xpfotlbflush", xpfotlbflush_param); -static bool __init need_xpfo(void) -{ - if (xpfo_disabled) { - printk(KERN_INFO "XPFO disabled\n"); - return false; - } - - return true; -} - -static void init_xpfo(void) -{ - printk(KERN_INFO "XPFO enabled\n"); - static_branch_enable(&xpfo_inited); -} - -struct page_ext_operations page_xpfo_ops = { - .size = sizeof(struct xpfo), - .need = need_xpfo, - .init = init_xpfo, -}; - bool __init xpfo_enabled(void) { - return !xpfo_disabled; + if (!static_branch_unlikely(&xpfo_inited)) + return false; + else + return true; } -EXPORT_SYMBOL(xpfo_enabled); - static void xpfo_cond_flush_kernel_tlb(struct page *page, int order) { @@ -92,58 +54,40 @@ static void xpfo_cond_flush_kernel_tlb(struct page *page, int order) xpfo_flush_kernel_tlb(page, order); } -static inline struct xpfo *lookup_xpfo(struct page *page) +void __meminit xpfo_init_single_page(struct page *page) { - struct page_ext *page_ext = lookup_page_ext(page); - - if (unlikely(!page_ext)) { - WARN(1, "xpfo: failed to get page ext"); - return NULL; - } - - return (void *)page_ext + page_xpfo_ops.offset; + spin_lock_init(&page->xpfo_lock); } void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { int i, flush_tlb = 0; - struct xpfo *xpfo; if (!static_branch_unlikely(&xpfo_inited)) return; for (i = 0; i < (1 << order); i++) { - xpfo = lookup_xpfo(page + i); - if (!xpfo) - continue; - - WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), - "xpfo: unmapped page being allocated\n"); - - /* Initialize the map lock and map counter */ - if (unlikely(!xpfo->inited)) { - spin_lock_init(&xpfo->maplock); - atomic_set(&xpfo->mapcount, 0); - xpfo->inited = true; - } - WARN(atomic_read(&xpfo->mapcount), - "xpfo: already mapped page being allocated\n"); - +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(PageXpfoUser(page + i)); + BUG_ON(PageXpfoUnmapped(page + i)); + BUG_ON(spin_is_locked(&(page + i)->xpfo_lock)); + BUG_ON(atomic_read(&(page + i)->xpfo_mapcount)); +#endif if ((gfp & GFP_HIGHUSER) == GFP_HIGHUSER) { if (static_branch_unlikely(&xpfo_do_tlb_flush)) { /* * Tag the page as a user page and flush the TLB if it * was previously allocated to the kernel. */ - if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!TestSetPageXpfoUser(page + i)) flush_tlb = 1; } else { - set_bit(XPFO_PAGE_USER, &xpfo->flags); + SetPageXpfoUser(page + i); } } else { /* Tag the page as a non-user (kernel) page */ - clear_bit(XPFO_PAGE_USER, &xpfo->flags); + ClearPageXpfoUser(page + i); } } @@ -154,27 +98,21 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) void xpfo_free_pages(struct page *page, int order) { int i; - struct xpfo *xpfo; if (!static_branch_unlikely(&xpfo_inited)) return; for (i = 0; i < (1 << order); i++) { - xpfo = lookup_xpfo(page + i); - if (!xpfo || unlikely(!xpfo->inited)) { - /* - * The page was allocated before page_ext was - * initialized, so it is a kernel page. - */ - continue; - } +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(atomic_read(&(page + i)->xpfo_mapcount)); +#endif /* * Map the page back into the kernel if it was previously * allocated to user space. */ - if (test_and_clear_bit(XPFO_PAGE_USER, &xpfo->flags)) { - clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + if (TestClearPageXpfoUser(page + i)) { + ClearPageXpfoUnmapped(page + i); set_kpte(page_address(page + i), page + i, PAGE_KERNEL); } @@ -183,84 +121,56 @@ void xpfo_free_pages(struct page *page, int order) void xpfo_kmap(void *kaddr, struct page *page) { - struct xpfo *xpfo; - if (!static_branch_unlikely(&xpfo_inited)) return; - xpfo = lookup_xpfo(page); - - /* - * The page was allocated before page_ext was initialized (which means - * it's a kernel page) or it's allocated to the kernel, so nothing to - * do. - */ - if (!xpfo || unlikely(!xpfo->inited) || - !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!PageXpfoUser(page)) return; - spin_lock(&xpfo->maplock); + spin_lock(&page->xpfo_lock); /* * The page was previously allocated to user space, so map it back * into the kernel. No TLB flush required. */ - if ((atomic_inc_return(&xpfo->mapcount) == 1) && - test_and_clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags)) + if ((atomic_inc_return(&page->xpfo_mapcount) == 1) && + TestClearPageXpfoUnmapped(page)) set_kpte(kaddr, page, PAGE_KERNEL); - spin_unlock(&xpfo->maplock); + spin_unlock(&page->xpfo_lock); } EXPORT_SYMBOL(xpfo_kmap); void xpfo_kunmap(void *kaddr, struct page *page) { - struct xpfo *xpfo; - if (!static_branch_unlikely(&xpfo_inited)) return; - xpfo = lookup_xpfo(page); - - /* - * The page was allocated before page_ext was initialized (which means - * it's a kernel page) or it's allocated to the kernel, so nothing to - * do. - */ - if (!xpfo || unlikely(!xpfo->inited) || - !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!PageXpfoUser(page)) return; - spin_lock(&xpfo->maplock); + spin_lock(&page->xpfo_lock); /* * The page is to be allocated back to user space, so unmap it from the * kernel, flush the TLB and tag it as a user page. */ - if (atomic_dec_return(&xpfo->mapcount) == 0) { - WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), - "xpfo: unmapping already unmapped page\n"); - set_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + if (atomic_dec_return(&page->xpfo_mapcount) == 0) { +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(PageXpfoUnmapped(page)); +#endif + SetPageXpfoUnmapped(page); set_kpte(kaddr, page, __pgprot(0)); xpfo_cond_flush_kernel_tlb(page, 0); } - spin_unlock(&xpfo->maplock); + spin_unlock(&page->xpfo_lock); } EXPORT_SYMBOL(xpfo_kunmap); bool xpfo_page_is_unmapped(struct page *page) { - struct xpfo *xpfo; - - if (!static_branch_unlikely(&xpfo_inited)) - return false; - - xpfo = lookup_xpfo(page); - if (unlikely(!xpfo) && !xpfo->inited) - return false; - - return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + return PageXpfoUnmapped(page); } EXPORT_SYMBOL(xpfo_page_is_unmapped); diff --git a/security/Kconfig b/security/Kconfig index 8d0e4e303551..c7c581bac963 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -13,7 +13,6 @@ config XPFO bool "Enable eXclusive Page Frame Ownership (XPFO)" default n depends on ARCH_SUPPORTS_XPFO - select PAGE_EXTENSION help This option offers protection against 'ret2dir' kernel attacks. When enabled, every time a page frame is allocated to user space, it @@ -25,6 +24,17 @@ config XPFO If in doubt, say "N". +config XPFO_DEBUG + bool "Enable debugging of XPFO" + default n + depends on XPFO + help + Enables additional checking of XPFO data structures that help find + bugs in the XPFO implementation. This option comes with a slight + performance cost. + + If in doubt, say "N". + config SECURITY_DMESG_RESTRICT bool "Restrict unprivileged access to the kernel syslog" default n From patchwork Thu Jan 10 21:09:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756945 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC335159A for ; Thu, 10 Jan 2019 21:13:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9EB329BCC for ; Thu, 10 Jan 2019 21:13:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99FAE29BE7; Thu, 10 Jan 2019 21:13:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id A46C129BCC for ; Thu, 10 Jan 2019 21:13:03 +0000 (UTC) Received: (qmail 15454 invoked by uid 550); 10 Jan 2019 21:11:26 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 15395 invoked from network); 10 Jan 2019 21:11:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=UbctwoSk8tN3IcjUiRawhNiiTGCYFxkfE1vdJR34/p4=; b=pXlQwsE6eT5Gwd80itwTVz5RN+Ix5BpvxJmBfde1uZygV+Rslofe7x724kOLvkrhuoDm VaufXFJc0x/bhg4csb86xgTa5oRsgGXNUwpk6jBaSYVCOK08HIMLKrRQFFXQuF/ljQLa c9M5nGEttCW3mctUWC2vYB1gn4wLmkHIUFEx/Aw1I+7qiQ3knEVUexcfa2bolqzEyvV3 LmtzvRDDdYXn9ZOKH/gctHRPShqvLWBf/rOrAA8BIVQSKsMuP44bA/KoMe+nCz6XDCkg Fj6Mqgg33kI6zR9S0fSK8spVh3Dy3zDN5Jmnk4/mwptg9ykfT++lnvDkaEVHNpTf9Xm8 jw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse , Khalid Aziz Subject: [RFC PATCH v7 13/16] xpfo, mm: optimize spinlock usage in xpfo_kunmap Date: Thu, 10 Jan 2019 14:09:45 -0700 Message-Id: <95b6fa40ce6c7afb4a9e58f8d747d86aa7a94177.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=822 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina Only the xpfo_kunmap call that needs to actually unmap the page needs to be serialized. We need to be careful to handle the case, where after the atomic decrement of the mapcount, a xpfo_kmap increased the mapcount again. In this case, we can safely skip modifying the page table. Model-checked with up to 4 concurrent callers with Spin. Signed-off-by: Julian Stecklina Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse Signed-off-by: Khalid Aziz --- mm/xpfo.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/mm/xpfo.c b/mm/xpfo.c index cbfeafc2f10f..dbf20efb0499 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -149,22 +149,24 @@ void xpfo_kunmap(void *kaddr, struct page *page) if (!PageXpfoUser(page)) return; - spin_lock(&page->xpfo_lock); - /* * The page is to be allocated back to user space, so unmap it from the * kernel, flush the TLB and tag it as a user page. */ if (atomic_dec_return(&page->xpfo_mapcount) == 0) { -#ifdef CONFIG_XPFO_DEBUG - BUG_ON(PageXpfoUnmapped(page)); -#endif - SetPageXpfoUnmapped(page); - set_kpte(kaddr, page, __pgprot(0)); - xpfo_cond_flush_kernel_tlb(page, 0); - } + spin_lock(&page->xpfo_lock); - spin_unlock(&page->xpfo_lock); + /* + * In the case, where we raced with kmap after the + * atomic_dec_return, we must not nuke the mapping. + */ + if (atomic_read(&page->xpfo_mapcount) == 0) { + SetPageXpfoUnmapped(page); + set_kpte(kaddr, page, __pgprot(0)); + xpfo_cond_flush_kernel_tlb(page, 0); + } + spin_unlock(&page->xpfo_lock); + } } EXPORT_SYMBOL(xpfo_kunmap); From patchwork Thu Jan 10 21:09:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756943 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E94BC159A for ; Thu, 10 Jan 2019 21:12:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7FF429BCC for ; Thu, 10 Jan 2019 21:12:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBB3629BE7; Thu, 10 Jan 2019 21:12:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 93F0329BCC for ; Thu, 10 Jan 2019 21:12:51 +0000 (UTC) Received: (qmail 14285 invoked by uid 550); 10 Jan 2019 21:11:23 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 14167 invoked from network); 10 Jan 2019 21:11:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=Hk38+nWLDl3nLi7YTCJ6Yd0tzUi3GaMmOjk//H1EH/4=; b=uhYxXSO/+zuWC4vaHaHuMFTavxikTbqRP/9BSmynD6o6Qfs9s/b9ru9AGepw0APjML8L t/zcQ5raQgd2hxnEuOIeQ3tDxSbyqTKVT6FNZFFa7ljeaVBa3CQAqtwsFUa9oJgUHMEN J1/5tStu0S/5soyx7I4gstQJ0JFuHXHoL4NIlQduCj46WdjUo+HRUfM3GlbCBZPjpZCP meRR+n3PdKItdFWjds7Pzwo05wntxRAKFMVuyxYxPHw2u2pUYySmPVsmEqcZwrgyZXTm Br+2+S7rp8VHwZgDUt9kxwdqx34UbClUTG1UlvtiwhHrxvZFpc1GovmwTm8BchLw/gP8 qQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse , Khalid Aziz Subject: [RFC PATCH v7 14/16] EXPERIMENTAL: xpfo, mm: optimize spin lock usage in xpfo_kmap Date: Thu, 10 Jan 2019 14:09:46 -0700 Message-Id: <7e8e17f519ae87a91fc6cbb57b8b27094c96305c.1547153058.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina We can reduce spin lock usage in xpfo_kmap to the 0->1 transition of the mapcount. This means that xpfo_kmap() can now race and that we get spurious page faults. The page fault handler helps the system make forward progress by fixing the page table instead of allowing repeated page faults until the right xpfo_kmap went through. Model-checked with up to 4 concurrent callers with Spin. Signed-off-by: Julian Stecklina Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse Signed-off-by: Khalid Aziz --- arch/x86/mm/fault.c | 4 ++++ include/linux/xpfo.h | 4 ++++ mm/xpfo.c | 50 +++++++++++++++++++++++++++++++++++++------- 3 files changed, 51 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index ba51652fbd33..207081dcd572 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -18,6 +18,7 @@ #include /* faulthandler_disabled() */ #include /* efi_recover_from_page_fault()*/ #include +#include #include /* boot_cpu_has, ... */ #include /* dotraplinkage, ... */ @@ -1218,6 +1219,9 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code, if (kprobes_fault(regs)) return; + if (xpfo_spurious_fault(address)) + return; + /* * Note, despite being a "bad area", there are quite a few * acceptable reasons to get here, such as erratum fixups diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index ea5188882f49..58dd243637d2 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -54,6 +54,8 @@ bool xpfo_enabled(void); phys_addr_t user_virt_to_phys(unsigned long addr); +bool xpfo_spurious_fault(unsigned long addr); + #else /* !CONFIG_XPFO */ static inline void xpfo_init_single_page(struct page *page) { } @@ -81,6 +83,8 @@ static inline bool xpfo_enabled(void) { return false; } static inline phys_addr_t user_virt_to_phys(unsigned long addr) { return 0; } +static inline bool xpfo_spurious_fault(unsigned long addr) { return false; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/xpfo.c b/mm/xpfo.c index dbf20efb0499..85079377c91d 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -119,6 +119,16 @@ void xpfo_free_pages(struct page *page, int order) } } +static void xpfo_do_map(void *kaddr, struct page *page) +{ + spin_lock(&page->xpfo_lock); + if (PageXpfoUnmapped(page)) { + set_kpte(kaddr, page, PAGE_KERNEL); + ClearPageXpfoUnmapped(page); + } + spin_unlock(&page->xpfo_lock); +} + void xpfo_kmap(void *kaddr, struct page *page) { if (!static_branch_unlikely(&xpfo_inited)) @@ -127,17 +137,12 @@ void xpfo_kmap(void *kaddr, struct page *page) if (!PageXpfoUser(page)) return; - spin_lock(&page->xpfo_lock); - /* * The page was previously allocated to user space, so map it back * into the kernel. No TLB flush required. */ - if ((atomic_inc_return(&page->xpfo_mapcount) == 1) && - TestClearPageXpfoUnmapped(page)) - set_kpte(kaddr, page, PAGE_KERNEL); - - spin_unlock(&page->xpfo_lock); + if (atomic_inc_return(&page->xpfo_mapcount) == 1) + xpfo_do_map(kaddr, page); } EXPORT_SYMBOL(xpfo_kmap); @@ -204,3 +209,34 @@ void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, kunmap_atomic(mapping[i]); } EXPORT_SYMBOL(xpfo_temp_unmap); + +bool xpfo_spurious_fault(unsigned long addr) +{ + struct page *page; + bool spurious; + int mapcount; + + if (!static_branch_unlikely(&xpfo_inited)) + return false; + + /* XXX Is this sufficient to guard against calling virt_to_page() on a + * virtual address that has no corresponding struct page? */ + if (!virt_addr_valid(addr)) + return false; + + page = virt_to_page(addr); + mapcount = atomic_read(&page->xpfo_mapcount); + spurious = PageXpfoUser(page) && mapcount; + + /* Guarantee forward progress in case xpfo_kmap() raced. */ + if (spurious && PageXpfoUnmapped(page)) { + xpfo_do_map((void *)(addr & PAGE_MASK), page); + } + + if (unlikely(!spurious)) + printk("XPFO non-spurious fault %lx user=%d unmapped=%d mapcount=%d\n", + addr, PageXpfoUser(page), PageXpfoUnmapped(page), + mapcount); + + return spurious; +} From patchwork Thu Jan 10 21:09:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5DA913B5 for ; Thu, 10 Jan 2019 21:12:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1E4029BA0 for ; Thu, 10 Jan 2019 21:12:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A330F29BE7; Thu, 10 Jan 2019 21:12:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D615229BA0 for ; Thu, 10 Jan 2019 21:12:28 +0000 (UTC) Received: (qmail 13485 invoked by uid 550); 10 Jan 2019 21:11:13 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 13367 invoked from network); 10 Jan 2019 21:11:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=un0wieK7yWFodUoPf/TbQJCMffDoPehxHgZRzLFUW2o=; b=chPszpAiJAbkg+zkadcOBGkMBvHUigW7P3J5U9kXh/4WR4WaTdIqSIJKmzOC/NYFDsyM MrFViq/HYuHukuzNxvfAtgv5S+929Mdu9AdazByjvd+OPJMDXYmo8iFtHbKf7joRxxl/ QVImhgrAo9iB2tmKxcqYcyE2n9GTsgn+pHOsq5kbq0i5OjnVPX2uEIwCk/3vgDm7ck2L gdt2Jf6KBycEqmPAG9bKtRrQvpN6O6xgeHZxhQyKMbbVoWzO9mVK92OnrDy3zK7hhBVl 4ErVypdkvqQImbKrKUjb7pLwtF16Ii9FhyFfMXHHcbV30+4FHsmkdvwuL+W2nUMEn+cL gQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Khalid Aziz , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v7 15/16] xpfo, mm: Fix hang when booting with "xpfotlbflush" Date: Thu, 10 Jan 2019 14:09:47 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP Kernel hangs when booted up with "xpfotlbflush" option. This is caused by xpfo_kunmap() fliushing TLB while holding xpfo lock starving other tasks waiting for the lock. This patch moves tlb flush outside of the code holding xpfo lock. Signed-off-by: Khalid Aziz --- mm/xpfo.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/xpfo.c b/mm/xpfo.c index 85079377c91d..79ffdba6af69 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -148,6 +148,8 @@ EXPORT_SYMBOL(xpfo_kmap); void xpfo_kunmap(void *kaddr, struct page *page) { + bool flush_tlb = false; + if (!static_branch_unlikely(&xpfo_inited)) return; @@ -168,10 +170,13 @@ void xpfo_kunmap(void *kaddr, struct page *page) if (atomic_read(&page->xpfo_mapcount) == 0) { SetPageXpfoUnmapped(page); set_kpte(kaddr, page, __pgprot(0)); - xpfo_cond_flush_kernel_tlb(page, 0); + flush_tlb = true; } spin_unlock(&page->xpfo_lock); } + + if (flush_tlb) + xpfo_cond_flush_kernel_tlb(page, 0); } EXPORT_SYMBOL(xpfo_kunmap); From patchwork Thu Jan 10 21:09:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10756941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 32B5313B5 for ; Thu, 10 Jan 2019 21:12:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 22B3029BD2 for ; Thu, 10 Jan 2019 21:12:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1558329BFC; Thu, 10 Jan 2019 21:12:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id CF3CE29BA0 for ; Thu, 10 Jan 2019 21:12:39 +0000 (UTC) Received: (qmail 13690 invoked by uid 550); 10 Jan 2019 21:11:16 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 13633 invoked from network); 10 Jan 2019 21:11:15 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=749tAVv48vAA0ttTJdCoUpxycWrvDUNoF9AI2xBMv4U=; b=YXAMLnptH6eKpRUf8FRROq+aTtAfiMP1nh2RLmNsczXSkzjuqPCrImMehySWFczDHX6s mrzIO3Aq2NqgpeFp8OMh1bjZfM1S3YoMoMTMWGk8mVqEp/NNjL910+1obXG2GoB0Xv7Z vRd2u4+Xde5Yu4aVJfOFvG3ZoEIMGE7MCSEU6LhP30v7C2IHioVInos1OPt3z4IjnjMD gCd+XdrcIrQ6o4rwk+OyQMQXSFhvuttBdDzmnhGNLYYMEQEULtpZsLqgAPYioWXBWnDP 1g1OYin5eQU5noDxdiZA1toYs5qWuFxzY93CfQQ9XA6Zh5o2VyiUnumYS1U1i/mZYa1F ZQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, konrad.wilk@oracle.com Cc: Khalid Aziz , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, joao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v7 16/16] xpfo, mm: Defer TLB flushes for non-current CPUs (x86 only) Date: Thu, 10 Jan 2019 14:09:48 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9132 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901100164 X-Virus-Scanned: ClamAV using ClamSMTP XPFO flushes kernel space TLB entries for pages that are now mapped in userspace on not only the current CPU but also all other CPUs. If the number of TLB entries to flush exceeds tlb_single_page_flush_ceiling, this results in entire TLB neing flushed on all CPUs. A malicious userspace app can exploit the dual mapping of a physical page caused by physmap only on the CPU it is running on. There is no good reason to incur the very high cost of TLB flush on CPUs that may never run the malicious app or do not have any TLB entries for the malicious app. The cost of full TLB flush goes up dramatically on machines with high core count. This patch flushes relevant TLB entries for current process or entire TLB depending upon number of entries for the current CPU and posts a pending TLB flush on all other CPUs when a page is unmapped from kernel space and mapped in userspace. This pending TLB flush is posted for each task separately and TLB is flushed on a CPU when a task is scheduled on it that has a pending TLB flush posted for that CPU. This patch does two things - (1) it potentially aggregates multiple TLB flushes into one, and (2) it avoids TLB flush on CPUs that never run the task that caused a TLB flush. This has very significant impact especially on machines with large core counts. To illustrate this, kernel was compiled with -j on two classes of machines - a server with high core count and large amount of memory, and a desktop class machine with more modest specs. System time from "make -j" from vanilla 4.20 kernel, 4.20 with XPFO patches before applying this patch and after applying this patch are below: Hardware: 96-core Intel Xeon Platinum 8160 CPU @ 2.10GHz, 768 GB RAM make -j60 all 4.20 915.183s 4.19+XPFO 24129.354s 26.366x 4.19+XPFO+Deferred flush 1216.987s 1.330xx Hardware: 4-core Intel Core i5-3550 CPU @ 3.30GHz, 8G RAM make -j4 all 4.20 607.671s 4.19+XPFO 1588.646s 2.614x 4.19+XPFO+Deferred flush 794.473s 1.307xx This patch could use more optimization. For instance, it posts a pending full TLB flush for other CPUs even when number of TLB entries being flushed does not exceed tlb_single_page_flush_ceiling. Batching more TLB entry flushes, as was suggested for earlier version of these patches, can help reduce these cases. This same code should be implemented for other architectures as well once finalized. Signed-off-by: Khalid Aziz --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 27 +++++++++++++++++++++++++++ arch/x86/mm/xpfo.c | 2 +- include/linux/sched.h | 9 +++++++++ 4 files changed, 38 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index f4204bf377fc..92d23629d01d 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -561,6 +561,7 @@ extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables); extern void flush_tlb_kernel_range(unsigned long start, unsigned long end); +extern void xpfo_flush_tlb_kernel_range(unsigned long start, unsigned long end); static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) { diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 03b6b4c2238d..b04a501c850b 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -319,6 +319,15 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, __flush_tlb_all(); } #endif + + /* If there is a pending TLB flush for this CPU due to XPFO + * flush, do it now. + */ + if (tsk && cpumask_test_and_clear_cpu(cpu, &tsk->pending_xpfo_flush)) { + count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); + __flush_tlb_all(); + } + this_cpu_write(cpu_tlbstate.is_lazy, false); /* @@ -801,6 +810,24 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) } } +void xpfo_flush_tlb_kernel_range(unsigned long start, unsigned long end) +{ + + /* Balance as user space task's flush, a bit conservative */ + if (end == TLB_FLUSH_ALL || + (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { + do_flush_tlb_all(NULL); + } else { + struct flush_tlb_info info; + + info.start = start; + info.end = end; + do_kernel_range_flush(&info); + } + cpumask_setall(¤t->pending_xpfo_flush); + cpumask_clear_cpu(smp_processor_id(), ¤t->pending_xpfo_flush); +} + void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { struct flush_tlb_info info = { diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c index bcdb2f2089d2..5aa17cb2c813 100644 --- a/arch/x86/mm/xpfo.c +++ b/arch/x86/mm/xpfo.c @@ -110,7 +110,7 @@ inline void xpfo_flush_kernel_tlb(struct page *page, int order) return; } - flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); + xpfo_flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); } /* Convert a user space virtual address to a physical address. diff --git a/include/linux/sched.h b/include/linux/sched.h index 291a9bd5b97f..ba298be3b5a1 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1206,6 +1206,15 @@ struct task_struct { unsigned long prev_lowest_stack; #endif + /* + * When a full TLB flush is needed to flush stale TLB entries + * for pages that have been mapped into userspace and unmapped + * from kernel space, this TLB flush will be delayed until the + * task is scheduled on that CPU. Keep track of CPUs with + * pending full TLB flush forced by xpfo. + */ + cpumask_t pending_xpfo_flush; + /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct.