From patchwork Thu Feb 14 00:01:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAC8513A4 for ; Thu, 14 Feb 2019 00:05:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B66092DCD3 for ; Thu, 14 Feb 2019 00:05:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A56DE2DD6F; Thu, 14 Feb 2019 00:05:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id AF3942DCD3 for ; Thu, 14 Feb 2019 00:05:02 +0000 (UTC) Received: (qmail 26169 invoked by uid 550); 14 Feb 2019 00:04:28 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 26106 invoked from network); 14 Feb 2019 00:04:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=BnvBthvqyUR+BETbLnaqcFAixVQhrRUG4KWFDYKw33s=; b=mcr+aLBjXAXSptqDtlz0VxxfWE1bq8ULxWcJo1ATX2NiCgH2AWLADjbr/HqYkpk9odGK ChvFnz1ZxMVxKCJ0jfXdJW5TYWVBwj+Ty3/i6pxWwCEiACxkvV13/+T2YmPTX1CBTwJG UBQhh1BtAVY8KyQ7Wk/NDQM5Umuvxs3VEJF2d3Xv4chhe+jqnXx8PoRI61KUxuMIHH1j hzaCrG4RkCU1W8fxes/FLLMZ7uQSL15sc1ZI1d0I09k3vAa0Q8LpfMdnyskkbMB7yMLc eVkiHdh8fE/ZM3EnxXIAPz7Bn0aTkW7mrmcxMY3krFlvZUEFc5Jyqgv8meeyiWbyIGX9 Lw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 01/14] mm: add MAP_HUGETLB support to vm_mmap Date: Wed, 13 Feb 2019 17:01:24 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen vm_mmap is exported, which means kernel modules can use it. In particular, for testing XPFO support, we want to use it with the MAP_HUGETLB flag, so let's support it via vm_mmap. Signed-off-by: Tycho Andersen Tested-by: Marco Benatto Tested-by: Khalid Aziz --- include/linux/mm.h | 2 ++ mm/mmap.c | 19 +------------------ mm/util.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 35 insertions(+), 18 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5411de93a363..30bddc7b3c75 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2361,6 +2361,8 @@ struct vm_unmapped_area_info { extern unsigned long unmapped_area(struct vm_unmapped_area_info *info); extern unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info); +struct file *map_hugetlb_setup(unsigned long *len, unsigned long flags); + /* * Search for an unmapped address range. * diff --git a/mm/mmap.c b/mm/mmap.c index 6c04292e16a7..c668d7d27c2b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1582,24 +1582,7 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; } else if (flags & MAP_HUGETLB) { - struct user_struct *user = NULL; - struct hstate *hs; - - hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); - if (!hs) - return -EINVAL; - - len = ALIGN(len, huge_page_size(hs)); - /* - * VM_NORESERVE is used because the reservations will be - * taken when vm_ops->mmap() is called - * A dummy user value is used because we are not locking - * memory so no accounting is necessary - */ - file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, - VM_NORESERVE, - &user, HUGETLB_ANONHUGE_INODE, - (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + file = map_hugetlb_setup(&len, flags); if (IS_ERR(file)) return PTR_ERR(file); } diff --git a/mm/util.c b/mm/util.c index 8bf08b5b5760..536c14cf88ba 100644 --- a/mm/util.c +++ b/mm/util.c @@ -357,6 +357,29 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, return ret; } +struct file *map_hugetlb_setup(unsigned long *len, unsigned long flags) +{ + struct user_struct *user = NULL; + struct hstate *hs; + + hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + if (!hs) + return ERR_PTR(-EINVAL); + + *len = ALIGN(*len, huge_page_size(hs)); + + /* + * VM_NORESERVE is used because the reservations will be + * taken when vm_ops->mmap() is called + * A dummy user value is used because we are not locking + * memory so no accounting is necessary + */ + return hugetlb_file_setup(HUGETLB_ANON_FILE, *len, + VM_NORESERVE, + &user, HUGETLB_ANONHUGE_INODE, + (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); +} + unsigned long vm_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long offset) @@ -366,6 +389,15 @@ unsigned long vm_mmap(struct file *file, unsigned long addr, if (unlikely(offset_in_page(offset))) return -EINVAL; + if (flag & MAP_HUGETLB) { + if (file) + return -EINVAL; + + file = map_hugetlb_setup(&len, flag); + if (IS_ERR(file)) + return PTR_ERR(file); + } + return vm_mmap_pgoff(file, addr, len, prot, flag, offset >> PAGE_SHIFT); } EXPORT_SYMBOL(vm_mmap); From patchwork Thu Feb 14 00:01:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811403 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 36F716C2 for ; Thu, 14 Feb 2019 00:03:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26EAA28C19 for ; Thu, 14 Feb 2019 00:03:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A3BA28DD6; Thu, 14 Feb 2019 00:03:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 3570228C19 for ; Thu, 14 Feb 2019 00:03:54 +0000 (UTC) Received: (qmail 15880 invoked by uid 550); 14 Feb 2019 00:03:11 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 15780 invoked from network); 14 Feb 2019 00:03:10 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=BVvtye26KZIs7muho5hLhXsAoOpu90yJN7XQD4ZHF6s=; b=fZ7BPquZinNkNfviJmu2aj5OWxvfaEW6ze9/DO2c6uAP65Cc4e4Cl3T1X2A+uS3/Cnnj FXw2tAJbgFiECJbQ4WvubRBn8pxAj4qhxLwEmJiaDQgis7w2FmPwW2j42M7DueNqyBGy EyJX2fW7sQZHaipQdaHrDl4Ch27fK6OjK5xHCrNaGpwnUD2AjFaKdF5KlkO1RjMlgz7E wO+hNUWeMh8kKPGfqr2rZVzyWdP8lbCxRvKmyeYtHEahIc8wLWsnl9jqQNQFXSxMWNlu 3RxQ4F/6tOJWAzRs4vywbeqWiVdLGC3PRvx47Ai+oOoMdEY1LwrLa/ajJI34d9cE7gWR uA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 02/14] x86: always set IF before oopsing from page fault Date: Wed, 13 Feb 2019 17:01:25 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=885 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen Oopsing might kill the task, via rewind_stack_do_exit() at the bottom, and that might sleep: Aug 23 19:30:27 xpfo kernel: [ 38.302714] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:33 Aug 23 19:30:27 xpfo kernel: [ 38.303837] in_atomic(): 0, irqs_disabled(): 1, pid: 1970, name: lkdtm_xpfo_test Aug 23 19:30:27 xpfo kernel: [ 38.304758] CPU: 3 PID: 1970 Comm: lkdtm_xpfo_test Tainted: G D 4.13.0-rc5+ #228 Aug 23 19:30:27 xpfo kernel: [ 38.305813] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Aug 23 19:30:27 xpfo kernel: [ 38.306926] Call Trace: Aug 23 19:30:27 xpfo kernel: [ 38.307243] dump_stack+0x63/0x8b Aug 23 19:30:27 xpfo kernel: [ 38.307665] ___might_sleep+0xec/0x110 Aug 23 19:30:27 xpfo kernel: [ 38.308139] __might_sleep+0x45/0x80 Aug 23 19:30:27 xpfo kernel: [ 38.308593] exit_signals+0x21/0x1c0 Aug 23 19:30:27 xpfo kernel: [ 38.309046] ? blocking_notifier_call_chain+0x11/0x20 Aug 23 19:30:27 xpfo kernel: [ 38.309677] do_exit+0x98/0xbf0 Aug 23 19:30:27 xpfo kernel: [ 38.310078] ? smp_reader+0x27/0x40 [lkdtm] Aug 23 19:30:27 xpfo kernel: [ 38.310604] ? kthread+0x10f/0x150 Aug 23 19:30:27 xpfo kernel: [ 38.311045] ? read_user_with_flags+0x60/0x60 [lkdtm] Aug 23 19:30:27 xpfo kernel: [ 38.311680] rewind_stack_do_exit+0x17/0x20 To be safe, let's just always enable irqs. The particular case I'm hitting is: Aug 23 19:30:27 xpfo kernel: [ 38.278615] __bad_area_nosemaphore+0x1a9/0x1d0 Aug 23 19:30:27 xpfo kernel: [ 38.278617] bad_area_nosemaphore+0xf/0x20 Aug 23 19:30:27 xpfo kernel: [ 38.278618] __do_page_fault+0xd1/0x540 Aug 23 19:30:27 xpfo kernel: [ 38.278620] ? irq_work_queue+0x9b/0xb0 Aug 23 19:30:27 xpfo kernel: [ 38.278623] ? wake_up_klogd+0x36/0x40 Aug 23 19:30:27 xpfo kernel: [ 38.278624] trace_do_page_fault+0x3c/0xf0 Aug 23 19:30:27 xpfo kernel: [ 38.278625] do_async_page_fault+0x14/0x60 Aug 23 19:30:27 xpfo kernel: [ 38.278627] async_page_fault+0x28/0x30 When a fault is in kernel space which has been triggered by XPFO. Signed-off-by: Tycho Andersen CC: x86@kernel.org Tested-by: Khalid Aziz --- arch/x86/mm/fault.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 71d4b9d4d43f..ba51652fbd33 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -748,6 +748,12 @@ no_context(struct pt_regs *regs, unsigned long error_code, /* Executive summary in case the body of the oops scrolled away */ printk(KERN_DEFAULT "CR2: %016lx\n", address); + /* + * We're about to oops, which might kill the task. Make sure we're + * allowed to sleep. + */ + flags |= X86_EFLAGS_IF; + oops_end(flags, regs, sig); } From patchwork Thu Feb 14 00:01:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811399 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C701C746 for ; Thu, 14 Feb 2019 00:03:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B67692D923 for ; Thu, 14 Feb 2019 00:03:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9C412DBB6; Thu, 14 Feb 2019 00:03:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 675BA2D9EE for ; Thu, 14 Feb 2019 00:03:42 +0000 (UTC) Received: (qmail 15424 invoked by uid 550); 14 Feb 2019 00:03:06 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 14299 invoked from network); 14 Feb 2019 00:03:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=tGIuo+6XNLWxTeQwFkrLNJKDwNp+81usp+K2xaNMZbQ=; b=EV0/KezfENVHmr/AMNMxy2Nd61uEG5+pcnOu5DPfYMTva1/IdPx+bbb8siANlxD0fdgY sZukNc6EGxpttdwgr8JPyw6hQH4zlHCuBvnmuyXP71eeBk2S8t7slIaSOrF+P6M5wC5y 4Jv2X3fB+dQq9pFM9aC2bB4beiAYPs0WbvJHQz9UfOi+aD489bz7N77ZxE4TQOG/CrX2 05lcdk23um8iMP0q5FcGMFpG+98u25VFNMJtTzpAfxlaCt8/lCpSyfYuK/e5sd3YmYYB EWzKyo9eiVRG1uOsHGyawgrR7H9TeOXmHDB7Hd3VozbXXXOXFmJTQwVd1e+JN7Gs6ldm 1A== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Tycho Andersen , Marco Benatto Subject: [RFC PATCH v8 03/14] mm, x86: Add support for eXclusive Page Frame Ownership (XPFO) Date: Wed, 13 Feb 2019 17:01:26 -0700 Message-Id: <8275de2a7e6b72d19b1cd2ec5d71a42c2c7dd6c5.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger This patch adds support for XPFO which protects against 'ret2dir' kernel attacks. The basic idea is to enforce exclusive ownership of page frames by either the kernel or userspace, unless explicitly requested by the kernel. Whenever a page destined for userspace is allocated, it is unmapped from physmap (the kernel's page table). When such a page is reclaimed from userspace, it is mapped back to physmap. Additional fields in the page_ext struct are used for XPFO housekeeping, specifically: - two flags to distinguish user vs. kernel pages and to tag unmapped pages. - a reference counter to balance kmap/kunmap operations. - a lock to serialize access to the XPFO fields. This patch is based on the work of Vasileios P. Kemerlis et al. who published their work in this paper: http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf v6: * use flush_tlb_kernel_range() instead of __flush_tlb_one, so we flush the tlb entry on all CPUs when unmapping it in kunmap * handle lookup_page_ext()/lookup_xpfo() returning NULL * drop lots of BUG()s in favor of WARN() * don't disable irqs in xpfo_kmap/xpfo_kunmap, export __split_large_page so we can do our own alloc_pages(GFP_ATOMIC) to pass it CC: x86@kernel.org Suggested-by: Vasileios P. Kemerlis Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Marco Benatto [jsteckli@amazon.de: rebased from v4.13 to v4.19] Signed-off-by: Julian Stecklina Reviewed-by: Khalid Aziz --- .../admin-guide/kernel-parameters.txt | 2 + arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 26 ++ arch/x86/mm/Makefile | 2 + arch/x86/mm/pageattr.c | 23 +- arch/x86/mm/xpfo.c | 119 ++++++++++ include/linux/highmem.h | 15 +- include/linux/xpfo.h | 48 ++++ mm/Makefile | 1 + mm/page_alloc.c | 2 + mm/page_ext.c | 4 + mm/xpfo.c | 223 ++++++++++++++++++ security/Kconfig | 19 ++ 13 files changed, 463 insertions(+), 22 deletions(-) create mode 100644 arch/x86/mm/xpfo.c create mode 100644 include/linux/xpfo.h create mode 100644 mm/xpfo.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index aefd358a5ca3..c4c62599f216 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2982,6 +2982,8 @@ nox2apic [X86-64,APIC] Do not enable x2APIC mode. + noxpfo [X86-64] Disable XPFO when CONFIG_XPFO is on. + cpu0_hotplug [X86] Turn on CPU0 hotplug feature when CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off. Some features depend on CPU0. Known dependencies are: diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8689e794a43c..d69d8cc6e57e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -207,6 +207,7 @@ config X86 select USER_STACKTRACE_SUPPORT select VIRT_TO_BUS select X86_FEATURE_NAMES if PROC_FS + select ARCH_SUPPORTS_XPFO if X86_64 config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 40616e805292..f6eeb75c8a21 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1437,6 +1437,32 @@ static inline bool arch_has_pfn_modify_check(void) return boot_cpu_has_bug(X86_BUG_L1TF); } +/* + * The current flushing context - we pass it instead of 5 arguments: + */ +struct cpa_data { + unsigned long *vaddr; + pgd_t *pgd; + pgprot_t mask_set; + pgprot_t mask_clr; + unsigned long numpages; + int flags; + unsigned long pfn; + unsigned force_split : 1, + force_static_prot : 1; + int curpage; + struct page **pages; +}; + + +int +should_split_large_page(pte_t *kpte, unsigned long address, + struct cpa_data *cpa); +extern spinlock_t cpa_lock; +int +__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address, + struct page *base); + #include #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 4b101dd6e52f..93b0fdaf4a99 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -53,3 +53,5 @@ obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index a1bcde35db4c..84002442ab61 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -26,23 +26,6 @@ #include #include -/* - * The current flushing context - we pass it instead of 5 arguments: - */ -struct cpa_data { - unsigned long *vaddr; - pgd_t *pgd; - pgprot_t mask_set; - pgprot_t mask_clr; - unsigned long numpages; - int flags; - unsigned long pfn; - unsigned force_split : 1, - force_static_prot : 1; - int curpage; - struct page **pages; -}; - enum cpa_warn { CPA_CONFLICT, CPA_PROTECT, @@ -57,7 +40,7 @@ static const int cpa_warn_level = CPA_PROTECT; * entries change the page attribute in parallel to some other cpu * splitting a large page entry along with changing the attribute. */ -static DEFINE_SPINLOCK(cpa_lock); +DEFINE_SPINLOCK(cpa_lock); #define CPA_FLUSHTLB 1 #define CPA_ARRAY 2 @@ -869,7 +852,7 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address, return 0; } -static int should_split_large_page(pte_t *kpte, unsigned long address, +int should_split_large_page(pte_t *kpte, unsigned long address, struct cpa_data *cpa) { int do_split; @@ -919,7 +902,7 @@ static void split_set_pte(struct cpa_data *cpa, pte_t *pte, unsigned long pfn, set_pte(pte, pfn_pte(pfn, ref_prot)); } -static int +int __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address, struct page *base) { diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c new file mode 100644 index 000000000000..6c7502993351 --- /dev/null +++ b/arch/x86/mm/xpfo.c @@ -0,0 +1,119 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include + +#include + +extern spinlock_t cpa_lock; + +/* Update a single kernel page table entry */ +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot) +{ + unsigned int level; + pgprot_t msk_clr; + pte_t *pte = lookup_address((unsigned long)kaddr, &level); + + if (unlikely(!pte)) { + WARN(1, "xpfo: invalid address %p\n", kaddr); + return; + } + + switch (level) { + case PG_LEVEL_4K: + set_pte_atomic(pte, pfn_pte(page_to_pfn(page), + canon_pgprot(prot))); + break; + case PG_LEVEL_2M: + case PG_LEVEL_1G: { + struct cpa_data cpa = { }; + int do_split; + + if (level == PG_LEVEL_2M) + msk_clr = pmd_pgprot(*(pmd_t *)pte); + else + msk_clr = pud_pgprot(*(pud_t *)pte); + + cpa.vaddr = kaddr; + cpa.pages = &page; + cpa.mask_set = prot; + cpa.mask_clr = msk_clr; + cpa.numpages = 1; + cpa.flags = 0; + cpa.curpage = 0; + cpa.force_split = 0; + + + do_split = should_split_large_page(pte, (unsigned long)kaddr, + &cpa); + if (do_split) { + struct page *base; + + base = alloc_pages(GFP_ATOMIC, 0); + if (!base) { + WARN(1, "xpfo: failed to split large page\n"); + break; + } + + if (!debug_pagealloc_enabled()) + spin_lock(&cpa_lock); + if (__split_large_page(&cpa, pte, + (unsigned long)kaddr, base) < 0) + WARN(1, "xpfo: failed to split large page\n"); + if (!debug_pagealloc_enabled()) + spin_unlock(&cpa_lock); + } + + break; + } + case PG_LEVEL_512G: + /* fallthrough, splitting infrastructure doesn't + * support 512G pages. + */ + default: + WARN(1, "xpfo: unsupported page level %x\n", level); + } + +} + +inline void xpfo_flush_kernel_tlb(struct page *page, int order) +{ + int level; + unsigned long size, kaddr; + + kaddr = (unsigned long)page_address(page); + + if (unlikely(!lookup_address(kaddr, &level))) { + WARN(1, "xpfo: invalid address to flush %lx %d\n", kaddr, + level); + return; + } + + switch (level) { + case PG_LEVEL_4K: + size = PAGE_SIZE; + break; + case PG_LEVEL_2M: + size = PMD_SIZE; + break; + case PG_LEVEL_1G: + size = PUD_SIZE; + break; + default: + WARN(1, "xpfo: unsupported page level %x\n", level); + return; + } + + flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); +} diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 0690679832d4..1fdae929e38b 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -8,6 +8,7 @@ #include #include #include +#include #include @@ -56,24 +57,34 @@ static inline struct page *kmap_to_page(void *addr) #ifndef ARCH_HAS_KMAP static inline void *kmap(struct page *page) { + void *kaddr; + might_sleep(); - return page_address(page); + kaddr = page_address(page); + xpfo_kmap(kaddr, page); + return kaddr; } static inline void kunmap(struct page *page) { + xpfo_kunmap(page_address(page), page); } static inline void *kmap_atomic(struct page *page) { + void *kaddr; + preempt_disable(); pagefault_disable(); - return page_address(page); + kaddr = page_address(page); + xpfo_kmap(kaddr, page); + return kaddr; } #define kmap_atomic_prot(page, prot) kmap_atomic(page) static inline void __kunmap_atomic(void *addr) { + xpfo_kunmap(addr, virt_to_page(addr)); pagefault_enable(); preempt_enable(); } diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h new file mode 100644 index 000000000000..b15234745fb4 --- /dev/null +++ b/include/linux/xpfo.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2017 Docker, Inc. + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * Tycho Andersen + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#ifndef _LINUX_XPFO_H +#define _LINUX_XPFO_H + +#include +#include + +struct page; + +#ifdef CONFIG_XPFO + +extern struct page_ext_operations page_xpfo_ops; + +void set_kpte(void *kaddr, struct page *page, pgprot_t prot); +void xpfo_dma_map_unmap_area(bool map, const void *addr, size_t size, + enum dma_data_direction dir); +void xpfo_flush_kernel_tlb(struct page *page, int order); + +void xpfo_kmap(void *kaddr, struct page *page); +void xpfo_kunmap(void *kaddr, struct page *page); +void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp); +void xpfo_free_pages(struct page *page, int order); + +#else /* !CONFIG_XPFO */ + +static inline void xpfo_kmap(void *kaddr, struct page *page) { } +static inline void xpfo_kunmap(void *kaddr, struct page *page) { } +static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } +static inline void xpfo_free_pages(struct page *page, int order) { } + +#endif /* CONFIG_XPFO */ + +#endif /* _LINUX_XPFO_H */ diff --git a/mm/Makefile b/mm/Makefile index d210cc9d6f80..e99e1e6ae5ae 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -99,3 +99,4 @@ obj-$(CONFIG_HARDENED_USERCOPY) += usercopy.o obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o obj-$(CONFIG_HMM) += hmm.o obj-$(CONFIG_MEMFD_CREATE) += memfd.o +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e95b5b7c9c3d..08e277790b5f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1038,6 +1038,7 @@ static __always_inline bool free_pages_prepare(struct page *page, kernel_poison_pages(page, 1 << order, 0); kernel_map_pages(page, 1 << order, 0); kasan_free_pages(page, order); + xpfo_free_pages(page, order); return true; } @@ -1915,6 +1916,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, kernel_map_pages(page, 1 << order, 1); kernel_poison_pages(page, 1 << order, 1); kasan_alloc_pages(page, order); + xpfo_alloc_pages(page, order, gfp_flags); set_page_owner(page, order, gfp_flags); } diff --git a/mm/page_ext.c b/mm/page_ext.c index ae44f7adbe07..38e5013dcb9a 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,6 +8,7 @@ #include #include #include +#include /* * struct page extension @@ -68,6 +69,9 @@ static struct page_ext_operations *page_ext_ops[] = { #if defined(CONFIG_IDLE_PAGE_TRACKING) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_XPFO + &page_xpfo_ops, +#endif }; static unsigned long total_usage; diff --git a/mm/xpfo.c b/mm/xpfo.c new file mode 100644 index 000000000000..24b33d3c20cb --- /dev/null +++ b/mm/xpfo.c @@ -0,0 +1,223 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2017 Docker, Inc. + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * Tycho Andersen + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include +#include +#include +#include + +#include + +/* XPFO page state flags */ +enum xpfo_flags { + XPFO_PAGE_USER, /* Page is allocated to user-space */ + XPFO_PAGE_UNMAPPED, /* Page is unmapped from the linear map */ +}; + +/* Per-page XPFO house-keeping data */ +struct xpfo { + unsigned long flags; /* Page state */ + bool inited; /* Map counter and lock initialized */ + atomic_t mapcount; /* Counter for balancing map/unmap requests */ + spinlock_t maplock; /* Lock to serialize map/unmap requests */ +}; + +DEFINE_STATIC_KEY_FALSE(xpfo_inited); + +static bool xpfo_disabled __initdata; + +static int __init noxpfo_param(char *str) +{ + xpfo_disabled = true; + + return 0; +} + +early_param("noxpfo", noxpfo_param); + +static bool __init need_xpfo(void) +{ + if (xpfo_disabled) { + pr_info("XPFO disabled\n"); + return false; + } + + return true; +} + +static void init_xpfo(void) +{ + pr_info("XPFO enabled\n"); + static_branch_enable(&xpfo_inited); +} + +struct page_ext_operations page_xpfo_ops = { + .size = sizeof(struct xpfo), + .need = need_xpfo, + .init = init_xpfo, +}; + +static inline struct xpfo *lookup_xpfo(struct page *page) +{ + struct page_ext *page_ext = lookup_page_ext(page); + + if (unlikely(!page_ext)) { + WARN(1, "xpfo: failed to get page ext"); + return NULL; + } + + return (void *)page_ext + page_xpfo_ops.offset; +} + +void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) +{ + int i, flush_tlb = 0; + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + for (i = 0; i < (1 << order); i++) { + xpfo = lookup_xpfo(page + i); + if (!xpfo) + continue; + + WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), + "xpfo: unmapped page being allocated\n"); + + /* Initialize the map lock and map counter */ + if (unlikely(!xpfo->inited)) { + spin_lock_init(&xpfo->maplock); + atomic_set(&xpfo->mapcount, 0); + xpfo->inited = true; + } + WARN(atomic_read(&xpfo->mapcount), + "xpfo: already mapped page being allocated\n"); + + if ((gfp & GFP_HIGHUSER) == GFP_HIGHUSER) { + /* + * Tag the page as a user page and flush the TLB if it + * was previously allocated to the kernel. + */ + if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) + flush_tlb = 1; + } else { + /* Tag the page as a non-user (kernel) page */ + clear_bit(XPFO_PAGE_USER, &xpfo->flags); + } + } + + if (flush_tlb) + xpfo_flush_kernel_tlb(page, order); +} + +void xpfo_free_pages(struct page *page, int order) +{ + int i; + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + for (i = 0; i < (1 << order); i++) { + xpfo = lookup_xpfo(page + i); + if (!xpfo || unlikely(!xpfo->inited)) { + /* + * The page was allocated before page_ext was + * initialized, so it is a kernel page. + */ + continue; + } + + /* + * Map the page back into the kernel if it was previously + * allocated to user space. + */ + if (test_and_clear_bit(XPFO_PAGE_USER, &xpfo->flags)) { + clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + set_kpte(page_address(page + i), page + i, + PAGE_KERNEL); + } + } +} + +void xpfo_kmap(void *kaddr, struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + xpfo = lookup_xpfo(page); + + /* + * The page was allocated before page_ext was initialized (which means + * it's a kernel page) or it's allocated to the kernel, so nothing to + * do. + */ + if (!xpfo || unlikely(!xpfo->inited) || + !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + return; + + spin_lock(&xpfo->maplock); + + /* + * The page was previously allocated to user space, so map it back + * into the kernel. No TLB flush required. + */ + if ((atomic_inc_return(&xpfo->mapcount) == 1) && + test_and_clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags)) + set_kpte(kaddr, page, PAGE_KERNEL); + + spin_unlock(&xpfo->maplock); +} +EXPORT_SYMBOL(xpfo_kmap); + +void xpfo_kunmap(void *kaddr, struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return; + + xpfo = lookup_xpfo(page); + + /* + * The page was allocated before page_ext was initialized (which means + * it's a kernel page) or it's allocated to the kernel, so nothing to + * do. + */ + if (!xpfo || unlikely(!xpfo->inited) || + !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + return; + + spin_lock(&xpfo->maplock); + + /* + * The page is to be allocated back to user space, so unmap it from the + * kernel, flush the TLB and tag it as a user page. + */ + if (atomic_dec_return(&xpfo->mapcount) == 0) { + WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), + "xpfo: unmapping already unmapped page\n"); + set_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + set_kpte(kaddr, page, __pgprot(0)); + xpfo_flush_kernel_tlb(page, 0); + } + + spin_unlock(&xpfo->maplock); +} +EXPORT_SYMBOL(xpfo_kunmap); diff --git a/security/Kconfig b/security/Kconfig index d9aa521b5206..8d0e4e303551 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -6,6 +6,25 @@ menu "Security options" source security/keys/Kconfig +config ARCH_SUPPORTS_XPFO + bool + +config XPFO + bool "Enable eXclusive Page Frame Ownership (XPFO)" + default n + depends on ARCH_SUPPORTS_XPFO + select PAGE_EXTENSION + help + This option offers protection against 'ret2dir' kernel attacks. + When enabled, every time a page frame is allocated to user space, it + is unmapped from the direct mapped RAM region in kernel space + (physmap). Similarly, when a page frame is freed/reclaimed, it is + mapped back to physmap. + + There is a slight performance impact when this option is enabled. + + If in doubt, say "N". + config SECURITY_DMESG_RESTRICT bool "Restrict unprivileged access to the kernel syslog" default n From patchwork Thu Feb 14 00:01:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0532713B4 for ; Thu, 14 Feb 2019 00:05:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9DC62DCD3 for ; Thu, 14 Feb 2019 00:05:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD7EF2DD6F; Thu, 14 Feb 2019 00:05:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0C7262DCD3 for ; Thu, 14 Feb 2019 00:05:41 +0000 (UTC) Received: (qmail 30618 invoked by uid 550); 14 Feb 2019 00:05:12 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30538 invoked from network); 14 Feb 2019 00:05:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=MZqi6ournax3KQmG8JyBwhw5kF/1WOYT8qIrSyRL3+o=; b=5c4qlL7951pylpPk/LYjHCqBaQh+OE07SNEncySbv9VpXjItDk191jRy8R2qAS/Si7wk 5azy3SAjL1y5ZU+ayiqvpGeYyYPej1QFoowA8i7hY6LnuqWwGRdVeXP9R8NT7EnzXbG6 VmTdJcDS8tsN+qApiiulm6FgWZxe8R1C9yy4gLDgnBMg2ET8uYg5u30QJc9jnecDxXLZ Ws9u2vhvaPb5ONbVL/qJNVAlo/vUy0Ks8iSwt4n7n8NmKRBZtbPL12yeMBe53Z6nM1Be It7bqaU0WEZ+yaRGwGM5y3xEn/ALsicHwZUxBoSgDXDN78jBld75C6noZszpY4wmUtbI 7A== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Tycho Andersen Subject: [RFC PATCH v8 04/14] swiotlb: Map the buffer if it was unmapped by XPFO Date: Wed, 13 Feb 2019 17:01:27 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=845 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger v6: * guard against lookup_xpfo() returning NULL CC: Konrad Rzeszutek Wilk Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Reviewed-by: Khalid Aziz Reviewed-by: Konrad Rzeszutek Wilk --- include/linux/xpfo.h | 4 ++++ kernel/dma/swiotlb.c | 3 ++- mm/xpfo.c | 15 +++++++++++++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index b15234745fb4..cba37ffb09b1 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -36,6 +36,8 @@ void xpfo_kunmap(void *kaddr, struct page *page); void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp); void xpfo_free_pages(struct page *page, int order); +bool xpfo_page_is_unmapped(struct page *page); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -43,6 +45,8 @@ static inline void xpfo_kunmap(void *kaddr, struct page *page) { } static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } static inline void xpfo_free_pages(struct page *page, int order) { } +static inline bool xpfo_page_is_unmapped(struct page *page) { return false; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 045930e32c0e..820a54b57491 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -396,8 +396,9 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr, { unsigned long pfn = PFN_DOWN(orig_addr); unsigned char *vaddr = phys_to_virt(tlb_addr); + struct page *page = pfn_to_page(pfn); - if (PageHighMem(pfn_to_page(pfn))) { + if (PageHighMem(page) || xpfo_page_is_unmapped(page)) { /* The buffer does not have a mapping. Map it in and copy */ unsigned int offset = orig_addr & ~PAGE_MASK; char *buffer; diff --git a/mm/xpfo.c b/mm/xpfo.c index 24b33d3c20cb..67884736bebe 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -221,3 +221,18 @@ void xpfo_kunmap(void *kaddr, struct page *page) spin_unlock(&xpfo->maplock); } EXPORT_SYMBOL(xpfo_kunmap); + +bool xpfo_page_is_unmapped(struct page *page) +{ + struct xpfo *xpfo; + + if (!static_branch_unlikely(&xpfo_inited)) + return false; + + xpfo = lookup_xpfo(page); + if (unlikely(!xpfo) && !xpfo->inited) + return false; + + return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); +} +EXPORT_SYMBOL(xpfo_page_is_unmapped); From patchwork Thu Feb 14 00:01:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BC2113B4 for ; Thu, 14 Feb 2019 00:04:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BEA12904E for ; Thu, 14 Feb 2019 00:04:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3C292A52A; Thu, 14 Feb 2019 00:04:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 101AD2904E for ; Thu, 14 Feb 2019 00:04:19 +0000 (UTC) Received: (qmail 19989 invoked by uid 550); 14 Feb 2019 00:03:33 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 18406 invoked from network); 14 Feb 2019 00:03:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=iAUhMQrV5BcM9lvrtcT+KJw+jAn1o1A4vhSmYo6KfrI=; b=VjSTr/AqEcZZcnT253Enimw080YefgzZ6lczl9mWxonAMufYwkCcF3dwQch6ueHmg70i dJpR3VkpqqCo5MrMJg08cXPCQwRMg0ItEUkV50un7KCD9qy/0d+vUhhjmxELv5vDsXg3 Ec8x2cdR9pfsnnCSj5tyMVOUNuafN0UEGEnMrvyKFP6U+tH7zegPGoK2cHWcMcllXHpR 1PT/VRty4pVVnHKlT/GtTFe+lLr6OSdMuZAJLbsu2cPVgpX0UlpaWQ5YfTTR1EgTiKBX JspamAuYU1IEU+REu+zfOfbB5ugsNlyZSl+JykVJwE4cXEWuAqhvosNaLi5M380H3gN0 KA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Tycho Andersen , Khalid Aziz Subject: [RFC PATCH v8 05/14] arm64/mm: Add support for XPFO Date: Wed, 13 Feb 2019 17:01:28 -0700 Message-Id: <4a7b373b438d5b93999349c648189bd5ba8f9198.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger Enable support for eXclusive Page Frame Ownership (XPFO) for arm64 and provide a hook for updating a single kernel page table entry (which is required by the generic XPFO code). CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Signed-off-by: Khalid Aziz --- v6: - use flush_tlb_kernel_range() instead of __flush_tlb_one() v8: - Add check for NULL pte in set_kpte() arch/arm64/Kconfig | 1 + arch/arm64/mm/Makefile | 2 ++ arch/arm64/mm/xpfo.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+) create mode 100644 arch/arm64/mm/xpfo.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index ea2ab0330e3a..f0a9c0007d23 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -171,6 +171,7 @@ config ARM64 select SWIOTLB select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK + select ARCH_SUPPORTS_XPFO help ARM 64-bit (AArch64) Linux support. diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 849c1df3d214..cca3808d9776 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -12,3 +12,5 @@ KASAN_SANITIZE_physaddr.o += n obj-$(CONFIG_KASAN) += kasan_init.o KASAN_SANITIZE_kasan_init.o := n + +obj-$(CONFIG_XPFO) += xpfo.o diff --git a/arch/arm64/mm/xpfo.c b/arch/arm64/mm/xpfo.c new file mode 100644 index 000000000000..1f790f7746ad --- /dev/null +++ b/arch/arm64/mm/xpfo.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2017 Hewlett Packard Enterprise Development, L.P. + * Copyright (C) 2016 Brown University. All rights reserved. + * + * Authors: + * Juerg Haefliger + * Vasileios P. Kemerlis + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + */ + +#include +#include + +#include + +/* + * Lookup the page table entry for a virtual address and return a pointer to + * the entry. Based on x86 tree. + */ +static pte_t *lookup_address(unsigned long addr) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + + pgd = pgd_offset_k(addr); + if (pgd_none(*pgd)) + return NULL; + + pud = pud_offset(pgd, addr); + if (pud_none(*pud)) + return NULL; + + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + return NULL; + + return pte_offset_kernel(pmd, addr); +} + +/* Update a single kernel page table entry */ +inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot) +{ + pte_t *pte = lookup_address((unsigned long)kaddr); + + if (unlikely(!pte)) { + WARN(1, "xpfo: invalid address %p\n", kaddr); + return; + } + + set_pte(pte, pfn_pte(page_to_pfn(page), prot)); +} + +inline void xpfo_flush_kernel_tlb(struct page *page, int order) +{ + unsigned long kaddr = (unsigned long)page_address(page); + unsigned long size = PAGE_SIZE; + + flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); +} From patchwork Thu Feb 14 00:01:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811383 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF2506C2 for ; Thu, 14 Feb 2019 00:03:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF4F62D9EE for ; Thu, 14 Feb 2019 00:03:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2E112DE30; Thu, 14 Feb 2019 00:03:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id C398B2D9EE for ; Thu, 14 Feb 2019 00:03:15 +0000 (UTC) Received: (qmail 14311 invoked by uid 550); 14 Feb 2019 00:03:04 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 14231 invoked from network); 14 Feb 2019 00:03:02 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=pnPAMgXPoRW3Y+XGQjlBeFoy0dFQyjStoDUpBHTUk8o=; b=P8MhcmBsqowTapSCWseyQXycsL6swLz4umwP9SBw6+ulLYPgQxLYPAEZ/o9AhHRJONze oYae3uML40WaOv1a39/aVVlZlsxyHs9WIr9DlGXDRTDq832AMS1IhnxZR/SlXqxcRSe8 xVPdOvmpX2ulrj52wlQOEOi4pN4dGsXZqeE8pt03NqDXmySkTKp6Hnrv+rZewUQNaso7 ele2U7dCemABJxM/iTeQAVNSeCxzWjRiyAQZB4GRIUvXED192Fn5YsV8J3mHml/c/vsR 1A5jr/53DI9WF3/DXkm1vroHhlCMQMdkWRgRlraDEqbRGW8QDSSClGDzYcI1zO+9cv9T WQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 06/14] xpfo: add primitives for mapping underlying memory Date: Wed, 13 Feb 2019 17:01:29 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=956 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen In some cases (on arm64 DMA and data cache flushes) we may have unmapped the underlying pages needed for something via XPFO. Here are some primitives useful for ensuring the underlying memory is mapped/unmapped in the face of xpfo. Signed-off-by: Tycho Andersen Reviewed-by: Khalid Aziz --- include/linux/xpfo.h | 22 ++++++++++++++++++++++ mm/xpfo.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index cba37ffb09b1..1ae05756344d 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -38,6 +38,15 @@ void xpfo_free_pages(struct page *page, int order); bool xpfo_page_is_unmapped(struct page *page); +#define XPFO_NUM_PAGES(addr, size) \ + (PFN_UP((unsigned long) (addr) + (size)) - \ + PFN_DOWN((unsigned long) (addr))) + +void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len); +void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, + size_t mapping_len); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -47,6 +56,19 @@ static inline void xpfo_free_pages(struct page *page, int order) { } static inline bool xpfo_page_is_unmapped(struct page *page) { return false; } +#define XPFO_NUM_PAGES(addr, size) 0 + +static inline void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ +} + +static inline void xpfo_temp_unmap(const void *addr, size_t size, + void **mapping, size_t mapping_len) +{ +} + + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/xpfo.c b/mm/xpfo.c index 67884736bebe..92ca6d1baf06 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -14,6 +14,7 @@ * the Free Software Foundation. */ +#include #include #include #include @@ -236,3 +237,32 @@ bool xpfo_page_is_unmapped(struct page *page) return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); } EXPORT_SYMBOL(xpfo_page_is_unmapped); + +void xpfo_temp_map(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ + struct page *page = virt_to_page(addr); + int i, num_pages = mapping_len / sizeof(mapping[0]); + + memset(mapping, 0, mapping_len); + + for (i = 0; i < num_pages; i++) { + if (page_to_virt(page + i) >= addr + size) + break; + + if (xpfo_page_is_unmapped(page + i)) + mapping[i] = kmap_atomic(page + i); + } +} +EXPORT_SYMBOL(xpfo_temp_map); + +void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, + size_t mapping_len) +{ + int i, num_pages = mapping_len / sizeof(mapping[0]); + + for (i = 0; i < num_pages; i++) + if (mapping[i]) + kunmap_atomic(mapping[i]); +} +EXPORT_SYMBOL(xpfo_temp_unmap); From patchwork Thu Feb 14 00:01:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1FDF2746 for ; Thu, 14 Feb 2019 00:03:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0EE992D923 for ; Thu, 14 Feb 2019 00:03:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 029222DBB6; Thu, 14 Feb 2019 00:03:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 2E1BE2D923 for ; Thu, 14 Feb 2019 00:03:21 +0000 (UTC) Received: (qmail 14277 invoked by uid 550); 14 Feb 2019 00:03:03 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 14230 invoked from network); 14 Feb 2019 00:03:01 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=fEQ1MfpMn+jTbXDbToNZqpIPh0CEmp2s2jsy7mZAXXk=; b=clQp5FLlv6dBK7nLgBvO7z0XrNUeRlDkeSjrUlTYTSJbuNRLfOKl0R1BT5sDq21QFYX8 pSz8w5lOqXzP3bUCNpXzkHLjNtdbGgLEsSSt8Eua3YWbfcZEEcQkzBiGSUl4yar1bvHp q8fCHDNP/V5TBUNkjrEFGj4gq1sOoXWLJD0C9kdSdRCeEl+8pLBspnYCQCdJcqUzk5J3 M9AXZQEMVtI8t9Hvrz9GDsGdYXEpQKo1Pp1JwtMth2sBZOvx6EWbBElqk6cR1/ggkzPa cINSZoMWJK2sX83UednZi/f2IJhDrqSJBRYcN3DKtOMSwLUXe+sR+ATtY0MHyLaEGXWX Aw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Tycho Andersen Subject: [RFC PATCH v8 07/14] arm64/mm, xpfo: temporarily map dcache regions Date: Wed, 13 Feb 2019 17:01:30 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=559 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger If the page is unmapped by XPFO, a data cache flush results in a fatal page fault, so let's temporarily map the region, flush the cache, and then unmap it. v6: actually flush in the face of xpfo, and temporarily map the underlying memory so it can be flushed correctly CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen --- arch/arm64/mm/flush.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index 30695a868107..fad09aafd9d5 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -28,9 +29,15 @@ void sync_icache_aliases(void *kaddr, unsigned long len) { unsigned long addr = (unsigned long)kaddr; + unsigned long num_pages = XPFO_NUM_PAGES(addr, len); + void *mapping[num_pages]; if (icache_is_aliasing()) { + xpfo_temp_map(kaddr, len, mapping, + sizeof(mapping[0]) * num_pages); __clean_dcache_area_pou(kaddr, len); + xpfo_temp_unmap(kaddr, len, mapping, + sizeof(mapping[0]) * num_pages); __flush_icache_all(); } else { flush_icache_range(addr, addr + len); From patchwork Thu Feb 14 00:01:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811391 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C17816C2 for ; Thu, 14 Feb 2019 00:03:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B22FC2D923 for ; Thu, 14 Feb 2019 00:03:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A517B2DBB6; Thu, 14 Feb 2019 00:03:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id C6C262D923 for ; Thu, 14 Feb 2019 00:03:29 +0000 (UTC) Received: (qmail 15371 invoked by uid 550); 14 Feb 2019 00:03:05 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 14239 invoked from network); 14 Feb 2019 00:03:02 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=B56xy1SykMvAYC6aqGnZB/bYPtFwWukcKOnvet3ifEU=; b=uIx25zRg4V1mI7KE5rlBz/NOl4jvjuexvGmPkyZvqcfrmno71cCT5E+9JbWTSHC/QKat mwHwGgsn3lyUx7sgJV9UNpCRjTU6py7eGgk7lKgOpsDhpFjeerfPM2DbaBuKJ9UZcAKz DsSL/ZBoCBxdhR3YuaIpkJ3CIAyPWAuQv3sF7iGkSsqlEtLlb7Zeh/RNvN5hZ7YrOLsh sHsS6j1a4GAWGwWpXc9lQ+6XKaFyP9pfD/3U4ji5t9w3llUxgied6gcwwuUnMAESawHr XKG5wRxJo2PHsFrtNsUaPYHBCbF2txoevXeZNyWcby8O60DPcnZ1QRqSHrFDNLBVEldH TQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 08/14] arm64/mm: disable section/contiguous mappings if XPFO is enabled Date: Wed, 13 Feb 2019 17:01:31 -0700 Message-Id: <0b9624b6c1fe5a31d73a6390e063d551bfebc321.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen XPFO doesn't support section/contiguous mappings yet, so let's disable it if XPFO is turned on. Thanks to Laura Abbot for the simplification from v5, and Mark Rutland for pointing out we need NO_CONT_MAPPINGS too. CC: linux-arm-kernel@lists.infradead.org Signed-off-by: Tycho Andersen Reviewed-by: Khalid Aziz --- arch/arm64/mm/mmu.c | 2 +- include/linux/xpfo.h | 4 ++++ mm/xpfo.c | 6 ++++++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index d1d6601b385d..f4dd27073006 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -451,7 +451,7 @@ static void __init map_mem(pgd_t *pgdp) struct memblock_region *reg; int flags = 0; - if (debug_pagealloc_enabled()) + if (debug_pagealloc_enabled() || xpfo_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; /* diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 1ae05756344d..8b029918a958 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -47,6 +47,8 @@ void xpfo_temp_map(const void *addr, size_t size, void **mapping, void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, size_t mapping_len); +bool xpfo_enabled(void); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -69,6 +71,8 @@ static inline void xpfo_temp_unmap(const void *addr, size_t size, } +static inline bool xpfo_enabled(void) { return false; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/xpfo.c b/mm/xpfo.c index 92ca6d1baf06..150784ae0f08 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -71,6 +71,12 @@ struct page_ext_operations page_xpfo_ops = { .init = init_xpfo, }; +bool __init xpfo_enabled(void) +{ + return !xpfo_disabled; +} +EXPORT_SYMBOL(xpfo_enabled); + static inline struct xpfo *lookup_xpfo(struct page *page) { struct page_ext *page_ext = lookup_page_ext(page); From patchwork Thu Feb 14 00:01:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811415 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 906AB13A4 for ; Thu, 14 Feb 2019 00:04:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8191C2DD5B for ; Thu, 14 Feb 2019 00:04:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 748B52DD6F; Thu, 14 Feb 2019 00:04:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 8C9B92DD5B for ; Thu, 14 Feb 2019 00:04:34 +0000 (UTC) Received: (qmail 20015 invoked by uid 550); 14 Feb 2019 00:03:34 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 19855 invoked from network); 14 Feb 2019 00:03:31 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=TQlIsuP1ag8N8cWtIZarXg5qzwt4o6eHPtANA79RWaM=; b=2PAnanOtADcxei1cy8BlTb/qU5ZZ7i0wqibFXPa9bnInPG/bvReURBWdV7oqlHHKB04P eJe2e67nu+luxcdvC3UBRDtvbK4t0fpRO4v+csDCFP/QjO2HMEGx/ANYI0/HgjyJOXBQ A+HhQpvU1Zm+os48RbnhGhjXf13nrybGW9Q+s+JqmKbNdQ6NQ9nWd7faph98x/U5WFsQ GP3HR7Iu2pYrMuqLFfE3A+sddb/2CPIRd3PP80otI32ENy8xTiH9Z0CLXt9neXDuvRjw swjC9f3y4oeNAID085G4bmwvjmiQzHs2QY3PeVUXmX6gdeHyJcTl0BfB9Kn9iJyjhaP4 ZQ== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Tycho Andersen , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Khalid Aziz Subject: [RFC PATCH v8 09/14] mm: add a user_virt_to_phys symbol Date: Wed, 13 Feb 2019 17:01:32 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Tycho Andersen We need someting like this for testing XPFO. Since it's architecture specific, putting it in the test code is slightly awkward, so let's make it an arch-specific symbol and export it for use in LKDTM. CC: linux-arm-kernel@lists.infradead.org CC: x86@kernel.org Signed-off-by: Tycho Andersen Tested-by: Marco Benatto Signed-off-by: Khalid Aziz --- v6: * add a definition of user_virt_to_phys in the !CONFIG_XPFO case v7: * make user_virt_to_phys a GPL symbol arch/x86/mm/xpfo.c | 57 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/xpfo.h | 8 +++++++ 2 files changed, 65 insertions(+) diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c index 6c7502993351..e13b99019c47 100644 --- a/arch/x86/mm/xpfo.c +++ b/arch/x86/mm/xpfo.c @@ -117,3 +117,60 @@ inline void xpfo_flush_kernel_tlb(struct page *page, int order) flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); } + +/* Convert a user space virtual address to a physical address. + * Shamelessly copied from slow_virt_to_phys() and lookup_address() in + * arch/x86/mm/pageattr.c + */ +phys_addr_t user_virt_to_phys(unsigned long addr) +{ + phys_addr_t phys_addr; + unsigned long offset; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_offset(current->mm, addr); + if (pgd_none(*pgd)) + return 0; + + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) + return 0; + + if (p4d_large(*p4d) || !p4d_present(*p4d)) { + phys_addr = (unsigned long)p4d_pfn(*p4d) << PAGE_SHIFT; + offset = addr & ~P4D_MASK; + goto out; + } + + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) + return 0; + + if (pud_large(*pud) || !pud_present(*pud)) { + phys_addr = (unsigned long)pud_pfn(*pud) << PAGE_SHIFT; + offset = addr & ~PUD_MASK; + goto out; + } + + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + return 0; + + if (pmd_large(*pmd) || !pmd_present(*pmd)) { + phys_addr = (unsigned long)pmd_pfn(*pmd) << PAGE_SHIFT; + offset = addr & ~PMD_MASK; + goto out; + } + + pte = pte_offset_kernel(pmd, addr); + phys_addr = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT; + offset = addr & ~PAGE_MASK; + +out: + return (phys_addr_t)(phys_addr | offset); +} +EXPORT_SYMBOL_GPL(user_virt_to_phys); diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 8b029918a958..117869991d5b 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -24,6 +24,10 @@ struct page; #ifdef CONFIG_XPFO +#include + +#include + extern struct page_ext_operations page_xpfo_ops; void set_kpte(void *kaddr, struct page *page, pgprot_t prot); @@ -49,6 +53,8 @@ void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, bool xpfo_enabled(void); +phys_addr_t user_virt_to_phys(unsigned long addr); + #else /* !CONFIG_XPFO */ static inline void xpfo_kmap(void *kaddr, struct page *page) { } @@ -73,6 +79,8 @@ static inline void xpfo_temp_unmap(const void *addr, size_t size, static inline bool xpfo_enabled(void) { return false; } +static inline phys_addr_t user_virt_to_phys(unsigned long addr) { return 0; } + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ From patchwork Thu Feb 14 00:01:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811405 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B4466C2 for ; Thu, 14 Feb 2019 00:04:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6ADD8296D5 for ; Thu, 14 Feb 2019 00:04:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E61F2B2B8; Thu, 14 Feb 2019 00:04:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id EA68A296D5 for ; Thu, 14 Feb 2019 00:04:06 +0000 (UTC) Received: (qmail 16000 invoked by uid 550); 14 Feb 2019 00:03:12 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 15915 invoked from network); 14 Feb 2019 00:03:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=1tfiao/L4nkY1/BQ61imQ3duc4RawqVwLZqH4dWuQjI=; b=D+VWajgNZnnbCBmJd6hRmQKSxbcltQMbhcmtC6zmq5GAiZd2odC4bM6E90rcz6FYx3g3 p7y7UNpH69a0nk3zZf710qRnzwfsm4El++cnOGfwCXpBM+jvTdVH6Sf1YTW1mX20YwrC xhahrxss9iSQvxnblHxengOYGhfFgrNit8QWBepNbxggligImNFYyOIvecSsajODeFCA QYzSeuNjvM9sbPmvcKLqraQPt2jiXMulJIq2+crPTnfNMyeJj3D2nJzTpOUsVOy0xI7m BqAYotYlxCLKc5WX7QxIgpOoO3+kFRcKTeD7C15b1QdehKVErxfhN/MX2ZEDG3Kpwb4g ZA== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Juerg Haefliger , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Tycho Andersen Subject: [RFC PATCH v8 10/14] lkdtm: Add test for XPFO Date: Wed, 13 Feb 2019 17:01:33 -0700 Message-Id: <1af3c61568d36cad4a2b2fece978336620701b86.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Juerg Haefliger This test simply reads from userspace memory via the kernel's linear map. v6: * drop an #ifdef, just let the test fail if XPFO is not supported * add XPFO_SMP test to try and test the case when one CPU does an xpfo unmap of an address, that it can't be used accidentally by other CPUs. Signed-off-by: Juerg Haefliger Signed-off-by: Tycho Andersen Tested-by: Marco Benatto [jsteckli@amazon.de: rebased from v4.13 to v4.19] Signed-off-by: Julian Stecklina Tested-by: Khalid Aziz --- drivers/misc/lkdtm/Makefile | 1 + drivers/misc/lkdtm/core.c | 3 + drivers/misc/lkdtm/lkdtm.h | 5 + drivers/misc/lkdtm/xpfo.c | 194 ++++++++++++++++++++++++++++++++++++ 4 files changed, 203 insertions(+) create mode 100644 drivers/misc/lkdtm/xpfo.c diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile index 951c984de61a..97c6b7818cce 100644 --- a/drivers/misc/lkdtm/Makefile +++ b/drivers/misc/lkdtm/Makefile @@ -9,6 +9,7 @@ lkdtm-$(CONFIG_LKDTM) += refcount.o lkdtm-$(CONFIG_LKDTM) += rodata_objcopy.o lkdtm-$(CONFIG_LKDTM) += usercopy.o lkdtm-$(CONFIG_LKDTM) += stackleak.o +lkdtm-$(CONFIG_LKDTM) += xpfo.o KASAN_SANITIZE_stackleak.o := n KCOV_INSTRUMENT_rodata.o := n diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index 2837dc77478e..25f4ab4ebf50 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -185,6 +185,9 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(USERCOPY_KERNEL), CRASHTYPE(USERCOPY_KERNEL_DS), CRASHTYPE(STACKLEAK_ERASING), + CRASHTYPE(XPFO_READ_USER), + CRASHTYPE(XPFO_READ_USER_HUGE), + CRASHTYPE(XPFO_SMP), }; diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index 3c6fd327e166..6b31ff0c7f8f 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -87,4 +87,9 @@ void lkdtm_USERCOPY_KERNEL_DS(void); /* lkdtm_stackleak.c */ void lkdtm_STACKLEAK_ERASING(void); +/* lkdtm_xpfo.c */ +void lkdtm_XPFO_READ_USER(void); +void lkdtm_XPFO_READ_USER_HUGE(void); +void lkdtm_XPFO_SMP(void); + #endif diff --git a/drivers/misc/lkdtm/xpfo.c b/drivers/misc/lkdtm/xpfo.c new file mode 100644 index 000000000000..d903063bdd0b --- /dev/null +++ b/drivers/misc/lkdtm/xpfo.c @@ -0,0 +1,194 @@ +/* + * This is for all the tests related to XPFO (eXclusive Page Frame Ownership). + */ + +#include "lkdtm.h" + +#include +#include +#include +#include +#include + +#include +#include + +#define XPFO_DATA 0xdeadbeef + +static unsigned long do_map(unsigned long flags) +{ + unsigned long user_addr, user_data = XPFO_DATA; + + user_addr = vm_mmap(NULL, 0, PAGE_SIZE, + PROT_READ | PROT_WRITE | PROT_EXEC, + flags, 0); + if (user_addr >= TASK_SIZE) { + pr_warn("Failed to allocate user memory\n"); + return 0; + } + + if (copy_to_user((void __user *)user_addr, &user_data, + sizeof(user_data))) { + pr_warn("copy_to_user failed\n"); + goto free_user; + } + + return user_addr; + +free_user: + vm_munmap(user_addr, PAGE_SIZE); + return 0; +} + +static unsigned long *user_to_kernel(unsigned long user_addr) +{ + phys_addr_t phys_addr; + void *virt_addr; + + phys_addr = user_virt_to_phys(user_addr); + if (!phys_addr) { + pr_warn("Failed to get physical address of user memory\n"); + return NULL; + } + + virt_addr = phys_to_virt(phys_addr); + if (phys_addr != virt_to_phys(virt_addr)) { + pr_warn("Physical address of user memory seems incorrect\n"); + return NULL; + } + + return virt_addr; +} + +static void read_map(unsigned long *virt_addr) +{ + pr_info("Attempting bad read from kernel address %p\n", virt_addr); + if (*(unsigned long *)virt_addr == XPFO_DATA) + pr_err("FAIL: Bad read succeeded?!\n"); + else + pr_err("FAIL: Bad read didn't fail but data is incorrect?!\n"); +} + +static void read_user_with_flags(unsigned long flags) +{ + unsigned long user_addr, *kernel; + + user_addr = do_map(flags); + if (!user_addr) { + pr_err("FAIL: map failed\n"); + return; + } + + kernel = user_to_kernel(user_addr); + if (!kernel) { + pr_err("FAIL: user to kernel conversion failed\n"); + goto free_user; + } + + read_map(kernel); + +free_user: + vm_munmap(user_addr, PAGE_SIZE); +} + +/* Read from userspace via the kernel's linear map. */ +void lkdtm_XPFO_READ_USER(void) +{ + read_user_with_flags(MAP_PRIVATE | MAP_ANONYMOUS); +} + +void lkdtm_XPFO_READ_USER_HUGE(void) +{ + read_user_with_flags(MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB); +} + +struct smp_arg { + unsigned long *virt_addr; + unsigned int cpu; +}; + +static int smp_reader(void *parg) +{ + struct smp_arg *arg = parg; + unsigned long *virt_addr; + + if (arg->cpu != smp_processor_id()) { + pr_err("FAIL: scheduled on wrong CPU?\n"); + return 0; + } + + virt_addr = smp_cond_load_acquire(&arg->virt_addr, VAL != NULL); + read_map(virt_addr); + + return 0; +} + +#ifdef CONFIG_X86 +#define XPFO_SMP_KILLED SIGKILL +#elif CONFIG_ARM64 +#define XPFO_SMP_KILLED SIGSEGV +#else +#error unsupported arch +#endif + +/* The idea here is to read from the kernel's map on a different thread than + * did the mapping (and thus the TLB flushing), to make sure that the page + * faults on other cores too. + */ +void lkdtm_XPFO_SMP(void) +{ + unsigned long user_addr, *virt_addr; + struct task_struct *thread; + int ret; + struct smp_arg arg; + + if (num_online_cpus() < 2) { + pr_err("not enough to do a multi cpu test\n"); + return; + } + + arg.virt_addr = NULL; + arg.cpu = (smp_processor_id() + 1) % num_online_cpus(); + thread = kthread_create(smp_reader, &arg, "lkdtm_xpfo_test"); + if (IS_ERR(thread)) { + pr_err("couldn't create kthread? %ld\n", PTR_ERR(thread)); + return; + } + + kthread_bind(thread, arg.cpu); + get_task_struct(thread); + wake_up_process(thread); + + user_addr = do_map(MAP_PRIVATE | MAP_ANONYMOUS); + if (!user_addr) + goto kill_thread; + + virt_addr = user_to_kernel(user_addr); + if (!virt_addr) { + /* + * let's store something that will fail, so we can unblock the + * thread + */ + smp_store_release(&arg.virt_addr, &arg); + goto free_user; + } + + smp_store_release(&arg.virt_addr, virt_addr); + + /* there must be a better way to do this. */ + while (1) { + if (thread->exit_state) + break; + msleep_interruptible(100); + } + +free_user: + if (user_addr) + vm_munmap(user_addr, PAGE_SIZE); + +kill_thread: + ret = kthread_stop(thread); + if (ret != XPFO_SMP_KILLED) + pr_err("FAIL: thread wasn't killed: %d\n", ret); + put_task_struct(thread); +} From patchwork Thu Feb 14 00:01:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811439 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD7EA13A4 for ; Thu, 14 Feb 2019 00:06:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CCC9228C19 for ; Thu, 14 Feb 2019 00:06:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C045E28DD6; Thu, 14 Feb 2019 00:06:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0633028DAB for ; Thu, 14 Feb 2019 00:06:14 +0000 (UTC) Received: (qmail 3539 invoked by uid 550); 14 Feb 2019 00:06:13 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 3517 invoked from network); 14 Feb 2019 00:06:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=TYTmjWk1Ud9xFO4O2CDjqPHQ3NzvMn4w1OlUR3qol7g=; b=xwYy2aXo7t5xJC9+w32YfLL7cjgc3iIpBAE0V7wYjGFAQ06Se7GHUmGKKd3a8t4ViTPM hgsMfdwTc6d6rvLVimRpcx4hxWiqZKD+nUkx1x33WRzAvkWQYxjujKv9YZ4u+vG2ZrRK TTxa/WMoUVhUwZz7hR+pi68WsOnVaDmCXKCF/3R5yqBQTLOj2LJn4XmBTIh74W10CKhs UxA1QSEodqXxzu7zAkdK1fGxw33J2M/yvMyFF+CSQBOMgasBZ7a33Dqq6g4Aspx/2pF5 Rsiuq3kqdpq0p0SsA48rdZvbLBUg/DgjD6G/t8Bot0UQ0KJBENA5cO5yi+dmHCm/oK5p 4g== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse Subject: [RFC PATCH v8 11/14] xpfo, mm: remove dependency on CONFIG_PAGE_EXTENSION Date: Wed, 13 Feb 2019 17:01:34 -0700 Message-Id: <465161d96733a9bda8dfbae9d2028912ee383faa.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina Instead of using the page extension debug feature, encode all information, we need for XPFO in struct page. This allows to get rid of some checks in the hot paths and there are also no pages anymore that are allocated before XPFO is enabled. Also make debugging aids configurable for maximum performance. Signed-off-by: Julian Stecklina Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse Reviewed-by: Khalid Aziz --- include/linux/mm_types.h | 8 ++ include/linux/page-flags.h | 13 +++ include/linux/xpfo.h | 3 +- include/trace/events/mmflags.h | 10 ++- mm/page_alloc.c | 3 +- mm/page_ext.c | 4 - mm/xpfo.c | 159 ++++++++------------------------- security/Kconfig | 12 ++- 8 files changed, 80 insertions(+), 132 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2c471a2c43fa..d17d33f36a01 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -204,6 +204,14 @@ struct page { #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS int _last_cpupid; #endif + +#ifdef CONFIG_XPFO + /* Counts the number of times this page has been kmapped. */ + atomic_t xpfo_mapcount; + + /* Serialize kmap/kunmap of this page */ + spinlock_t xpfo_lock; +#endif } _struct_page_alignment; /* diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 50ce1bddaf56..a532063f27b5 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -101,6 +101,10 @@ enum pageflags { #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT) PG_young, PG_idle, +#endif +#ifdef CONFIG_XPFO + PG_xpfo_user, /* Page is allocated to user-space */ + PG_xpfo_unmapped, /* Page is unmapped from the linear map */ #endif __NR_PAGEFLAGS, @@ -398,6 +402,15 @@ TESTCLEARFLAG(Young, young, PF_ANY) PAGEFLAG(Idle, idle, PF_ANY) #endif +#ifdef CONFIG_XPFO +PAGEFLAG(XpfoUser, xpfo_user, PF_ANY) +TESTCLEARFLAG(XpfoUser, xpfo_user, PF_ANY) +TESTSETFLAG(XpfoUser, xpfo_user, PF_ANY) +PAGEFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +TESTCLEARFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +TESTSETFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +#endif + /* * On an anonymous page mapped into a user virtual memory area, * page->mapping points to its anon_vma, not to a struct address_space; diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 117869991d5b..1dd590ff1a1f 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -28,7 +28,7 @@ struct page; #include -extern struct page_ext_operations page_xpfo_ops; +void xpfo_init_single_page(struct page *page); void set_kpte(void *kaddr, struct page *page, pgprot_t prot); void xpfo_dma_map_unmap_area(bool map, const void *addr, size_t size, @@ -57,6 +57,7 @@ phys_addr_t user_virt_to_phys(unsigned long addr); #else /* !CONFIG_XPFO */ +static inline void xpfo_init_single_page(struct page *page) { } static inline void xpfo_kmap(void *kaddr, struct page *page) { } static inline void xpfo_kunmap(void *kaddr, struct page *page) { } static inline void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { } diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index a1675d43777e..6bb000bb366f 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -79,6 +79,12 @@ #define IF_HAVE_PG_IDLE(flag,string) #endif +#ifdef CONFIG_XPFO +#define IF_HAVE_PG_XPFO(flag,string) ,{1UL << flag, string} +#else +#define IF_HAVE_PG_XPFO(flag,string) +#endif + #define __def_pageflag_names \ {1UL << PG_locked, "locked" }, \ {1UL << PG_waiters, "waiters" }, \ @@ -105,7 +111,9 @@ IF_HAVE_PG_MLOCK(PG_mlocked, "mlocked" ) \ IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ -IF_HAVE_PG_IDLE(PG_idle, "idle" ) +IF_HAVE_PG_IDLE(PG_idle, "idle" ) \ +IF_HAVE_PG_XPFO(PG_xpfo_user, "xpfo_user" ) \ +IF_HAVE_PG_XPFO(PG_xpfo_unmapped, "xpfo_unmapped" ) \ #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 08e277790b5f..d00382b20001 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1024,6 +1024,7 @@ static __always_inline bool free_pages_prepare(struct page *page, if (bad) return false; + xpfo_free_pages(page, order); page_cpupid_reset_last(page); page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); @@ -1038,7 +1039,6 @@ static __always_inline bool free_pages_prepare(struct page *page, kernel_poison_pages(page, 1 << order, 0); kernel_map_pages(page, 1 << order, 0); kasan_free_pages(page, order); - xpfo_free_pages(page, order); return true; } @@ -1191,6 +1191,7 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn, if (!is_highmem_idx(zone)) set_page_address(page, __va(pfn << PAGE_SHIFT)); #endif + xpfo_init_single_page(page); } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT diff --git a/mm/page_ext.c b/mm/page_ext.c index 38e5013dcb9a..ae44f7adbe07 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,7 +8,6 @@ #include #include #include -#include /* * struct page extension @@ -69,9 +68,6 @@ static struct page_ext_operations *page_ext_ops[] = { #if defined(CONFIG_IDLE_PAGE_TRACKING) && !defined(CONFIG_64BIT) &page_idle_ops, #endif -#ifdef CONFIG_XPFO - &page_xpfo_ops, -#endif }; static unsigned long total_usage; diff --git a/mm/xpfo.c b/mm/xpfo.c index 150784ae0f08..dc03c423c52f 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -17,113 +17,58 @@ #include #include #include -#include #include #include -/* XPFO page state flags */ -enum xpfo_flags { - XPFO_PAGE_USER, /* Page is allocated to user-space */ - XPFO_PAGE_UNMAPPED, /* Page is unmapped from the linear map */ -}; - -/* Per-page XPFO house-keeping data */ -struct xpfo { - unsigned long flags; /* Page state */ - bool inited; /* Map counter and lock initialized */ - atomic_t mapcount; /* Counter for balancing map/unmap requests */ - spinlock_t maplock; /* Lock to serialize map/unmap requests */ -}; - -DEFINE_STATIC_KEY_FALSE(xpfo_inited); - -static bool xpfo_disabled __initdata; +DEFINE_STATIC_KEY_TRUE(xpfo_inited); static int __init noxpfo_param(char *str) { - xpfo_disabled = true; + static_branch_disable(&xpfo_inited); return 0; } early_param("noxpfo", noxpfo_param); -static bool __init need_xpfo(void) -{ - if (xpfo_disabled) { - pr_info("XPFO disabled\n"); - return false; - } - - return true; -} - -static void init_xpfo(void) -{ - pr_info("XPFO enabled\n"); - static_branch_enable(&xpfo_inited); -} - -struct page_ext_operations page_xpfo_ops = { - .size = sizeof(struct xpfo), - .need = need_xpfo, - .init = init_xpfo, -}; - bool __init xpfo_enabled(void) { - return !xpfo_disabled; + if (!static_branch_unlikely(&xpfo_inited)) + return false; + else + return true; } -EXPORT_SYMBOL(xpfo_enabled); -static inline struct xpfo *lookup_xpfo(struct page *page) +void __meminit xpfo_init_single_page(struct page *page) { - struct page_ext *page_ext = lookup_page_ext(page); - - if (unlikely(!page_ext)) { - WARN(1, "xpfo: failed to get page ext"); - return NULL; - } - - return (void *)page_ext + page_xpfo_ops.offset; + spin_lock_init(&page->xpfo_lock); } void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { int i, flush_tlb = 0; - struct xpfo *xpfo; if (!static_branch_unlikely(&xpfo_inited)) return; for (i = 0; i < (1 << order); i++) { - xpfo = lookup_xpfo(page + i); - if (!xpfo) - continue; - - WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), - "xpfo: unmapped page being allocated\n"); - - /* Initialize the map lock and map counter */ - if (unlikely(!xpfo->inited)) { - spin_lock_init(&xpfo->maplock); - atomic_set(&xpfo->mapcount, 0); - xpfo->inited = true; - } - WARN(atomic_read(&xpfo->mapcount), - "xpfo: already mapped page being allocated\n"); - +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(PageXpfoUser(page + i)); + BUG_ON(PageXpfoUnmapped(page + i)); + BUG_ON(spin_is_locked(&(page + i)->xpfo_lock)); + BUG_ON(atomic_read(&(page + i)->xpfo_mapcount)); +#endif if ((gfp & GFP_HIGHUSER) == GFP_HIGHUSER) { /* * Tag the page as a user page and flush the TLB if it * was previously allocated to the kernel. */ - if (!test_and_set_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!TestSetPageXpfoUser(page + i)) flush_tlb = 1; } else { /* Tag the page as a non-user (kernel) page */ - clear_bit(XPFO_PAGE_USER, &xpfo->flags); + ClearPageXpfoUser(page + i); } } @@ -134,27 +79,21 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) void xpfo_free_pages(struct page *page, int order) { int i; - struct xpfo *xpfo; if (!static_branch_unlikely(&xpfo_inited)) return; for (i = 0; i < (1 << order); i++) { - xpfo = lookup_xpfo(page + i); - if (!xpfo || unlikely(!xpfo->inited)) { - /* - * The page was allocated before page_ext was - * initialized, so it is a kernel page. - */ - continue; - } +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(atomic_read(&(page + i)->xpfo_mapcount)); +#endif /* * Map the page back into the kernel if it was previously * allocated to user space. */ - if (test_and_clear_bit(XPFO_PAGE_USER, &xpfo->flags)) { - clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + if (TestClearPageXpfoUser(page + i)) { + ClearPageXpfoUnmapped(page + i); set_kpte(page_address(page + i), page + i, PAGE_KERNEL); } @@ -163,84 +102,56 @@ void xpfo_free_pages(struct page *page, int order) void xpfo_kmap(void *kaddr, struct page *page) { - struct xpfo *xpfo; - if (!static_branch_unlikely(&xpfo_inited)) return; - xpfo = lookup_xpfo(page); - - /* - * The page was allocated before page_ext was initialized (which means - * it's a kernel page) or it's allocated to the kernel, so nothing to - * do. - */ - if (!xpfo || unlikely(!xpfo->inited) || - !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!PageXpfoUser(page)) return; - spin_lock(&xpfo->maplock); + spin_lock(&page->xpfo_lock); /* * The page was previously allocated to user space, so map it back * into the kernel. No TLB flush required. */ - if ((atomic_inc_return(&xpfo->mapcount) == 1) && - test_and_clear_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags)) + if ((atomic_inc_return(&page->xpfo_mapcount) == 1) && + TestClearPageXpfoUnmapped(page)) set_kpte(kaddr, page, PAGE_KERNEL); - spin_unlock(&xpfo->maplock); + spin_unlock(&page->xpfo_lock); } EXPORT_SYMBOL(xpfo_kmap); void xpfo_kunmap(void *kaddr, struct page *page) { - struct xpfo *xpfo; - if (!static_branch_unlikely(&xpfo_inited)) return; - xpfo = lookup_xpfo(page); - - /* - * The page was allocated before page_ext was initialized (which means - * it's a kernel page) or it's allocated to the kernel, so nothing to - * do. - */ - if (!xpfo || unlikely(!xpfo->inited) || - !test_bit(XPFO_PAGE_USER, &xpfo->flags)) + if (!PageXpfoUser(page)) return; - spin_lock(&xpfo->maplock); + spin_lock(&page->xpfo_lock); /* * The page is to be allocated back to user space, so unmap it from the * kernel, flush the TLB and tag it as a user page. */ - if (atomic_dec_return(&xpfo->mapcount) == 0) { - WARN(test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags), - "xpfo: unmapping already unmapped page\n"); - set_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + if (atomic_dec_return(&page->xpfo_mapcount) == 0) { +#ifdef CONFIG_XPFO_DEBUG + BUG_ON(PageXpfoUnmapped(page)); +#endif + SetPageXpfoUnmapped(page); set_kpte(kaddr, page, __pgprot(0)); xpfo_flush_kernel_tlb(page, 0); } - spin_unlock(&xpfo->maplock); + spin_unlock(&page->xpfo_lock); } EXPORT_SYMBOL(xpfo_kunmap); bool xpfo_page_is_unmapped(struct page *page) { - struct xpfo *xpfo; - - if (!static_branch_unlikely(&xpfo_inited)) - return false; - - xpfo = lookup_xpfo(page); - if (unlikely(!xpfo) && !xpfo->inited) - return false; - - return test_bit(XPFO_PAGE_UNMAPPED, &xpfo->flags); + return PageXpfoUnmapped(page); } EXPORT_SYMBOL(xpfo_page_is_unmapped); diff --git a/security/Kconfig b/security/Kconfig index 8d0e4e303551..c7c581bac963 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -13,7 +13,6 @@ config XPFO bool "Enable eXclusive Page Frame Ownership (XPFO)" default n depends on ARCH_SUPPORTS_XPFO - select PAGE_EXTENSION help This option offers protection against 'ret2dir' kernel attacks. When enabled, every time a page frame is allocated to user space, it @@ -25,6 +24,17 @@ config XPFO If in doubt, say "N". +config XPFO_DEBUG + bool "Enable debugging of XPFO" + default n + depends on XPFO + help + Enables additional checking of XPFO data structures that help find + bugs in the XPFO implementation. This option comes with a slight + performance cost. + + If in doubt, say "N". + config SECURITY_DMESG_RESTRICT bool "Restrict unprivileged access to the kernel syslog" default n From patchwork Thu Feb 14 00:01:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811431 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A488113A4 for ; Thu, 14 Feb 2019 00:05:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94FAD2DCD3 for ; Thu, 14 Feb 2019 00:05:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 880C12DD6F; Thu, 14 Feb 2019 00:05:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 9DD802DCD3 for ; Thu, 14 Feb 2019 00:05:29 +0000 (UTC) Received: (qmail 30368 invoked by uid 550); 14 Feb 2019 00:05:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30286 invoked from network); 14 Feb 2019 00:05:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=sphrv7BG0r4M1etaGHVdl0ES/afYZFzSiF3kwKx/qUc=; b=rq32CuBk5X2+Xev+zFjtWbTbukjVYgp052VQoUtKNOTE4Cda9Cy4Byz7X7TyDaf3TQ2I 6Xq6h/PXDcwuH55t84HlJx++IE29KeYzxGnud9iAcqceAw++4jace2g3nP5q9+tVQuDP JKmeME1MBZxG+r3eijrKz+I+/dFivoWQvUM6OVeVIggnFDVOUMogq4PMOpJhjY4t+4Ty T/Rv6uXJl+x2rhUlCA5dHNltlK7OXjBzMdts964clZe9v6/2TNndXTWUZGdIWr0yX2HO Q8vLkyWbeiF65glX8gZob//mF1Popwb6C6Z2TLKwVkfbusnbRd8Lf2e8RxtLEG/+nZ1T cg== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Khalid Aziz , "Vasileios P . Kemerlis" , Juerg Haefliger , Tycho Andersen , Marco Benatto , David Woodhouse Subject: [RFC PATCH v8 12/14] xpfo, mm: optimize spinlock usage in xpfo_kunmap Date: Wed, 13 Feb 2019 17:01:35 -0700 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina Only the xpfo_kunmap call that needs to actually unmap the page needs to be serialized. We need to be careful to handle the case, where after the atomic decrement of the mapcount, a xpfo_kmap increased the mapcount again. In this case, we can safely skip modifying the page table. Model-checked with up to 4 concurrent callers with Spin. Signed-off-by: Julian Stecklina Signed-off-by: Khalid Aziz Cc: x86@kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Vasileios P. Kemerlis Cc: Juerg Haefliger Cc: Tycho Andersen Cc: Marco Benatto Cc: David Woodhouse --- mm/xpfo.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/mm/xpfo.c b/mm/xpfo.c index dc03c423c52f..5157cbebce4b 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -124,28 +124,35 @@ EXPORT_SYMBOL(xpfo_kmap); void xpfo_kunmap(void *kaddr, struct page *page) { + bool flush_tlb = false; + if (!static_branch_unlikely(&xpfo_inited)) return; if (!PageXpfoUser(page)) return; - spin_lock(&page->xpfo_lock); - /* * The page is to be allocated back to user space, so unmap it from the * kernel, flush the TLB and tag it as a user page. */ if (atomic_dec_return(&page->xpfo_mapcount) == 0) { -#ifdef CONFIG_XPFO_DEBUG - BUG_ON(PageXpfoUnmapped(page)); -#endif - SetPageXpfoUnmapped(page); - set_kpte(kaddr, page, __pgprot(0)); - xpfo_flush_kernel_tlb(page, 0); + spin_lock(&page->xpfo_lock); + + /* + * In the case, where we raced with kmap after the + * atomic_dec_return, we must not nuke the mapping. + */ + if (atomic_read(&page->xpfo_mapcount) == 0) { + SetPageXpfoUnmapped(page); + set_kpte(kaddr, page, __pgprot(0)); + flush_tlb = true; + } + spin_unlock(&page->xpfo_lock); } - spin_unlock(&page->xpfo_lock); + if (flush_tlb) + xpfo_flush_kernel_tlb(page, 0); } EXPORT_SYMBOL(xpfo_kunmap); From patchwork Thu Feb 14 00:01:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811419 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B00813A4 for ; Thu, 14 Feb 2019 00:04:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 69DE92DCD3 for ; Thu, 14 Feb 2019 00:04:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D26D2DD6F; Thu, 14 Feb 2019 00:04:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0D1D62DD5B for ; Thu, 14 Feb 2019 00:04:47 +0000 (UTC) Received: (qmail 20055 invoked by uid 550); 14 Feb 2019 00:03:35 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 18402 invoked from network); 14 Feb 2019 00:03:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=lijI2dkmHOz8IBbMljD0kgh/GfNNHRB/Ww5ZCLiDpC8=; b=Eq0NRf9CvVjMRyqL/pJwd5gMUZxni+bHvlB5+zwob5e+xiwZT4ra0AyBxJ92DivdFLna 97cYTFa2qe9QQ6wYiXHornwjF1eTBScwRD03REBGbSCxrGe1sHulQ2gDVu4xX3mxM/36 3SvBa2fAgAWdU0U1FEda7/JATU3IVFLn4IveEsG+K9h5CqxCBFKvEGRJGGdkfDkYamdp PsnFndFcwiVrKpuBYeXNhI81MuZO4NIfCeGp/zunHR7sbKkLh0NDGuO1bP7hoZRAZp8q tczxkn156C41JsFvXIo8bWbOeUspWxI2rXu9qB8VJta8M6C6Ic/459RKF/2UsrhhQ7Ps Gw== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Khalid Aziz , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 13/14] xpfo, mm: Defer TLB flushes for non-current CPUs (x86 only) Date: Wed, 13 Feb 2019 17:01:36 -0700 Message-Id: <98134cb73e911b2f0b59ffb76243a7777963d218.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP XPFO flushes kernel space TLB entries for pages that are now mapped in userspace on not only the current CPU but also all other CPUs synchronously. Processes on each core allocating pages causes a flood of IPI messages to all other cores to flush TLB entries. Many of these messages are to flush the entire TLB on the core if the number of entries being flushed from local core exceeds tlb_single_page_flush_ceiling. The cost of TLB flush caused by unmapping pages from physmap goes up dramatically on machines with high core count. This patch flushes relevant TLB entries for current process or entire TLB depending upon number of entries for the current CPU and posts a pending TLB flush on all other CPUs when a page is unmapped from kernel space and mapped in userspace. Each core checks the pending TLB flush flag for itself on every context switch, flushes its TLB if the flag is set and clears it. This patch potentially aggregates multiple TLB flushes into one. This has very significant impact especially on machines with large core counts. To illustrate this, kernel was compiled with -j on two classes of machines - a server with high core count and large amount of memory, and a desktop class machine with more modest specs. System time from "make -j" from vanilla 4.20 kernel, 4.20 with XPFO patches before applying this patch and after applying this patch are below: Hardware: 96-core Intel Xeon Platinum 8160 CPU @ 2.10GHz, 768 GB RAM make -j60 all 4.20 950.966s 4.20+XPFO 25073.169s 26.366x 4.20+XPFO+Deferred flush 1372.874s 1.44x Hardware: 4-core Intel Core i5-3550 CPU @ 3.30GHz, 8G RAM make -j4 all 4.20 607.671s 4.20+XPFO 1588.646s 2.614x 4.20+XPFO+Deferred flush 803.989s 1.32x This patch could use more optimization. Batching more TLB entry flushes, as was suggested for earlier version of these patches, can help reduce these cases. This same code should be implemented for other architectures as well once finalized. Signed-off-by: Khalid Aziz --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 38 +++++++++++++++++++++++++++++++++ arch/x86/mm/xpfo.c | 2 +- 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index f4204bf377fc..92d23629d01d 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -561,6 +561,7 @@ extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables); extern void flush_tlb_kernel_range(unsigned long start, unsigned long end); +extern void xpfo_flush_tlb_kernel_range(unsigned long start, unsigned long end); static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) { diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 03b6b4c2238d..c907b643eecb 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -35,6 +35,15 @@ */ #define LAST_USER_MM_IBPB 0x1UL +/* + * When a full TLB flush is needed to flush stale TLB entries + * for pages that have been mapped into userspace and unmapped + * from kernel space, this TLB flush will be delayed until the + * task is scheduled on that CPU. Keep track of CPUs with + * pending full TLB flush forced by xpfo. + */ +static cpumask_t pending_xpfo_flush; + /* * We get here when we do something requiring a TLB invalidation * but could not go invalidate all of the contexts. We do the @@ -319,6 +328,15 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, __flush_tlb_all(); } #endif + + /* If there is a pending TLB flush for this CPU due to XPFO + * flush, do it now. + */ + if (cpumask_test_and_clear_cpu(cpu, &pending_xpfo_flush)) { + count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); + __flush_tlb_all(); + } + this_cpu_write(cpu_tlbstate.is_lazy, false); /* @@ -801,6 +819,26 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) } } +void xpfo_flush_tlb_kernel_range(unsigned long start, unsigned long end) +{ + struct cpumask tmp_mask; + + /* Balance as user space task's flush, a bit conservative */ + if (end == TLB_FLUSH_ALL || + (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { + do_flush_tlb_all(NULL); + } else { + struct flush_tlb_info info; + + info.start = start; + info.end = end; + do_kernel_range_flush(&info); + } + cpumask_setall(&tmp_mask); + cpumask_clear_cpu(smp_processor_id(), &tmp_mask); + cpumask_or(&pending_xpfo_flush, &pending_xpfo_flush, &tmp_mask); +} + void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { struct flush_tlb_info info = { diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c index e13b99019c47..d3833532bfdc 100644 --- a/arch/x86/mm/xpfo.c +++ b/arch/x86/mm/xpfo.c @@ -115,7 +115,7 @@ inline void xpfo_flush_kernel_tlb(struct page *page, int order) return; } - flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); + xpfo_flush_tlb_kernel_range(kaddr, kaddr + (1 << order) * size); } /* Convert a user space virtual address to a physical address. From patchwork Thu Feb 14 00:01:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Khalid Aziz X-Patchwork-Id: 10811427 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D233F13B4 for ; Thu, 14 Feb 2019 00:05:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C15352DCD3 for ; Thu, 14 Feb 2019 00:05:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B4EA42DD6F; Thu, 14 Feb 2019 00:05:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 791602DCD3 for ; Thu, 14 Feb 2019 00:05:16 +0000 (UTC) Received: (qmail 29996 invoked by uid 550); 14 Feb 2019 00:05:02 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 29928 invoked from network); 14 Feb 2019 00:05:01 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : in-reply-to : references; s=corp-2018-07-02; bh=Ov6dF12doUxji6ePINUtaYIJxqhm2/VTB1zpiOjT1tA=; b=HWgMzsoVLWuxYs5bJDKNiffhWGRMwUJsNHgz0cMMCPscT4fdXMdRnfNdp9sKENEJvwft TXF3uWQrqHXeKNtriV1PxiSLggLWtrKHDZbqDPKfC2vSDmyGgCvykt4L33owldrq6md1 TQQBpA6dSqxhEIVIk5w1iCvxCAnQfbKcOuFvv8SErUpmXO8Gu4GIlFIIKLHDLf3O5o4+ ufPr2YjYEzK5U4hHeZSfwBGrufuBkFk0aiuvWEs187yLF1EZ2nbUxudhQYjrx/FwpdC3 jHTyqjS/1miNz9E4XceIQ4rlIkCOYGq8Vpq/8qTkxjsEyQH8SgC1/wxD+ys2W18dm4Sm 2A== From: Khalid Aziz To: juergh@gmail.com, tycho@tycho.ws, jsteckli@amazon.de, ak@linux.intel.com, torvalds@linux-foundation.org, liran.alon@oracle.com, keescook@google.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, will.deacon@arm.com, jmorris@namei.org, konrad.wilk@oracle.com Cc: Khalid Aziz , deepa.srinivasan@oracle.com, chris.hyser@oracle.com, tyhicks@canonical.com, dwmw@amazon.co.uk, andrew.cooper3@citrix.com, jcm@redhat.com, boris.ostrovsky@oracle.com, kanth.ghatraju@oracle.com, oao.m.martins@oracle.com, jmattson@google.com, pradeep.vincent@oracle.com, john.haxby@oracle.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, hch@lst.de, steven.sistare@oracle.com, labbott@redhat.com, luto@kernel.org, dave.hansen@intel.com, peterz@infradead.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v8 14/14] xpfo, mm: Optimize XPFO TLB flushes by batching them together Date: Wed, 13 Feb 2019 17:01:37 -0700 Message-Id: <6a92971cd9b360ec1b0ae75887f33f67774d681a.1550088114.git.khalid.aziz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9166 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902130157 X-Virus-Scanned: ClamAV using ClamSMTP When XPFO forces a TLB flush on all cores, the performance impact is very significant. Batching as many of these TLB updates as possible can help lower this impact. When a userspace allocates a page, kernel tries to get that page from the per-cpu free list. This free list is replenished in bulk when it runs low. Free list is being replenished for future allocation to userspace is a good opportunity to update TLB entries in batch and reduce the impact of multiple TLB flushes later. This patch adds new tags for the page so a page can be marked as available for userspace allocation and unmapped from kernel address space. All such pages are removed from kernel address space in bulk at the time they are added to per-cpu free list. This patch when combined with deferred TLB flushes improves performance further. Using the same benchmark as before of building kernel in parallel, here are the system times on two differently sized systems: Hardware: 96-core Intel Xeon Platinum 8160 CPU @ 2.10GHz, 768 GB RAM make -j60 all 4.20 950.966s 4.20+XPFO 25073.169s 26.366x 4.20+XPFO+Deferred flush 1372.874s 1.44x 4.20+XPFO+Deferred flush+Batch update 1255.021s 1.32x Hardware: 4-core Intel Core i5-3550 CPU @ 3.30GHz, 8G RAM make -j4 all 4.20 607.671s 4.20+XPFO 1588.646s 2.614x 4.20+XPFO+Deferred flush 803.989s 1.32x 4.20+XPFO+Deferred flush+Batch update 795.728s 1.31x Signed-off-by: Khalid Aziz Signed-off-by: Tycho Andersen --- arch/x86/mm/xpfo.c | 5 +++++ include/linux/page-flags.h | 5 ++++- include/linux/xpfo.h | 8 ++++++++ mm/page_alloc.c | 4 ++++ mm/xpfo.c | 35 +++++++++++++++++++++++++++++++++-- 5 files changed, 54 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/xpfo.c b/arch/x86/mm/xpfo.c index d3833532bfdc..fb06bb3cb718 100644 --- a/arch/x86/mm/xpfo.c +++ b/arch/x86/mm/xpfo.c @@ -87,6 +87,11 @@ inline void set_kpte(void *kaddr, struct page *page, pgprot_t prot) } +void xpfo_flush_tlb_all(void) +{ + xpfo_flush_tlb_kernel_range(0, TLB_FLUSH_ALL); +} + inline void xpfo_flush_kernel_tlb(struct page *page, int order) { int level; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index a532063f27b5..fdf7e14cbc96 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -406,9 +406,11 @@ PAGEFLAG(Idle, idle, PF_ANY) PAGEFLAG(XpfoUser, xpfo_user, PF_ANY) TESTCLEARFLAG(XpfoUser, xpfo_user, PF_ANY) TESTSETFLAG(XpfoUser, xpfo_user, PF_ANY) +#define __PG_XPFO_USER (1UL << PG_xpfo_user) PAGEFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) TESTCLEARFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) TESTSETFLAG(XpfoUnmapped, xpfo_unmapped, PF_ANY) +#define __PG_XPFO_UNMAPPED (1UL << PG_xpfo_unmapped) #endif /* @@ -787,7 +789,8 @@ static inline void ClearPageSlabPfmemalloc(struct page *page) * alloc-free cycle to prevent from reusing the page. */ #define PAGE_FLAGS_CHECK_AT_PREP \ - (((1UL << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON) + (((1UL << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON & ~__PG_XPFO_USER & \ + ~__PG_XPFO_UNMAPPED) #define PAGE_FLAGS_PRIVATE \ (1UL << PG_private | 1UL << PG_private_2) diff --git a/include/linux/xpfo.h b/include/linux/xpfo.h index 1dd590ff1a1f..c4f6c99e7380 100644 --- a/include/linux/xpfo.h +++ b/include/linux/xpfo.h @@ -34,6 +34,7 @@ void set_kpte(void *kaddr, struct page *page, pgprot_t prot); void xpfo_dma_map_unmap_area(bool map, const void *addr, size_t size, enum dma_data_direction dir); void xpfo_flush_kernel_tlb(struct page *page, int order); +void xpfo_flush_tlb_all(void); void xpfo_kmap(void *kaddr, struct page *page); void xpfo_kunmap(void *kaddr, struct page *page); @@ -55,6 +56,8 @@ bool xpfo_enabled(void); phys_addr_t user_virt_to_phys(unsigned long addr); +bool xpfo_pcp_refill(struct page *page, enum migratetype migratetype, + int order); #else /* !CONFIG_XPFO */ static inline void xpfo_init_single_page(struct page *page) { } @@ -82,6 +85,11 @@ static inline bool xpfo_enabled(void) { return false; } static inline phys_addr_t user_virt_to_phys(unsigned long addr) { return 0; } +static inline bool xpfo_pcp_refill(struct page *page, + enum migratetype migratetype, int order) +{ +} + #endif /* CONFIG_XPFO */ #endif /* _LINUX_XPFO_H */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d00382b20001..5702b6fa435c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2478,6 +2478,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, int migratetype) { int i, alloced = 0; + bool flush_tlb = false; spin_lock(&zone->lock); for (i = 0; i < count; ++i) { @@ -2503,6 +2504,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, if (is_migrate_cma(get_pcppage_migratetype(page))) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, -(1 << order)); + flush_tlb |= xpfo_pcp_refill(page, migratetype, order); } /* @@ -2513,6 +2515,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, */ __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order)); spin_unlock(&zone->lock); + if (flush_tlb) + xpfo_flush_tlb_all(); return alloced; } diff --git a/mm/xpfo.c b/mm/xpfo.c index 5157cbebce4b..7f78d00df002 100644 --- a/mm/xpfo.c +++ b/mm/xpfo.c @@ -47,7 +47,8 @@ void __meminit xpfo_init_single_page(struct page *page) void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) { - int i, flush_tlb = 0; + int i; + bool flush_tlb = false; if (!static_branch_unlikely(&xpfo_inited)) return; @@ -65,7 +66,7 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) * was previously allocated to the kernel. */ if (!TestSetPageXpfoUser(page + i)) - flush_tlb = 1; + flush_tlb = true; } else { /* Tag the page as a non-user (kernel) page */ ClearPageXpfoUser(page + i); @@ -74,6 +75,8 @@ void xpfo_alloc_pages(struct page *page, int order, gfp_t gfp) if (flush_tlb) xpfo_flush_kernel_tlb(page, order); + + return; } void xpfo_free_pages(struct page *page, int order) @@ -190,3 +193,31 @@ void xpfo_temp_unmap(const void *addr, size_t size, void **mapping, kunmap_atomic(mapping[i]); } EXPORT_SYMBOL(xpfo_temp_unmap); + +bool xpfo_pcp_refill(struct page *page, enum migratetype migratetype, + int order) +{ + int i; + bool flush_tlb = false; + + if (!static_branch_unlikely(&xpfo_inited)) + return false; + + for (i = 0; i < 1 << order; i++) { + if (migratetype == MIGRATE_MOVABLE) { + /* GPF_HIGHUSER ** + * Tag the page as a user page, mark it as unmapped + * in kernel space and flush the TLB if it was + * previously allocated to the kernel. + */ + if (!TestSetPageXpfoUnmapped(page + i)) + flush_tlb = true; + SetPageXpfoUser(page + i); + } else { + /* Tag the page as a non-user (kernel) page */ + ClearPageXpfoUser(page + i); + } + } + + return(flush_tlb); +}