From patchwork Mon Jul 27 17:11:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 11687283 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78E12138C for ; Mon, 27 Jul 2020 17:03:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B8D9207FC for ; Mon, 27 Jul 2020 17:03:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="NrvwWQqv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729594AbgG0RDV (ORCPT ); Mon, 27 Jul 2020 13:03:21 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:56132 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726387AbgG0RDU (ORCPT ); Mon, 27 Jul 2020 13:03:20 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGhCP5077272; Mon, 27 Jul 2020 17:02:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=P/3elD8YdvsiVdf2i288xCJTu+wr4zknX8/P6Cp077g=; b=NrvwWQqv1q559vr/YBsf0ZN/buPKnYUD/hsr8sCxcSmGujTbKM5N4rE5JCntF+3l/+9V XHT6ui7C2ZDho2rpiA5Ny5NxcILBwBIHG+jd+SiJXdlaQBVst0HNh4OuVcqTV4XxWJZk RqxMhMwjUwzrJhkoYmn481XYfUqCfeBr3OyfhMn7o8lK2JOM7bLv7wMkuKrD6dhSayAK Wo4dWA4+v1j86v+yJLXXjZLBNPuaoH4PtAZ3TScGc71HrFiY2qLHCOHGAGEsdO2CNA7r tDgFY/MUdRruzAOTw0wOxabd5jl8iTB7Cdhtk2ChBIZWRyZROiDh0VKKem3mlnyboHG1 ww== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 32hu1j2rux-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:11 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgWHd055648; Mon, 27 Jul 2020 17:02:11 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3020.oracle.com with ESMTP id 32hu5r9f9d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:11 +0000 Received: from userp3020.oracle.com (userp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06RGuWGf111604; Mon, 27 Jul 2020 17:02:10 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by userp3020.oracle.com with ESMTP id 32hu5r9f7r-2; Mon, 27 Jul 2020 17:02:10 +0000 From: Anthony Yznaga To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org Cc: mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com, steven.sistare@oracle.com Subject: [RFC PATCH 1/5] elf: reintroduce using MAP_FIXED_NOREPLACE for elf executable mappings Date: Mon, 27 Jul 2020 10:11:23 -0700 Message-Id: <1595869887-23307-2-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9695 signatures=668679 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 clxscore=1015 malwarescore=0 spamscore=0 suspectscore=2 bulkscore=0 priorityscore=1501 phishscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007270116 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Commit b212921b13bd ("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings") reverted back to using MAP_FIXED to map elf load segments because it was found that the load segments in some binaries overlap and can cause MAP_FIXED_NOREPLACE to fail. The original intent of MAP_FIXED_NOREPLACE was to prevent the silent clobbering of an existing mapping (e.g. the stack) by the elf image. To achieve this, expand on the logic used when loading ET_DYN binaries which calculates a total size for the image when the first segment is mapped, maps the entire image, and then unmaps the remainder before remaining segments are mapped. Apply this to ET_EXEC binaries as well as ET_DYN binaries as is done now, and for both ET_EXEC and ET_DYN+INTERP use MAP_FIXED_NOREPLACE for the initial total size mapping and MAP_FIXED for remaining mappings. For ET_DYN w/out INTERP, continue to map at a system-selected address in the mmap region. Signed-off-by: Anthony Yznaga --- fs/binfmt_elf.c | 112 ++++++++++++++++++++++++++++++++------------------------ 1 file changed, 64 insertions(+), 48 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 9fe3b51c116a..6445a6dbdb1d 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1046,58 +1046,25 @@ static int load_elf_binary(struct linux_binprm *bprm) vaddr = elf_ppnt->p_vaddr; /* - * If we are loading ET_EXEC or we have already performed - * the ET_DYN load_addr calculations, proceed normally. + * Map remaining segments with MAP_FIXED once the first + * total size mapping has been done. */ - if (elf_ex->e_type == ET_EXEC || load_addr_set) { + if (load_addr_set) { elf_flags |= MAP_FIXED; - } else if (elf_ex->e_type == ET_DYN) { - /* - * This logic is run once for the first LOAD Program - * Header for ET_DYN binaries to calculate the - * randomization (load_bias) for all the LOAD - * Program Headers, and to calculate the entire - * size of the ELF mapping (total_size). (Note that - * load_addr_set is set to true later once the - * initial mapping is performed.) - * - * There are effectively two types of ET_DYN - * binaries: programs (i.e. PIE: ET_DYN with INTERP) - * and loaders (ET_DYN without INTERP, since they - * _are_ the ELF interpreter). The loaders must - * be loaded away from programs since the program - * may otherwise collide with the loader (especially - * for ET_EXEC which does not have a randomized - * position). For example to handle invocations of - * "./ld.so someprog" to test out a new version of - * the loader, the subsequent program that the - * loader loads must avoid the loader itself, so - * they cannot share the same load range. Sufficient - * room for the brk must be allocated with the - * loader as well, since brk must be available with - * the loader. - * - * Therefore, programs are loaded offset from - * ELF_ET_DYN_BASE and loaders are loaded into the - * independently randomized mmap region (0 load_bias - * without MAP_FIXED). - */ - if (interpreter) { - load_bias = ELF_ET_DYN_BASE; - if (current->flags & PF_RANDOMIZE) - load_bias += arch_mmap_rnd(); - elf_flags |= MAP_FIXED; - } else - load_bias = 0; - + } else { /* - * Since load_bias is used for all subsequent loading - * calculations, we must lower it by the first vaddr - * so that the remaining calculations based on the - * ELF vaddrs will be correctly offset. The result - * is then page aligned. + * To ensure loading does not continue if an ELF + * LOAD segment overlaps an existing mapping (e.g. + * the stack), for the first LOAD Program Header + * calculate the the entire size of the ELF mapping + * and map it with MAP_FIXED_NOREPLACE. On success, + * the remainder will be unmapped and subsequent + * LOAD segments mapped with MAP_FIXED rather than + * MAP_FIXED_NOREPLACE because some binaries may + * have overlapping segments that would cause the + * mmap to fail. */ - load_bias = ELF_PAGESTART(load_bias - vaddr); + elf_flags |= MAP_FIXED_NOREPLACE; total_size = total_mapping_size(elf_phdata, elf_ex->e_phnum); @@ -1105,6 +1072,55 @@ static int load_elf_binary(struct linux_binprm *bprm) retval = -EINVAL; goto out_free_dentry; } + + if (elf_ex->e_type == ET_DYN) { + /* + * This logic is run once for the first LOAD + * Program Header for ET_DYN binaries to + * calculate the randomization (load_bias) for + * all the LOAD Program Headers. + * + * There are effectively two types of ET_DYN + * binaries: programs (i.e. PIE: ET_DYN with + * INTERP) and loaders (ET_DYN without INTERP, + * since they _are_ the ELF interpreter). The + * loaders must be loaded away from programs + * since the program may otherwise collide with + * the loader (especially for ET_EXEC which does + * not have a randomized position). For example + * to handle invocations of "./ld.so someprog" + * to test out a new version of the loader, the + * subsequent program that the loader loads must + * avoid the loader itself, so they cannot share + * the same load range. Sufficient room for the + * brk must be allocated with the loader as + * well, since brk must be available with the + * loader. + * + * Therefore, programs are loaded offset from + * ELF_ET_DYN_BASE and loaders are loaded into + * the independently randomized mmap region + * (0 load_bias without MAP_FIXED*). + */ + if (interpreter) { + load_bias = ELF_ET_DYN_BASE; + if (current->flags & PF_RANDOMIZE) + load_bias += arch_mmap_rnd(); + } else { + load_bias = 0; + elf_flags &= ~MAP_FIXED_NOREPLACE; + } + + /* + * Since load_bias is used for all subsequent + * loading calculations, we must lower it by + * the first vaddr so that the remaining + * calculations based on the ELF vaddrs will + * be correctly offset. The result is then + * page aligned. + */ + load_bias = ELF_PAGESTART(load_bias - vaddr); + } } error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, From patchwork Mon Jul 27 17:11:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 11687277 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2971138C for ; Mon, 27 Jul 2020 17:03:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 99A892073E for ; Mon, 27 Jul 2020 17:03:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="RYz1YCi4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729230AbgG0RDH (ORCPT ); Mon, 27 Jul 2020 13:03:07 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39514 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728999AbgG0RDH (ORCPT ); Mon, 27 Jul 2020 13:03:07 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RH1m1l115287; Mon, 27 Jul 2020 17:02:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=TeNtDqDqIQ7vUEFJe8evTe6FYY6zKZvMIc3ENnPMQc0=; b=RYz1YCi4flsp2nsaalElHjr/hSEq9BFy53s+1DZ+5n+d1AvmCPHXu6sh5Y145WTuFAai HU0TACE1RDbQRpWN8Z8XTQB98axZVsYGs7DSYIoz/h7Klai8PAGqwKRTLDbKImfMts3S P+PsgiBo0fou1y6SAFN9Hpuk/GMdlLoiKrPK69KDT+4x6bJ2UE1bkWLpW+GLclUvh5Dw uGR1Dpzy+80qdnAlIFuEp6v4lk8LH6FusYE8y+/sTrhAFxKylcskhlPSMXWjNYe0nDFJ H5EbxXytrIkHX1d3qQwm16zuK82R9R5HiCQh7J2V0isgUvT+NIRyX4CtBbCtekUKREkg wQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 32hu1j2rah-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:13 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgWKE055548; Mon, 27 Jul 2020 17:02:12 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3020.oracle.com with ESMTP id 32hu5r9fa8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:12 +0000 Received: from userp3020.oracle.com (userp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06RGuWGh111604; Mon, 27 Jul 2020 17:02:12 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by userp3020.oracle.com with ESMTP id 32hu5r9f7r-3; Mon, 27 Jul 2020 17:02:12 +0000 From: Anthony Yznaga To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org Cc: mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com, steven.sistare@oracle.com Subject: [RFC PATCH 2/5] mm: do not assume only the stack vma exists in setup_arg_pages() Date: Mon, 27 Jul 2020 10:11:24 -0700 Message-Id: <1595869887-23307-3-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9695 signatures=668679 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 malwarescore=0 clxscore=1015 mlxscore=0 impostorscore=0 phishscore=0 adultscore=0 suspectscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007270117 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org In preparation for allowing vmas to be preserved across exec do not assume that there is no prev vma to pass to mprotect_fixup() in setup_arg_pages(). Also, setup_arg_pages() expands the initial stack of a process by 128k or to the stack size limit, whichever is smaller. expand_stack() assumes there is no vma between the vma passed to it and the address to expand to, so check before calling it. Signed-off-by: Anthony Yznaga --- fs/exec.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/exec.c b/fs/exec.c index e6e8a9a70327..262112e5f9f8 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -720,7 +720,7 @@ int setup_arg_pages(struct linux_binprm *bprm, unsigned long stack_shift; struct mm_struct *mm = current->mm; struct vm_area_struct *vma = bprm->vma; - struct vm_area_struct *prev = NULL; + struct vm_area_struct *prev = vma->vm_prev; unsigned long vm_flags; unsigned long stack_base; unsigned long stack_size; @@ -819,6 +819,10 @@ int setup_arg_pages(struct linux_binprm *bprm, else stack_base = vma->vm_start - stack_expand; #endif + if (vma != find_vma(mm, stack_base)) { + ret = -EFAULT; + goto out_unlock; + } current->mm->start_stack = bprm->p; ret = expand_stack(vma, stack_base); if (ret) From patchwork Mon Jul 27 17:11:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 11687303 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F6F1138C for ; Mon, 27 Jul 2020 17:04:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61DBD2083E for ; Mon, 27 Jul 2020 17:04:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="xMB0HfK4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729829AbgG0REp (ORCPT ); Mon, 27 Jul 2020 13:04:45 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:51798 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728626AbgG0REp (ORCPT ); Mon, 27 Jul 2020 13:04:45 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RH1oE5167412; Mon, 27 Jul 2020 17:04:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=0MK6qZ4rmREeaZwYRzKOOUuuO9k+/vuBanKdQpkLLj0=; b=xMB0HfK4JVWE3/LtK1M1FsIoACvTkHyveasvbs4I6h2R8R5+Muq8BwQLHE58ukF7Lvws 072k5QrhQxYwX53lw/viQ1GyuEDwkpJt0Om+E4vAVCy7IRrFMUdg/8l1vnmDB8PAPkb5 7hvtDYByCv7TMfsbAA2C0cv5iOCeOyLgJ6KbyiR0j7KNw7b1VY4Cm5pNojOswbGHWijd SG1DlQCI1NviMPXmSP0rwZyiDVvSKlCO2yLdeiZGWG1GgbUmk7ybiKcAvKQ6CU3lUcmo o/lnwUaMEZE1xcnpvQbVMKmLCIkROOa3CcLQkl0RQql3BPP1daqZIAwKIIOBsG/Yre7/ Sw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 32hu1jas6x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:04:14 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgWR5055581; Mon, 27 Jul 2020 17:02:14 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3020.oracle.com with ESMTP id 32hu5r9fb6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:14 +0000 Received: from userp3020.oracle.com (userp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06RGuWGj111604; Mon, 27 Jul 2020 17:02:13 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by userp3020.oracle.com with ESMTP id 32hu5r9f7r-4; Mon, 27 Jul 2020 17:02:13 +0000 From: Anthony Yznaga To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org Cc: mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com, steven.sistare@oracle.com Subject: [RFC PATCH 3/5] mm: introduce VM_EXEC_KEEP Date: Mon, 27 Jul 2020 10:11:25 -0700 Message-Id: <1595869887-23307-4-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9695 signatures=668679 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 clxscore=1015 mlxlogscore=999 malwarescore=0 impostorscore=0 priorityscore=1501 spamscore=0 phishscore=0 suspectscore=2 bulkscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007270117 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org A vma with the VM_EXEC_KEEP flag is preserved across exec. For anonymous vmas only. For safety, overlap with fixed address VMAs created in the new mm during exec (e.g. the stack and elf load segments) is not permitted and will cause the exec to fail. (We are studying how to guarantee there are no conflicts. Comments welcome.) Signed-off-by: Steve Sistare Signed-off-by: Anthony Yznaga --- arch/x86/Kconfig | 1 + fs/exec.c | 20 ++++++++++++++++++++ include/linux/mm.h | 5 +++++ kernel/fork.c | 2 +- mm/mmap.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 74 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 883da0abf779..fc36eb2f45c0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -30,6 +30,7 @@ config X86_64 select MODULES_USE_ELF_RELA select NEED_DMA_MAP_STATE select SWIOTLB + select ARCH_USES_HIGH_VMA_FLAGS config FORCE_DYNAMIC_FTRACE def_bool y diff --git a/fs/exec.c b/fs/exec.c index 262112e5f9f8..1de09c4eef00 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1069,6 +1069,20 @@ ssize_t read_code(struct file *file, unsigned long addr, loff_t pos, size_t len) EXPORT_SYMBOL(read_code); #endif +static int vma_dup_some(struct mm_struct *old_mm, struct mm_struct *new_mm) +{ + struct vm_area_struct *vma; + int ret; + + for (vma = old_mm->mmap; vma; vma = vma->vm_next) + if (vma->vm_flags & VM_EXEC_KEEP) { + ret = vma_dup(vma, new_mm); + if (ret) + return ret; + } + return 0; +} + /* * Maps the mm_struct mm into the current task struct. * On success, this function returns with the mutex @@ -1104,6 +1118,12 @@ static int exec_mmap(struct mm_struct *mm) mutex_unlock(&tsk->signal->exec_update_mutex); return -EINTR; } + ret = vma_dup_some(old_mm, mm); + if (ret) { + mmap_read_unlock(old_mm); + mutex_unlock(&tsk->signal->exec_update_mutex); + return ret; + } } task_lock(tsk); diff --git a/include/linux/mm.h b/include/linux/mm.h index dc7b87310c10..1c538ba77f33 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -295,11 +295,15 @@ int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *, #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_EXEC_KEEP BIT(VM_HIGH_ARCH_BIT_5) /* preserve VMA across exec */ +#else +#define VM_EXEC_KEEP VM_NONE #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -2534,6 +2538,7 @@ extern struct vm_area_struct *copy_vma(struct vm_area_struct **, unsigned long addr, unsigned long len, pgoff_t pgoff, bool *need_rmap_locks); extern void exit_mmap(struct mm_struct *); +extern int vma_dup(struct vm_area_struct *vma, struct mm_struct *mm); static inline int check_data_rlimit(unsigned long rlim, unsigned long new, diff --git a/kernel/fork.c b/kernel/fork.c index efc5493203ae..15ead613714f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -564,7 +564,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, tmp->anon_vma = NULL; } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; - tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); + tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT | VM_EXEC_KEEP); file = tmp->vm_file; if (file) { struct inode *inode = file_inode(file); diff --git a/mm/mmap.c b/mm/mmap.c index 59a4682ebf3f..be2ff53743c3 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3279,6 +3279,53 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, return NULL; } +int vma_dup(struct vm_area_struct *old_vma, struct mm_struct *mm) +{ + unsigned long npages; + struct mm_struct *old_mm = old_vma->vm_mm; + struct vm_area_struct *vma; + int ret = -ENOMEM; + + if (WARN_ON(old_vma->vm_file || old_vma->vm_ops)) + return -EINVAL; + + vma = find_vma(mm, old_vma->vm_start); + if (vma && vma->vm_start < old_vma->vm_end) + return -EEXIST; + + npages = vma_pages(old_vma); + mm->total_vm += npages; + + vma = vm_area_dup(old_vma); + if (!vma) + goto fail_nomem; + + ret = vma_dup_policy(old_vma, vma); + if (ret) + goto fail_nomem_policy; + + vma->vm_mm = mm; + ret = anon_vma_fork(vma, old_vma); + if (ret) + goto fail_nomem_anon_vma_fork; + + vma->vm_flags &= ~(VM_LOCKED|VM_UFFD_MISSING|VM_UFFD_WP|VM_EXEC_KEEP); + vma->vm_next = vma->vm_prev = NULL; + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; + if (is_vm_hugetlb_page(vma)) + reset_vma_resv_huge_pages(vma); + __insert_vm_struct(mm, vma); + ret = copy_page_range(mm, old_mm, old_vma); + return ret; + +fail_nomem_anon_vma_fork: + mpol_put(vma_policy(vma)); +fail_nomem_policy: + vm_area_free(vma); +fail_nomem: + return -ENOMEM; +} + /* * Return true if the calling process may expand its vm space by the passed * number of pages From patchwork Mon Jul 27 17:11:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 11687279 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DB89C13B6 for ; Mon, 27 Jul 2020 17:03:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BF261207BB for ; Mon, 27 Jul 2020 17:03:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="KE9rVmmx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729670AbgG0RDX (ORCPT ); Mon, 27 Jul 2020 13:03:23 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:50754 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729527AbgG0RDW (ORCPT ); Mon, 27 Jul 2020 13:03:22 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RH1nEt167389; Mon, 27 Jul 2020 17:02:15 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=RKuxFySxIOrswcS7HZJn86yQ5aSrouGMhrnBJ6qwKug=; b=KE9rVmmxrgg0F0B/r6KWm1pHHqXHwclbdyJDDYCpVyF4MLM713/EvNIju/DkhXVjQe/L pe+Lp6OuzdQnvu64n3ZSq9MN/uMJ8FSJ6mu4gBfWZg2Q7bw7HaXViJJD2Kn5FqdZx6Ob HxISVAtYmuXELEtAAzVvRPUDNwPsQwGahUwIAl0rlh74xzbZAbtjnhLb4ViYq7TH2XaA xz7FMbns1wDFC1l3NWCoFCNwGsUky7lM/WJmgzIwwqPvn9GYNVqRaOUOTT67ljE43YG4 txmDHeLZu4R5xH1DErSYrvAbaYLEIvexFw0vig7T77KuwJAGlZsyunSJorS1FM9guLw7 QQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 32hu1jaru5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:15 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgWHg055648; Mon, 27 Jul 2020 17:02:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3020.oracle.com with ESMTP id 32hu5r9fc7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:15 +0000 Received: from userp3020.oracle.com (userp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06RGuWGl111604; Mon, 27 Jul 2020 17:02:14 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by userp3020.oracle.com with ESMTP id 32hu5r9f7r-5; Mon, 27 Jul 2020 17:02:14 +0000 From: Anthony Yznaga To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org Cc: mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com, steven.sistare@oracle.com Subject: [RFC PATCH 4/5] exec, elf: require opt-in for accepting preserved mem Date: Mon, 27 Jul 2020 10:11:26 -0700 Message-Id: <1595869887-23307-5-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9695 signatures=668679 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 clxscore=1015 mlxlogscore=999 malwarescore=0 impostorscore=0 priorityscore=1501 spamscore=0 phishscore=0 suspectscore=2 bulkscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007270117 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Don't copy preserved VMAs to the binary being exec'd unless the binary has a "preserved-mem-ok" ELF note. Signed-off-by: Anthony Yznaga --- fs/binfmt_elf.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++ fs/exec.c | 17 +++++----- include/linux/binfmts.h | 7 ++++- 3 files changed, 100 insertions(+), 8 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 6445a6dbdb1d..46248b7b0a75 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -683,6 +683,81 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex, return error; } +#define NOTES_SZ SZ_1K +#define PRESERVED_MEM_OK_STRING "preserved-mem-ok" +#define SZ_PRESERVED_MEM_OK_STRING sizeof(PRESERVED_MEM_OK_STRING) + +static int parse_elf_note(struct linux_binprm *bprm, const char *data, size_t *off, size_t datasz) +{ + const struct elf_note *nhdr; + const char *name; + size_t o; + + o = *off; + datasz -= o; + + if (datasz < sizeof(*nhdr)) + return -ENOEXEC; + + nhdr = (const struct elf_note *)(data + o); + o += sizeof(*nhdr); + datasz -= sizeof(*nhdr); + + /* + * Currently only the preserved-mem-ok elf note is of interest. + */ + if (nhdr->n_type != 0x07c1feed) + goto next; + + if (nhdr->n_namesz > SZ_PRESERVED_MEM_OK_STRING) + return -ENOEXEC; + + name = data + o; + if (datasz < SZ_PRESERVED_MEM_OK_STRING || + strncmp(name, PRESERVED_MEM_OK_STRING, SZ_PRESERVED_MEM_OK_STRING)) + return -ENOEXEC; + + bprm->accepts_preserved_mem = 1; + +next: + o += roundup(nhdr->n_namesz, 4) + roundup(nhdr->n_descsz, 4); + *off = o; + + return 0; +} + +static int parse_elf_notes(struct linux_binprm *bprm, struct elf_phdr *phdr) +{ + char *notes; + size_t notes_sz; + size_t off = 0; + int ret; + + if (!phdr) + return 0; + + notes_sz = phdr->p_filesz; + if ((notes_sz > NOTES_SZ) || (notes_sz < sizeof(struct elf_note))) + return -ENOEXEC; + + notes = kvmalloc(notes_sz, GFP_KERNEL); + if (!notes) + return -ENOMEM; + + ret = elf_read(bprm->file, notes, notes_sz, phdr->p_offset); + if (ret < 0) + goto out; + + while (off < notes_sz) { + ret = parse_elf_note(bprm, notes, &off, notes_sz); + if (ret) + break; + } +out: + kvfree(notes); + return ret; +} + /* * These are the functions used to load ELF style executables and shared * libraries. There is no binary dependent code anywhere else. @@ -801,6 +876,7 @@ static int load_elf_binary(struct linux_binprm *bprm) unsigned long error; struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL; struct elf_phdr *elf_property_phdata = NULL; + struct elf_phdr *elf_notes_phdata = NULL; unsigned long elf_bss, elf_brk; int bss_prot = 0; int retval, i; @@ -909,6 +985,10 @@ static int load_elf_binary(struct linux_binprm *bprm) executable_stack = EXSTACK_DISABLE_X; break; + case PT_NOTE: + elf_notes_phdata = elf_ppnt; + break; + case PT_LOPROC ... PT_HIPROC: retval = arch_elf_pt_proc(elf_ex, elf_ppnt, bprm->file, false, @@ -970,6 +1050,10 @@ static int load_elf_binary(struct linux_binprm *bprm) if (retval) goto out_free_dentry; + retval = parse_elf_notes(bprm, elf_notes_phdata); + if (retval) + goto out_free_dentry; + /* Flush all traces of the currently running executable */ retval = begin_new_exec(bprm); if (retval) diff --git a/fs/exec.c b/fs/exec.c index 1de09c4eef00..b2b046fec1f8 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1088,10 +1088,11 @@ static int vma_dup_some(struct mm_struct *old_mm, struct mm_struct *new_mm) * On success, this function returns with the mutex * exec_update_mutex locked. */ -static int exec_mmap(struct mm_struct *mm) +static int exec_mmap(struct linux_binprm *bprm) { struct task_struct *tsk; struct mm_struct *old_mm, *active_mm; + struct mm_struct *mm = bprm->mm; int ret; /* Notify parent that we're no longer interested in the old VM */ @@ -1118,11 +1119,13 @@ static int exec_mmap(struct mm_struct *mm) mutex_unlock(&tsk->signal->exec_update_mutex); return -EINTR; } - ret = vma_dup_some(old_mm, mm); - if (ret) { - mmap_read_unlock(old_mm); - mutex_unlock(&tsk->signal->exec_update_mutex); - return ret; + if (bprm->accepts_preserved_mem) { + ret = vma_dup_some(old_mm, mm); + if (ret) { + mmap_read_unlock(old_mm); + mutex_unlock(&tsk->signal->exec_update_mutex); + return ret; + } } } @@ -1386,7 +1389,7 @@ int begin_new_exec(struct linux_binprm * bprm) * Release all of the old mmap stuff */ acct_arg_size(bprm, 0); - retval = exec_mmap(bprm->mm); + retval = exec_mmap(bprm); if (retval) goto out; diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index 4a20b7517dd0..6a66589454c8 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -41,7 +41,12 @@ struct linux_binprm { * Set when errors can no longer be returned to the * original userspace. */ - point_of_no_return:1; + point_of_no_return:1, + /* + * Set if the binary being exec'd will accept memory marked + * for preservation by the outgoing process. + */ + accepts_preserved_mem:1; #ifdef __alpha__ unsigned int taso:1; #endif From patchwork Mon Jul 27 17:11:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 11687335 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B020C138A for ; Mon, 27 Jul 2020 17:08:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 97E0720775 for ; Mon, 27 Jul 2020 17:08:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="l1LOF0dm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730203AbgG0RH7 (ORCPT ); Mon, 27 Jul 2020 13:07:59 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:59392 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729403AbgG0RH7 (ORCPT ); Mon, 27 Jul 2020 13:07:59 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgkN5077073; Mon, 27 Jul 2020 17:02:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2020-01-29; bh=EH2bcmGiKig5NQUr25P5VYLKdWqRTYRE1Orwb4KgXik=; b=l1LOF0dmlbi1ncVDq7RWlEvNJYr2LbPMPmGo3aXniYWPZhLlMSHR0iWNp/2cDgmG6dbR RsmBzBxwcJM3BxRVkITqX1WvldXoqwBEQ03KjtD+yDZgD1mbk32LJV5cwSQ381iZh3Y9 BCtLvGVqEyXedSXoHlknOTfyCHrbE6mWURmbGcIEEHGXvrGUEJ5/qtMwH8azy5BTtzxL krFCtP1n2bTqG81bcnd5Xz1b8ajPJlHCouPTuK7bHtdIFfZN519587R0e4/RkkwN7uww /QNbyjqx8Ocr6uD0+RLlm6G5JCez12sO4h0aSAWN1OV1n+iLEpHqky6Wc6yI5u2WSKKf jg== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 32hu1j2rvc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:16 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 06RGgViM055487; Mon, 27 Jul 2020 17:02:16 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3020.oracle.com with ESMTP id 32hu5r9fda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 27 Jul 2020 17:02:16 +0000 Received: from userp3020.oracle.com (userp3020.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06RGuWGn111604; Mon, 27 Jul 2020 17:02:15 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by userp3020.oracle.com with ESMTP id 32hu5r9f7r-6; Mon, 27 Jul 2020 17:02:15 +0000 From: Anthony Yznaga To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org Cc: mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com, steven.sistare@oracle.com Subject: [RFC PATCH 5/5] mm: introduce MADV_DOEXEC Date: Mon, 27 Jul 2020 10:11:27 -0700 Message-Id: <1595869887-23307-6-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9695 signatures=668679 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 clxscore=1015 malwarescore=0 spamscore=0 suspectscore=0 bulkscore=0 priorityscore=1501 phishscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007270116 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org madvise MADV_DOEXEC preserves a memory range across exec. Initially only supported for non-executable, non-stack, anonymous memory. MADV_DONTEXEC reverts the effect of a previous MADV_DOXEXEC call and undoes the preservation of the range. After a successful exec call, the behavior of all ranges reverts to MADV_DONTEXEC. Signed-off-by: Steve Sistare Signed-off-by: Anthony Yznaga --- include/uapi/asm-generic/mman-common.h | 3 +++ mm/madvise.c | 25 +++++++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index f94f65d429be..7c5f616b28f7 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -72,6 +72,9 @@ #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_DOEXEC 22 /* do inherit across exec */ +#define MADV_DONTEXEC 23 /* don't inherit across exec */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/mm/madvise.c b/mm/madvise.c index dd1d43cf026d..b447fa748649 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -103,6 +103,26 @@ static long madvise_behavior(struct vm_area_struct *vma, case MADV_KEEPONFORK: new_flags &= ~VM_WIPEONFORK; break; + case MADV_DOEXEC: + /* + * MADV_DOEXEC is only supported on private, non-executable, + * non-stack anonymous memory and if the VM_EXEC_KEEP flag + * is available. + */ + if (!VM_EXEC_KEEP || vma->vm_file || vma->vm_flags & (VM_EXEC|VM_SHARED|VM_STACK)) { + error = -EINVAL; + goto out; + } + new_flags |= (new_flags & ~VM_MAYEXEC) | VM_EXEC_KEEP; + break; + case MADV_DONTEXEC: + if (!VM_EXEC_KEEP) { + error = -EINVAL; + goto out; + } + if (new_flags & VM_EXEC_KEEP) + new_flags |= (new_flags & ~VM_EXEC_KEEP) | VM_MAYEXEC; + break; case MADV_DONTDUMP: new_flags |= VM_DONTDUMP; break; @@ -983,6 +1003,8 @@ static int madvise_inject_error(int behavior, case MADV_SOFT_OFFLINE: case MADV_HWPOISON: #endif + case MADV_DOEXEC: + case MADV_DONTEXEC: return true; default: @@ -1037,6 +1059,9 @@ static int madvise_inject_error(int behavior, * MADV_DONTDUMP - the application wants to prevent pages in the given range * from being included in its core dump. * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. + * MADV_DOEXEC - On exec, preserve and duplicate this area in the new process + * if the new process allows it. + * MADV_DONTEXEC - Undo the effect of MADV_DOEXEC. * * return values: * zero - success