From patchwork Thu Aug 27 11:49:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D26CA14F6 for ; Thu, 27 Aug 2020 11:49:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9F2EC22BF3 for ; Thu, 27 Aug 2020 11:49:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ZeJkuoAc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9F2EC22BF3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E875C8E000F; Thu, 27 Aug 2020 07:49:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E122B8E0006; Thu, 27 Aug 2020 07:49:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C15158E000F; Thu, 27 Aug 2020 07:49:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0031.hostedemail.com [216.40.44.31]) by kanga.kvack.org (Postfix) with ESMTP id 9B60B8E0006 for ; Thu, 27 Aug 2020 07:49:45 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 627F72461 for ; Thu, 27 Aug 2020 11:49:45 +0000 (UTC) X-FDA: 77196179130.23.month28_5a0893f2706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 3273D37604 for ; Thu, 27 Aug 2020 11:49:45 +0000 (UTC) X-Spam-Summary: 1,0,0,8db81f950eb0c33f,d41d8cd98f00b204,3151hxwukcdixobbvuccuzs.qcazwbil-aayjoqy.cfu@flex--jannh.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:800:960:966:973:981:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1535:1544:1593:1594:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:2898:2899:3138:3139:3140:3141:3142:3152:3355:3865:3866:3867:3868:3870:3871:3872:3874:4118:4250:4321:4385:4605:5007:6261:6653:7875:7903:7974:8660:9592:9969:10004:11026:11473:11658:11914:12043:12291:12296:12297:12438:12555:12683:12895:12986:13148:13230:13255:14096:14097:14181:14394:14659:14721:21080:21444:21627:21795:21939:21990:30003:30012:30051:30054:30070,0,RBL:209.85.219.202:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04yff913ojom7xrqw4qho3zgbb1djycajtgnhbk1et9oqi3fx69o4cidohh1qb3.qhzg55yuhbcwhanpd8fetck9kb1oiy17czxrtfofawddm8xfw5jrb94k4q3a4rt.g-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,Domai nCache:0 X-HE-Tag: month28_5a0893f2706c X-Filterd-Recvd-Size: 7370 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:44 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id e1so7052537ybk.14 for ; Thu, 27 Aug 2020 04:49:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=W92EWp10cYvF5yPv3rgiIaePnIXh3IWgRNGaRIqgcR4=; b=ZeJkuoAcd8dJY7+fCilJDkM8V8M3SmcWQgU+So846yMM9f6jM9riCbwbPfREaF0dY5 azjiHNKp8O7adw/N9CKVBPXtpdMW0qDR/FRcGjNgwiA7Xg4YAVSd/KrxNo/DMShAhT2R gvt3WhIEAlZTrCYDYGy9B3J/DwPtkqHRIFxRzdItRltKvJZQ6pF8lrkxptpCzwuOFdYe UxEuUIK+0+Pu7M6s2hyfeNVbO0IZ3zA1ZSkfIAezur4pqLLvO+lceOPtxc4LafsGYadB YDTG6C+Tz8L/0lfYzwSactKMTE8n+Q1nBDRoze+VAVJKqPB+hOKuq88CjDzXHrCFqY/c nb7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=W92EWp10cYvF5yPv3rgiIaePnIXh3IWgRNGaRIqgcR4=; b=twrtIQ5/HJ0z3hoLa9v6elETaT9qIPAA3ZUimnVK2Ffil0l1dmuWMvLABaq54q6ggI S1+lgeBNuO+qyj7jtIYAw1YOkNbqnjeyZKJhrpnJtUqmEIc0QgP7EaGPRXCvwCU8Yf7j 80fuxQczTdc5rbg4hWyevjyns/E7VPW3kaXzYD6ubbWwLWZxAnDSZbg7OCGgzmbxQfAn eGSF8Vx2h3zE9xu5DtBzrcLb7ZKpzlnwkPHA5Ld91sScd3nPWbQvvKLkgTPEgtGi1+po ncaWmpWGPLkYVJGn9jL5B9KWaMTfMEhcwEFUq/pUAawID+Pq9/fese/Ju7ygWMi6eXZ1 EnkQ== X-Gm-Message-State: AOAM531wfFk8FSct4EqyrYpI6H9DSh0wAGwdNs8bGisE7G3D8APu+wNw 9ZDtvVL5NzPnWEGrWCI9ppp91LqlNA== X-Google-Smtp-Source: ABdhPJxBMYeNdBT4AqELwSCvlFMkpdIbIdV0PyZym4hTP8g2vL+hg+JfUQWtBxMn/GgdmqZygND6iAfo1A== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a25:14c3:: with SMTP id 186mr28815290ybu.114.1598528983701; Thu, 27 Aug 2020 04:49:43 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:26 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-2-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 1/7] binfmt_elf_fdpic: Stop using dump_emit() on user pointers on !MMU From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 3273D37604 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: dump_emit() is for kernel pointers, and VMAs describe userspace memory. Let's be tidy here and avoid accessing userspace pointers under KERNEL_DS, even if it probably doesn't matter much on !MMU systems - especially given that it looks like we can just use the same get_dump_page() as on MMU if we move it out of the CONFIG_MMU block. One small change we have to make in get_dump_page() is to use __get_user_pages_locked() instead of __get_user_pages(), since the latter doesn't exist on nommu. On mmu builds, __get_user_pages_locked() will just call __get_user_pages() for us. Signed-off-by: Jann Horn --- fs/binfmt_elf_fdpic.c | 8 ------ mm/gup.c | 57 +++++++++++++++++++++---------------------- 2 files changed, 28 insertions(+), 37 deletions(-) diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index 50f845702b92..a53f83830986 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1529,14 +1529,11 @@ static bool elf_fdpic_dump_segments(struct coredump_params *cprm) struct vm_area_struct *vma; for (vma = current->mm->mmap; vma; vma = vma->vm_next) { -#ifdef CONFIG_MMU unsigned long addr; -#endif if (!maydump(vma, cprm->mm_flags)) continue; -#ifdef CONFIG_MMU for (addr = vma->vm_start; addr < vma->vm_end; addr += PAGE_SIZE) { bool res; @@ -1552,11 +1549,6 @@ static bool elf_fdpic_dump_segments(struct coredump_params *cprm) if (!res) return false; } -#else - if (!dump_emit(cprm, (void *) vma->vm_start, - vma->vm_end - vma->vm_start)) - return false; -#endif } return true; } diff --git a/mm/gup.c b/mm/gup.c index ae096ea7583f..92519e5a44b3 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1495,35 +1495,6 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors) mmap_read_unlock(mm); return ret; /* 0 or negative error code */ } - -/** - * get_dump_page() - pin user page in memory while writing it to core dump - * @addr: user address - * - * Returns struct page pointer of user page pinned for dump, - * to be freed afterwards by put_page(). - * - * Returns NULL on any kind of failure - a hole must then be inserted into - * the corefile, to preserve alignment with its headers; and also returns - * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - - * allowing a hole to be left in the corefile to save diskspace. - * - * Called without mmap_lock, but after all other threads have been killed. - */ -#ifdef CONFIG_ELF_CORE -struct page *get_dump_page(unsigned long addr) -{ - struct vm_area_struct *vma; - struct page *page; - - if (__get_user_pages(current->mm, addr, 1, - FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma, - NULL) < 1) - return NULL; - flush_cache_page(vma, addr, page_to_pfn(page)); - return page; -} -#endif /* CONFIG_ELF_CORE */ #else /* CONFIG_MMU */ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start, unsigned long nr_pages, struct page **pages, @@ -1569,6 +1540,34 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start, } #endif /* !CONFIG_MMU */ +/** + * get_dump_page() - pin user page in memory while writing it to core dump + * @addr: user address + * + * Returns struct page pointer of user page pinned for dump, + * to be freed afterwards by put_page(). + * + * Returns NULL on any kind of failure - a hole must then be inserted into + * the corefile, to preserve alignment with its headers; and also returns + * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - + * allowing a hole to be left in the corefile to save diskspace. + * + * Called without mmap_lock, but after all other threads have been killed. + */ +#ifdef CONFIG_ELF_CORE +struct page *get_dump_page(unsigned long addr) +{ + struct vm_area_struct *vma; + struct page *page; + + if (__get_user_pages_locked(current->mm, addr, 1, &page, &vma, NULL, + FOLL_FORCE | FOLL_DUMP | FOLL_GET) < 1) + return NULL; + flush_cache_page(vma, addr, page_to_pfn(page)); + return page; +} +#endif /* CONFIG_ELF_CORE */ + #if defined(CONFIG_FS_DAX) || defined (CONFIG_CMA) static bool check_dax_vmas(struct vm_area_struct **vmas, long nr_pages) { From patchwork Thu Aug 27 11:49:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA702722 for ; Thu, 27 Aug 2020 11:49:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A718322BF3 for ; Thu, 27 Aug 2020 11:49:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="EKUYAShS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A718322BF3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C150B8E0010; Thu, 27 Aug 2020 07:49:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B78038E0006; Thu, 27 Aug 2020 07:49:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A17578E0010; Thu, 27 Aug 2020 07:49:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id 8813C8E0006 for ; Thu, 27 Aug 2020 07:49:49 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4CD4F2493 for ; Thu, 27 Aug 2020 11:49:49 +0000 (UTC) X-FDA: 77196179298.18.book20_38166422706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 18CC7100EC675 for ; Thu, 27 Aug 2020 11:49:49 +0000 (UTC) X-Spam-Summary: 1,0,0,9a9910d3aabf1838,d41d8cd98f00b204,3251hxwukcdybsffzyggydw.ugedafmp-eecnsuc.gjy@flex--jannh.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3152:3352:3865:3867:3868:3872:3874:4321:5007:6261:6653:7903:9969:10004:10400:11658:11914:12043:12296:12297:12555:12895:13069:13311:13357:14096:14097:14181:14394:14659:14721:21080:21444:21627:21990:30054,0,RBL:209.85.218.74:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04yrp4875st6twp4ftbanftejqwh6opmh9stjrzsgx6no5der1j4eaxgo3aoeom.wgjyyf9s81oicmxw3kn9basdjncwmu8fpwoeay7w8rpckzwryq7f6yjqbbx891k.w-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: book20_38166422706c X-Filterd-Recvd-Size: 4193 Received: from mail-ej1-f74.google.com (mail-ej1-f74.google.com [209.85.218.74]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:48 +0000 (UTC) Received: by mail-ej1-f74.google.com with SMTP id fy10so2492932ejb.9 for ; Thu, 27 Aug 2020 04:49:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=nVHuYaY9UtfQ5ceDG+IOgNARKDklUVVN28+71pW1fmk=; b=EKUYAShSuw6vIw45vevo+Sy7e72QAp73T544NA5kbUcsd1iV/LN/8R2GxzPOpqGvdX pdndWsVd6Lr3QZFGe4mTnAYNnlEb32APpoNqKM8a2Y00vZHJs04X7m3Rg2VxyGHiu8WB 8/ObYeSz7+0b4daA5rCvUagkfihi0oU2h2O/4pWpaAnFNKDROaYdPONQnUqQY9kfrvGP uKFQPWrcbMbQHbVYb3WZx3sPPDGERlRi8cAKR52YtrWMhsNQhVHjm12Y5jeYEQeQQlti ugA2ZkmaIpOOKUnnVGqvAI0YlxcaE3BBSdIcXjisRZIOt2eBY+4cKnwSl5+8CX5Ld5vK EnZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=nVHuYaY9UtfQ5ceDG+IOgNARKDklUVVN28+71pW1fmk=; b=AvT2OEGCuoUrhuPDKQQNLpylVZWK6OsVNr/bV/dQwzDDTpvH15YZ/Y/9QvC912K5k1 bETveJVM47k7ohG+D7L/4N9My7KtXdd2fIp1W2uEtXqiCzHRoqOPxy0d4oxobcT1uD0M 4z0LiVbJD8qBy31bxfP9tb7chgyHnTxmpdfrGeCuN3dz3mLvH5J9vAvejpUq/RP2R/TY nTkPwwpJ44kx1j8BJUmEfYZot85cLhde9/Ck/G7Za33TPWWH1sli4jVuLUYF+ckCsh/s H/yqo2QRcrGwR3t/Qoo5xb6+ZUBhuVyiBiewj3bQXKWrWmdr2W0SvRxrzCLp9laAgQgb AbYg== X-Gm-Message-State: AOAM532ufu5L/fK+OD6x6QAZrjzf9vL17iDX5v+IA5r5ApcJpCKt1hG2 e6133dj4v4ANW6Nr48KSItf6U812sQ== X-Google-Smtp-Source: ABdhPJy2EioBHl5DQHnC+VEIiLoSJ394iMXy3ZaLk8kxlGOGKXykgo5D+wC/Tp3lDGd7qD4rKXOJxasNpA== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a17:906:dc03:: with SMTP id yy3mr21385084ejb.380.1598528987410; Thu, 27 Aug 2020 04:49:47 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:27 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-3-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 2/7] coredump: Let dump_emit() bail out on short writes From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 18CC7100EC675 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: dump_emit() has a retry loop, but there seems to be no way for that retry logic to actually be used; and it was also buggy, writing the same data repeatedly after a short write. Let's just bail out on a short write. Suggested-by: Linus Torvalds Signed-off-by: Jann Horn --- fs/coredump.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/fs/coredump.c b/fs/coredump.c index 76e7c10edfc0..5e24c06092c9 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -840,17 +840,17 @@ int dump_emit(struct coredump_params *cprm, const void *addr, int nr) ssize_t n; if (cprm->written + nr > cprm->limit) return 0; - while (nr) { - if (dump_interrupted()) - return 0; - n = __kernel_write(file, addr, nr, &pos); - if (n <= 0) - return 0; - file->f_pos = pos; - cprm->written += n; - cprm->pos += n; - nr -= n; - } + + + if (dump_interrupted()) + return 0; + n = __kernel_write(file, addr, nr, &pos); + if (n != nr) + return 0; + file->f_pos = pos; + cprm->written += n; + cprm->pos += n; + return 1; } EXPORT_SYMBOL(dump_emit); From patchwork Thu Aug 27 11:49:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF6E114F6 for ; Thu, 27 Aug 2020 11:49:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7C9F222BF3 for ; Thu, 27 Aug 2020 11:49:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="mwLQkW2e" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7C9F222BF3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9CA678E0011; Thu, 27 Aug 2020 07:49:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9549C8E0006; Thu, 27 Aug 2020 07:49:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F64C8E0011; Thu, 27 Aug 2020 07:49:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0109.hostedemail.com [216.40.44.109]) by kanga.kvack.org (Postfix) with ESMTP id 662A58E0006 for ; Thu, 27 Aug 2020 07:49:53 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2810A181AEF1A for ; Thu, 27 Aug 2020 11:49:53 +0000 (UTC) X-FDA: 77196179466.11.tub58_1c0ab282706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id D6A4C180F8B81 for ; Thu, 27 Aug 2020 11:49:52 +0000 (UTC) X-Spam-Summary: 1,0,0,66eaece4c109a92b,d41d8cd98f00b204,3351hxwukcdofwjjdckkcha.ykihejqt-iigrwyg.knc@flex--jannh.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1535:1544:1593:1594:1711:1730:1747:1777:1792:2393:2559:2562:2898:3138:3139:3140:3141:3142:3152:3354:3865:3866:3867:3868:3872:3874:4118:4321:4605:5007:6261:6653:7974:9592:9969:10004:11026:11473:11658:11914:12043:12291:12294:12296:12297:12438:12555:12683:12895:12986:14096:14097:14110:14181:14394:14659:14721:21080:21444:21627:21990:30003:30054:30070,0,RBL:209.85.208.73:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-62.18.84.100 66.100.201.100;04yrnbs899rriqiydmfb1i159zekpyc394ch9k9ts5ufg9zkbfmmf9aq39er6kc.bqzm16sqznfpncbbqayeeusx9eqrkanjz7bbnzxfabe1b9cuydeebhyia3thfcp.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:no ne X-HE-Tag: tub58_1c0ab282706c X-Filterd-Recvd-Size: 7098 Received: from mail-ed1-f73.google.com (mail-ed1-f73.google.com [209.85.208.73]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:52 +0000 (UTC) Received: by mail-ed1-f73.google.com with SMTP id u11so1809605eds.23 for ; Thu, 27 Aug 2020 04:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=37TD1tOgbExZK4EAyJ2auNMattkG66IuBeoEm2InRhU=; b=mwLQkW2eu135cvu7FtBkDyZdvLbrR4ESlq2qN0k/6PXfOdym4GlJSHKkQgZRQIApvg TNU2TA7NTgAq8W1RjGrPU5cfU2v13XgwY1Y286YXZR1RIgU1rBATGkuwy3UnVnEfzVzS h9gBAOJscFv841RTN5OFqgz0ZDFn2mr100yymtqXnZ90QYDN1zdwYNj0w3mZsZbXRoJm ApvuTLxP25AdJ3uWDj7M1DUDci1qtgVkkFgUVcI2lcMwK3aGaOSzeLI4NpF3Su/JRTmH cP3DWhLkdT8DKJZ11vVYL16N1ngL13i5DCLHElAc6DNRzZ2vCnjXQVgEWJqe8SDAXd7p Xrpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=37TD1tOgbExZK4EAyJ2auNMattkG66IuBeoEm2InRhU=; b=DRWOu+vfoxOI8b7pVY+Y3QgKbRe+CTVgSWvSzuiKMq7iiCt0iUK6GUGpIUvZ07PPFD ZkF3vD1qPUqvEpxglVAhvV/CgDo5V+0GKxbsIthAZXxlvj/KLGNloemT+nDz70wwJFm+ AUe4QN9qMRtU7hRoZ5RgpyQBbRoHOvbq9yZL3SSLv2z+Dvdzf4cO70ZXci1n/9Obf3kt sb+URWOhF1VGcDTZkzmop7cBy/J3+ukbTTwK3Rv5khMNNFvpRWuEQ9DwZkTrHcRzPOZP eX7hhhLGkpIPWlHFMJnVJjhptZkGkOIGEh5Mg4Ko3GSWitSsEyZ8PuOINJiyKB1+RSeR LHlQ== X-Gm-Message-State: AOAM532qvKCo0RpE2v1g+NowdKANaA7yQcwMKFgbPf+8hwLYQUc6gt5L fbe6DsRgsmBeL2pYU8WbsVmwK1ZdmQ== X-Google-Smtp-Source: ABdhPJwLTnajnvbze0tWFXuS4/QQ3qxi89bpi+PbXi3ya8YRbpq6nxuMIwJcqnNQknEdlR+GAxHngz9mBw== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a17:906:c7cd:: with SMTP id dc13mr20150907ejb.446.1598528991398; Thu, 27 Aug 2020 04:49:51 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:28 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-4-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 3/7] coredump: Refactor page range dumping into common helper From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: D6A4C180F8B81 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Both fs/binfmt_elf.c and fs/binfmt_elf_fdpic.c need to dump ranges of pages into the coredump file. Extract that logic into a common helper. Signed-off-by: Jann Horn --- fs/binfmt_elf.c | 22 ++-------------------- fs/binfmt_elf_fdpic.c | 18 +++--------------- fs/coredump.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/coredump.h | 2 ++ 4 files changed, 41 insertions(+), 35 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 13d053982dd7..5fd11a25d320 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -2419,26 +2419,8 @@ static int elf_core_dump(struct coredump_params *cprm) for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; vma = next_vma(vma, gate_vma)) { - unsigned long addr; - unsigned long end; - - end = vma->vm_start + vma_filesz[i++]; - - for (addr = vma->vm_start; addr < end; addr += PAGE_SIZE) { - struct page *page; - int stop; - - page = get_dump_page(addr); - if (page) { - void *kaddr = kmap(page); - stop = !dump_emit(cprm, kaddr, PAGE_SIZE); - kunmap(page); - put_page(page); - } else - stop = !dump_skip(cprm, PAGE_SIZE); - if (stop) - goto end_coredump; - } + if (!dump_user_range(cprm, vma->vm_start, vma_filesz[i++])) + goto end_coredump; } dump_truncate(cprm); diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index a53f83830986..76e8c0defdc8 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1534,21 +1534,9 @@ static bool elf_fdpic_dump_segments(struct coredump_params *cprm) if (!maydump(vma, cprm->mm_flags)) continue; - for (addr = vma->vm_start; addr < vma->vm_end; - addr += PAGE_SIZE) { - bool res; - struct page *page = get_dump_page(addr); - if (page) { - void *kaddr = kmap(page); - res = dump_emit(cprm, kaddr, PAGE_SIZE); - kunmap(page); - put_page(page); - } else { - res = dump_skip(cprm, PAGE_SIZE); - } - if (!res) - return false; - } + if (!dump_user_range(cprm, vma->vm_start, + vma->vma_end - vma->vm_start)) + return false; } return true; } diff --git a/fs/coredump.c b/fs/coredump.c index 5e24c06092c9..6042d15acd51 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -876,6 +876,40 @@ int dump_skip(struct coredump_params *cprm, size_t nr) } EXPORT_SYMBOL(dump_skip); +#ifdef CONFIG_ELF_CORE +int dump_user_range(struct coredump_params *cprm, unsigned long start, + unsigned long len) +{ + unsigned long addr; + + for (addr = start; addr < start + len; addr += PAGE_SIZE) { + struct page *page; + int stop; + + /* + * To avoid having to allocate page tables for virtual address + * ranges that have never been used yet, and also to make it + * easy to generate sparse core files, use a helper that returns + * NULL when encountering an empty page table entry that would + * otherwise have been filled with the zero page. + */ + page = get_dump_page(addr); + if (page) { + void *kaddr = kmap(page); + + stop = !dump_emit(cprm, kaddr, PAGE_SIZE); + kunmap(page); + put_page(page); + } else { + stop = !dump_skip(cprm, PAGE_SIZE); + } + if (stop) + return 0; + } + return 1; +} +#endif + int dump_align(struct coredump_params *cprm, int align) { unsigned mod = cprm->pos & (align - 1); diff --git a/include/linux/coredump.h b/include/linux/coredump.h index 7a899e83835d..f0b71a74d0bc 100644 --- a/include/linux/coredump.h +++ b/include/linux/coredump.h @@ -16,6 +16,8 @@ extern int dump_skip(struct coredump_params *cprm, size_t nr); extern int dump_emit(struct coredump_params *cprm, const void *addr, int nr); extern int dump_align(struct coredump_params *cprm, int align); extern void dump_truncate(struct coredump_params *cprm); +int dump_user_range(struct coredump_params *cprm, unsigned long start, + unsigned long len); #ifdef CONFIG_COREDUMP extern void do_coredump(const kernel_siginfo_t *siginfo); #else From patchwork Thu Aug 27 11:49:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740589 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69A4614F6 for ; Thu, 27 Aug 2020 11:49:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1BC3922BF3 for ; Thu, 27 Aug 2020 11:49:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="pIw2Krxm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BC3922BF3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 18EBF8E0012; Thu, 27 Aug 2020 07:49:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 117C18E0006; Thu, 27 Aug 2020 07:49:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F21448E0012; Thu, 27 Aug 2020 07:49:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id D27F28E0006 for ; Thu, 27 Aug 2020 07:49:56 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 91FB02465 for ; Thu, 27 Aug 2020 11:49:56 +0000 (UTC) X-FDA: 77196179592.19.paint79_510ffe72706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 5DFF81AD1B2 for ; Thu, 27 Aug 2020 11:49:56 +0000 (UTC) X-Spam-Summary: 1,0,0,f0bbf860025aef62,d41d8cd98f00b204,34p1hxwukcd0izmmgfnnfkd.bnlkhmtw-lljuzbj.nqf@flex--jannh.bounces.google.com,,RULES_HIT:4:41:69:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1359:1437:1500:1516:1518:1593:1594:1605:1730:1747:1777:1792:1981:2194:2199:2380:2393:2553:2559:2562:2638:2898:2903:3138:3139:3140:3141:3142:3152:3622:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4605:5007:6261:6653:7808:7875:7903:7974:8603:9592:9707:9969:10004:11026:11232:11473:11658:11914:12043:12291:12296:12297:12438:12555:12663:12683:12895:12986:13138:13161:13229:13231:13255:14096:14097:14394:14659:21080:21222:21324:21444:21451:21627:21740:21990:30012:30029:30051:30054:30070:30075:30090,0,RBL:209.85.222.201:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04yrdknb3fmnxgojeodbpocbdfgagoprqnf1awsnt5dw7jqsbionm6otns8pntj.8qhmcgw3oqe5epg685hoj7jh1dtcb97sy9oq66c87zhnuy4wd8igw4zxim5c957.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none, Bayesian X-HE-Tag: paint79_510ffe72706c X-Filterd-Recvd-Size: 15579 Received: from mail-qk1-f201.google.com (mail-qk1-f201.google.com [209.85.222.201]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:55 +0000 (UTC) Received: by mail-qk1-f201.google.com with SMTP id b76so4520966qkg.8 for ; Thu, 27 Aug 2020 04:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=E0DOGeEY62qLZ7uSRf9UfIpHkq96+GdwwPXcN18hEJ0=; b=pIw2KrxmoMV7baKQC5k3kTG9I/MArgQOkxCEyQg7puGJNupue/a7c2lTegkZWYpXVj MRoJDyBaISNiyxzjuMP2fAquo3bkxD+e3+sH+Yd1wAM03gSei7kjZIP/QrKrNYOCYXQK IBbYHcuB8qgRiist7/ix7HynPAXoB1j25Lww9igyKbfY12dDMb4gNWRmIxvFqnkL3c0b NBPxrFxAYADJVPolcWPSkobKgiUocNZEkcAIYX0Sk5a1Z1OPIe2DlKk/Rrw1qC/l3tML lkmyTkfxZ/Vd/6xkglPtQkPmQw3NhD0MMoGzP2oJC97CBNez/q+Q6jyD9yjvqJj41FFx nN9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=E0DOGeEY62qLZ7uSRf9UfIpHkq96+GdwwPXcN18hEJ0=; b=OaHwhSdzF4lrpd0IxG8enphrymFmNh18Q2JTdoA4YvJDBXvaDesztPKpYb+L42jCLX 7RkRGK1UqXq5+tbWDyyyhTk2DIyruKB9airjp+9xCkpv1yWWVjU/TFd5Vn8e23ZW6IK2 tZNyjQuK8gKpI0xcgIbPeTgZPV8cv2fQMywxSNOCl+ABtpqL0ExyokgHtIrSdlDX7861 FZmgQHEkiIABosbI4S3ipw4bpMnpdppWVzSz45Q8KriwK9KcUj/5da0zeAiSpAhBSGkW TyxImg/gKL5r39Jso41I2iz2AMnkYwNwp1Gg5F6Ciyonf4lmKpCDFOA5kOxoz+YdUoUm 8fRQ== X-Gm-Message-State: AOAM531iQU8XH8r1LnwNoNEOG9pB7qqyWClYvkGaPYj0T2L48/ubHaXm FFC/w8xbqWzEv1RIkGT1/Vz3gH/1uA== X-Google-Smtp-Source: ABdhPJx4bULUmnyYEEBJ3ZbFaInsMrqKtXp7IHbERfvfk/Yok8RrNFYczmlckohS0u+OxxH3yv4/+3jxKA== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a0c:f4cb:: with SMTP id o11mr18241148qvm.3.1598528994972; Thu, 27 Aug 2020 04:49:54 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:29 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-5-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 4/7] coredump: Rework elf/elf_fdpic vma_dump_size() into common helper From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 5DFF81AD1B2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly different code to figure out which VMAs should be dumped, and if so, whether the dump should contain the entire VMA or just its first page. Eliminate duplicate code by reworking the binfmt_elf version into a generic core dumping helper in coredump.c. As part of that, change the heuristic for detecting executable/library header pages to check whether the inode is executable instead of looking at the file mode. This is less problematic in terms of locking because it lets us avoid get_user() under the mmap_sem. (And arguably it looks nicer and makes more sense in generic code.) Adjust a little bit based on the binfmt_elf_fdpic version: ->anon_vma is only meaningful under CONFIG_MMU, otherwise we have to assume that the VMA has been written to. Suggested-by: Linus Torvalds Signed-off-by: Jann Horn --- fs/binfmt_elf.c | 120 --------------------------------------- fs/binfmt_elf_fdpic.c | 83 ++------------------------- fs/coredump.c | 101 ++++++++++++++++++++++++++++++++ include/linux/coredump.h | 1 + 4 files changed, 106 insertions(+), 199 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 5fd11a25d320..03478005a128 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1389,126 +1389,6 @@ static int load_elf_library(struct file *file) * Jeremy Fitzhardinge */ -/* - * The purpose of always_dump_vma() is to make sure that special kernel mappings - * that are useful for post-mortem analysis are included in every core dump. - * In that way we ensure that the core dump is fully interpretable later - * without matching up the same kernel and hardware config to see what PC values - * meant. These special mappings include - vDSO, vsyscall, and other - * architecture specific mappings - */ -static bool always_dump_vma(struct vm_area_struct *vma) -{ - /* Any vsyscall mappings? */ - if (vma == get_gate_vma(vma->vm_mm)) - return true; - - /* - * Assume that all vmas with a .name op should always be dumped. - * If this changes, a new vm_ops field can easily be added. - */ - if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops->name(vma)) - return true; - - /* - * arch_vma_name() returns non-NULL for special architecture mappings, - * such as vDSO sections. - */ - if (arch_vma_name(vma)) - return true; - - return false; -} - -/* - * Decide what to dump of a segment, part, all or none. - */ -static unsigned long vma_dump_size(struct vm_area_struct *vma, - unsigned long mm_flags) -{ -#define FILTER(type) (mm_flags & (1UL << MMF_DUMP_##type)) - - /* always dump the vdso and vsyscall sections */ - if (always_dump_vma(vma)) - goto whole; - - if (vma->vm_flags & VM_DONTDUMP) - return 0; - - /* support for DAX */ - if (vma_is_dax(vma)) { - if ((vma->vm_flags & VM_SHARED) && FILTER(DAX_SHARED)) - goto whole; - if (!(vma->vm_flags & VM_SHARED) && FILTER(DAX_PRIVATE)) - goto whole; - return 0; - } - - /* Hugetlb memory check */ - if (is_vm_hugetlb_page(vma)) { - if ((vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_SHARED)) - goto whole; - if (!(vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_PRIVATE)) - goto whole; - return 0; - } - - /* Do not dump I/O mapped devices or special mappings */ - if (vma->vm_flags & VM_IO) - return 0; - - /* By default, dump shared memory if mapped from an anonymous file. */ - if (vma->vm_flags & VM_SHARED) { - if (file_inode(vma->vm_file)->i_nlink == 0 ? - FILTER(ANON_SHARED) : FILTER(MAPPED_SHARED)) - goto whole; - return 0; - } - - /* Dump segments that have been written to. */ - if (vma->anon_vma && FILTER(ANON_PRIVATE)) - goto whole; - if (vma->vm_file == NULL) - return 0; - - if (FILTER(MAPPED_PRIVATE)) - goto whole; - - /* - * If this looks like the beginning of a DSO or executable mapping, - * check for an ELF header. If we find one, dump the first page to - * aid in determining what was mapped here. - */ - if (FILTER(ELF_HEADERS) && - vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) { - u32 __user *header = (u32 __user *) vma->vm_start; - u32 word; - /* - * Doing it this way gets the constant folded by GCC. - */ - union { - u32 cmp; - char elfmag[SELFMAG]; - } magic; - BUILD_BUG_ON(SELFMAG != sizeof word); - magic.elfmag[EI_MAG0] = ELFMAG0; - magic.elfmag[EI_MAG1] = ELFMAG1; - magic.elfmag[EI_MAG2] = ELFMAG2; - magic.elfmag[EI_MAG3] = ELFMAG3; - if (unlikely(get_user(word, header))) - word = 0; - if (word == magic.cmp) - return PAGE_SIZE; - } - -#undef FILTER - - return 0; - -whole: - return vma->vm_end - vma->vm_start; -} - /* An ELF note in memory */ struct memelfnote { diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index 76e8c0defdc8..f531c6198864 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1215,76 +1215,6 @@ struct elf_prstatus_fdpic int pr_fpvalid; /* True if math co-processor being used. */ }; -/* - * Decide whether a segment is worth dumping; default is yes to be - * sure (missing info is worse than too much; etc). - * Personally I'd include everything, and use the coredump limit... - * - * I think we should skip something. But I am not sure how. H.J. - */ -static int maydump(struct vm_area_struct *vma, unsigned long mm_flags) -{ - int dump_ok; - - /* Do not dump I/O mapped devices or special mappings */ - if (vma->vm_flags & VM_IO) { - kdcore("%08lx: %08lx: no (IO)", vma->vm_start, vma->vm_flags); - return 0; - } - - /* If we may not read the contents, don't allow us to dump - * them either. "dump_write()" can't handle it anyway. - */ - if (!(vma->vm_flags & VM_READ)) { - kdcore("%08lx: %08lx: no (!read)", vma->vm_start, vma->vm_flags); - return 0; - } - - /* support for DAX */ - if (vma_is_dax(vma)) { - if (vma->vm_flags & VM_SHARED) { - dump_ok = test_bit(MMF_DUMP_DAX_SHARED, &mm_flags); - kdcore("%08lx: %08lx: %s (DAX shared)", vma->vm_start, - vma->vm_flags, dump_ok ? "yes" : "no"); - } else { - dump_ok = test_bit(MMF_DUMP_DAX_PRIVATE, &mm_flags); - kdcore("%08lx: %08lx: %s (DAX private)", vma->vm_start, - vma->vm_flags, dump_ok ? "yes" : "no"); - } - return dump_ok; - } - - /* By default, dump shared memory if mapped from an anonymous file. */ - if (vma->vm_flags & VM_SHARED) { - if (file_inode(vma->vm_file)->i_nlink == 0) { - dump_ok = test_bit(MMF_DUMP_ANON_SHARED, &mm_flags); - kdcore("%08lx: %08lx: %s (share)", vma->vm_start, - vma->vm_flags, dump_ok ? "yes" : "no"); - return dump_ok; - } - - dump_ok = test_bit(MMF_DUMP_MAPPED_SHARED, &mm_flags); - kdcore("%08lx: %08lx: %s (share)", vma->vm_start, - vma->vm_flags, dump_ok ? "yes" : "no"); - return dump_ok; - } - -#ifdef CONFIG_MMU - /* By default, if it hasn't been written to, don't write it out */ - if (!vma->anon_vma) { - dump_ok = test_bit(MMF_DUMP_MAPPED_PRIVATE, &mm_flags); - kdcore("%08lx: %08lx: %s (!anon)", vma->vm_start, - vma->vm_flags, dump_ok ? "yes" : "no"); - return dump_ok; - } -#endif - - dump_ok = test_bit(MMF_DUMP_ANON_PRIVATE, &mm_flags); - kdcore("%08lx: %08lx: %s", vma->vm_start, vma->vm_flags, - dump_ok ? "yes" : "no"); - return dump_ok; -} - /* An ELF note in memory */ struct memelfnote { @@ -1529,13 +1459,9 @@ static bool elf_fdpic_dump_segments(struct coredump_params *cprm) struct vm_area_struct *vma; for (vma = current->mm->mmap; vma; vma = vma->vm_next) { - unsigned long addr; - - if (!maydump(vma, cprm->mm_flags)) - continue; + unsigned long size = vma_dump_size(vma, cprm->mm_flags); - if (!dump_user_range(cprm, vma->vm_start, - vma->vma_end - vma->vm_start)) + if (!dump_user_range(cprm, vma->vm_start, size)) return false; } return true; @@ -1547,8 +1473,7 @@ static size_t elf_core_vma_data_size(unsigned long mm_flags) size_t size = 0; for (vma = current->mm->mmap; vma; vma = vma->vm_next) - if (maydump(vma, mm_flags)) - size += vma->vm_end - vma->vm_start; + size += vma_dump_size(vma, mm_flags); return size; } @@ -1694,7 +1619,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) phdr.p_offset = offset; phdr.p_vaddr = vma->vm_start; phdr.p_paddr = 0; - phdr.p_filesz = maydump(vma, cprm->mm_flags) ? sz : 0; + phdr.p_filesz = vma_dump_size(vma, cprm->mm_flags); phdr.p_memsz = sz; offset += phdr.p_filesz; phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0; diff --git a/fs/coredump.c b/fs/coredump.c index 6042d15acd51..4ef4c49a65b7 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -936,3 +936,104 @@ void dump_truncate(struct coredump_params *cprm) } } EXPORT_SYMBOL(dump_truncate); + +/* + * The purpose of always_dump_vma() is to make sure that special kernel mappings + * that are useful for post-mortem analysis are included in every core dump. + * In that way we ensure that the core dump is fully interpretable later + * without matching up the same kernel and hardware config to see what PC values + * meant. These special mappings include - vDSO, vsyscall, and other + * architecture specific mappings + */ +static bool always_dump_vma(struct vm_area_struct *vma) +{ + /* Any vsyscall mappings? */ + if (vma == get_gate_vma(vma->vm_mm)) + return true; + + /* + * Assume that all vmas with a .name op should always be dumped. + * If this changes, a new vm_ops field can easily be added. + */ + if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops->name(vma)) + return true; + + /* + * arch_vma_name() returns non-NULL for special architecture mappings, + * such as vDSO sections. + */ + if (arch_vma_name(vma)) + return true; + + return false; +} + +/* + * Decide how much of @vma's contents should be included in a core dump. + */ +unsigned long vma_dump_size(struct vm_area_struct *vma, unsigned long mm_flags) +{ +#define FILTER(type) (mm_flags & (1UL << MMF_DUMP_##type)) + + /* always dump the vdso and vsyscall sections */ + if (always_dump_vma(vma)) + goto whole; + + if (vma->vm_flags & VM_DONTDUMP) + return 0; + + /* support for DAX */ + if (vma_is_dax(vma)) { + if ((vma->vm_flags & VM_SHARED) && FILTER(DAX_SHARED)) + goto whole; + if (!(vma->vm_flags & VM_SHARED) && FILTER(DAX_PRIVATE)) + goto whole; + return 0; + } + + /* Hugetlb memory check */ + if (is_vm_hugetlb_page(vma)) { + if ((vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_SHARED)) + goto whole; + if (!(vma->vm_flags & VM_SHARED) && FILTER(HUGETLB_PRIVATE)) + goto whole; + return 0; + } + + /* Do not dump I/O mapped devices or special mappings */ + if (vma->vm_flags & VM_IO) + return 0; + + /* By default, dump shared memory if mapped from an anonymous file. */ + if (vma->vm_flags & VM_SHARED) { + if (file_inode(vma->vm_file)->i_nlink == 0 ? + FILTER(ANON_SHARED) : FILTER(MAPPED_SHARED)) + goto whole; + return 0; + } + + /* Dump segments that have been written to. */ + if ((!IS_ENABLED(CONFIG_MMU) || vma->anon_vma) && FILTER(ANON_PRIVATE)) + goto whole; + if (vma->vm_file == NULL) + return 0; + + if (FILTER(MAPPED_PRIVATE)) + goto whole; + + /* + * If this is the beginning of an executable file mapping, + * dump the first page to aid in determining what was mapped here. + */ + if (FILTER(ELF_HEADERS) && + vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) && + (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0) + return PAGE_SIZE; + +#undef FILTER + + return 0; + +whole: + return vma->vm_end - vma->vm_start; +} diff --git a/include/linux/coredump.h b/include/linux/coredump.h index f0b71a74d0bc..bfecb8d79a7f 100644 --- a/include/linux/coredump.h +++ b/include/linux/coredump.h @@ -16,6 +16,7 @@ extern int dump_skip(struct coredump_params *cprm, size_t nr); extern int dump_emit(struct coredump_params *cprm, const void *addr, int nr); extern int dump_align(struct coredump_params *cprm, int align); extern void dump_truncate(struct coredump_params *cprm); +unsigned long vma_dump_size(struct vm_area_struct *vma, unsigned long mm_flags); int dump_user_range(struct coredump_params *cprm, unsigned long start, unsigned long len); #ifdef CONFIG_COREDUMP From patchwork Thu Aug 27 11:49:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740591 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 125A0913 for ; Thu, 27 Aug 2020 11:50:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AAA6D22CBE for ; Thu, 27 Aug 2020 11:50:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="RsIoKu7H" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AAA6D22CBE Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8E0488E0013; Thu, 27 Aug 2020 07:50:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 86A8B8E0006; Thu, 27 Aug 2020 07:50:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70A7B8E0013; Thu, 27 Aug 2020 07:50:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 4F1798E0006 for ; Thu, 27 Aug 2020 07:50:00 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 11855180AD815 for ; Thu, 27 Aug 2020 11:50:00 +0000 (UTC) X-FDA: 77196179760.09.time60_4017ffa2706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id E1140180AD811 for ; Thu, 27 Aug 2020 11:49:59 +0000 (UTC) X-Spam-Summary: 1,0,0,a7efc93f0b260642,d41d8cd98f00b204,35p1hxwukceemdqqkjrrjoh.frpolqx0-ppnydfn.ruj@flex--jannh.bounces.google.com,,RULES_HIT:4:41:69:152:355:379:541:800:960:966:973:988:989:1260:1277:1313:1314:1345:1359:1431:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2693:2892:2898:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3873:3874:4250:4321:4385:4605:5007:6119:6261:6653:7875:7903:7974:8603:9010:9012:9592:9969:10004:10226:11026:11473:11658:11914:12043:12291:12294:12296:12297:12438:12555:12679:12683:12895:12986:13149:13161:13229:13230:13869:13972:14096:14097:14394:14659:21080:21220:21325:21444:21450:21451:21611:21627:21740:21939:21983:21987:21990:30005:30012:30034:30045:30054:30070:30075:30090,0,RBL:209.85.219.202:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04yfykhtrkfr33smorizmtcy7at6cych9ruaizw1ys59a9qgb68awt9px5615py.8u4woy13ik5j4wfn3raug33bqwias1bx4hoz4314t7hkynb9tn74 dmwhytpt X-HE-Tag: time60_4017ffa2706c X-Filterd-Recvd-Size: 19349 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:59 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id x6so6993440ybp.10 for ; Thu, 27 Aug 2020 04:49:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=ChG9OT8YpBy/wCXjoQcT7mcfe51RwJeE1qaxZBpgE38=; b=RsIoKu7HEtzc9XeEV23ejUsVaV3p8UmK+EwpIFL+HE1NwTMaR+TNkUTDPoDM4vO1ol GnAOgjIkY16Wfq9AM4ulWMMd4u+srziLQ3FZ80tRRY5bqSV4SEd2nILFG8h/pKzqKlOW biLLwesuG5kKNLdlvNx56XbAyPxHopJ2anheErk+et3xVsx25DHjW4wVoF1fMT8xX+0d gvzOfEeKy4UrSF6YUxrOiO6WxaOoEt5JXUXFwhPWDdQmU5M4PKXRTKqpOhIVsTbNvFCk DRSqhzIazVx39HPgVQx8FwZr8x4saxoU0DnIl3Y1LvRjEEqfvA6qn8+BxW0xGN6HuikU 8b+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ChG9OT8YpBy/wCXjoQcT7mcfe51RwJeE1qaxZBpgE38=; b=lSGgnJPiAq9ktZkRZ7P8VHdIh1l2J0hoaxOw3wf376tqOycw96SAh9qOgtqLhQm3SK HZTyq17rgmfBnRyiEHJucAVFeDJnY8zHcpSUMJQY73f+qwCmcVgVvHwvC68FLXvx3zZY VpTw7uAOXzL+8CbqInpKjW5Yva9KiL9Jn1mp6iv1U9tWLySfOXMMD6Vnjpvjy5wblwmg SJFFH0HS690W5v1xaqUaoXrJC6YJoGRgQ6erAMb0uV2lsPUStLx8vuZOvIVc58kDGd3d e1JBrmaCrG3rwlMccnYb5eYsYu+Y9TSiaq3pXVGharuON7Ga6g+fACmxE+/4HjbN0Urs ziTg== X-Gm-Message-State: AOAM531vy56Vun/7MWBGIt7Dft+SCkwIgJfSMJ2egISWs2P8HLHDqWor +WPbX/eAI7Y4mhHdcoBbrwf8r4dbew== X-Google-Smtp-Source: ABdhPJwZD09uB5UfiGVB8q5Jr2M8er9OffaZKarHfoLl/X0H6SsrqCeMwjLDkIaGpXiSy6zG6H68lkvLHw== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a25:ad5a:: with SMTP id l26mr26698371ybe.510.1598528998652; Thu, 27 Aug 2020 04:49:58 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:30 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-6-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 5/7] binfmt_elf, binfmt_elf_fdpic: Use a VMA list snapshot From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: E1140180AD811 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In both binfmt_elf and binfmt_elf_fdpic, use a new helper dump_vma_snapshot() to take a snapshot of the VMA list (including the gate VMA, if we have one) while protected by the mmap_lock, and then use that snapshot instead of walking the VMA list without locking. An alternative approach would be to keep the mmap_lock held across the entire core dumping operation; however, keeping the mmap_lock locked while we may be blocked for an unbounded amount of time (e.g. because we're dumping to a FUSE filesystem or so) isn't really optimal; the mmap_lock blocks things like the ->release handler of userfaultfd, and we don't really want critical system daemons to grind to a halt just because someone "gifted" them SCM_RIGHTS to an eternally-locked userfaultfd, or something like that. Since both the normal ELF code and the FDPIC ELF code need this functionality (and if any other binfmt wants to add coredump support in the future, they'd probably need it, too), implement this with a common helper in fs/coredump.c. A downside of this approach is that we now need a bigger amount of kernel memory per userspace VMA in the normal ELF case, and that we need O(n) kernel memory in the FDPIC ELF case at all; but 40 bytes per VMA shouldn't be terribly bad. There currently is a data race between stack expansion and anything that reads ->vm_start or ->vm_end under the mmap_lock held in read mode; to mitigate that for core dumping, take the mmap_lock in write mode when taking a snapshot of the VMA hierarchy. (If we only took the mmap_lock in read mode, we could end up with a corrupted core dump if someone does get_user_pages_remote() concurrently. Not really a major problem, but taking the mmap_lock either way works here, so we might as well avoid the issue.) (This doesn't do anything about the existing data races with stack expansion in other mm code.) Signed-off-by: Jann Horn --- fs/binfmt_elf.c | 100 +++++++++------------------------------ fs/binfmt_elf_fdpic.c | 67 +++++++++++--------------- fs/coredump.c | 81 ++++++++++++++++++++++++++++++- include/linux/coredump.h | 10 +++- 4 files changed, 138 insertions(+), 120 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 03478005a128..40ec0b9b4b4f 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -2100,32 +2100,6 @@ static void free_note_info(struct elf_note_info *info) #endif -static struct vm_area_struct *first_vma(struct task_struct *tsk, - struct vm_area_struct *gate_vma) -{ - struct vm_area_struct *ret = tsk->mm->mmap; - - if (ret) - return ret; - return gate_vma; -} -/* - * Helper function for iterating across a vma list. It ensures that the caller - * will visit `gate_vma' prior to terminating the search. - */ -static struct vm_area_struct *next_vma(struct vm_area_struct *this_vma, - struct vm_area_struct *gate_vma) -{ - struct vm_area_struct *ret; - - ret = this_vma->vm_next; - if (ret) - return ret; - if (this_vma == gate_vma) - return NULL; - return gate_vma; -} - static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum, elf_addr_t e_shoff, int segs) { @@ -2152,9 +2126,8 @@ static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum, static int elf_core_dump(struct coredump_params *cprm) { int has_dumped = 0; - int segs, i; - size_t vma_data_size = 0; - struct vm_area_struct *vma, *gate_vma; + int vma_count, segs, i; + size_t vma_data_size; struct elfhdr elf; loff_t offset = 0, dataoff; struct elf_note_info info = { }; @@ -2162,30 +2135,16 @@ static int elf_core_dump(struct coredump_params *cprm) struct elf_shdr *shdr4extnum = NULL; Elf_Half e_phnum; elf_addr_t e_shoff; - elf_addr_t *vma_filesz = NULL; + struct core_vma_metadata *vma_meta; + + if (dump_vma_snapshot(cprm, &vma_count, &vma_meta, &vma_data_size)) + return 0; - /* - * We no longer stop all VM operations. - * - * This is because those proceses that could possibly change map_count - * or the mmap / vma pages are now blocked in do_exit on current - * finishing this core dump. - * - * Only ptrace can touch these memory addresses, but it doesn't change - * the map_count or the pages allocated. So no possibility of crashing - * exists while dumping the mm->vm_next areas to the core file. - */ - /* * The number of segs are recored into ELF header as 16bit value. * Please check DEFAULT_MAX_MAP_COUNT definition when you modify here. */ - segs = current->mm->map_count; - segs += elf_core_extra_phdrs(); - - gate_vma = get_gate_vma(current->mm); - if (gate_vma != NULL) - segs++; + segs = vma_count + elf_core_extra_phdrs(); /* for notes section */ segs++; @@ -2223,24 +2182,6 @@ static int elf_core_dump(struct coredump_params *cprm) dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE); - /* - * Zero vma process will get ZERO_SIZE_PTR here. - * Let coredump continue for register state at least. - */ - vma_filesz = kvmalloc(array_size(sizeof(*vma_filesz), (segs - 1)), - GFP_KERNEL); - if (!vma_filesz) - goto end_coredump; - - for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; - vma = next_vma(vma, gate_vma)) { - unsigned long dump_size; - - dump_size = vma_dump_size(vma, cprm->mm_flags); - vma_filesz[i++] = dump_size; - vma_data_size += dump_size; - } - offset += vma_data_size; offset += elf_core_extra_data_size(); e_shoff = offset; @@ -2261,21 +2202,23 @@ static int elf_core_dump(struct coredump_params *cprm) goto end_coredump; /* Write program headers for segments dump */ - for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; - vma = next_vma(vma, gate_vma)) { + for (i = 0; i < vma_count; i++) { + struct core_vma_metadata *meta = vma_meta + i; struct elf_phdr phdr; phdr.p_type = PT_LOAD; phdr.p_offset = offset; - phdr.p_vaddr = vma->vm_start; + phdr.p_vaddr = meta->start; phdr.p_paddr = 0; - phdr.p_filesz = vma_filesz[i++]; - phdr.p_memsz = vma->vm_end - vma->vm_start; + phdr.p_filesz = meta->dump_size; + phdr.p_memsz = meta->end - meta->start; offset += phdr.p_filesz; - phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0; - if (vma->vm_flags & VM_WRITE) + phdr.p_flags = 0; + if (meta->flags & VM_READ) + phdr.p_flags |= PF_R; + if (meta->flags & VM_WRITE) phdr.p_flags |= PF_W; - if (vma->vm_flags & VM_EXEC) + if (meta->flags & VM_EXEC) phdr.p_flags |= PF_X; phdr.p_align = ELF_EXEC_PAGESIZE; @@ -2297,9 +2240,10 @@ static int elf_core_dump(struct coredump_params *cprm) if (!dump_skip(cprm, dataoff - cprm->pos)) goto end_coredump; - for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; - vma = next_vma(vma, gate_vma)) { - if (!dump_user_range(cprm, vma->vm_start, vma_filesz[i++])) + for (i = 0; i < vma_count; i++) { + struct core_vma_metadata *meta = vma_meta + i; + + if (!dump_user_range(cprm, meta->start, meta->dump_size)) goto end_coredump; } dump_truncate(cprm); @@ -2315,7 +2259,7 @@ static int elf_core_dump(struct coredump_params *cprm) end_coredump: free_note_info(&info); kfree(shdr4extnum); - kvfree(vma_filesz); + kvfree(vma_meta); kfree(phdr4note); return has_dumped; } diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index f531c6198864..be4062b8ba75 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1454,29 +1454,21 @@ static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum, /* * dump the segments for an MMU process */ -static bool elf_fdpic_dump_segments(struct coredump_params *cprm) +static bool elf_fdpic_dump_segments(struct coredump_params *cprm, + struct core_vma_metadata *vma_meta, + int vma_count) { - struct vm_area_struct *vma; + int i; - for (vma = current->mm->mmap; vma; vma = vma->vm_next) { - unsigned long size = vma_dump_size(vma, cprm->mm_flags); + for (i = 0; i < vma_count; i++) { + struct core_vma_metadata *meta = vma_meta + i; - if (!dump_user_range(cprm, vma->vm_start, size)) + if (!dump_user_range(cprm, meta->start, meta->dump_size)) return false; } return true; } -static size_t elf_core_vma_data_size(unsigned long mm_flags) -{ - struct vm_area_struct *vma; - size_t size = 0; - - for (vma = current->mm->mmap; vma; vma = vma->vm_next) - size += vma_dump_size(vma, mm_flags); - return size; -} - /* * Actual dumper * @@ -1487,9 +1479,8 @@ static size_t elf_core_vma_data_size(unsigned long mm_flags) static int elf_fdpic_core_dump(struct coredump_params *cprm) { int has_dumped = 0; - int segs; + int vma_count, segs; int i; - struct vm_area_struct *vma; struct elfhdr *elf = NULL; loff_t offset = 0, dataoff; struct memelfnote psinfo_note, auxv_note; @@ -1503,18 +1494,8 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) elf_addr_t e_shoff; struct core_thread *ct; struct elf_thread_status *tmp; - - /* - * We no longer stop all VM operations. - * - * This is because those proceses that could possibly change map_count - * or the mmap / vma pages are now blocked in do_exit on current - * finishing this core dump. - * - * Only ptrace can touch these memory addresses, but it doesn't change - * the map_count or the pages allocated. So no possibility of crashing - * exists while dumping the mm->vm_next areas to the core file. - */ + struct core_vma_metadata *vma_meta = NULL; + size_t vma_data_size; /* alloc memory for large data structures: too large to be on stack */ elf = kmalloc(sizeof(*elf), GFP_KERNEL); @@ -1524,6 +1505,9 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) if (!psinfo) goto end_coredump; + if (dump_vma_snapshot(cprm, &vma_count, &vma_meta, &vma_data_size)) + goto end_coredump; + for (ct = current->mm->core_state->dumper.next; ct; ct = ct->next) { tmp = elf_dump_thread_status(cprm->siginfo->si_signo, @@ -1543,8 +1527,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) tmp->next = thread_list; thread_list = tmp; - segs = current->mm->map_count; - segs += elf_core_extra_phdrs(); + segs = vma_count + elf_core_extra_phdrs(); /* for notes section */ segs++; @@ -1589,7 +1572,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) /* Page-align dumped data */ dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE); - offset += elf_core_vma_data_size(cprm->mm_flags); + offset += vma_data_size; offset += elf_core_extra_data_size(); e_shoff = offset; @@ -1609,23 +1592,26 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) goto end_coredump; /* write program headers for segments dump */ - for (vma = current->mm->mmap; vma; vma = vma->vm_next) { + for (i = 0; i < vma_count; i++) { + struct core_vma_metadata *meta = vma_meta + i; struct elf_phdr phdr; size_t sz; - sz = vma->vm_end - vma->vm_start; + sz = meta->end - meta->start; phdr.p_type = PT_LOAD; phdr.p_offset = offset; - phdr.p_vaddr = vma->vm_start; + phdr.p_vaddr = meta->start; phdr.p_paddr = 0; - phdr.p_filesz = vma_dump_size(vma, cprm->mm_flags); + phdr.p_filesz = meta->dump_size; phdr.p_memsz = sz; offset += phdr.p_filesz; - phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0; - if (vma->vm_flags & VM_WRITE) + phdr.p_flags = 0; + if (meta->flags & VM_READ) + phdr.p_flags |= PF_R; + if (meta->flags & VM_WRITE) phdr.p_flags |= PF_W; - if (vma->vm_flags & VM_EXEC) + if (meta->flags & VM_EXEC) phdr.p_flags |= PF_X; phdr.p_align = ELF_EXEC_PAGESIZE; @@ -1657,7 +1643,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) if (!dump_skip(cprm, dataoff - cprm->pos)) goto end_coredump; - if (!elf_fdpic_dump_segments(cprm)) + if (!elf_fdpic_dump_segments(cprm, vma_meta, vma_count)) goto end_coredump; if (!elf_core_write_extra_data(cprm)) @@ -1681,6 +1667,7 @@ static int elf_fdpic_core_dump(struct coredump_params *cprm) thread_list = thread_list->next; kfree(tmp); } + kvfree(vma_meta); kfree(phdr4note); kfree(elf); kfree(psinfo); diff --git a/fs/coredump.c b/fs/coredump.c index 4ef4c49a65b7..0cd9056d79cc 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -971,7 +971,8 @@ static bool always_dump_vma(struct vm_area_struct *vma) /* * Decide how much of @vma's contents should be included in a core dump. */ -unsigned long vma_dump_size(struct vm_area_struct *vma, unsigned long mm_flags) +static unsigned long vma_dump_size(struct vm_area_struct *vma, + unsigned long mm_flags) { #define FILTER(type) (mm_flags & (1UL << MMF_DUMP_##type)) @@ -1037,3 +1038,81 @@ unsigned long vma_dump_size(struct vm_area_struct *vma, unsigned long mm_flags) whole: return vma->vm_end - vma->vm_start; } + +static struct vm_area_struct *first_vma(struct task_struct *tsk, + struct vm_area_struct *gate_vma) +{ + struct vm_area_struct *ret = tsk->mm->mmap; + + if (ret) + return ret; + return gate_vma; +} + +/* + * Helper function for iterating across a vma list. It ensures that the caller + * will visit `gate_vma' prior to terminating the search. + */ +static struct vm_area_struct *next_vma(struct vm_area_struct *this_vma, + struct vm_area_struct *gate_vma) +{ + struct vm_area_struct *ret; + + ret = this_vma->vm_next; + if (ret) + return ret; + if (this_vma == gate_vma) + return NULL; + return gate_vma; +} + +/* + * Under the mmap_lock, take a snapshot of relevant information about the task's + * VMAs. + */ +int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count, + struct core_vma_metadata **vma_meta, + size_t *vma_data_size_ptr) +{ + struct vm_area_struct *vma, *gate_vma; + struct mm_struct *mm = current->mm; + int i; + size_t vma_data_size = 0; + + /* + * Once the stack expansion code is fixed to not change VMA bounds + * under mmap_lock in read mode, this can be changed to take the + * mmap_lock in read mode. + */ + if (mmap_write_lock_killable(mm)) + return -EINTR; + + gate_vma = get_gate_vma(mm); + *vma_count = mm->map_count + (gate_vma ? 1 : 0); + + *vma_meta = kvmalloc_array(*vma_count, sizeof(**vma_meta), GFP_KERNEL); + if (!*vma_meta) { + mmap_write_unlock(mm); + return -ENOMEM; + } + + for (i = 0, vma = first_vma(current, gate_vma); vma != NULL; + vma = next_vma(vma, gate_vma), i++) { + struct core_vma_metadata *m = (*vma_meta) + i; + + m->start = vma->vm_start; + m->end = vma->vm_end; + m->flags = vma->vm_flags; + m->dump_size = vma_dump_size(vma, cprm->mm_flags); + + vma_data_size += m->dump_size; + } + + mmap_write_unlock(mm); + + if (WARN_ON(i != *vma_count)) + return -EFAULT; + + *vma_data_size_ptr = vma_data_size; + return 0; +} diff --git a/include/linux/coredump.h b/include/linux/coredump.h index bfecb8d79a7f..e58e8c207782 100644 --- a/include/linux/coredump.h +++ b/include/linux/coredump.h @@ -7,6 +7,12 @@ #include #include +struct core_vma_metadata { + unsigned long start, end; + unsigned long flags; + unsigned long dump_size; +}; + /* * These are the only things you should do on a core-file: use only these * functions to write out all the necessary info. @@ -16,9 +22,11 @@ extern int dump_skip(struct coredump_params *cprm, size_t nr); extern int dump_emit(struct coredump_params *cprm, const void *addr, int nr); extern int dump_align(struct coredump_params *cprm, int align); extern void dump_truncate(struct coredump_params *cprm); -unsigned long vma_dump_size(struct vm_area_struct *vma, unsigned long mm_flags); int dump_user_range(struct coredump_params *cprm, unsigned long start, unsigned long len); +int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count, + struct core_vma_metadata **vma_meta, + size_t *vma_data_size_ptr); #ifdef CONFIG_COREDUMP extern void do_coredump(const kernel_siginfo_t *siginfo); #else From patchwork Thu Aug 27 11:49:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03AF114F6 for ; Thu, 27 Aug 2020 11:50:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C514C2177B for ; Thu, 27 Aug 2020 11:50:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="qAl3aYBu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C514C2177B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BC5548E0014; Thu, 27 Aug 2020 07:50:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B4F8F8E0006; Thu, 27 Aug 2020 07:50:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A177A8E0014; Thu, 27 Aug 2020 07:50:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 86CB28E0006 for ; Thu, 27 Aug 2020 07:50:04 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 413EA181AEF0B for ; Thu, 27 Aug 2020 11:50:04 +0000 (UTC) X-FDA: 77196179928.19.flesh49_000aae32706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 105AA1AD1B2 for ; Thu, 27 Aug 2020 11:50:04 +0000 (UTC) X-Spam-Summary: 1,0,0,f580c51989b9901e,d41d8cd98f00b204,36p1hxwukceuqhuuonvvnsl.jvtspu14-ttr2hjr.vyn@flex--jannh.bounces.google.com,,RULES_HIT:41:152:355:379:541:800:960:973:981:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3152:3353:3865:3867:3868:3870:3872:4250:4321:4605:5007:6261:6653:7903:9969:10004:10400:11026:11233:11473:11658:11914:12043:12296:12297:12438:12555:12895:13069:13311:13357:14096:14097:14181:14394:14659:14721:21080:21444:21627:21987:21990:30012:30054:30070,0,RBL:209.85.208.74:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.84.100;04yfqxuqzsfpxpkecstkqmcx4w67kyctzfmuur85an99ewp6frqgw8ojhxtwgig.qmo1sck9z4jb7ktm9z978knmg4xsh1sah4tmskjet5kc1obe58u6nw7i7ygcgap.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: flesh49_000aae32706c X-Filterd-Recvd-Size: 4632 Received: from mail-ed1-f74.google.com (mail-ed1-f74.google.com [209.85.208.74]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:50:03 +0000 (UTC) Received: by mail-ed1-f74.google.com with SMTP id z11so1831568edj.3 for ; Thu, 27 Aug 2020 04:50:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=7aZgefibZrTCs6XQP3SSXIsvAQ9RHGTwgc0UHLaNuJc=; b=qAl3aYBuljMXDgGHSGVDp+hgxe+gqGFmAWy4rRnhiga/eisuFw16UE3XkV1gF06/rw N49bdbuxV6EwXiDzAvesC/DoZmDX3L/5rN1/EUfXfknzL6Yq25+xirgAm6bTrx1qdzlg 48kPTyb1jWCxPQ5QRpbpvOLyB4xMZj9SQ33HjhROU6Y4JbC/ffa6Ppz1ujabxx0i/W7p biZ/pGZTE2ifWHW9M2HQ1aj7T23aompUwxKttebsptCS0lMr62sq7+TFDJi0bPS0vKEs U0e7G5uYqwdIfHjhgM/r/9A/u1lUQ1eGs8hqvXJNCAjlDAVw1eHVdErrdoNQ6173LzqE ujeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=7aZgefibZrTCs6XQP3SSXIsvAQ9RHGTwgc0UHLaNuJc=; b=b+OcKe1NTQ3VcTyjcSEQuDsxlL5W1i5X1SJ147gxAxB8BNCf+fBEArpCZJLiEw0EYa dp86hBoMBwzZOUQIl0tcsjVlwA1e4FE6Zn2NVyM1VXYq0HVfo3/rkfWlJgHHqQ8lPVRR Prx4QDFy6q3RNUn7ZfV73uXz7Xj/bM9pls2M4O9VGYD7SDAC0+qrGfa5zqNjuTnwUfqc ab9ZwZMZ364T6wz3GoEAkAOvS4bXKT8+HehXhfOtHHEpbwmEPz4sc6JoFHokDmJzirzT OWnLH2ckfqmimOdzO46hybn59xfr9zLLb/g6P6aXUD+Geewf392jwQ+4iHZINSGLp6Iw iBhA== X-Gm-Message-State: AOAM531VNPlFUxxeXbQKUqcyLyxFuMkqyflF3jGrRuvoKg1Jl3ebdmtG G9mVJc0ZFGbqlPXa2t4MZxLAmG+ALw== X-Google-Smtp-Source: ABdhPJwhBCyiHrz/+IMDKbZ52GDgcHe+j5akrd2gdlCfhjevzWrep4IKiEZ8xF0Uw2rMa5in+uH4gw5rKg== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a17:906:c792:: with SMTP id cw18mr12340246ejb.285.1598529002516; Thu, 27 Aug 2020 04:50:02 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:31 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-7-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 6/7] mm/gup: Take mmap_lock in get_dump_page() From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 105AA1AD1B2 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Properly take the mmap_lock before calling into the GUP code from get_dump_page(); and play nice, allowing the GUP code to drop the mmap_lock if it has to sleep. As Linus pointed out, we don't actually need the VMA because __get_user_pages() will flush the dcache for us if necessary. Signed-off-by: Jann Horn --- mm/gup.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 92519e5a44b3..bd0f7311c5c6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1552,19 +1552,23 @@ static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start, * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - * allowing a hole to be left in the corefile to save diskspace. * - * Called without mmap_lock, but after all other threads have been killed. + * Called without mmap_lock (takes and releases the mmap_lock by itself). */ #ifdef CONFIG_ELF_CORE struct page *get_dump_page(unsigned long addr) { - struct vm_area_struct *vma; + struct mm_struct *mm = current->mm; struct page *page; + int locked = 1; + int ret; - if (__get_user_pages_locked(current->mm, addr, 1, &page, &vma, NULL, - FOLL_FORCE | FOLL_DUMP | FOLL_GET) < 1) + if (mmap_read_lock_killable(mm)) return NULL; - flush_cache_page(vma, addr, page_to_pfn(page)); - return page; + ret = __get_user_pages_locked(mm, addr, 1, &page, NULL, &locked, + FOLL_FORCE | FOLL_DUMP | FOLL_GET); + if (locked) + mmap_read_unlock(mm); + return (ret == 1) ? page : NULL; } #endif /* CONFIG_ELF_CORE */ From patchwork Thu Aug 27 11:49:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8DCE614F6 for ; Thu, 27 Aug 2020 11:50:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4D2A12177B for ; Thu, 27 Aug 2020 11:50:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="g0HxF1iq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D2A12177B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 229A78E0015; Thu, 27 Aug 2020 07:50:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1B37C8E0006; Thu, 27 Aug 2020 07:50:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07A3E8E0015; Thu, 27 Aug 2020 07:50:08 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id DF4E28E0006 for ; Thu, 27 Aug 2020 07:50:07 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A21B21263 for ; Thu, 27 Aug 2020 11:50:07 +0000 (UTC) X-FDA: 77196180054.24.cakes28_1909a492706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 624881A4A5 for ; Thu, 27 Aug 2020 11:50:07 +0000 (UTC) X-Spam-Summary: 1,0,0,8c9c4c1dddae034a,d41d8cd98f00b204,37p1hxwukcekulyysrzzrwp.nzxwty58-xxv6lnv.z2r@flex--jannh.bounces.google.com,,RULES_HIT:1:41:69:152:355:379:541:800:960:966:968:973:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:1801:2194:2196:2199:2200:2393:2553:2559:2562:2636:2693:2736:2898:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6261:6653:7903:8660:9010:9592:9969:10004:11026:11232:11473:11658:11914:12043:12296:12297:12438:12555:12895:12986:13148:13149:13161:13229:13230:13972:14096:14097:14394:14659:21080:21324:21433:21444:21451:21627:21939:21987:21990:30003:30012:30051:30054:30070:30090,0,RBL:209.85.219.202:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04yrkp6tebjcrurroyjmu7twte1w7ypqspkks9r9gy7x31d1te1c6z3jdzamefd.proko8q1b6fex9cca7szg8x19iwdq6phzmnih5sqy33tbhhrzdoknhofiaq6ax9.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian: 0.5,0.5, X-HE-Tag: cakes28_1909a492706c X-Filterd-Recvd-Size: 12986 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:50:06 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id a75so7003377ybg.15 for ; Thu, 27 Aug 2020 04:50:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=cVuWDkLHwa5/2KrhblG6pdzJasoQug9C22emcf5s8N8=; b=g0HxF1iqNkTG/iZ1NuqSDpSdtIFu/5LVjPFCvlEvPLfFc8Nt0Y9LUm/EDoP7OChPOn C5ScdaXQLX26RsmBnPBcgDGY1m9qNmtIkNvlA5DrglwYMX82L3zyjd8Fdzh29+iUvB27 BtGIl7ot7ZYiNlDygiqTmUalK/KQEBFOQLJ9cSG87Qeqm2Ixo/7XNv8xW300Din64vmz ZTtYOU5XGTP45neB/5x11Ia7kADtKZzy5kFigDbuwSL/n+cFWzuwtPPCh7/zHSMUoUg+ cEHxVAl3z9dhn5D6PTGY465up4NSdYwPDPX7B75uGRxdpC8rE6ySBOlHfjDmokL6hDcs azWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=cVuWDkLHwa5/2KrhblG6pdzJasoQug9C22emcf5s8N8=; b=TnUTADeqIn0P3x7ciLqJDB1yPUNm9vhV/AU3y6Hm5sNETZIl5TrwvgYKTeEmMtb8zn vttz3aSJlcEmsKjyAuBwf+UyeF11BkJkuzRQFBNOKGbMUs6Hz0d2qqxu9UXdwMTMNe0v 88gKLUGhKzbLMJGGFPxKCfgOFpjnzxXPNaih74aoPihEh5ZW8BGaLO7I0Cg9E5nATc4S KiOjE4oTGCy1VpFYJz+N0vShj31Kg+nwnguWbZ4nnQNvDb7KpLJxKGNRpKumde6H/Jx+ 0hvvOKEXCZk/3FEEnpcgl9NFCVHT0pb8aBXnsImvm2Vsf4xKF0agMRScsRn0FXrw89N/ lYEQ== X-Gm-Message-State: AOAM531GBTWndDSo4xkeRbgJoi+64YkB85bZxWZqJaX6qXoryIEJZJTT o/crAJ3mYF0mH09s+9tHDFw1BYwCng== X-Google-Smtp-Source: ABdhPJyMvXDO/j8HxgHB0/l9iea2k6+d1+5jncQNCmzp/jZuuFGmOCLYN1CMoCdszB/WupNt0Y317Ks0yQ== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:a25:5807:: with SMTP id m7mr8209201ybb.456.1598529006101; Thu, 27 Aug 2020 04:50:06 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:32 +0200 In-Reply-To: <20200827114932.3572699-1-jannh@google.com> Message-Id: <20200827114932.3572699-8-jannh@google.com> Mime-Version: 1.0 References: <20200827114932.3572699-1-jannh@google.com> X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 7/7] mm: Remove the now-unnecessary mmget_still_valid() hack From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 624881A4A5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: Jann Horn --- drivers/infiniband/core/uverbs_main.c | 3 --- drivers/vfio/pci/vfio_pci.c | 38 +++++++++++++-------------- fs/proc/task_mmu.c | 18 ------------- fs/userfaultfd.c | 28 +++++++------------- include/linux/sched/mm.h | 25 ------------------ mm/khugepaged.c | 2 +- mm/madvise.c | 17 ------------ mm/mmap.c | 5 +--- 8 files changed, 29 insertions(+), 107 deletions(-) diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 37794d88b1f3..a4ba0b87d6de 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -845,8 +845,6 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile) * will only be one mm, so no big deal. */ mmap_read_lock(mm); - if (!mmget_still_valid(mm)) - goto skip_mm; mutex_lock(&ufile->umap_lock); list_for_each_entry_safe (priv, next_priv, &ufile->umaps, list) { @@ -865,7 +863,6 @@ void uverbs_user_mmap_disassociate(struct ib_uverbs_file *ufile) } } mutex_unlock(&ufile->umap_lock); - skip_mm: mmap_read_unlock(mm); mmput(mm); } diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 620465c2a1da..27f11cc7ba6c 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -1480,31 +1480,29 @@ static int vfio_pci_zap_and_vma_lock(struct vfio_pci_device *vdev, bool try) } else { mmap_read_lock(mm); } - if (mmget_still_valid(mm)) { - if (try) { - if (!mutex_trylock(&vdev->vma_lock)) { - mmap_read_unlock(mm); - mmput(mm); - return 0; - } - } else { - mutex_lock(&vdev->vma_lock); + if (try) { + if (!mutex_trylock(&vdev->vma_lock)) { + mmap_read_unlock(mm); + mmput(mm); + return 0; } - list_for_each_entry_safe(mmap_vma, tmp, - &vdev->vma_list, vma_next) { - struct vm_area_struct *vma = mmap_vma->vma; + } else { + mutex_lock(&vdev->vma_lock); + } + list_for_each_entry_safe(mmap_vma, tmp, + &vdev->vma_list, vma_next) { + struct vm_area_struct *vma = mmap_vma->vma; - if (vma->vm_mm != mm) - continue; + if (vma->vm_mm != mm) + continue; - list_del(&mmap_vma->vma_next); - kfree(mmap_vma); + list_del(&mmap_vma->vma_next); + kfree(mmap_vma); - zap_vma_ptes(vma, vma->vm_start, - vma->vm_end - vma->vm_start); - } - mutex_unlock(&vdev->vma_lock); + zap_vma_ptes(vma, vma->vm_start, + vma->vm_end - vma->vm_start); } + mutex_unlock(&vdev->vma_lock); mmap_read_unlock(mm); mmput(mm); } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5066b0251ed8..c43490aec95d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1168,24 +1168,6 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, count = -EINTR; goto out_mm; } - /* - * Avoid to modify vma->vm_flags - * without locked ops while the - * coredump reads the vm_flags. - */ - if (!mmget_still_valid(mm)) { - /* - * Silently return "count" - * like if get_task_mm() - * failed. FIXME: should this - * function have returned - * -ESRCH if get_task_mm() - * failed like if - * get_proc_task() fails? - */ - mmap_write_unlock(mm); - goto out_mm; - } for (vma = mm->mmap; vma; vma = vma->vm_next) { vma->vm_flags &= ~VM_SOFTDIRTY; vma_set_page_prot(vma); diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 0e4a3837da52..000b457ad087 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -601,8 +601,6 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, /* the various vma->vm_userfaultfd_ctx still points to it */ mmap_write_lock(mm); - /* no task can run (and in turn coredump) yet */ - VM_WARN_ON(!mmget_still_valid(mm)); for (vma = mm->mmap; vma; vma = vma->vm_next) if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) { vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; @@ -842,7 +840,6 @@ static int userfaultfd_release(struct inode *inode, struct file *file) /* len == 0 means wake all */ struct userfaultfd_wake_range range = { .len = 0, }; unsigned long new_flags; - bool still_valid; WRITE_ONCE(ctx->released, true); @@ -858,7 +855,6 @@ static int userfaultfd_release(struct inode *inode, struct file *file) * taking the mmap_lock for writing. */ mmap_write_lock(mm); - still_valid = mmget_still_valid(mm); prev = NULL; for (vma = mm->mmap; vma; vma = vma->vm_next) { cond_resched(); @@ -869,17 +865,15 @@ static int userfaultfd_release(struct inode *inode, struct file *file) continue; } new_flags = vma->vm_flags & ~(VM_UFFD_MISSING | VM_UFFD_WP); - if (still_valid) { - prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end, - new_flags, vma->anon_vma, - vma->vm_file, vma->vm_pgoff, - vma_policy(vma), - NULL_VM_UFFD_CTX); - if (prev) - vma = prev; - else - prev = vma; - } + prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end, + new_flags, vma->anon_vma, + vma->vm_file, vma->vm_pgoff, + vma_policy(vma), + NULL_VM_UFFD_CTX); + if (prev) + vma = prev; + else + prev = vma; vma->vm_flags = new_flags; vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; } @@ -1309,8 +1303,6 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, goto out; mmap_write_lock(mm); - if (!mmget_still_valid(mm)) - goto out_unlock; vma = find_vma_prev(mm, start, &prev); if (!vma) goto out_unlock; @@ -1511,8 +1503,6 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, goto out; mmap_write_lock(mm); - if (!mmget_still_valid(mm)) - goto out_unlock; vma = find_vma_prev(mm, start, &prev); if (!vma) goto out_unlock; diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index f889e332912f..e9cd1e637d76 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -49,31 +49,6 @@ static inline void mmdrop(struct mm_struct *mm) __mmdrop(mm); } -/* - * This has to be called after a get_task_mm()/mmget_not_zero() - * followed by taking the mmap_lock for writing before modifying the - * vmas or anything the coredump pretends not to change from under it. - * - * It also has to be called when mmgrab() is used in the context of - * the process, but then the mm_count refcount is transferred outside - * the context of the process to run down_write() on that pinned mm. - * - * NOTE: find_extend_vma() called from GUP context is the only place - * that can modify the "mm" (notably the vm_start/end) under mmap_lock - * for reading and outside the context of the process, so it is also - * the only case that holds the mmap_lock for reading that must call - * this function. Generally if the mmap_lock is hold for reading - * there's no need of this check after get_task_mm()/mmget_not_zero(). - * - * This function can be obsoleted and the check can be removed, after - * the coredump code will hold the mmap_lock for writing before - * invoking the ->core_dump methods. - */ -static inline bool mmget_still_valid(struct mm_struct *mm) -{ - return likely(!mm->core_state); -} - /** * mmget() - Pin the address space associated with a &struct mm_struct. * @mm: The address space to pin. diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 15a9af791014..101b636c72b5 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -431,7 +431,7 @@ static void insert_to_mm_slots_hash(struct mm_struct *mm, static inline int khugepaged_test_exit(struct mm_struct *mm) { - return atomic_read(&mm->mm_users) == 0 || !mmget_still_valid(mm); + return atomic_read(&mm->mm_users) == 0; } static bool hugepage_vma_check(struct vm_area_struct *vma, diff --git a/mm/madvise.c b/mm/madvise.c index dd1d43cf026d..d5b33d9011f0 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1091,23 +1091,6 @@ int do_madvise(unsigned long start, size_t len_in, int behavior) if (write) { if (mmap_write_lock_killable(current->mm)) return -EINTR; - - /* - * We may have stolen the mm from another process - * that is undergoing core dumping. - * - * Right now that's io_ring, in the future it may - * be remote process management and not "current" - * at all. - * - * We need to fix core dumping to not do this, - * but for now we have the mmget_still_valid() - * model. - */ - if (!mmget_still_valid(current->mm)) { - mmap_write_unlock(current->mm); - return -EINTR; - } } else { mmap_read_lock(current->mm); } diff --git a/mm/mmap.c b/mm/mmap.c index 40248d84ad5f..c47abe460439 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2552,7 +2552,7 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr) if (vma && (vma->vm_start <= addr)) return vma; /* don't alter vm_end if the coredump is running */ - if (!prev || !mmget_still_valid(mm) || expand_stack(prev, addr)) + if (!prev || expand_stack(prev, addr)) return NULL; if (prev->vm_flags & VM_LOCKED) populate_vma_page_range(prev, addr, prev->vm_end, NULL); @@ -2578,9 +2578,6 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr) return vma; if (!(vma->vm_flags & VM_GROWSDOWN)) return NULL; - /* don't alter vm_start if the coredump is running */ - if (!mmget_still_valid(mm)) - return NULL; start = vma->vm_start; if (expand_stack(vma, addr)) return NULL;