From patchwork Thu Aug 27 11:49:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11740581 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B488714F6 for ; Thu, 27 Aug 2020 11:49:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8060B22CE3 for ; Thu, 27 Aug 2020 11:49:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="Sx9t3jLt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8060B22CE3 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A407F8E000C; Thu, 27 Aug 2020 07:49:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9C9728E0006; Thu, 27 Aug 2020 07:49:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 892048E000C; Thu, 27 Aug 2020 07:49:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id 6C6F48E0006 for ; Thu, 27 Aug 2020 07:49:44 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2EDE3181AEF0B for ; Thu, 27 Aug 2020 11:49:44 +0000 (UTC) X-FDA: 77196179088.09.crowd68_1b08e622706c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 02BDB180AD80F for ; Thu, 27 Aug 2020 11:49:43 +0000 (UTC) X-Spam-Summary: 50,0,0,1a88211bbb7c475e,d41d8cd98f00b204,31j1hxwukcc8ulyysrzzrwp.nzxwtyfi-xxvglnv.zcr@flex--jannh.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:960:967:968:973:988:989:1260:1277:1313:1314:1345:1437:1516:1518:1535:1543:1593:1594:1605:1711:1730:1747:1777:1792:1801:2393:2525:2560:2563:2682:2685:2859:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4605:5007:6119:6261:6653:7875:7903:7974:9010:9025:9389:9969:10004:10400:10450:10455:11026:11658:11914:12043:12266:12291:12294:12296:12297:12438:12663:12679:12895:13161:13229:13255:14096:14097:14181:14394:14659:14721:19904:19999:21080:21324:21325:21365:21444:21451:21524:21627:21740:21795:30012:30051:30054:30070:30079,0,RBL:209.85.208.74:@flex--jannh.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.84.100;04yrag84crkjibd57d1mrxi8bni7eyp6gy9gm8nsxmowwzxcfxxdmgkfs3d91nq.qkmha76bnddzh h7s6zqss X-HE-Tag: crowd68_1b08e622706c X-Filterd-Recvd-Size: 6583 Received: from mail-ed1-f74.google.com (mail-ed1-f74.google.com [209.85.208.74]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Thu, 27 Aug 2020 11:49:43 +0000 (UTC) Received: by mail-ed1-f74.google.com with SMTP id u14so1815222edy.12 for ; Thu, 27 Aug 2020 04:49:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:message-id:mime-version:subject:from:to:cc; bh=jJaArDC7TxiyWMPiHAEJ9CuiEBqX/dHACXSbOxkT4GQ=; b=Sx9t3jLt3fRnRVHI8AjMe0N0Y8tpQBiIeW/iTMoct+JVtMg/eqZ4rgqbZpnF95YgOg n65kn+hVVtVFi0kVyu/4dEOXAJ4G/INzK27uZrvVagFNapCeIv8N6I7LhET21HoFLJtC 0F35yxOshCdncPGF4t7P90hW5b/EgubrWk3ZSrwar711eXJI4oTYk/PaUJ++U1k/DesV OMwQt31wHX1ywhRSddXdhAk60LoP09PP4p87vqy0sOaIVWdor/PH/JuzkiBpZi1W1JHO 2Rk+VyCdbNfqpzTR4agPBOj6cYHuWwKwp3ru+gmgvHY4VOIrRBtYD3Lvc82o3sMPBGiM ghfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:mime-version:subject:from :to:cc; bh=jJaArDC7TxiyWMPiHAEJ9CuiEBqX/dHACXSbOxkT4GQ=; b=H/0fnGkdXP2UW5LRQKiV/wDHEqv/TGw/GbpIfh0UfhOuqC2d1fW7ECvbKbGmQZamCF k6t+VJu8asNgOPDgmwmr98U3rbBl7kRS2jX3hP3EqRBjQq5/aEErsUbJMhBStMGd5EO4 G6aOMy8fmX6fW/Np1/4RpfHWuHwVZQyuUmNi/xK2uUFTrgvna+mLMjDd7ACYSAsZQr+t BBn50T7u0IDgxOw6HIMyNUgKWUB1lvxkvhtb9ecQS3tXQccjfULRqZe/9DmuE6g+O6yI 7IvhsmqmQi07ExE6HxBa1ZydievjxuL8AHgnyMWcB4bW6gafWWQP1Dtgkhcr9isCZecz gOqA== X-Gm-Message-State: AOAM530qMZ80LNkHZLMj/RreB77J+LNN/Qh71n5vXZ2l12+6ivr2QR4d KZIS54AZ/5bb0lSNVjr8NrPvzipMLw== X-Google-Smtp-Source: ABdhPJyuC/6TjvHfRUNcOWTk00OeB8q/y6iWl4E7GC1dtm02wUVz+GY64yKhV19RjS+w4cHVnb96XCZ0tA== X-Received: from jannh2.zrh.corp.google.com ([2a00:79e0:1b:201:1a60:24ff:fea6:bf44]) (user=jannh job=sendgmr) by 2002:aa7:d6d9:: with SMTP id x25mr12155398edr.265.1598528980099; Thu, 27 Aug 2020 04:49:40 -0700 (PDT) Date: Thu, 27 Aug 2020 13:49:25 +0200 Message-Id: <20200827114932.3572699-1-jannh@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.28.0.297.g1956fa8f8d-goog Subject: [PATCH v5 0/7] Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there From: Jann Horn To: Andrew Morton Cc: Linus Torvalds , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Eric W . Biederman" , Oleg Nesterov X-Rspamd-Queue-Id: 02BDB180AD80F X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: new in v5: - patches 1-3 and 6 are unchanged - added patch 4: rework vma_dump_size() into a common helper (Linus) - added patch 7: actually do the mmget_still_valid() removal (Linus) - for now, let dump_vma_snapshot() take the mmap_lock in write mode instead of read mode to avoid the data race with stack expansion new in v4: - simplify patch 4/5 by replacing the heuristic for dumping the first pages of ELF mappings with what Linus suggested At the moment, we have that rather ugly mmget_still_valid() helper to work around : ELF core dumping doesn't take the mmap_sem while traversing the task's VMAs, and if anything (like userfaultfd) then remotely messes with the VMA tree, fireworks ensue. So at the moment we use mmget_still_valid() to bail out in any writers that might be operating on a remote mm's VMAs. With this series, I'm trying to get rid of the need for that as cleanly as possible. ("cleanly" meaning "avoid holding the mmap_lock across unbounded sleeps".) Patches 1, 2, 3 and 4 are relatively unrelated cleanups in the core dumping code. Patches 5 and 6 implement the main change: Instead of repeatedly accessing the VMA list with sleeps in between, we snapshot it at the start with proper locking, and then later we just use our copy of the VMA list. This ensures that the kernel won't crash, that VMA metadata in the coredump is consistent even in the presence of concurrent modifications, and that any virtual addresses that aren't being concurrently modified have their contents show up in the core dump properly. The disadvantage of this approach is that we need a bit more memory during core dumping for storing metadata about all VMAs. At the end of the series, patch 7 removes the old workaround for this issue (mmget_still_valid()). I have tested: - Creating a simple core dump on X86-64 still works. - The created coredump on X86-64 opens in GDB and looks plausible. - X86-64 core dumps contain the first page for executable mappings at offset 0, and don't contain the first page for non-executable file mappings or executable mappings at offset !=0. - NOMMU 32-bit ARM can still generate plausible-looking core dumps through the FDPIC implementation. (I can't test this with GDB because GDB is missing some structure definition for nommu ARM, but I've poked around in the hexdump and it looked decent.) Jann Horn (7): binfmt_elf_fdpic: Stop using dump_emit() on user pointers on !MMU coredump: Let dump_emit() bail out on short writes coredump: Refactor page range dumping into common helper coredump: Rework elf/elf_fdpic vma_dump_size() into common helper binfmt_elf, binfmt_elf_fdpic: Use a VMA list snapshot mm/gup: Take mmap_lock in get_dump_page() mm: Remove the now-unnecessary mmget_still_valid() hack drivers/infiniband/core/uverbs_main.c | 3 - drivers/vfio/pci/vfio_pci.c | 38 ++-- fs/binfmt_elf.c | 238 +++----------------------- fs/binfmt_elf_fdpic.c | 162 +++--------------- fs/coredump.c | 236 +++++++++++++++++++++++-- fs/proc/task_mmu.c | 18 -- fs/userfaultfd.c | 28 +-- include/linux/coredump.h | 11 ++ include/linux/sched/mm.h | 25 --- mm/gup.c | 61 +++---- mm/khugepaged.c | 2 +- mm/madvise.c | 17 -- mm/mmap.c | 5 +- 13 files changed, 346 insertions(+), 498 deletions(-) base-commit: 06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd