From patchwork Thu Oct 12 17:04:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13419481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D565ECDB46E for ; Thu, 12 Oct 2023 17:04:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 14F178D0135; Thu, 12 Oct 2023 13:04:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0FEF08D0002; Thu, 12 Oct 2023 13:04:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F08D78D0135; Thu, 12 Oct 2023 13:04:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E0DC48D0002 for ; Thu, 12 Oct 2023 13:04:38 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7BB4F12074E for ; Thu, 12 Oct 2023 17:04:38 +0000 (UTC) X-FDA: 81337433436.19.6143279 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf21.hostedemail.com (Postfix) with ESMTP id AD32E1C0028 for ; Thu, 12 Oct 2023 17:04:35 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=C1IMw7Kd; spf=pass (imf21.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697130275; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=zxblbh/t0mrVonJh8ZCDwMUCuDrP2ztYXAQn+IoFI4M=; b=qg9HCao+Xt3X4+Zl+AlPCYA4YbmHdHKCN8DGMs303+IOacHHTwWS+/vaN3NwsUF5f7J2Tu aTY/B9FIuSatUFEpRp8HG5JcnOcI4/XobeIH15fPPTDlB2QBhDQ/IC3kY7yZPPVkWvpIEw wANpuYSaluaSuHwkWNIS/rE5mzzRNlg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697130275; a=rsa-sha256; cv=none; b=1w/a+Pfwo+o3bSEGRcU/48VXiBkIFpBROWwip95C/5nj4cfGYAlspB54BVXmevqP+0CbY/ //GxJOA50y7l+1pM76bAr7uqL9aHy0bHUDLSvz40XGFy/Y9Y7SeZfm+CG8EVPdh2pNYCBo DawrNovVeVLnuduQYRqXXpLN1fTmQkM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=C1IMw7Kd; spf=pass (imf21.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-4065dea9a33so12732235e9.3 for ; Thu, 12 Oct 2023 10:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697130273; x=1697735073; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=zxblbh/t0mrVonJh8ZCDwMUCuDrP2ztYXAQn+IoFI4M=; b=C1IMw7KdvB1UYEf2TyOJR1G+dgnEAxyudLQmSSucfUahE8+byjINqt1CZ4YsbV/0+j hmagMKUfr0bTmD7bIRlAKcl1QsiQK8J3pL+fXlwPP6S3ishQPO5ESc0q5k2669O6YU3x 978KlrZC0oUkmt/Y54LOI71F8XOlxjjgKIe0AVeXjbrjzMNoWsIcssFPsqX7rUIPcYUN mfEDc5H1gFoxy7zg9Qn6iYkC3QgbpMjFO33UqLhvjCjxTt6xDMwy5iw9RWk76PGmxExP j/OFCrMtAFF1Q123X70kk7AL+EYVRBvY+N59hN+3CF00u+1vDw1wJHi9LJVILc+9e9QP 00zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697130273; x=1697735073; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zxblbh/t0mrVonJh8ZCDwMUCuDrP2ztYXAQn+IoFI4M=; b=CXxT6L+occv3djWDGz8n4MDbOovW0Xe8LqrDdVxitKVm9J23s6w8opIkGLNIOHqt7N KCXk5b+DQlSDRsE94QEq5yd5xJYYB/3XT5QgDJydLGo9Jr8eEduwJ0pCnkDAUM5Nc6eG T9mC9y2MgfaOlKJALQTjbY/BQx4YgirOpGFccKucgtir7sitj7yPe/SgKQCTdf/mbVUE CZ/h7GZoZi8y4bz5t4WbyraDGkc5bgw9gdlbcAT4rkfoc4n9QhoVsEYI5yDh9x16PbHb j8HIP1snRUTowNoEXmILE3f/01CFEpdYUWKWnNIh4HIz0FxqPLNigWoXsenwO3FsTo+x fR5A== X-Gm-Message-State: AOJu0YzEC8pB2zW/6OGxRM77SrbRn7A3ti+zqVT1D66BeQjiCRAifi7T YB2aVrotVN9n66K2FvQQwKy37mx1h44= X-Google-Smtp-Source: AGHT+IGvK0IbNKRWQIhKPbCMMuEMig/luZu/oW3qen35wW7u6BbnRgrpYxNQs0mm4SkyvjOQHZ5slw== X-Received: by 2002:a05:6000:186:b0:324:8353:940e with SMTP id p6-20020a056000018600b003248353940emr21204555wrx.34.1697130273071; Thu, 12 Oct 2023 10:04:33 -0700 (PDT) Received: from lucifer.home ([2a00:23c5:dc8c:8701:1663:9a35:5a7b:1d76]) by smtp.googlemail.com with ESMTPSA id h16-20020adffd50000000b003197869bcd7sm18875418wrs.13.2023.10.12.10.04.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 10:04:32 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton Cc: Mike Kravetz , Muchun Song , Alexander Viro , Christian Brauner , Matthew Wilcox , Hugh Dickins , Andy Lutomirski , Jan Kara , linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org, Lorenzo Stoakes Subject: [PATCH v4 0/3] permit write-sealed memfd read-only shared mappings Date: Thu, 12 Oct 2023 18:04:27 +0100 Message-ID: X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 X-Stat-Signature: gpss498n6kh65jp81g7fq6asjm67sbu7 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: AD32E1C0028 X-Rspam-User: X-HE-Tag: 1697130275-402950 X-HE-Meta: U2FsdGVkX1+QJ4ZgskOtEt820I0i0PYVWMg6a4bQQyB0Qcx3EjeImDnFQS7TMm+HfEWuj7aDx9sieVy5M2H6d/jH+lbuBbaRkHvujGGyBijTiVBUP93/x/n9nFMZUAW5MnrS49VXb2b4LtkXqWGaDfnw1EZFMCyhUH/qYOajoaAqMAGPzpKcWAvD5/0yRe1xpGyhkLmKAEg08/beF01LZNZJwOGQy14NdHtsj/hWyhIcu1f+StEOn2Xj6FrA+X0zKCBxgU12+HioU/aehxsvcWVGi6+tY20shG+2YRLNtsnVuSndt5JpPp8HjGCpIE31axNwfmSueT3z1huz4NodPnSd14zMWNRskoHqS3dFkdlWOniMgNkBdk0kRf3UHtXqOBDTADj8YfTaMF56+rd9n2mwZyXeKd/FAwNDhssIIEKP3K95p5E9qI8KlkKpNmZkNjRcqE3tFHq0bpusZPVI/6XnDe72eNKAL0k9HylFbRgX3UHNMzd/hFr4AlhgSZGJRyjd88y9CKfoylkRNg3q6fUCSufHPB5xemob/9hPMur0xC599gAK6ZwBP/aXXfjPfHpGSO6XtFvT/ufQjBSBB9BrBbtyPLPW4tNfUd8l9xaEksc/3NEzkrBzN9WXrK+Or4Pvm/8aGprMitC/X3cpOFaMRVDBYawxh6BaHMUWb+QPw/wQOIyKDBKZ6tT0F/swfPVKIqLgwFNyR0ZJXn4qsq+pVd5VBRtDipnq85OX7Yl6L3xPNEgpGLLPwv0DberCdW0wuePkpjvSiViZSsjkQAfZ23Gr957FLAWSbA7ejdTU2FEIF/RwJr3/ak5glESlfeZAL/ceALjQRA1yiOdg95CJOnHUzvl0kGOnTyWCMYof7Ug7J7cz0uF4suLQO232opWDudeOeJQl4Nx7KQPKZsaSCsVOivY5fzHch064gg59M3FfapKlZAaClYjvcWMg+VVUVs/oTTGc2mrJdXr MhqIC2iL Eg05jLtOtEu0szmzPFlPOoEY8YcTAV0/8RhhfHoIUvuru2QvwDX9qwrkq8MZGb9YL8HHQ/QicKcwFQbtkQgUiLztFAw9e64/M0j6TJ0K8WnjPGyXu5zmQPpJ5DoMQCoozNk4+o5hR5D/EDRyEvdUR9rq4WN6QLRO+QytPecKuN5as5Af7VJCZE/Tn1FK85Wbrg9UiNgNZIEjaETuWfyMNHFsvw1XNSqkNGKJYdzr5Y3SlLOu6gn/2o1eEWC5nyqa42s1W7jQMczR913clbkW7+8ylmKHLaZiUqTEQ5UK1MX+9RnUVwhO8itBHFkAlo97h7+ZuqcTu9B+6zP+nLSzt1iyrIWiFASE1lenv4ms6BrkV3ni9oMURSyX8t8mUZD3cVH3Ydc/cXxIrtPO71y03zjIVUcMVWAfktMPXK2qW9s0z0Oi9jgBLu4Ru8VW7HShi52UWKlgbIzK0b0qnYoTwTe+x7GGSDjhPpdi/TKMomhR4r/2ne09l8IBQI3RBqN4qQSDW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The man page for fcntl() describing memfd file seals states the following about F_SEAL_WRITE:- Furthermore, trying to create new shared, writable memory-mappings via mmap(2) will also fail with EPERM. With emphasis on 'writable'. In turns out in fact that currently the kernel simply disallows all new shared memory mappings for a memfd with F_SEAL_WRITE applied, rendering this documentation inaccurate. This matters because users are therefore unable to obtain a shared mapping to a memfd after write sealing altogether, which limits their usefulness. This was reported in the discussion thread [1] originating from a bug report [2]. This is a product of both using the struct address_space->i_mmap_writable atomic counter to determine whether writing may be permitted, and the kernel adjusting this counter when any VM_SHARED mapping is performed and more generally implicitly assuming VM_SHARED implies writable. It seems sensible that we should only update this mapping if VM_MAYWRITE is specified, i.e. whether it is possible that this mapping could at any point be written to. If we do so then all we need to do to permit write seals to function as documented is to clear VM_MAYWRITE when mapping read-only. It turns out this functionality already exists for F_SEAL_FUTURE_WRITE - we can therefore simply adapt this logic to do the same for F_SEAL_WRITE. We then hit a chicken and egg situation in mmap_region() where the check for VM_MAYWRITE occurs before we are able to clear this flag. To work around this, perform this check after we invoke call_mmap(), with careful consideration of error paths. Thanks to Andy Lutomirski for the suggestion! [1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@linux-foundation.org/ [2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238 v4: - Revert to performing the writable check _after_ the call_mmap() invocation, as the only impact should be internal mm checks, rather than call_mmap(), as suggested by Jan Kara. - Additionally, fixup error handling paths, which resulted in an i915 test failure previously erroneously double-decrement the i_mmap_writable counter. We do this by tracking whether we have in fact marked the mapping writable. This is based on Jan's feedback also. v3: - Don't defer the writable check until after call_mmap() in case this breaks f_ops->mmap() callbacks which assume this has been done first. Instead, separate the check and enforcement of it across the call, allowing for it to change vma->vm_flags in the meanwhile. - Improve/correct commit messages and comments throughout. https://lore.kernel.org/all/cover.1696709413.git.lstoakes@gmail.com v2: - Removed RFC tag. - Correct incorrect goto pointed out by Jan. - Reworded cover letter as suggested by Jan. https://lore.kernel.org/all/cover.1682890156.git.lstoakes@gmail.com/ v1: https://lore.kernel.org/all/cover.1680560277.git.lstoakes@gmail.com/ Lorenzo Stoakes (3): mm: drop the assumption that VM_SHARED always implies writable mm: update memfd seal write check to include F_SEAL_WRITE mm: perform the mapping_map_writable() check after call_mmap() fs/hugetlbfs/inode.c | 2 +- include/linux/fs.h | 4 ++-- include/linux/mm.h | 26 +++++++++++++++++++------- kernel/fork.c | 2 +- mm/filemap.c | 2 +- mm/madvise.c | 2 +- mm/mmap.c | 27 ++++++++++++++++----------- mm/shmem.c | 2 +- 8 files changed, 42 insertions(+), 25 deletions(-) --- 2.42.0