From patchwork Sat Jul 8 19:12:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13305704 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 891D9C001DC for ; Sat, 8 Jul 2023 19:12:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23B108D0003; Sat, 8 Jul 2023 15:12:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E9656B0072; Sat, 8 Jul 2023 15:12:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B2548D0003; Sat, 8 Jul 2023 15:12:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F11776B0071 for ; Sat, 8 Jul 2023 15:12:23 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C4D421601C0 for ; Sat, 8 Jul 2023 19:12:23 +0000 (UTC) X-FDA: 80989390566.18.1CF695A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf21.hostedemail.com (Postfix) with ESMTP id F389E1C0006 for ; Sat, 8 Jul 2023 19:12:21 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=1k2h2Lzi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3FbWpZAYKCFICEBy7v08805y.w86527EH-664Fuw4.8B0@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3FbWpZAYKCFICEBy7v08805y.w86527EH-664Fuw4.8B0@flex--surenb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688843542; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oTV0xihdJDZ3LmT5b0VvbfWVPo2l/pYZxiWhaA2z/Wc=; b=E++D8xqdCiOBWUEJn3Vz8khfFy2l2oPnE5KPBZ/qclmbQMZYbNhjily4mhRgsD2IPr5xLE X5zRQhVB3mjUq60Ma+EF2gBhGZoEGTWiwiV8jHbNQpCc+DbbuRtGnCpPgKuipNE8kig0AV GYZSuqbHJKRBeHmtFrWwTECRVKHk8P4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=1k2h2Lzi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3FbWpZAYKCFICEBy7v08805y.w86527EH-664Fuw4.8B0@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3FbWpZAYKCFICEBy7v08805y.w86527EH-664Fuw4.8B0@flex--surenb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688843542; a=rsa-sha256; cv=none; b=YPeKvK13D/vQh8NYfP737vrSk+Fy+d8kgFePhKLOvG0L6lgXP0hwopn/lxqccQM9kG2oBu 0fTMSf6G3kUaxfZNCwip9huYQ5/zerDcK9AM+E0Bz36gfLywR3ob4p8F98pZ0G9hIVX4GX jZQGDhTSjwa4ii4QumP1tL+aHz+QD/g= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-573cacf4804so32511607b3.1 for ; Sat, 08 Jul 2023 12:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688843541; x=1691435541; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=oTV0xihdJDZ3LmT5b0VvbfWVPo2l/pYZxiWhaA2z/Wc=; b=1k2h2Lzi2Y7Zso2QCXABxD6tKxYBi69Od4eg/BJzd/oZKr/yo1QC9zYTy+F7bkwUsr MJ4Cs8yuMwfbeH9sKlCNAwVo49wzFYe05nOmOIFdL12CZV8/kFSu5wJrvCOBni6b6XLW llu6z1SxXl9jUD2la6uHivLOfnp3yDTzL+wbIb/uZmz+uMYQ0KGYXgadTEUmWm9Uz2p8 OvPqWDoLPuHzUoISVlQIJ3HwqRiXh8i6S85Nr9mydxCOVHJ3byBf5nAejApz8KtI+7VG nVka+9EAa6kAPb4o2CIYQlx+qV3ecK270Nq5eVaYu0hTPSFmvNlGf9LGupgmu1sf8Ore 83wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688843541; x=1691435541; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=oTV0xihdJDZ3LmT5b0VvbfWVPo2l/pYZxiWhaA2z/Wc=; b=WEfG8yUGYVF/t0ArmDcWaDRaYp9w0xhrZHW319xeoj5v0c7HJovNrgl/hpUaBl2cP7 Ziv2a7dBkhvq2Neuq3fJo0xwp99HfKTDl65kFatoqDInA7f1bC4k1Zrwf6op6l7ehKOs hQU+Zx1SGT5trk3Jc4wI7+/z5RySOyJz1Y+Sijwbw/U4e/ouD9JvspDbjsv+3kgzF9ny GbmKYlgbH0EyfkK5djl8FXXRyyEcqa5jWLK8kPBBKVYZAdmGOHfhJoli+mEZ/C6nLJhG uD1EINS4V8aT4n3YkL/zSt0dqK5nuuhezq6BROdS0Qtw2PIETDSGF7exzVlBaoTXpNp9 7aFw== X-Gm-Message-State: ABy/qLb8FXtpu8fWgTDXci2JeCzl5tUfILGX5grK84HW83jr+jUoOtls X0kQqi5pIf8Ie40sUpQNbtmQQxQUrrA= X-Google-Smtp-Source: APBJJlHVrRFpZyyO6dwy1Gjn4/30Rl1D+ek+P+dzMKAoVpdfL0WWnvznDWyGgdeVaC3RV2FWAanbuAUiF7I= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:6f0:5193:79db:25b3]) (user=surenb job=sendgmr) by 2002:a81:ac20:0:b0:565:b269:5ef7 with SMTP id k32-20020a81ac20000000b00565b2695ef7mr58334ywh.1.1688843541040; Sat, 08 Jul 2023 12:12:21 -0700 (PDT) Date: Sat, 8 Jul 2023 12:12:12 -0700 In-Reply-To: <20230708191212.4147700-1-surenb@google.com> Mime-Version: 1.0 References: <20230708191212.4147700-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.390.g38632f3daf-goog Message-ID: <20230708191212.4147700-3-surenb@google.com> Subject: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking From: Suren Baghdasaryan To: torvalds@linux-foundation.org Cc: akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Suren Baghdasaryan , Jiri Slaby , " =?utf-8?q?Holger_Hoffst=C3=A4tte?= " , stable@vger.kernel.org X-Rspamd-Queue-Id: F389E1C0006 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: rrxgyypubd5inyjbmfzr15bgix79qx58 X-HE-Tag: 1688843541-975679 X-HE-Meta: U2FsdGVkX1++Gmv7u0mQvjSgjiq/x2zf+cYfyNTJ3G8oKmdrxx1yB3MH/0n9rQcYnON4jxS0zLtfycpXPImfWJ98v00/C+5DdzdYf3ppL+8Qs7ey2OEWCmasrFVHwVQJQqS+ZAHuD2A6I8s8Ro2LuwNPVK2WJGf09FMRw8Yt6tJiTyLYRQfxS7n1M0XYBneFuiq6Pyh3L/sa6AFa0nAGT3hEwhytEKlIQHZXDXg3U5txvAqMOMkLgqvMNCA2xzTo53OTuAf3bCmY+a+iVnmLFTuzcIrN+dawRsPp8oTlpU4mUAcchKGnlwhHEeAPgvhN8mZpBOUg7DsJUWrw8Ta66Y6ME2jN5K5f2NtS9n3+/JQR3JRurzv4RX0d7bPr8PcKsaVVbXbLLNxAixzC2wodktAS2bxU408ZNLqm7WBCSt/+9v3U1KuLA8iFmn5m5kLS3EAQ4QGPVhA3m212tI/Tf/xhVyvqiWUyGAwvp27xVme3PatuXOVI3ZgNacq2lK1CZ+QhfL0gHWZueGV2fc1l2WzLyrhLkEivFmP3vqRtEjqcJQ1jHGIQ5/ApZWCRJR2LHFeLw5hBYE+w6RdBxLN1udlk1QecdW73XEGgu6Km13CKRactAD41Qm/z3LrZCPkaoKK/OqadjtzGvD6CUZceLxnpzLjzowCWHue9PbjtNmJkys0chwv13/KjZ1K6rjDfC5NepteTkohleBdIadRDAAhFpoSOg96Sp8kkyDSQiEIPryfO1RskWvjVEsaXgC49PKUVxxOc4J9EtKzlhelemOPFWUk5FcIV1PZru4fsq1T7LVhmcY0mUx2AlDCI6yowTbjgBLsLrT4pEgPyNbEF3DGkOHVmmtug693sG5cEXzxB3UhDgjnGqCtnp88rnnv5pXOeCpJkra8ZuqpqlpU9qbNwWLdPNS1cGtmhejKUxqSYPrr7P0gWDjccFS8ka5P+wgrcbJLXK8zzYNWEfLY hQ0ompOz lvijCMnreoRGWyz+syzQhB8Z1nkJR7oJM+hwF88kgiQAeeWadZ1a+5EIublcwJl+LBFAMgXdvHk565XDTPfQPA6H51GyS1XzfBx0snxiakgY637D4JP0mkoEaAPNXfD0qREJ42VeITfqpeHM4dGvtFP89TidazU5Nl1b86tr6sifcr5MNFCqYn6ciwtBDxE31zTQYJc2d/vBqmb6DAuABlQYNJ1zjsQArwwUd1zKaaijmvblvUIOjPJM8Qc1DQtZaUNSNzNNirtOjpT3YUY4Db5BSlhFqgF2VUOGTyQvjBtLsANkxlf0JHQWfk5gMHQIXSaVtT/o2wFZFhWeeZDbvPKxBJL2QYzHeBmG+y+gwyimlztP4g47tQ5Tl/TNBeIuu89HrN2cKXg2a7n+ROhrHR6j6DfO+qrA2OfzGnP2jtVpcqeECVn+C6CQumflDtUj+q4VUwWWS403mQOP1Rp9c5UzOU8BFYbZCHgCkv7XOtXsO3mhm919IMn03kTO/+IBinD+4O318kTghLZIYkXL6QAAX+UywZOgGE2Rjw5pzbWM4yndY7+8ZxqwQjPfc9ARPmz4B19f4pojLlA9K7bOnDZsGLF1iMf7/Axh19q8KLjONnhEZ2dV1ccTi6a0hIZcpoHoN/x4aQwONPbsLN3j6hk+Wbe3cGOLl8ox3owF97h4UdJEuPliCpHN7x9xhZ4HxPjNWqc7yMKpmeOYSmfuGLh+y+cWj2l3Oq63lrsXGnwBPNNtjJPMCGQxwqrgndVwtgEMCM05rKoEZwhZtbGZYw5cgaQVnO7Tkw3kmlcnjIsD7LljeNBaEeSvv7fGxcRDMRovgRY01YXSC7RT2YP1jcLY0oA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When forking a child process, parent write-protects an anonymous page and COW-shares it with the child being forked using copy_present_pte(). Parent's TLB is flushed right before we drop the parent's mmap_lock in dup_mmap(). If we get a write-fault before that TLB flush in the parent, and we end up replacing that anonymous page in the parent process in do_wp_page() (because, COW-shared with the child), this might lead to some stale writable TLB entries targeting the wrong (old) page. Similar issue happened in the past with userfaultfd (see flush_tlb_page() call inside do_wp_page()). Lock VMAs of the parent process when forking a child, which prevents concurrent page faults during fork operation and avoids this issue. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand Reported-by: Jiri Slaby Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffstätte Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ Reported-by: Jacob Young Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan --- kernel/fork.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/fork.c b/kernel/fork.c index b85814e614a5..d2e12b6d2b18 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, for_each_vma(old_vmi, mpnt) { struct file *file; + vma_start_write(mpnt); if (mpnt->vm_flags & VM_DONTCOPY) { vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); continue;