From patchwork Tue Oct 1 22:52:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13818830 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB687CF318A for ; Tue, 1 Oct 2024 22:52:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 611116B00F7; Tue, 1 Oct 2024 18:52:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BE35680023; Tue, 1 Oct 2024 18:52:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 412FF6B0108; Tue, 1 Oct 2024 18:52:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1DEFA6B00F7 for ; Tue, 1 Oct 2024 18:52:23 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9A7A116114F for ; Tue, 1 Oct 2024 22:52:22 +0000 (UTC) X-FDA: 82626533724.18.D34C034 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id E69E8140003 for ; Tue, 1 Oct 2024 22:52:19 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dfDuVI4A; spf=pass (imf09.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727823012; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E0UMJijSGhyPeMd9wA20oMGeQINhZ0MT7UqcI2QQ+RQ=; b=k0nXGVefx1VVEYGHtiK34SRNMTW22ffErEkBOI8twWFq6yVNQdWndQxf8E2uapSTY8aA9y bpD7UnHLDQfi7GcAPSJ/5dAWE6IDFTtDM/cbDa1wuWao88PhF1oA034elZ5w4HeXJj0gJl rmjE2OG4w/7LjaK3rCEL+Me4bgRcXEI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727823012; a=rsa-sha256; cv=none; b=qQpfVM7Ggicc3mhvtmXzkpQeXMxPaGI3Ilk+DcKkmmuLkOvvIVfdZhFEcp75M6HOF+yd0a dXjFqvKY1jx7HOVciHX1ZcHMKS1CnhQPXky9VYPRAe6IEe5L2wkF4uwqzAsUMWtxfX1d7b NN6casWNHuI2fydcnbDuDLbifbza3JE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dfDuVI4A; spf=pass (imf09.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 1621B5C5524; Tue, 1 Oct 2024 22:52:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C73A1C4CEC6; Tue, 1 Oct 2024 22:52:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727823139; bh=5eizI5x+ShSEQgV2sFgQYF+jiwqy/RpZnCYPy8CWlCY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dfDuVI4A1C9B6XefEXXmRXfyt7/s7dX48eXgg1McB+htkj6q+gilGOE4d61u22XCh UfxULXP1HekgDrUTsX2+Ln5wBjCakxnoD/gFrptDXm5zUyoSqPWUf30SbjqdXEbzQX 1Y7WBvvnXnURdWM7kDKwCuD1ZyLjVha0EIEcEHAPgeIvQbAIYIl3uMOjNn4Psmap5M QFFj8DAP9dAq9Mj7rOjscOJ9UFzHxizXyyksvbX1ftosYFrQe6GyfemxxqilV6KEBX bQr4Hj1fllwJKPcEpXl2xOQYQECHviW+3pSvnf592Nq2IwBGBWXnrjNhPa2grRrgbA zgC9Q7nB+8MNQ== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com Cc: rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, mingo@kernel.org, Andrii Nakryiko Subject: [PATCH v2 tip/perf/core 1/5] mm: introduce mmap_lock_speculation_{start|end} Date: Tue, 1 Oct 2024 15:52:03 -0700 Message-ID: <20241001225207.2215639-2-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241001225207.2215639-1-andrii@kernel.org> References: <20241001225207.2215639-1-andrii@kernel.org> MIME-Version: 1.0 X-Stat-Signature: y3b5mx73pn9jmibnsnp1bybjh9pk9qxg X-Rspamd-Queue-Id: E69E8140003 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1727823139-103016 X-HE-Meta: U2FsdGVkX18ynZDAkJ1Y/wjPBzB8fB/8qE2uh/9nDPQrYBZPDETbLbuRst7OT80GwLqyG7MK3bI4qobUcvsJI97d1QCx2vdEohjAyJo5r+i15PTlDfI6NPMYF0Gucej3E57QEUK5wCMTiAUfJN81lQnjUu2clH8sDwLg9SbrS/4N6xJv6LD1I0pku33bd1SW0bkXZoSICVJHMpKzssUF2k+t44i6/+EDlGnjXHjgQWI8HqLpV6Zl2ORnMioqFw1Onnfxcudo7y9q9mMXfEJ0pJttWW24tWBLWf4Sy0Hw1DXSIdM/LrjTjkrBWEx5cVn3HYMNdks6hnBkXrOOK10gZFV+eGPfy9msLG66nTPPE06s+QVcSWJtugrnEVGdF/62ivHopxt0OPOoOZiPfeZvv1Ek3m3UvBkj7ItgfhXpFhZftGzpWcJSua7VFkfbfcwucnFAnYrsbpUITi5h2ybTa4XHlNbtqfO+DXL6dw+qOkHaes5P9rRi7uAYJjp9rRRqYrPZyku8+s7ymeWL7zGV7i7vxeq2xttay/bQL9rdn6eWARMg5l9EmAB7SurhivlaLvwW5Ax55adHcWE1aNC0q7oawA3gjYLteCY2yxPSPc9mY1DuVNUZ7mcJKdiFeCZ9UdubYU5oDor5SL2HAi667EKwNc9D9XBLkshTyHlkvnu622w3NyPPTfk+g32mvFDLAkdSGiXO0trNjDHiD0Vg77MmPRgLdgcwliLW9rGRk2c1ZbacqRd3oUopdeHz/bxgIVBc4TL+Vh7KH/+BcHqLJrWIXEtR6TU9+ZB7xd1f5kxrhb3z2RynkhxkwqJ8vpJfiBib3fRAnwVgPHdP0ZAfNjSbIRopAJvLBbH0tH7YBiX0+kuf8fPjUKNglgmzy7uA2KsXlcj38c0nLDsSCib8g4RfP9v3VV7VT8sznCRQaBP3wCUB2hWjGecptsne1KgEjTB/kLQnX592UhNM5Ap WuCpA0Cn PAKH+DwCTYs+A2eUKWmMkQYvq5hek+b663qvVjn3ByPq4tVJscPdt3OeU3Ukva809w2JrZ6bBs4pRvHuNjU1yX1fMJlMEKID+gP/tjIsD7wW9hpjV2WNyzoZyk0Mqxm0ysHKTWcwVL4BDERskamisQOpbQaocxuDBB8WOI4prvcarj8tfLvYRMaIqMK9hUEeXhNNNTnJfOx9/hUlhgJiBOeUDOV+qRginOxbJPN3AKg4zoy6awc6ZDw3Fwm8rHvGrjPSDOLBD3pq9wDmgjqAJH4b8/LZ9KllLxXnDFNWPqMEahJG/GghL7l72s8DCQ/kFRjOd6Q7TxN7hIujD/QLgRctVX8jbvc1ObqYtXdk46E+89K8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Suren Baghdasaryan Add helper functions to speculatively perform operations without read-locking mmap_lock, expecting that mmap_lock will not be write-locked and mm is not modified from under us. Suggested-by: Peter Zijlstra Signed-off-by: Suren Baghdasaryan Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20240912210222.186542-1-surenb@google.com --- include/linux/mm_types.h | 3 ++ include/linux/mmap_lock.h | 72 ++++++++++++++++++++++++++++++++------- kernel/fork.c | 3 -- 3 files changed, 63 insertions(+), 15 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 485424979254..d5e3f907eea4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -876,6 +876,9 @@ struct mm_struct { * Roughly speaking, incrementing the sequence number is * equivalent to releasing locks on VMAs; reading the sequence * number can be part of taking a read lock on a VMA. + * Incremented every time mmap_lock is write-locked/unlocked. + * Initialized to 0, therefore odd values indicate mmap_lock + * is write-locked and even values that it's released. * * Can be modified under write mmap_lock using RELEASE * semantics. diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index de9dc20b01ba..9d23635bc701 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -71,39 +71,84 @@ static inline void mmap_assert_write_locked(const struct mm_struct *mm) } #ifdef CONFIG_PER_VMA_LOCK +static inline void init_mm_lock_seq(struct mm_struct *mm) +{ + mm->mm_lock_seq = 0; +} + /* - * Drop all currently-held per-VMA locks. - * This is called from the mmap_lock implementation directly before releasing - * a write-locked mmap_lock (or downgrading it to read-locked). - * This should normally NOT be called manually from other places. - * If you want to call this manually anyway, keep in mind that this will release - * *all* VMA write locks, including ones from further up the stack. + * Increment mm->mm_lock_seq when mmap_lock is write-locked (ACQUIRE semantics) + * or write-unlocked (RELEASE semantics). */ -static inline void vma_end_write_all(struct mm_struct *mm) +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) { mmap_assert_write_locked(mm); /* * Nobody can concurrently modify mm->mm_lock_seq due to exclusive * mmap_lock being held. - * We need RELEASE semantics here to ensure that preceding stores into - * the VMA take effect before we unlock it with this store. - * Pairs with ACQUIRE semantics in vma_start_read(). */ - smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + + if (acquire) { + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); + /* + * For ACQUIRE semantics we should ensure no following stores are + * reordered to appear before the mm->mm_lock_seq modification. + */ + smp_wmb(); + } else { + /* + * We need RELEASE semantics here to ensure that preceding stores + * into the VMA take effect before we unlock it with this store. + * Pairs with ACQUIRE semantics in vma_start_read(). + */ + smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + } +} + +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) +{ + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ + *seq = smp_load_acquire(&mm->mm_lock_seq); + /* Allow speculation if mmap_lock is not write-locked */ + return (*seq & 1) == 0; +} + +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) +{ + /* Pairs with ACQUIRE semantics in inc_mm_lock_seq(). */ + smp_rmb(); + return seq == READ_ONCE(mm->mm_lock_seq); } + #else -static inline void vma_end_write_all(struct mm_struct *mm) {} +static inline void init_mm_lock_seq(struct mm_struct *mm) {} +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) {} +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) { return false; } +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) { return false; } #endif +/* + * Drop all currently-held per-VMA locks. + * This is called from the mmap_lock implementation directly before releasing + * a write-locked mmap_lock (or downgrading it to read-locked). + * This should NOT be called manually from other places. + */ +static inline void vma_end_write_all(struct mm_struct *mm) +{ + inc_mm_lock_seq(mm, false); +} + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); + init_mm_lock_seq(mm); } static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -111,6 +156,7 @@ static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) { __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -120,6 +166,8 @@ static inline int mmap_write_lock_killable(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); ret = down_write_killable(&mm->mmap_lock); + if (!ret) + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, ret == 0); return ret; } diff --git a/kernel/fork.c b/kernel/fork.c index 18bdc87209d0..c44b71d354ee 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1259,9 +1259,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); -#ifdef CONFIG_PER_VMA_LOCK - mm->mm_lock_seq = 0; -#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0;