From patchwork Thu Oct 10 20:56:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13831110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C54D2D24451 for ; Thu, 10 Oct 2024 20:56:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C91D6B0085; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 278346B0088; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C9BF6B0089; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DEF4F6B0085 for ; Thu, 10 Oct 2024 16:56:53 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C8D11140A14 for ; Thu, 10 Oct 2024 20:56:49 +0000 (UTC) X-FDA: 82658901864.16.9F3F863 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf10.hostedemail.com (Postfix) with ESMTP id D57A0C0017 for ; Thu, 10 Oct 2024 20:56:50 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/xpkIms"; spf=pass (imf10.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728593674; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WA+KUbVPNVenl2hRo1jR353leDonhC7IiyFlaXACq7A=; b=7rx6ahWvv+IaOKViBixWVSUqkZ4hmcH/2zlC7cm0psFQXpV60lVrVYJRylMSFnRJJie1py V3QUzuzfTgbAeZxp5Q85cdqmJZQ1bdWGvw3dlH1PBlm5w4BzLG1DO/EbL4in8a0fu1HqXZ zGmAVUhrPwV9GLy2d4PRWRdAPAr4KDU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728593674; a=rsa-sha256; cv=none; b=GtxbNzJF5dcaGlC8NhSPd3mexZe+0YmaYXCBUvPXfp9/Gy2vNkF6Mg+3q8O4wIe63O7EWz xn1aazMr2sDwaIlA52ArStabascYktNd9Tg9FRVYyM0fyvsYwZ8iWHt/7V7TdIDVtBuRf8 EQzoLwPmQgteX1Z4Pwxcviyr5EasNEI= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/xpkIms"; spf=pass (imf10.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 9DA6B5C5FC1; Thu, 10 Oct 2024 20:56:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D916C4CECD; Thu, 10 Oct 2024 20:56:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728593810; bh=wgt1aG7beo2Dv30obPLfAmhHU07l2a/OUlPnAMEzc64=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T/xpkIms4/XteSjXOaghmqaQVxUthT47UHNM+oFCsckR7E3/ssFpC2mX6GbNYFNua c7GYJ/tay3Y31AM2Jrz+78QGi+BakcO2VBszuXhoE4TXE59jNBFRf5HbCii/XHY6Ju ZBJ7hzr4rp3ZbFkl0pAnhZKpDLNb5QPBiJXoCdgnt25FPcKDuiv9Fhn0on1J0/mcKc 8q6+B1AGFdV5jmva2MTrzj+PC0PcCptrwoFcIbfCWr43dCbkFE6F2qiAV3X8kN6SiB ht+oDb8FkkNv9pFXMfH8vUswVFgouEPlDaH6zjpszRYATw0AjV+4axBpdKG7KNpCoY Z16vjigxFwOfg== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Andrii Nakryiko Subject: [PATCH v3 tip/perf/core 1/4] mm: introduce mmap_lock_speculation_{start|end} Date: Thu, 10 Oct 2024 13:56:41 -0700 Message-ID: <20241010205644.3831427-2-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010205644.3831427-1-andrii@kernel.org> References: <20241010205644.3831427-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D57A0C0017 X-Stat-Signature: 3jcy8hmgktzi4dzj1j7q1hingg99dwxt X-HE-Tag: 1728593810-710125 X-HE-Meta: U2FsdGVkX18gnXMuWO1OuZdeLWm5W5ljttEQAHWC9wcPZ78XFHDFTYZ/OuQXH9ESo1l3/12h6bkFFruztFgBJYkhS1izjNY+Bh6HyGu0EXyqcNEhnFdYZH+702jlmAik3Tw+JPh3rcQa9QZ22qBNkfcaMrRDIfTZHB6bjulE2ezNAi9fxNB4CNlzb6kTsTFz4/woi5YEk9kGCvpm47NckeoVU6g8IC75p4mFLfOjoc/W1pTVexYziNyrgiI/4v3wBb0lDbs/njb5xpAaSzFuTw4uJOuQ6+NMFLClQUh/d9vDQ6zOuXic0AvGJLPYbIyWjGIKrRvrq8jLxgwJrHIIVsAlQgIr3Zv0u+Jjvcdq5ZXs6W/N94wBZLQq+j9qniMrILukyCHVJzlQLwduFawyWdaa7mj2yN3pqlzYhD0CiNR4cQ5qfB+IlHLXVUhFYb3IxssB5qzn9gOed5CFSJHtztCENC0oJtRQxQX7GpgXGlQlDz0LcBu0fXwj2Oc3u4yxphA2bWmcw7dNz6JYq/dfypY8n422dQChOeNAGPJpO+ZiwzW9iaDNb/pnE/bUUazLIZnO6CCtP9mdWVZJjxcTChm9P7qzITcrVvvomewYaEUK/9vKYhff1nStff7Lne+jobH1mkrXRotQCpfOWH+AR3eMbkZZQxxYFWV09jVy4DB3MLpSBJwa905JbnfxfsW+A2oFogm8oGWt/2bceP0z8qsMeLnHpbR/6ir5lAs2e/obtDw7P13o6JGNuE1JOJkQbKhkEUwIrFhYnGnInuXyOtZ/QfCt+LJrxxJ19g4azO3/gHC9aO5e3k7AzUgwjO/ERxWLLY6QxfJNMS0uDsn1Kk/i4vxFcC76vria5YTlJdh8b2RV6S6aDmlGBTCMkWJnt6hci0DupijH7cSvBCvgl2Nd+NSMLekuqvsi+5s1EW8NXDfaSG2KSQAndu2pqohGcJbrqJXhkoaZXX/TBXX kMCKmD7F icx0wXZsqIjwidBS/TGFS0cvFAQ4DZIAS0CZP2rIs+HjXNQbNDRLHsTOXWG/09CvkL+MO4B15Ok7gpxoaLLFWP6y5on9tvToPL4W81rHXjndRkgiuZaU+BYrDjuLugAFm4qJY2xWF/pR058EsRcQLK98XyLyDeEc2uwtPP+xDSxxMqK8DSeGXOn6DEh8cP0s84uoyxA00sg7iP0zmenG1n94VZzKub+pRMC+ukRxyBFAM1EWUfwaVKOUYETP0aaCUltxJzEMBsUh/d4XWX0FK1+xsy4ldSDe9h8HVUTwsHlKruqGU0uOwp2BgfvNSEJTnh+/25glqhebfUyptC03IaNDu/SM1g/mL6njQO6+U5nJ1N6k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Suren Baghdasaryan Add helper functions to speculatively perform operations without read-locking mmap_lock, expecting that mmap_lock will not be write-locked and mm is not modified from under us. Suggested-by: Peter Zijlstra Signed-off-by: Suren Baghdasaryan Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20240912210222.186542-1-surenb@google.com Reviewed-by: Shakeel Butt --- include/linux/mm_types.h | 3 ++ include/linux/mmap_lock.h | 72 ++++++++++++++++++++++++++++++++------- kernel/fork.c | 3 -- 3 files changed, 63 insertions(+), 15 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..5d8cdebd42bc 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -887,6 +887,9 @@ struct mm_struct { * Roughly speaking, incrementing the sequence number is * equivalent to releasing locks on VMAs; reading the sequence * number can be part of taking a read lock on a VMA. + * Incremented every time mmap_lock is write-locked/unlocked. + * Initialized to 0, therefore odd values indicate mmap_lock + * is write-locked and even values that it's released. * * Can be modified under write mmap_lock using RELEASE * semantics. diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index de9dc20b01ba..9d23635bc701 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -71,39 +71,84 @@ static inline void mmap_assert_write_locked(const struct mm_struct *mm) } #ifdef CONFIG_PER_VMA_LOCK +static inline void init_mm_lock_seq(struct mm_struct *mm) +{ + mm->mm_lock_seq = 0; +} + /* - * Drop all currently-held per-VMA locks. - * This is called from the mmap_lock implementation directly before releasing - * a write-locked mmap_lock (or downgrading it to read-locked). - * This should normally NOT be called manually from other places. - * If you want to call this manually anyway, keep in mind that this will release - * *all* VMA write locks, including ones from further up the stack. + * Increment mm->mm_lock_seq when mmap_lock is write-locked (ACQUIRE semantics) + * or write-unlocked (RELEASE semantics). */ -static inline void vma_end_write_all(struct mm_struct *mm) +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) { mmap_assert_write_locked(mm); /* * Nobody can concurrently modify mm->mm_lock_seq due to exclusive * mmap_lock being held. - * We need RELEASE semantics here to ensure that preceding stores into - * the VMA take effect before we unlock it with this store. - * Pairs with ACQUIRE semantics in vma_start_read(). */ - smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + + if (acquire) { + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); + /* + * For ACQUIRE semantics we should ensure no following stores are + * reordered to appear before the mm->mm_lock_seq modification. + */ + smp_wmb(); + } else { + /* + * We need RELEASE semantics here to ensure that preceding stores + * into the VMA take effect before we unlock it with this store. + * Pairs with ACQUIRE semantics in vma_start_read(). + */ + smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + } +} + +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) +{ + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ + *seq = smp_load_acquire(&mm->mm_lock_seq); + /* Allow speculation if mmap_lock is not write-locked */ + return (*seq & 1) == 0; +} + +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) +{ + /* Pairs with ACQUIRE semantics in inc_mm_lock_seq(). */ + smp_rmb(); + return seq == READ_ONCE(mm->mm_lock_seq); } + #else -static inline void vma_end_write_all(struct mm_struct *mm) {} +static inline void init_mm_lock_seq(struct mm_struct *mm) {} +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) {} +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) { return false; } +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) { return false; } #endif +/* + * Drop all currently-held per-VMA locks. + * This is called from the mmap_lock implementation directly before releasing + * a write-locked mmap_lock (or downgrading it to read-locked). + * This should NOT be called manually from other places. + */ +static inline void vma_end_write_all(struct mm_struct *mm) +{ + inc_mm_lock_seq(mm, false); +} + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); + init_mm_lock_seq(mm); } static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -111,6 +156,7 @@ static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) { __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -120,6 +166,8 @@ static inline int mmap_write_lock_killable(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); ret = down_write_killable(&mm->mmap_lock); + if (!ret) + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, ret == 0); return ret; } diff --git a/kernel/fork.c b/kernel/fork.c index 89ceb4a68af2..dd1bded0294d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1261,9 +1261,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); -#ifdef CONFIG_PER_VMA_LOCK - mm->mm_lock_seq = 0; -#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0;