From patchwork Tue Jan 31 08:39:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13122570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 725AFC636CD for ; Tue, 31 Jan 2023 09:10:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 168E36B007B; Tue, 31 Jan 2023 04:10:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F1B16B007D; Tue, 31 Jan 2023 04:10:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E60B86B0080; Tue, 31 Jan 2023 04:10:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D67726B007B for ; Tue, 31 Jan 2023 04:10:09 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B24AE14032A for ; Tue, 31 Jan 2023 09:10:09 +0000 (UTC) X-FDA: 80414522538.20.D5157FA Received: from lgeamrelo11.lge.com (lgeamrelo11.lge.com [156.147.23.51]) by imf01.hostedemail.com (Postfix) with ESMTP id 2F2684000D for ; Tue, 31 Jan 2023 09:10:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf01.hostedemail.com: 156.147.23.51 is neither permitted nor denied by domain of max.byungchul.park@gmail.com) smtp.mailfrom=max.byungchul.park@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675156207; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=30NU9u78xhDWXaHtqQ5YbCjG+v5+/HtqKX9JK6BUsIM=; b=zHFlSI6E0GwCuaQmXC7aJrRL/2bDmNohOXHTv/3af2q785EE1aZuYazKvpSjle0tQJy8c+ 2jX4c+Dvw6qHJ+cj1203ivw9yGOVTf5019o66zhnkq5jDlwIPy4lueC64XwuVJC0ttf9hS 2icSnxPKy9Nr5O+FU46j9s4jdNGQUy0= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf01.hostedemail.com: 156.147.23.51 is neither permitted nor denied by domain of max.byungchul.park@gmail.com) smtp.mailfrom=max.byungchul.park@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675156207; a=rsa-sha256; cv=none; b=wpa57EEnUgxyuUH+eEHJYwBD/6MKk49cKl7cfvDLGaiIjIPugZc+IteKyNwwjn86YPnGpL vS92YuMHsTfzW8TkyKreAkqN10r600oFIQb2GKSYrGjXa0y7jwx8+hV8rPjE9y4SqklbiK HC/k5KFTVYwWCY+Ic8JUrjVqeHX6WDs= Received: from unknown (HELO lgeamrelo04.lge.com) (156.147.1.127) by 156.147.23.51 with ESMTP; 31 Jan 2023 17:40:03 +0900 X-Original-SENDERIP: 156.147.1.127 X-Original-MAILFROM: max.byungchul.park@gmail.com Received: from unknown (HELO localhost.localdomain) (10.177.244.38) by 156.147.1.127 with ESMTP; 31 Jan 2023 17:40:03 +0900 X-Original-SENDERIP: 10.177.244.38 X-Original-MAILFROM: max.byungchul.park@gmail.com From: Byungchul Park To: linux-kernel@vger.kernel.org Cc: torvalds@linux-foundation.org, damien.lemoal@opensource.wdc.com, linux-ide@vger.kernel.org, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, mingo@redhat.com, peterz@infradead.org, will@kernel.org, tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org, sashal@kernel.org, daniel.vetter@ffwll.ch, duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org, tytso@mit.edu, willy@infradead.org, david@fromorbit.com, amir73il@gmail.com, gregkh@linuxfoundation.org, kernel-team@lge.com, linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org, minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, sj@kernel.org, jglisse@redhat.com, dennis@kernel.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, vbabka@suse.cz, ngupta@vflare.org, linux-block@vger.kernel.org, paolo.valente@linaro.org, josef@toxicpanda.com, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, jlayton@kernel.org, dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org, dri-devel@lists.freedesktop.org, rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com, hamohammed.sa@gmail.com, 42.hyeyoo@gmail.com, chris.p.wilson@intel.com, gwan-gyeong.mun@intel.com, max.byungchul.park@gmail.com, boqun.feng@gmail.com, longman@redhat.com, hdanton@sina.com Subject: [PATCH v9 25/25] dept: Track the potential waits of PG_{locked,writeback} Date: Tue, 31 Jan 2023 17:39:54 +0900 Message-Id: <1675154394-25598-26-git-send-email-max.byungchul.park@gmail.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1675154394-25598-1-git-send-email-max.byungchul.park@gmail.com> References: <1675154394-25598-1-git-send-email-max.byungchul.park@gmail.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2F2684000D X-Stat-Signature: 3egdfo3fwxdb3rtnpkko9jpfaw3pdstf X-Rspam-User: X-HE-Tag: 1675156205-53731 X-HE-Meta: U2FsdGVkX1+Ny5MoyJ0EMsnMMOtNPdzd/CpN+6ZnGjf+0UcNxtYf0o3CJ+a38gigJNQYEYFZO6gy2VKLP4qeUIh0q8ylhNuxSyiStLZf/G9QWEBWwxMsPCxyK/ZW8TkcSWUuZIPFb6kM6yiGTbT0Fxre6J7/nnRA8zzquTprnUeXxrPpQB6YRggUgCSjZUavSvAE7dLLswvlKm+u5AbwFolPIdRFquAVA4xkOzSHXTxFT5mqACzzYd2SrNM1Duk5d8xY3flRCi8NX8RLfR5/JG1ITQl2jUncYxSuOb6iQDclB9SE4OuijbaogJAH2WUE+SVneFXNPjJCyCowrGQIU6vsdKVbrcsdYtOGMDwOc5ECHaQH8+B2bAtxA1LAU6iXcQ/FtqynySppFn1nKO/hkR5larm3RBAxkrng3Y6CRRFcl9I9yJsjAJU2MnGaf5e8aNgM/aWDKutonINq8GbCswkuWiZ/h27xVEIbdk4beNPFmIJF0Wr39on+7+MvIMhgRb3Dm6TFNnScw9Gm8BVTaPN1d0iVPRKrFb9HNEg0ERPbjK4nx4bvbRFoUe1HBaIt+f4O5Up0TjJ7jVzz2yq7jrCSG1ncBBgFlSTeyQAjykigaqoNshxWHTc3/bnXjtTLhO2nuEtg0aotaeZUBMuWFsMsqlwLkQNcCozLaeWkRm5zSGBGzgYAcINuB/ryC6GsHTCikKTVn5peJsGvvp3Dt8GwII99Jx1/oBnJNOHyQgyx+z8WcUatY+aQUOfhakxueXCQis5pgGi4A6Kqe+7rsHNUK71veTg9RJPS7P8EVVnNR0XJV8tZgHr0pvqkn1e25N6ePWwURcpVOY095iVHAxGov2mN56SjjNxcqneR02FFXSRRUoguFTnosr/OSwpR6RH9X31h0hAEFcbHW+XdGrQ8oKIInWUFS3bbF28buZB/CLGSjtzZS1dC5rtzHMbLirokP+jKXL+5ck0E+7K KGLSUnzj PK9f6MbHrAAkuk1PzhkALie3rip2JTQzAuyFt1QR/Q/yAqjpkzmVRKM4N5Otai0vuhrShMg71zPmbe4dj5JGBxwtY/oVNPwFoGcZoJkW4hLhn/CJSeqKRxF310Ce6IpZJJ6Mrm7GAnDEN8wyjLgBGpQrVoCE7pXLlJJDuaKU8afWu8SlP5U7E8AQQz1Bx9/5Rd/tV2buqkM3ExSre+jTgV7PXBiWs1Rxsgj8kV5XZcr/yaX2/y0uIxKskWv1eJsdiP4/TkNNZNzfKZsV9cZ662+Q7jnobZphhQoNktmOa064WSQ2waH1IEqA9pA1fe9N7bDLEHPxvoCgKJfT0PCKAN2A2fGJA8wZ+bdqRyUh64uWf+vE+TqbAwJEW7kA6gb1wdGwYrH1uQygCOpN3Arh4o9F9RbCTHy9EMue0yAV+5kl5IisgpXDNa998TzjMdYikrR0JdxxtZ6bgH0cbUHCwvJXbbuMlph8PPYw7b9M5epKcqiWvq178dYSrZjI+gLuo2rDBa4HBfUB14nYhC4/3IyM/TVU4MjCENP0lHSI6eo+TXpY3XoNx+gQiGjTe5IYVYGN9kD1N3fNyav/qvAcXLk+36w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, Dept only tracks the real waits of PG_{locked,writeback} that actually happened having gone through __schedule() to avoid false positives. However, it ends in limited capacity for deadlock detection, because anyway there might be still way more potential dependencies by the waits that have yet to happen but may happen in the future so as to cause a deadlock. So let Dept assume that when PG_{locked,writeback} bit gets cleared, there might be waits on the bit to be woken up. Even though false positives may increase with the aggressive tracking, it's worth doing it because it's going to be useful in practice. See the following link for instance: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/ Signed-off-by: Byungchul Park --- include/linux/mm_types.h | 3 ++ include/linux/page-flags.h | 112 ++++++++++++++++++++++++++++++++++++++++----- include/linux/pagemap.h | 7 ++- mm/filemap.c | 11 ++++- mm/page_alloc.c | 3 ++ 5 files changed, 121 insertions(+), 15 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3b84750..61d982e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -19,6 +19,7 @@ #include #include #include +#include #include @@ -252,6 +253,8 @@ struct page { #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS int _last_cpupid; #endif + struct dept_ext_wgen PG_locked_wgen; + struct dept_ext_wgen PG_writeback_wgen; } _struct_page_alignment; /* diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 69e93a0..d6ca114 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -202,6 +202,50 @@ enum pageflags { #ifndef __GENERATING_BOUNDS_H +#ifdef CONFIG_DEPT +#include +#include + +extern struct dept_map PG_locked_map; +extern struct dept_map PG_writeback_map; + +/* + * Place the following annotations in its suitable point in code: + * + * Annotate dept_page_set_bit() around firstly set_bit*() + * Annotate dept_page_clear_bit() around clear_bit*() + * Annotate dept_page_wait_on_bit() around wait_on_bit*() + */ + +static inline void dept_page_set_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_request_event(&PG_locked_map, &p->PG_locked_wgen); + else if (bit_nr == PG_writeback) + dept_request_event(&PG_writeback_map, &p->PG_writeback_wgen); +} + +static inline void dept_page_clear_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_event(&PG_locked_map, 1UL, _RET_IP_, __func__, &p->PG_locked_wgen); + else if (bit_nr == PG_writeback) + dept_event(&PG_writeback_map, 1UL, _RET_IP_, __func__, &p->PG_writeback_wgen); +} + +static inline void dept_page_wait_on_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_wait(&PG_locked_map, 1UL, _RET_IP_, __func__, 0, -1L); + else if (bit_nr == PG_writeback) + dept_wait(&PG_writeback_map, 1UL, _RET_IP_, __func__, 0, -1L); +} +#else +#define dept_page_set_bit(p, bit_nr) do { } while (0) +#define dept_page_clear_bit(p, bit_nr) do { } while (0) +#define dept_page_wait_on_bit(p, bit_nr) do { } while (0) +#endif + #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); @@ -383,44 +427,88 @@ static unsigned long *folio_flags(struct folio *folio, unsigned n) #define SETPAGEFLAG(uname, lname, policy) \ static __always_inline \ void folio_set_##lname(struct folio *folio) \ -{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); \ + dept_page_set_bit(&folio->page, PG_##lname); \ +} \ static __always_inline void SetPage##uname(struct page *page) \ -{ set_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + set_bit(PG_##lname, &policy(page, 1)->flags); \ + dept_page_set_bit(page, PG_##lname); \ +} #define CLEARPAGEFLAG(uname, lname, policy) \ static __always_inline \ void folio_clear_##lname(struct folio *folio) \ -{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); \ + dept_page_clear_bit(&folio->page, PG_##lname); \ +} \ static __always_inline void ClearPage##uname(struct page *page) \ -{ clear_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + clear_bit(PG_##lname, &policy(page, 1)->flags); \ + dept_page_clear_bit(page, PG_##lname); \ +} #define __SETPAGEFLAG(uname, lname, policy) \ static __always_inline \ void __folio_set_##lname(struct folio *folio) \ -{ __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); \ + dept_page_set_bit(&folio->page, PG_##lname); \ +} \ static __always_inline void __SetPage##uname(struct page *page) \ -{ __set_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + __set_bit(PG_##lname, &policy(page, 1)->flags); \ + dept_page_set_bit(page, PG_##lname); \ +} #define __CLEARPAGEFLAG(uname, lname, policy) \ static __always_inline \ void __folio_clear_##lname(struct folio *folio) \ -{ __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); \ + dept_page_clear_bit(&folio->page, PG_##lname); \ +} \ static __always_inline void __ClearPage##uname(struct page *page) \ -{ __clear_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + __clear_bit(PG_##lname, &policy(page, 1)->flags); \ + dept_page_clear_bit(page, PG_##lname); \ +} #define TESTSETFLAG(uname, lname, policy) \ static __always_inline \ bool folio_test_set_##lname(struct folio *folio) \ -{ return test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + bool ret = test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy));\ + if (!ret) \ + dept_page_set_bit(&folio->page, PG_##lname); \ + return ret; \ +} \ static __always_inline int TestSetPage##uname(struct page *page) \ -{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + bool ret = test_and_set_bit(PG_##lname, &policy(page, 1)->flags);\ + if (!ret) \ + dept_page_set_bit(page, PG_##lname); \ + return ret; \ +} #define TESTCLEARFLAG(uname, lname, policy) \ static __always_inline \ bool folio_test_clear_##lname(struct folio *folio) \ -{ return test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + bool ret = test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy));\ + if (ret) \ + dept_page_clear_bit(&folio->page, PG_##lname); \ + return ret; \ +} \ static __always_inline int TestClearPage##uname(struct page *page) \ -{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); } +{ \ + bool ret = test_and_clear_bit(PG_##lname, &policy(page, 1)->flags);\ + if (ret) \ + dept_page_clear_bit(page, PG_##lname); \ + return ret; \ +} #define PAGEFLAG(uname, lname, policy) \ TESTPAGEFLAG(uname, lname, policy) \ diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 29e1f9e..2843619 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -898,7 +898,12 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm, */ static inline bool folio_trylock(struct folio *folio) { - return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0))); + bool ret = !test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)); + + if (ret) + dept_page_set_bit(&folio->page, PG_locked); + + return likely(ret); } /* diff --git a/mm/filemap.c b/mm/filemap.c index adc49cb..b80c8e2 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1097,6 +1097,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync, if (flags & WQ_FLAG_CUSTOM) { if (test_and_set_bit(key->bit_nr, &key->folio->flags)) return -1; + dept_page_set_bit(&key->folio->page, key->bit_nr); flags |= WQ_FLAG_DONE; } } @@ -1206,6 +1207,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr, if (wait->flags & WQ_FLAG_EXCLUSIVE) { if (test_and_set_bit(bit_nr, &folio->flags)) return false; + dept_page_set_bit(&folio->page, bit_nr); } else if (test_bit(bit_nr, &folio->flags)) return false; @@ -1216,8 +1218,10 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr, /* How many times do we accept lock stealing from under a waiter? */ int sysctl_page_lock_unfairness = 5; -static struct dept_map __maybe_unused PG_locked_map = DEPT_MAP_INITIALIZER(PG_locked_map, NULL); -static struct dept_map __maybe_unused PG_writeback_map = DEPT_MAP_INITIALIZER(PG_writeback_map, NULL); +struct dept_map __maybe_unused PG_locked_map = DEPT_MAP_INITIALIZER(PG_locked_map, NULL); +struct dept_map __maybe_unused PG_writeback_map = DEPT_MAP_INITIALIZER(PG_writeback_map, NULL); +EXPORT_SYMBOL(PG_locked_map); +EXPORT_SYMBOL(PG_writeback_map); static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, int state, enum behavior behavior) @@ -1230,6 +1234,7 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, unsigned long pflags; bool in_thrashing; + dept_page_wait_on_bit(&folio->page, bit_nr); if (bit_nr == PG_locked) sdt_might_sleep_start(&PG_locked_map); else if (bit_nr == PG_writeback) @@ -1327,6 +1332,7 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, wait->flags |= WQ_FLAG_DONE; break; } + dept_page_set_bit(&folio->page, bit_nr); /* * If a signal happened, this 'finish_wait()' may remove the last @@ -1534,6 +1540,7 @@ void folio_unlock(struct folio *folio) BUILD_BUG_ON(PG_waiters != 7); BUILD_BUG_ON(PG_locked > 7); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); + dept_page_clear_bit(&folio->page, PG_locked); if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0))) folio_wake_bit(folio, PG_locked); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0745aed..57d6c82 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -76,6 +76,7 @@ #include #include #include +#include #include #include #include @@ -1626,6 +1627,8 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn, page_mapcount_reset(page); page_cpupid_reset_last(page); page_kasan_tag_reset(page); + dept_ext_wgen_init(&page->PG_locked_wgen); + dept_ext_wgen_init(&page->PG_writeback_wgen); INIT_LIST_HEAD(&page->lru); #ifdef WANT_PAGE_VIRTUAL