From patchwork Mon May 30 07:49:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12864356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45D65C433F5 for ; Mon, 30 May 2022 07:51:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B6C38D000A; Mon, 30 May 2022 03:51:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 969268D0001; Mon, 30 May 2022 03:51:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82BC28D000A; Mon, 30 May 2022 03:51:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 746188D0001 for ; Mon, 30 May 2022 03:51:09 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 494B5810D7 for ; Mon, 30 May 2022 07:51:09 +0000 (UTC) X-FDA: 79521638658.24.87F28D7 Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) by imf29.hostedemail.com (Postfix) with ESMTP id 6ADB1120044 for ; Mon, 30 May 2022 07:50:56 +0000 (UTC) Received: by mail-pg1-f178.google.com with SMTP id q123so4430968pgq.6 for ; Mon, 30 May 2022 00:51:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eS8SvDpHTohQF4EqjCM+H0xOhguiTqKM+7kzLSkVHtI=; b=oNhG52LEbWnwUNpy9jfaU81Vmv8ckOaaRIy4hKuoyyvIBdiOMJxTEWcGRybCQOMK5r 1nbQzFTPEPVCro2sxvSjWsUavxLKR22/Vo74emZiaBVIF/iJD8jAEiNC81BATOLEP2wz uJwI0vDomMYwIeB/05ckx17wN19YDVO478ZSuECDlAjahoZU3sRO3V323bucTt2LLBi+ ewE7brA8zxbX4o8fE2qB5WNe7IHqWMNZCQb2LoiWm73zsJHIu3/3xrWRhdsRekwZg4cn pfuCP172Gkj6OLI5/cntWLEfw/+yK55lf6NthD9yyA02aiW2I/FMkWTYgQRZgiqE7qAE Kglw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eS8SvDpHTohQF4EqjCM+H0xOhguiTqKM+7kzLSkVHtI=; b=DFymK6tIudiOas+/E4qNnhxCfh1TtW5GCO9plH9/TxlgIgMZNJRKJK+EwJ48oaDuGg CjDR/9vJIfjJqzZlstDZkx3pf05IG8iT9sFb/uzVlxetGWE8dsWitYhv6dUKpq4+DYJb 120+auU0vLMfuA44noyJp/gGQsrcn61X/THGL0cnz5TpTc3sTpmiIUcAqyn7yf3R5m7s /DVmS2YQVfQruzXRKM2lAma37fdpzXZpji9PEqyHaLPJ9KAhxAuSDbQqPygOKvFR28vv wZJsqwrOnUWQ1KxdAi+PVTnCltQuyTRa1laWoi37XKVwBEvfZuRqCe5yUjxOBhGClqpz /QOQ== X-Gm-Message-State: AOAM531vLii2rJTLzD+BAt2E+UhyitvRbRnotozHWBvtkg4Ic3EE+UwH YMv9Q9WP+T7DnuJFAZIL8KlhNA== X-Google-Smtp-Source: ABdhPJxDcBQuPQuEg/cf8yN6kKV4P5TywkiUYecbsMiAztAXNXJlgHQcbNGGtDRXdsuez+VXYEOCWA== X-Received: by 2002:a63:2c16:0:b0:3fb:1b5f:4441 with SMTP id s22-20020a632c16000000b003fb1b5f4441mr16702730pgs.516.1653897067974; Mon, 30 May 2022 00:51:07 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([2408:8207:18da:2310:2071:e13a:8aa:cacf]) by smtp.gmail.com with ESMTPSA id a23-20020a170902b59700b001616c3bd5c2sm8421381pls.162.2022.05.30.00.50.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 May 2022 00:51:07 -0700 (PDT) From: Muchun Song To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, akpm@linux-foundation.org Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com, longman@redhat.com, Muchun Song Subject: [PATCH v5 06/11] mm: thp: make split queue lock safe when LRU pages are reparented Date: Mon, 30 May 2022 15:49:14 +0800 Message-Id: <20220530074919.46352-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220530074919.46352-1-songmuchun@bytedance.com> References: <20220530074919.46352-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: ioabogsax9tpxjxtkkcnbuc5g7z1yarm Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=oNhG52LE; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf29.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.215.178 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6ADB1120044 X-HE-Tag: 1653897056-154330 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Similar to the lruvec lock, we use the same approach to make the split queue lock safe when LRU pages are reparented. Signed-off-by: Muchun Song Acked-by: Roman Gushchin --- include/linux/memcontrol.h | 10 ++++ mm/huge_memory.c | 116 +++++++++++++++++++++++++++++++++++---------- 2 files changed, 100 insertions(+), 26 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index e390aaa46776..56227603dcb8 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1650,6 +1650,11 @@ int alloc_shrinker_info(struct mem_cgroup *memcg); void free_shrinker_info(struct mem_cgroup *memcg); void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id); void reparent_shrinker_deferred(struct mem_cgroup *memcg); + +static inline int shrinker_id(struct shrinker *shrinker) +{ + return shrinker->id; +} #else #define mem_cgroup_sockets_enabled 0 static inline void mem_cgroup_sk_alloc(struct sock *sk) { }; @@ -1663,6 +1668,11 @@ static inline void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) { } + +static inline int shrinker_id(struct shrinker *shrinker) +{ + return -1; +} #endif #ifdef CONFIG_MEMCG_KMEM diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b17b9d25d045..d3411dc291ab 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -503,25 +503,90 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) } #ifdef CONFIG_MEMCG -static inline struct deferred_split *get_deferred_split_queue(struct page *page) +static inline struct mem_cgroup *folio_split_queue_memcg(struct folio *folio, + struct deferred_split *queue) { - struct mem_cgroup *memcg = page_memcg(compound_head(page)); - struct pglist_data *pgdat = NODE_DATA(page_to_nid(page)); + if (mem_cgroup_disabled()) + return NULL; + if (&NODE_DATA(folio_nid(folio))->deferred_split_queue == queue) + return NULL; + return container_of(queue, struct mem_cgroup, deferred_split_queue); +} - if (memcg) - return &memcg->deferred_split_queue; - else - return &pgdat->deferred_split_queue; +static inline struct deferred_split *folio_memcg_split_queue(struct folio *folio) +{ + struct mem_cgroup *memcg = folio_memcg(folio); + + return memcg ? &memcg->deferred_split_queue : NULL; } #else -static inline struct deferred_split *get_deferred_split_queue(struct page *page) +static inline struct mem_cgroup *folio_split_queue_memcg(struct folio *folio, + struct deferred_split *queue) { - struct pglist_data *pgdat = NODE_DATA(page_to_nid(page)); + return NULL; +} - return &pgdat->deferred_split_queue; +static inline struct deferred_split *folio_memcg_split_queue(struct folio *folio) +{ + return NULL; } #endif +static struct deferred_split *folio_split_queue(struct folio *folio) +{ + struct deferred_split *queue = folio_memcg_split_queue(folio); + + return queue ? : &NODE_DATA(folio_nid(folio))->deferred_split_queue; +} + +static struct deferred_split *folio_split_queue_lock(struct folio *folio) +{ + struct deferred_split *queue; + + rcu_read_lock(); +retry: + queue = folio_split_queue(folio); + spin_lock(&queue->split_queue_lock); + + if (unlikely(folio_split_queue_memcg(folio, queue) != folio_memcg(folio))) { + spin_unlock(&queue->split_queue_lock); + goto retry; + } + rcu_read_unlock(); + + return queue; +} + +static struct deferred_split * +folio_split_queue_lock_irqsave(struct folio *folio, unsigned long *flags) +{ + struct deferred_split *queue; + + rcu_read_lock(); +retry: + queue = folio_split_queue(folio); + spin_lock_irqsave(&queue->split_queue_lock, *flags); + + if (unlikely(folio_split_queue_memcg(folio, queue) != folio_memcg(folio))) { + spin_unlock_irqrestore(&queue->split_queue_lock, *flags); + goto retry; + } + rcu_read_unlock(); + + return queue; +} + +static inline void split_queue_unlock(struct deferred_split *queue) +{ + spin_unlock(&queue->split_queue_lock); +} + +static inline void split_queue_unlock_irqrestore(struct deferred_split *queue, + unsigned long flags) +{ + spin_unlock_irqrestore(&queue->split_queue_lock, flags); +} + void prep_transhuge_page(struct page *page) { /* @@ -2489,7 +2554,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) { struct folio *folio = page_folio(page); struct page *head = &folio->page; - struct deferred_split *ds_queue = get_deferred_split_queue(head); + struct deferred_split *ds_queue; XA_STATE(xas, &head->mapping->i_pages, head->index); struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; @@ -2581,13 +2646,13 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } /* Prevent deferred_split_scan() touching ->_refcount */ - spin_lock(&ds_queue->split_queue_lock); + ds_queue = folio_split_queue_lock(folio); if (page_ref_freeze(head, 1 + extra_pins)) { if (!list_empty(page_deferred_list(head))) { ds_queue->split_queue_len--; list_del(page_deferred_list(head)); } - spin_unlock(&ds_queue->split_queue_lock); + split_queue_unlock(ds_queue); if (mapping) { int nr = thp_nr_pages(head); @@ -2605,7 +2670,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) __split_huge_page(page, list, end); ret = 0; } else { - spin_unlock(&ds_queue->split_queue_lock); + split_queue_unlock(ds_queue); fail: if (mapping) xas_unlock(&xas); @@ -2630,25 +2695,23 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) void free_transhuge_page(struct page *page) { - struct deferred_split *ds_queue = get_deferred_split_queue(page); + struct deferred_split *ds_queue; unsigned long flags; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + ds_queue = folio_split_queue_lock_irqsave(page_folio(page), &flags); if (!list_empty(page_deferred_list(page))) { ds_queue->split_queue_len--; list_del(page_deferred_list(page)); } - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); + split_queue_unlock_irqrestore(ds_queue, flags); free_compound_page(page); } void deferred_split_huge_page(struct page *page) { - struct deferred_split *ds_queue = get_deferred_split_queue(page); -#ifdef CONFIG_MEMCG - struct mem_cgroup *memcg = page_memcg(compound_head(page)); -#endif + struct deferred_split *ds_queue; unsigned long flags; + struct folio *folio = page_folio(page); VM_BUG_ON_PAGE(!PageTransHuge(page), page); @@ -2665,18 +2728,19 @@ void deferred_split_huge_page(struct page *page) if (PageSwapCache(page)) return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + ds_queue = folio_split_queue_lock_irqsave(folio, &flags); if (list_empty(page_deferred_list(page))) { + struct mem_cgroup *memcg; + + memcg = folio_split_queue_memcg(folio, ds_queue); count_vm_event(THP_DEFERRED_SPLIT_PAGE); list_add_tail(page_deferred_list(page), &ds_queue->split_queue); ds_queue->split_queue_len++; -#ifdef CONFIG_MEMCG if (memcg) set_shrinker_bit(memcg, page_to_nid(page), - deferred_split_shrinker.id); -#endif + shrinker_id(&deferred_split_shrinker)); } - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); + split_queue_unlock_irqrestore(ds_queue, flags); } static unsigned long deferred_split_count(struct shrinker *shrink,