From patchwork Sun Aug 9 20:43:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11706801 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33B3D13B1 for ; Sun, 9 Aug 2020 20:44:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DA54E20709 for ; Sun, 9 Aug 2020 20:44:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QAE2+Df0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DA54E20709 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F1F0C6B0005; Sun, 9 Aug 2020 16:44:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ED2EC6B0006; Sun, 9 Aug 2020 16:44:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4C356B0007; Sun, 9 Aug 2020 16:44:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id BF3856B0005 for ; Sun, 9 Aug 2020 16:44:07 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6F92B8248047 for ; Sun, 9 Aug 2020 20:44:07 +0000 (UTC) X-FDA: 77132207334.10.star16_3700c1626fd4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 3D19316A0BE for ; Sun, 9 Aug 2020 20:44:07 +0000 (UTC) X-Spam-Summary: 50,0,0,33ecdc966dd49cd9,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:1:41:355:379:541:800:960:966:967:968:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2561:2564:2637:2682:2685:2693:2731:2736:2859:2897:2901:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:4605:5007:6119:6261:6653:6755:7514:7903:7904:8603:8660:9010:9025:9413:10004:10394:11026:11473:11658:11914:12043:12048:12050:12219:12291:12295:12296:12297:12438:12517:12519:12539:12555:12663:12679:12683:12895:13141:13148:13161:13168:13206:13229:13230:13894:14096:14110:14687:21063:21080:21094:21323:21324:21433:21444:21450:21451:21524:21627:21666:21740:21788:21789:21790:21795:21939:21990:30003:30012:30034:30051:30054:30074,0,RBL:209.85.167.68:@gmail.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04ygiz96uc3ziqao9wxgg7m85qbw5o crfexc9r X-HE-Tag: star16_3700c1626fd4 X-Filterd-Recvd-Size: 14735 Received: from mail-lf1-f68.google.com (mail-lf1-f68.google.com [209.85.167.68]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Sun, 9 Aug 2020 20:44:06 +0000 (UTC) Received: by mail-lf1-f68.google.com with SMTP id v15so3646401lfg.6 for ; Sun, 09 Aug 2020 13:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FdFb3yTk8VezuNLtMJbTJ6je149Jso+TLVwr4gYTmZw=; b=QAE2+Df0NGuJhMW10rBMlRh3YZnhNfnvXScwsMEnH2ZctoDrYDnq9dLzvfm0l1mGBU jXbpYqGcTZLW5/Yq80NkLF2CScvK8Cv4FX49ISCjVpjRxXa9QSSlRpKP8+sR1IixgD6S B0uNXaOvTroAjhNeUXWVrlieOEuVlAX4iXRLc5xaihDWqhn61tx9qqkYWG0ej+fFX9Bs 7e2a8FRHKgW+PCFqEM9iKmAmHoQGr70KX+uN9FfWK8P9EH7+s+RuxDbs8RjVbcLyYJ8C fSKn2BO3tMlpHUIbvmV94lA2B0TUT/IHazqd85tRXp5FwGvgSCfUMKrotQdVGYVtc6is Qkmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FdFb3yTk8VezuNLtMJbTJ6je149Jso+TLVwr4gYTmZw=; b=QUAdJacVIgqqo9Z5tCkGkbLiCY/2Zg27zQYZm6YQ4E5ChGHuUN/o02x0MLs9a7JVhr u9g9GCjPtKpE7i3f2umC8M7WEYIJBHhFpASyLGqxAdDOp/YSMMIA1yp49gFuKASoCBgq 2r1DRfm0WbX4f3xLmTPy9BAVVAsuI4zEF8063DY3Pqh78TnSlOmbmj4pBsNPzeCEtSum YiWan44j0FITShBduFcBWKl+msTdDZcf0VDvVHK0Ua9GJbx7mBZu3aJEZeDkRTD17q88 FGGJWZHA/3C99f6uPyKpwmQ63BL9t+V8DdpQQI4d8LA+G1GiHyBHQ0OqJbhN5GMOYtV2 pIkQ== X-Gm-Message-State: AOAM531OliXmvRcQr/vz/8fN5KAnuqqESCpra+WRooL4HA8KRo/Dy3JY sbe1oYhPNwdnYUEKbzRUbak= X-Google-Smtp-Source: ABdhPJxV0xcojitn7HTgaWaUyWTJaiK03XGhLzKsPODPAqItS4ba69Blwq1FsEmIHjeTokFV/OpN9w== X-Received: by 2002:ac2:5f48:: with SMTP id 8mr11378403lfz.157.1597005844947; Sun, 09 Aug 2020 13:44:04 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id c17sm9529603lfr.23.2020.08.09.13.44.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 09 Aug 2020 13:44:04 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , "Paul E . McKenney" , Matthew Wilcox Cc: "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Date: Sun, 9 Aug 2020 22:43:53 +0200 Message-Id: <20200809204354.20137-2-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200809204354.20137-1-urezki@gmail.com> References: <20200809204354.20137-1-urezki@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 3D19316A0BE X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Some background and kfree_rcu() =============================== The pointers to be freed are stored in the per-cpu array to improve performance, to enable an easier-to-use API, to accommodate vmalloc memmory and to support a single argument of the kfree_rcu() when only a pointer is passed. More details are below. In order to maintain such per-CPU arrays there is a need in dynamic allocation when a current array is fully populated and a new block is required. See below the example: 0 1 2 3 0 1 2 3 |p|p|p|p| -> |p|p|p|p| -> NULL there are two pointer-blocks, each one can store 4 addresses which will be freed after a grace period is passed. In reality we store PAGE_SIZE / sizeof(void *). So to maintain such blocks a single page is obtain via the page allocator: bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_NOWAIT | __GFP_NOWARN); after that it is attached to the "head" and its "next" pointer is set to previous "head", so the list of blocks can be maintained and grow dynamically until it gets drained by the reclaiming thread. Please note. There is always a fallback if an allocation fails. In the single argument, this is a call to synchronize_rcu() and for the two arguments case this is to use rcu_head structure embedded in the object being free, and then paying cache-miss penalty, also invoke the kfree() per object instead of kfree_bulk() for groups of objects. Why we maintain arrays/blocks instead of linking objects by the regular "struct rcu_head" technique. See below a few but main reasons: a) A memory can be reclaimed by invoking of the kfree_bulk() interface that requires passing an array and number of entries in it. That reduces the per-object overhead caused by calling kfree() per-object. This reduces the reclamation time. b) Improves locality and reduces the number of cache-misses, due to "pointer chasing" between objects, which can be far spread between each other. c) Support a "single argument" in the kvfree_rcu() void *ptr = kvmalloc(some_bytes, GFP_KERNEL); if (ptr) kvfree_rcu(ptr); We need it when an "rcu_head" is not embed into a stucture but an object must be freed after a grace period. Therefore for the single argument, such objects cannot be queued on a linked list. So nowadays, since we do not have a single argument but we see the demand in it, to workaround it people just do a simple not efficient sequence: synchronize_rcu(); /* Can be long and blocks a current context */ kfree(p); More details is here: https://lkml.org/lkml/2020/4/28/1626 d) To distinguish vmalloc pointers between SLAB ones. It becomes possible to invoke the right freeing API for the right kind of pointer, kfree_bulk() or TBD: vmalloc_bulk(). Also, please have a look here: https://lkml.org/lkml/2020/7/30/1166 Limitations and concerns (Main part) ==================================== The current memmory-allocation interface presents to following difficulties that this patch is designed to overcome: a) If built with CONFIG_PROVE_RAW_LOCK_NESTING, the lockdep will complain about violation("BUG: Invalid wait context") of the nesting rules. It does the raw_spinlock vs. spinlock nesting checks, i.e. it is not legal to acquire a spinlock_t while holding a raw_spinlock_t. Internally the kfree_rcu() uses raw_spinlock_t(in rcu-dev branch) whereas the "page allocator" internally deals with spinlock_t to access to its zones. The code also can be broken from higher level of view: raw_spin_lock(&some_lock); kfree_rcu(some_pointer, some_field_offset); b) If built with CONFIG_PREEMPT_RT. Please note, in that case spinlock_t is converted into sleepable variant. Invoking the page allocator from atomic contexts leads to "BUG: scheduling while atomic". Proposal ======== 1) Make GFP_* that ensures that the allocator returns NULL rather than acquire its own spinlock_t. Having such flag will address a and b limitations described above. It will also make the kfree_rcu() code common for RT and regular kernel, more clean, less handling corner cases and reduce the code size. Description: The page allocator has two phases, fast path and slow one. We are interested in fast path and order-0 allocations. In its turn it is divided also into two phases: lock-less and not: a) As a first step the page allocator tries to obtain a page from the per-cpu-list, so each CPU has its own one. That is why this step is lock-less and fast. Basically it disables irqs on current CPU in order to access to per-cpu data and remove a first element from the pcp-list. An element/page is returned to an user. b) If there is no any available page in per-cpu-list, the second step is involved. It removes a specified number of elements from the buddy allocator transferring them to the "supplied-list/per-cpu-list" described in [1]. A number of pre-fetched elements can be controlled via sysfs attribute. Please see the /proc/sys/vm/percpu_pagelist_fraction. This step is not lock-less. It uses spinlock_t for accessing to the buddy zone. This step is fully covered by the rmqueue_bulk() function. Summarizing. The __GFP_NO_LOCKS covers only [1] and can not do step [2], due to the fact that [2] acquires spinlock_t. It implies that it is super fast, but a higher rate of fails is also expected. Having such flag will address (a) and (b) limitations described above. Usage: __get_free_page(__GFP_NO_LOCKS); Signed-off-by: Uladzislau Rezki (Sony) --- include/linux/gfp.h | 11 +++++++++-- include/trace/events/mmflags.h | 1 + mm/page_alloc.c | 31 +++++++++++++++++++++++++------ tools/perf/builtin-kmem.c | 1 + 4 files changed, 36 insertions(+), 8 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 67a0774e080b..c6f11481c42a 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -39,8 +39,9 @@ struct vm_area_struct; #define ___GFP_HARDWALL 0x100000u #define ___GFP_THISNODE 0x200000u #define ___GFP_ACCOUNT 0x400000u +#define ___GFP_NO_LOCKS 0x800000u #ifdef CONFIG_LOCKDEP -#define ___GFP_NOLOCKDEP 0x800000u +#define ___GFP_NOLOCKDEP 0x1000000u #else #define ___GFP_NOLOCKDEP 0 #endif @@ -215,16 +216,22 @@ struct vm_area_struct; * %__GFP_COMP address compound page metadata. * * %__GFP_ZERO returns a zeroed page on success. + * + * %__GFP_NO_LOCKS order-0 allocation without sleepable-locks. + * It obtains a page from the per-cpu-list and considered as + * lock-less. No other actions are performed, thus it returns + * NULL if per-cpu-list is empty. */ #define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN) #define __GFP_COMP ((__force gfp_t)___GFP_COMP) #define __GFP_ZERO ((__force gfp_t)___GFP_ZERO) +#define __GFP_NO_LOCKS ((__force gfp_t)___GFP_NO_LOCKS) /* Disable lockdep for GFP context tracking */ #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP) /* Room for N __GFP_FOO bits */ -#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP)) +#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP)) #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1)) /** diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 939092dbcb8b..653c56c478ad 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -45,6 +45,7 @@ {(unsigned long)__GFP_RECLAIMABLE, "__GFP_RECLAIMABLE"}, \ {(unsigned long)__GFP_MOVABLE, "__GFP_MOVABLE"}, \ {(unsigned long)__GFP_ACCOUNT, "__GFP_ACCOUNT"}, \ + {(unsigned long)__GFP_NO_LOCKS, "__GFP_NO_LOCKS"}, \ {(unsigned long)__GFP_WRITE, "__GFP_WRITE"}, \ {(unsigned long)__GFP_RECLAIM, "__GFP_RECLAIM"}, \ {(unsigned long)__GFP_DIRECT_RECLAIM, "__GFP_DIRECT_RECLAIM"},\ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e4896e674594..8bf1e3a9a1c3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3305,7 +3305,8 @@ static inline void zone_statistics(struct zone *preferred_zone, struct zone *z) } /* Remove page from the per-cpu list, caller must protect the list */ -static struct page *__rmqueue_pcplist(struct zone *zone, int migratetype, +static struct page *__rmqueue_pcplist(struct zone *zone, gfp_t gfp_flags, + int migratetype, unsigned int alloc_flags, struct per_cpu_pages *pcp, struct list_head *list) @@ -3314,7 +3315,8 @@ static struct page *__rmqueue_pcplist(struct zone *zone, int migratetype, do { if (list_empty(list)) { - pcp->count += rmqueue_bulk(zone, 0, + if (!(gfp_flags & __GFP_NO_LOCKS)) + pcp->count += rmqueue_bulk(zone, 0, pcp->batch, list, migratetype, alloc_flags); if (unlikely(list_empty(list))) @@ -3341,8 +3343,20 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone, local_irq_save(flags); pcp = &this_cpu_ptr(zone->pageset)->pcp; - list = &pcp->lists[migratetype]; - page = __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list); + + if (!(gfp_flags & __GFP_NO_LOCKS)) { + list = &pcp->lists[migratetype]; + page = __rmqueue_pcplist(zone, gfp_flags, migratetype, alloc_flags, pcp, list); + } else { + /* Iterate over all migrate types of the pcp-lists. */ + for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++) { + list = &pcp->lists[migratetype]; + page = __rmqueue_pcplist(zone, gfp_flags, migratetype, alloc_flags, pcp, list); + if (page) + break; + } + } + if (page) { __count_zid_vm_events(PGALLOC, page_zonenum(page), 1); zone_statistics(preferred_zone, zone); @@ -3790,7 +3804,8 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, * grow this zone if it contains deferred pages. */ if (static_branch_unlikely(&deferred_pages)) { - if (_deferred_grow_zone(zone, order)) + if (!(gfp_mask & __GFP_NO_LOCKS) && + _deferred_grow_zone(zone, order)) goto try_this_zone; } #endif @@ -3835,7 +3850,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, reserve_highatomic_pageblock(page, zone, order); return page; - } else { + } else if (!(gfp_mask & __GFP_NO_LOCKS)) { #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* Try again if zone has deferred pages */ if (static_branch_unlikely(&deferred_pages)) { @@ -4880,6 +4895,10 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, if (likely(page)) goto out; + /* Bypass slow path if __GFP_NO_LOCKS. */ + if ((gfp_mask & __GFP_NO_LOCKS)) + goto out; + /* * Apply scoped allocation constraints. This is mainly about GFP_NOFS * resp. GFP_NOIO which has to be inherited for all allocation requests diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c index 38a5ab683ebc..662e1d9a0e99 100644 --- a/tools/perf/builtin-kmem.c +++ b/tools/perf/builtin-kmem.c @@ -656,6 +656,7 @@ static const struct { { "__GFP_RECLAIMABLE", "RC" }, { "__GFP_MOVABLE", "M" }, { "__GFP_ACCOUNT", "AC" }, + { "__GFP_NO_LOCKS", "NL" }, { "__GFP_WRITE", "WR" }, { "__GFP_RECLAIM", "R" }, { "__GFP_DIRECT_RECLAIM", "DR" }, From patchwork Sun Aug 9 20:43:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11706803 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA35C13B1 for ; Sun, 9 Aug 2020 20:44:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 969C7206A2 for ; Sun, 9 Aug 2020 20:44:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oPiShrji" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 969C7206A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4BF496B0006; Sun, 9 Aug 2020 16:44:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 496946B0007; Sun, 9 Aug 2020 16:44:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AE0A6B0008; Sun, 9 Aug 2020 16:44:08 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id 274896B0007 for ; Sun, 9 Aug 2020 16:44:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CAB8E363F for ; Sun, 9 Aug 2020 20:44:07 +0000 (UTC) X-FDA: 77132207334.27.bee52_2f1690126fd4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id A65743D663 for ; Sun, 9 Aug 2020 20:44:07 +0000 (UTC) X-Spam-Summary: 1,0,0,d39e3ee6324d5e50,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1541:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2736:2901:3138:3139:3140:3141:3142:3352:3865:3866:3867:3870:3871:3872:3874:4321:4385:5007:6261:6653:7514:9413:10004:11026:11473:11658:11914:12043:12048:12295:12296:12297:12438:12517:12519:12555:12895:12986:13069:13311:13357:13894:14093:14096:14181:14384:14394:14687:14721:14819:21080:21324:21433:21444:21451:21627:21666:21790:30003:30012:30054,0,RBL:209.85.208.194:@gmail.com:.lbl8.mailshell.net-62.18.84.100 66.100.201.100;04ygmxun7ux76ej344ghw7mrmw73jypzd3ht49p49ez9p4kmiokc68p4bon8okz.9i67ddybts3rth4wye47g6m6w9kzwuwk1ecm6dne7ei3uw3ff81fq83whhmphjk.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: bee52_2f1690126fd4 X-Filterd-Recvd-Size: 4831 Received: from mail-lj1-f194.google.com (mail-lj1-f194.google.com [209.85.208.194]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Sun, 9 Aug 2020 20:44:07 +0000 (UTC) Received: by mail-lj1-f194.google.com with SMTP id v4so7462448ljd.0 for ; Sun, 09 Aug 2020 13:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uuKzMy8ewGXUSBJtIFW0KX3RN99yxbeLmlHXP4MWBpY=; b=oPiShrjiXPEdy2Cco0DaYWuVYysn0jRQlB+C7ArO0yieudE5NAPxFgcc78p6eCl6xQ eUYcIXo3hUVx6AeVpVePfK4c1s4oxxqecRJST0kCQRYovO+T6bv/0FIhNs7vbqeo8wjB EameDLmP6EniKNI0ezXFKoLzlNbtBaBOyvLrWiYI4iLyBsvrRIBQ1buAzjRdl8gRzcVF 6Q2n1w99ZlOq+QtQcheIWmvxXVovkoko4w+eKO/NrpSFYVbi0P1cF5MciP9XteH+Lbc1 SXNF/RYhb0uAEfjx1sfex+S+O2DRjg9b4+oCIGIdgJAdfKCnik70hP7mk+cjjUsUbZgb b/ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uuKzMy8ewGXUSBJtIFW0KX3RN99yxbeLmlHXP4MWBpY=; b=WPGyu51mirFSAmzOaooZfcSpfOkEkPC507guGBmTNAI4zHoMoyN3mEqQdC1pDJoC1V FEpwrG4iwEe+yechdGtpeNIppvCBfpMApZlTVBtkSb0jnCRW1osBqP2z6xuTzuQZnM4i PDzCwIp/xzKlDlxDf439rzZK5/p5ZPftynSpOP6iwZbyrZasM0QuOpltH5NdfMQZ+CBv hr3M9Blldey1KdY0W9UCxFaHvTVdYd38lrtvwmzAnUYWMjMIJ26db4pZHmIR96WS4uzW v5rfIdN8eNE4PhHjCyIztAlQxr1oYJjUuJ0e3rbZPMKA8cuYJ4d9fjm+sI9yt0WzYNt1 UUFQ== X-Gm-Message-State: AOAM530FbW3dYuvgpXntrMMen46B6URdA1ZCQPW2fEEtmWbL6Y0Hem44 8VXF9k3gSRnyZX1ZjTiX7rU= X-Google-Smtp-Source: ABdhPJze7IbNW/h8CDiwh7ppwJgta/7FPLU8nmFubdLHUR0XB3cjuCTYs4AdpvuvPw70SotORU6WdQ== X-Received: by 2002:a2e:8084:: with SMTP id i4mr11562182ljg.447.1597005846020; Sun, 09 Aug 2020 13:44:06 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id c17sm9529603lfr.23.2020.08.09.13.44.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 09 Aug 2020 13:44:05 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , "Paul E . McKenney" , Matthew Wilcox Cc: "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH 2/2] rcu/tree: use __GFP_NO_LOCKS flag Date: Sun, 9 Aug 2020 22:43:54 +0200 Message-Id: <20200809204354.20137-3-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200809204354.20137-1-urezki@gmail.com> References: <20200809204354.20137-1-urezki@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A65743D663 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enter the page allocator with newly introduced __GFP_NO_LOCKS flag instead of former GFP_NOWAIT | __GFP_NOWARN sequence. Such approach address two concerns. See them below: a) If built with CONFIG_PROVE_RAW_LOCK_NESTING, the lockdep complains about violation("BUG: Invalid wait context") of the nesting rules. It does the raw_spinlock vs. spinlock nesting checks, i.e. it is not legal to acquire a spinlock_t while holding a raw_spinlock_t. Internally the kfree_rcu() uses raw_spinlock_t whereas the page allocator internally deals with spinlock_t to access to its zones. The code also can be broken from higher level of view: raw_spin_lock(&some_lock); kfree_rcu(some_pointer, some_field_offset); b) If built with CONFIG_PREEMPT_RT. Please note, in that case spinlock_t is converted into sleepable variant. Invoking the page allocator from atomic contexts leads to: "BUG: scheduling while atomic". Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 30e7e252b9e7..48cb64800108 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3327,7 +3327,7 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) * pages are available. */ bnode = (struct kvfree_rcu_bulk_data *) - __get_free_page(GFP_NOWAIT | __GFP_NOWARN); + __get_free_page(__GFP_NO_LOCKS); } /* Switch to emergency path. */