From patchwork Mon Jul 27 21:10:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 11687653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 05E80912 for ; Mon, 27 Jul 2020 21:10:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC4B820809 for ; Mon, 27 Jul 2020 21:10:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iP+7r2dt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC4B820809 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C9A766B0002; Mon, 27 Jul 2020 17:10:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C49976B0005; Mon, 27 Jul 2020 17:10:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3A036B0006; Mon, 27 Jul 2020 17:10:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id 9DFC86B0002 for ; Mon, 27 Jul 2020 17:10:22 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5A08A3624 for ; Mon, 27 Jul 2020 21:10:22 +0000 (UTC) X-FDA: 77085099084.25.rings47_180f61726f64 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 290F11804E3A0 for ; Mon, 27 Jul 2020 21:10:22 +0000 (UTC) X-Spam-Summary: 1,0,0,4183832087aed028,d41d8cd98f00b204,urezki@gmail.com,,RULES_HIT:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1437:1515:1535:1544:1605:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2736:2901:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4118:4250:4321:4385:4605:5007:6119:6261:6653:7514:7808:7875:7903:7904:8660:9121:9163:9413:10004:11026:11233:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12740:12895:12986:13095:13141:13148:13161:13229:13230:13894:14096:14181:14394:14687:14721:21080:21324:21433:21444:21451:21611:21627:21666:21740:21939:21966:21990:30029:30041:30054:30056:30070:30075,0,RBL:209.85.167.66:@gmail.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04ygfpk79a4u8ci8h3gbtwa3mj79yycd9dz7hjia5xpowbwui4ws9s8fdug4pce.phtoxs33i6f414z83mtp15r75hdm3ye9sefn4e6d4r9wkuykfof7a3tfqzwz5fx.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,Domai nCache:0 X-HE-Tag: rings47_180f61726f64 X-Filterd-Recvd-Size: 7337 Received: from mail-lf1-f66.google.com (mail-lf1-f66.google.com [209.85.167.66]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 27 Jul 2020 21:10:21 +0000 (UTC) Received: by mail-lf1-f66.google.com with SMTP id c3so2618083lfb.3 for ; Mon, 27 Jul 2020 14:10:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=gCi4uQupGH9wlbYDIIaXDuhHkIJkqnW5jRCmhH4yzys=; b=iP+7r2dt0mrERD07SdS+TGzP3SyiL1AoMitm778mxLBgJZrySPoHh14xUJfzeLz3EW daetsozIsLIZLGcVVraW3Pmc5VAuA2+odgfJLWBYul6shRJ9hh0Qgp/svRTItr1aTWTe RxWRKJCIdxzKFXV/QM9dPDBoGbmKArmISv250T5rYezYh69L6dIp0+NntO7tF00HJwr2 R1KkIYCfuDYF3NKEpywJJuDXrFTbw83C9pNgaChmhw5v05WuAFe1eTxpXzAkoMG+dPLP DsRsoV7FL6PuFxs1I+S81exnBP7IBPH+DBS7qPlw8yaNONrobm9F0AqURFY18yLGtnSC k0lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=gCi4uQupGH9wlbYDIIaXDuhHkIJkqnW5jRCmhH4yzys=; b=k/oMn2fw2edhuB8R0tbZXdgLODfsmKOUBW+F65KvZaVktMdjQsujUVcf3JIGIzzeCJ d6WVlIrvT28La8wvsRp7Sk7Fv4HHd6ZcE4FX40Qlk8MVdOyj1BG7m549QEvRPQTDlzNn AkHS4FQSIOFl68YgNsEZpuYqr1CGFiSMKtcCypmDEYMtC325+k+nojDwKSohErMdakFh /lrGhOYQg+RMIZzNzdbCwShlFlh04oqbaePRrRAvyA9BvBtqqdhA98FvgU9C5/W+LFVx tnQDq+2q9B3tA8qpb6r9ZcTagUyJ6bojwq/BrXC0bY+ajPULhzQzJN9dE3RxEfXX4gEM sbdA== X-Gm-Message-State: AOAM530tPBxG1ZTMDEP839kfDcBfwt1mVAjG2QVISYSg4eOWBrD2aUvg lO04Zro68Qj7msX9W5aRuUk= X-Google-Smtp-Source: ABdhPJwJkrQ26iT00/yNlcmK6rwCV8JlV2VJHQdUO53NwZI5VUUccB9yeYw2y0rV9rMYW2G9Uff4ag== X-Received: by 2002:a05:6512:10ca:: with SMTP id k10mr12911706lfg.177.1595884220130; Mon, 27 Jul 2020 14:10:20 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id s1sm3266078lfi.76.2020.07.27.14.10.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jul 2020 14:10:19 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, "Paul E . McKenney" Cc: Andrew Morton , "Theodore Y . Ts'o" , Matthew Wilcox , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 1/1] rcu/tree: Drop the lock before entering to page allocator Date: Mon, 27 Jul 2020 23:10:12 +0200 Message-Id: <20200727211012.30948-1-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 290F11804E3A0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the kernel is built with CONFIG_PROVE_RAW_LOCK_NESTING option, the lockedp will complain about violation of the nesting rules: [ 28.060389] ============================= [ 28.060389] [ BUG: Invalid wait context ] [ 28.060389] 5.8.0-rc3-rcu #211 Tainted: G E [ 28.060389] ----------------------------- [ 28.060390] vmalloc_test/0/523 is trying to lock: [ 28.060390] ffff96df7ffe0228 (&zone->lock){-.-.}-{3:3}, at: get_page_from_freelist+0xcf0/0x16d0 [ 28.060391] other info that might help us debug this: [ 28.060391] context-{5:5} [ 28.060392] 2 locks held by vmalloc_test/0/523: [ 28.060392] #0: ffffffffc06750d0 (prepare_for_test_rwsem){++++}-{4:4}, at: test_func+0x76/0x240 [test_vmalloc] [ 28.060393] #1: ffff96df5fa1d390 (krc.lock){..-.}-{2:2}, at: kvfree_call_rcu+0x5c/0x230 [ 28.060395] stack backtrace: [ 28.060395] CPU: 0 PID: 523 Comm: vmalloc_test/0 Tainted: G E 5.8.0-rc3-rcu #211 [ 28.060395] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 28.060396] Call Trace: [ 28.060397] dump_stack+0x96/0xd0 [ 28.060397] __lock_acquire.cold.65+0x166/0x2d7 [ 28.060398] ? find_held_lock+0x2d/0x90 [ 28.060399] lock_acquire+0xad/0x370 [ 28.060400] ? get_page_from_freelist+0xcf0/0x16d0 [ 28.060401] ? mark_held_locks+0x48/0x70 [ 28.060402] _raw_spin_lock+0x25/0x30 [ 28.060403] ? get_page_from_freelist+0xcf0/0x16d0 [ 28.060404] get_page_from_freelist+0xcf0/0x16d0 [ 28.060405] ? __lock_acquire+0x3ee/0x1b90 [ 28.060407] __alloc_pages_nodemask+0x16a/0x3a0 [ 28.060408] __get_free_pages+0xd/0x30 [ 28.060409] kvfree_call_rcu+0x18a/0x230 Internally the kfree_rcu() uses raw_spinlock_t whereas the page allocator internally deals with spinlock_t to access to its zones. In order to prevent such vialation that is in question we can drop the internal raw_spinlock_t before entering to the page allocaor. Short changelog (v1 -> v2): - rework the commit message; - rework the patch making it smaller; - add more comments. Signed-off-by: Uladzislau Rezki (Sony) Reviewed-by: Joel Fernandes (Google) Tested-by: Chris Wilson --- kernel/rcu/tree.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 21c2fa5bd8c3..2de112404121 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3287,6 +3287,8 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) return false; lockdep_assert_held(&krcp->lock); + lockdep_assert_irqs_disabled(); + idx = !!is_vmalloc_addr(ptr); /* Check if a new block is required. */ @@ -3306,6 +3308,29 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) if (IS_ENABLED(CONFIG_PREEMPT_RT)) return false; + /* + * If built with CONFIG_PROVE_RAW_LOCK_NESTING option, + * the lockedp will complain about violation of the + * nesting rules. It does the raw_spinlock vs. spinlock + * nesting checks. + * + * That is why we drop the raw lock. Please note IRQs are + * still disabled it guarantees that the "current" stays + * on the same CPU later on when the raw lock is taken + * back. + * + * It is important because if the page allocator is invoked + * in fully preemptible context, it can be that we get a page + * but end up on another CPU. That another CPU might not need + * a page because of having some extra spots in its internal + * array for pointer collecting. Staying on same CPU eliminates + * described issue. + * + * Dropping the lock also reduces the critical section by + * the time taken by the page allocator to obtain a page. + */ + raw_spin_unlock(&krcp->lock); + /* * NOTE: For one argument of kvfree_rcu() we can * drop the lock and get the page in sleepable @@ -3315,6 +3340,8 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) */ bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_NOWAIT | __GFP_NOWARN); + + raw_spin_lock(&krcp->lock); } /* Switch to emergency path. */