From patchwork Fri Oct 9 12:51:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11825781 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA6AB109B for ; Fri, 9 Oct 2020 12:51:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 77D0E222BA for ; Fri, 9 Oct 2020 12:51:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h5vw1Q6e" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 77D0E222BA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A0F686B006C; Fri, 9 Oct 2020 08:51:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 998AE6B006E; Fri, 9 Oct 2020 08:51:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85EEC6B0070; Fri, 9 Oct 2020 08:51:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id 503836B006C for ; Fri, 9 Oct 2020 08:51:55 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E33D21EE6 for ; Fri, 9 Oct 2020 12:51:54 +0000 (UTC) X-FDA: 77352374148.25.owner23_0b02fa1271e0 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id C18D71804E3A0 for ; Fri, 9 Oct 2020 12:51:54 +0000 (UTC) X-Spam-Summary: 1,0,0,ed2169d5e40ba916,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:41:69:355:379:541:800:960:966:973:981:988:989:1260:1345:1359:1437:1535:1544:1605:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2693:2731:2901:2903:3138:3139:3140:3141:3142:3165:3865:3866:3867:3868:3870:3871:3872:3873:3874:4118:4250:4321:4385:4605:5007:6119:6261:6653:7514:7576:7875:9413:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12895:13869:14130:14181:14394:14687:14721:21080:21324:21433:21444:21451:21627:21666:21740:21990:30005:30012:30045:30054:30064:30069:30070:30090,0,RBL:209.85.167.196:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04ygfnm74sba6r9mcretahua5una9ocwbat9ppnnaf563mx4f7ygnzpnebrazdp.f8fy6fkujqh4hw7uefzb96gkhj8df16nmwjkgi63dnq9nfotq5xqmiezjqqmfoo.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0, LFtime:2 X-HE-Tag: owner23_0b02fa1271e0 X-Filterd-Recvd-Size: 7259 Received: from mail-oi1-f196.google.com (mail-oi1-f196.google.com [209.85.167.196]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 12:51:54 +0000 (UTC) Received: by mail-oi1-f196.google.com with SMTP id c13so10069437oiy.6 for ; Fri, 09 Oct 2020 05:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IWHS/EzbCD9llYg4YvlRdsM69gVjza4DYN+VKGEIyMg=; b=h5vw1Q6ehhkRgK4e44tdZ+0HbAzNFI8qUb/ZJPMtabBsNhLZC9e6HKpsztK7jAt9zr 2i5M2i1Oye0qU6ht4nMYcGYcAE0YpADJFHp5owXWgqW3dlo/rZZdsks5gs+2L236iPcN jjP3G3gxh+mnqrv76oor3DuZqjElcgaj0ZCnJQ6m28rRrFFpIP0GFGFifKC2y7xCWFYx PAw2DtxrYxJqHsoLsy2EuN36BFX6medYFN/26vXu0owsFjZk6d1QQPaJYSKGnqiytOYJ TotWyoiAkkcLcnpGh58jepjHzi7wotjSkrgEixplaUz9oNutWevFbhGwTtO44U89wgAq grDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IWHS/EzbCD9llYg4YvlRdsM69gVjza4DYN+VKGEIyMg=; b=kM6KS1vHLOoBnaxXU8CcdGQvyfK11CicYvXbiZvNIckYb+ViQ+QUWrKbC/kN+4KiQ1 Gh1K5pf1soHfaJyh1VUNQZZv3eggq7tBUx6siNVpWsIW4fK23B7lSZLACbs8FbWKa+6p IKq5+QptFHgo/DCMv+OkKS4raBebKLT4QrlZ7nC2oI93vHyyczNC0QKVQylQuO40t8q6 9H37bg+jwD5klmmtmIXI8WVYr3+EyNTncBbSgnYNtfl7lgeaotHelMo7NlrTuxmnOZIa BGdI2rKlYpCrSivL7k3YmC7DMo+HSQhBj+PvnPe24S4Jf6jB7ic5+UG7OP0DG1bSJGqw Zkgg== X-Gm-Message-State: AOAM5339x7nT6tMKJmPNp1MOohhpi7urlXuz/rfceMk4zj6+6DmYv4FS TOQHRgbXm4yy7iYeGo/QaLo= X-Google-Smtp-Source: ABdhPJywSKrlmzTUV7BRYRzAJplTpQbx7iJIc7TG42ucjslz9rub42PFEY/nsBW/GI1ccLRmnf4a3w== X-Received: by 2002:aca:4ad0:: with SMTP id x199mr2315208oia.113.1602247913752; Fri, 09 Oct 2020 05:51:53 -0700 (PDT) Received: from localhost.localdomain ([50.236.19.102]) by smtp.gmail.com with ESMTPSA id l25sm6736861otb.4.2020.10.09.05.51.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Oct 2020 05:51:52 -0700 (PDT) From: Yafang Shao To: david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com, willy@infradead.org, mhocko@kernel.org, akpm@linux-foundation.org Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [PATCH v8 1/2] mm: Add become_kswapd and restore_kswapd Date: Fri, 9 Oct 2020 20:51:26 +0800 Message-Id: <20201009125127.37435-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.17.2 (Apple Git-113) In-Reply-To: <20201009125127.37435-1-laoar.shao@gmail.com> References: <20201009125127.37435-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Since XFS needs to pretend to be kswapd in some of its worker threads, create methods to save & restore kswapd state. Don't bother restoring kswapd state in kswapd -- the only time we reach this code is when we're exiting and the task_struct is about to be destroyed anyway. Cc: Dave Chinner Cc: Christoph Hellwig Cc: Darrick J. Wong Cc: Matthew Wilcox Acked-by: Michal Hocko Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Yafang Shao Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree.c | 14 ++++++++------ include/linux/sched/mm.h | 23 +++++++++++++++++++++++ mm/vmscan.c | 16 +--------------- 3 files changed, 32 insertions(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2d25bab68764..a04a44238aab 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -2813,8 +2813,9 @@ xfs_btree_split_worker( { struct xfs_btree_split_args *args = container_of(work, struct xfs_btree_split_args, work); + bool is_kswapd = args->kswapd; unsigned long pflags; - unsigned long new_pflags = PF_MEMALLOC_NOFS; + int memalloc_nofs; /* * we are in a transaction context here, but may also be doing work @@ -2822,16 +2823,17 @@ xfs_btree_split_worker( * temporarily to ensure that we don't block waiting for memory reclaim * in any way. */ - if (args->kswapd) - new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; - - current_set_flags_nested(&pflags, new_pflags); + if (is_kswapd) + pflags = become_kswapd(); + memalloc_nofs = memalloc_nofs_save(); args->result = __xfs_btree_split(args->cur, args->level, args->ptrp, args->key, args->curp, args->stat); complete(args->done); - current_restore_flags_nested(&pflags, new_pflags); + memalloc_nofs_restore(memalloc_nofs); + if (is_kswapd) + restore_kswapd(pflags); } /* diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index f889e332912f..b38fdcb977a4 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -303,6 +303,29 @@ static inline void memalloc_nocma_restore(unsigned int flags) } #endif +/* + * Tell the memory management code that this thread is working on behalf + * of background memory reclaim (like kswapd). That means that it will + * get access to memory reserves should it need to allocate memory in + * order to make forward progress. With this great power comes great + * responsibility to not exhaust those reserves. + */ +#define KSWAPD_PF_FLAGS (PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD) + +static inline unsigned long become_kswapd(void) +{ + unsigned long flags = current->flags & KSWAPD_PF_FLAGS; + + current->flags |= KSWAPD_PF_FLAGS; + + return flags; +} + +static inline void restore_kswapd(unsigned long flags) +{ + current->flags &= ~(flags ^ KSWAPD_PF_FLAGS); +} + #ifdef CONFIG_MEMCG /** * memalloc_use_memcg - Starts the remote memcg charging scope. diff --git a/mm/vmscan.c b/mm/vmscan.c index 466fc3144fff..eb6f6e8103c1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3867,19 +3867,7 @@ static int kswapd(void *p) if (!cpumask_empty(cpumask)) set_cpus_allowed_ptr(tsk, cpumask); - /* - * Tell the memory management that we're a "memory allocator", - * and that if we need more memory we should get access to it - * regardless (see "__alloc_pages()"). "kswapd" should - * never get caught in the normal page freeing logic. - * - * (Kswapd normally doesn't need memory anyway, but sometimes - * you need a small amount of memory in order to be able to - * page out something else, and this flag essentially protects - * us from recursively trying to free more memory as we're - * trying to free the first piece of memory in the first place). - */ - tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; + become_kswapd(); set_freezable(); WRITE_ONCE(pgdat->kswapd_order, 0); @@ -3929,8 +3917,6 @@ static int kswapd(void *p) goto kswapd_try_sleep; } - tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD); - return 0; } From patchwork Fri Oct 9 12:51:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11825785 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27D696CA for ; Fri, 9 Oct 2020 12:52:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C2BCD222C2 for ; Fri, 9 Oct 2020 12:52:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nvhZzhxP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C2BCD222C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AF91D6B006E; Fri, 9 Oct 2020 08:52:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A82986B0070; Fri, 9 Oct 2020 08:52:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94850900002; Fri, 9 Oct 2020 08:52:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id 5E1C56B006E for ; Fri, 9 Oct 2020 08:52:01 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id F1C08181AE868 for ; Fri, 9 Oct 2020 12:52:00 +0000 (UTC) X-FDA: 77352374400.23.glove79_570c041271e0 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id D0A9037604 for ; Fri, 9 Oct 2020 12:52:00 +0000 (UTC) X-Spam-Summary: 1,0,0,c878c52f915cdae1,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:973:981:988:989:1260:1345:1359:1437:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2892:2895:2897:3138:3139:3140:3141:3142:3664:3865:3866:3867:3868:3870:3871:3872:3874:4051:4250:4321:4385:4605:5007:6117:6119:6261:6653:7514:7903:8660:9010:9413:9592:10004:11026:11232:11233:11473:11658:11914:12043:12048:12291:12295:12296:12297:12438:12517:12519:12555:12683:12895:12986:13148:13230:14096:14394:14687:21080:21433:21444:21451:21611:21627:21666:21809:21939:21990:30034:30054:30064:30069:30070:30075,0,RBL:209.85.210.68:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.50.0.100;04yf9usntp53jumeqdj34bgx5fum1opb78tur1o9r86ht88r1wum9hnkmkmwics.5tnzyp8ats44gx1hgdnepa6m7mrk7ax3siwyoc6s5m4yb94qnptg1emd1qecqq3.e-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0 ,LFtime: X-HE-Tag: glove79_570c041271e0 X-Filterd-Recvd-Size: 11660 Received: from mail-ot1-f68.google.com (mail-ot1-f68.google.com [209.85.210.68]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 12:52:00 +0000 (UTC) Received: by mail-ot1-f68.google.com with SMTP id n61so8850557ota.10 for ; Fri, 09 Oct 2020 05:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MrmOgCPwLoOgByfPkxZPowu90VULWkTwFwiuXFAZp8w=; b=nvhZzhxPbZ/ByyEuvwdd32XjupY0z6M1tjbVpuU7ywsm15slDVNqo4GW0m6UME9+/d ATddXXRjniGCVjTS+pEMboCZVsXrmsqTJItDqNRyn69fnwIGXXdTRFosK+j7q1XRfcWv ENMZqMX0pBMMnT77oqRBtLDZu2Z4ZBHpLA+s2dMxToPtdy8AkApXj0Kqr0YL9fyuY4ut huc+WD3ls/vEIuRoon1g40RTAwhwkdl5/MkWgJk7kGtKtLGWOC8yaquG7cvo2XuMbC++ T+BMKuzGdtQx3st0wZbAFO6fQ7LV8DIi5lcZMazv4tyFd0hnrbHfoezY7JJfVI9Dne6H GeCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MrmOgCPwLoOgByfPkxZPowu90VULWkTwFwiuXFAZp8w=; b=qlh9C7icUcXoA2gLcV7buXLwHk8Ci6VNmK0R+7Gyk8qx3ufYL3FKt9kDFMmxkJag6x 5lW0bG2Q0mRsoJIX1z7STSRsLFkLH5kweu7OSMlmHXN3s4jUVFViHwA5kPnm7XexRe7g P6TV8LiNTEeqTKo8qeM8XMv8TsiNfEgQ4nyxtOCpkD1kDcvSMRT1IBMKacIXZ4/GRg5m 7/tdXPZcvzpEDCjnPXvCv1feSjVUuSISxxvrg2S8eaSAkftC37JCOi676snMDxDAnysC tnq9ZyQHEkH9jN26yH7Ne/RbJ+iR+zfZ+pZ3XzX2MsutiXtV1CFEy48tcKJZlTjAbMSI QbEw== X-Gm-Message-State: AOAM53361PCT3nc3qsrvFWoE/PmLRtbCGl1H3V0apRsJF/GZT5Fy6t83 XFovip/HN37Qqw/leBJ0R/M= X-Google-Smtp-Source: ABdhPJyuAxm1JNk5zmF4gfJiau7rJ89L4Yjmo24IJchqThUA6XgvlEosFW7gPe1bVrk4kzNxC03vuQ== X-Received: by 2002:a9d:7d8a:: with SMTP id j10mr8551880otn.74.1602247919634; Fri, 09 Oct 2020 05:51:59 -0700 (PDT) Received: from localhost.localdomain ([50.236.19.102]) by smtp.gmail.com with ESMTPSA id l25sm6736861otb.4.2020.10.09.05.51.54 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Oct 2020 05:51:59 -0700 (PDT) From: Yafang Shao To: david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com, willy@infradead.org, mhocko@kernel.org, akpm@linux-foundation.org Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [PATCH v8 2/2] xfs: avoid transaction reservation recursion Date: Fri, 9 Oct 2020 20:51:27 +0800 Message-Id: <20201009125127.37435-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.17.2 (Apple Git-113) In-Reply-To: <20201009125127.37435-1-laoar.shao@gmail.com> References: <20201009125127.37435-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PF_FSTRANS which is used to avoid transaction reservation recursion, is dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce memalloc_nofs_{save,restore} API") and replaced by PF_MEMALLOC_NOFS which means to avoid filesystem reclaim recursion. That change is subtle. Let's take the exmple of the check of WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) to explain why this abstraction from PF_FSTRANS to PF_MEMALLOC_NOFS is not proper. Below comment is quoted from Dave, > It wasn't for memory allocation recursion protection in XFS - it was for > transaction reservation recursion protection by something trying to flush > data pages while holding a transaction reservation. Doing > this could deadlock the journal because the existing reservation > could prevent the nested reservation for being able to reserve space > in the journal and that is a self-deadlock vector. > IOWs, this check is not protecting against memory reclaim recursion > bugs at all (that's the previous check [1]). This check is > protecting against the filesystem calling writepages directly from a > context where it can self-deadlock. > So what we are seeing here is that the PF_FSTRANS -> > PF_MEMALLOC_NOFS abstraction lost all the actual useful information > about what type of error this check was protecting against. As a result, we should reintroduce PF_FSTRANS. As current->journal_info isn't used in XFS, we can reuse it to indicate whehter the task is in fstrans or not, Per Willy. To achieve that, four new helpers are introduce in this patch, per Dave: - xfs_trans_context_set() Used in xfs_trans_alloc() - xfs_trans_context_clear() Used in xfs_trans_commit() and xfs_trans_cancel() - xfs_trans_context_update() Used in xfs_trans_roll() - xfs_trans_context_active() To check whehter current is in fs transcation or not [1]. Below check is to avoid memory reclaim recursion. if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == PF_MEMALLOC)) goto redirty; Cc: Dave Chinner Cc: Christoph Hellwig Cc: Michal Hocko Cc: Darrick J. Wong Cc: Matthew Wilcox Signed-off-by: Yafang Shao Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 7 ------- fs/xfs/xfs_aops.c | 23 +++++++++++++++++++++-- fs/xfs/xfs_linux.h | 4 ---- fs/xfs/xfs_trans.c | 19 +++++++++---------- fs/xfs/xfs_trans.h | 30 ++++++++++++++++++++++++++++++ 5 files changed, 60 insertions(+), 23 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index bcfc288dba3f..3dc57a38bf0b 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1498,13 +1498,6 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data) PF_MEMALLOC)) goto redirty; - /* - * Given that we do not allow direct reclaim to call us, we should - * never be called in a recursive filesystem reclaim context. - */ - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) - goto redirty; - /* * Is this page beyond the end of the file? * diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index b35611882ff9..af7270f5f8a9 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -62,7 +62,8 @@ xfs_setfilesize_trans_alloc( * We hand off the transaction to the completion thread now, so * clear the flag here. */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); + return 0; } @@ -125,7 +126,7 @@ xfs_setfilesize_ioend( * thus we need to mark ourselves as being in a transaction manually. * Similarly for freeze protection. */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_set(tp); __sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS); /* we abort the update if there was an IO error */ @@ -564,6 +565,16 @@ xfs_vm_writepage( { struct xfs_writepage_ctx wpc = { }; + /* + * Given that we do not allow direct reclaim to call us, we should + * never be called while in a filesystem transaction. + */ + if (xfs_trans_context_active()) { + redirty_page_for_writepage(wbc, page); + unlock_page(page); + return 0; + } + return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); } @@ -575,6 +586,14 @@ xfs_vm_writepages( struct xfs_writepage_ctx wpc = { }; xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED); + + /* + * Given that we do not allow direct reclaim to call us, we should + * never be called while in a filesystem transaction. + */ + if (xfs_trans_context_active()) + return 0; + return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops); } diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index ab737fed7b12..8a4f6db77e33 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -102,10 +102,6 @@ typedef __u32 xfs_nlink_t; #define xfs_cowb_secs xfs_params.cowb_timer.val #define current_cpu() (raw_smp_processor_id()) -#define current_set_flags_nested(sp, f) \ - (*(sp) = current->flags, current->flags |= (f)) -#define current_restore_flags_nested(sp, f) \ - (current->flags = ((current->flags & ~(f)) | (*(sp) & (f)))) #define NBBY 8 /* number of bits per byte */ diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index ed72867b1a19..5f3a4ff51b3c 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -153,8 +153,6 @@ xfs_trans_reserve( int error = 0; bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0; - /* Mark this thread as being in a transaction */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); /* * Attempt to reserve the needed disk blocks by decrementing @@ -163,10 +161,8 @@ xfs_trans_reserve( */ if (blocks > 0) { error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd); - if (error != 0) { - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (error != 0) return -ENOSPC; - } tp->t_blk_res += blocks; } @@ -241,8 +237,6 @@ xfs_trans_reserve( tp->t_blk_res = 0; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); - return error; } @@ -284,6 +278,8 @@ xfs_trans_alloc( INIT_LIST_HEAD(&tp->t_dfops); tp->t_firstblock = NULLFSBLOCK; + /* Mark this thread as being in a transaction */ + xfs_trans_context_set(tp); error = xfs_trans_reserve(tp, resp, blocks, rtextents); if (error) { xfs_trans_cancel(tp); @@ -878,7 +874,8 @@ __xfs_trans_commit( xfs_log_commit_cil(mp, tp, &commit_lsn, regrant); - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (!regrant) + xfs_trans_context_clear(tp); xfs_trans_free(tp); /* @@ -910,7 +907,8 @@ __xfs_trans_commit( xfs_log_ticket_ungrant(mp->m_log, tp->t_ticket); tp->t_ticket = NULL; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, !!error); xfs_trans_free(tp); @@ -971,7 +969,7 @@ xfs_trans_cancel( } /* mark this thread as no longer being in a transaction */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, dirty); xfs_trans_free(tp); @@ -1013,6 +1011,7 @@ xfs_trans_roll( if (error) return error; + xfs_trans_context_update(trans, *tpp); /* * Reserve space in the log for the next transaction. * This also pushes items in the "AIL", the list of logged items, diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index b752501818d2..f84b563438f6 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -243,4 +243,34 @@ void xfs_trans_buf_copy_type(struct xfs_buf *dst_bp, extern kmem_zone_t *xfs_trans_zone; +static inline void +xfs_trans_context_set(struct xfs_trans *tp) +{ + ASSERT(!current->journal_info); + current->journal_info = tp; + tp->t_pflags = memalloc_nofs_save(); +} + +static inline void +xfs_trans_context_update(struct xfs_trans *old, struct xfs_trans *new) +{ + ASSERT(current->journal_info == old); + current->journal_info = new; +} + +static inline void +xfs_trans_context_clear(struct xfs_trans *tp) +{ + ASSERT(current->journal_info == tp); + current->journal_info = NULL; + memalloc_nofs_restore(tp->t_pflags); +} + +static inline bool +xfs_trans_context_active(void) +{ + /* Use journal_info to indicate current is in a transaction */ + return current->journal_info != NULL; +} + #endif /* __XFS_TRANS_H__ */