From patchwork Tue May 29 11:50:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 10435055 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0674B601E9 for ; Tue, 29 May 2018 11:51:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E5CE328701 for ; Tue, 29 May 2018 11:51:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E43C628709; Tue, 29 May 2018 11:51:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E599C28705 for ; Tue, 29 May 2018 11:51:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C7A096B000A; Tue, 29 May 2018 07:51:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C29D56B000C; Tue, 29 May 2018 07:51:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACC7C6B000D; Tue, 29 May 2018 07:51:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr0-f200.google.com (mail-wr0-f200.google.com [209.85.128.200]) by kanga.kvack.org (Postfix) with ESMTP id 404376B000A for ; Tue, 29 May 2018 07:51:00 -0400 (EDT) Received: by mail-wr0-f200.google.com with SMTP id 54-v6so12755487wrw.1 for ; Tue, 29 May 2018 04:51:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:references:mime-version:content-disposition:in-reply-to :user-agent:message-id; bh=Gzb/ULIbpoXlQ7TFU1+Ps/UpTb7fFtq1uIFJ41bhdcM=; b=X8PhKvuNlxot4mhZ6lbDtkkQvUiEpVTv9SNi3lXsyr+0WSWM4I7ECtovHCv+UKQX6u X6z/XCSm149bO0BlKSFh4//Uk+QvuCu7KYPLkdDazkjq+CORhqB8QnBf5IrAplmLSlBn 29n3zFkmgzN60jbmd/0WY3HkGosojg+YYxzQTbWo/ZcZ/ucpfhuC85kOxoHR55/Twua6 jRIlyle6zoT8JllvOmMHWBKpTb23O8OWIobAMo7mSQqn/dvpJbJ/5NTtq/KHg0Tea2HY 0tj3y/a/vQBCrkEwUMueEHSB3W/FhMb7LYN/uq7L3cduab7ZZuTI7rv+CG8Nbo9gcxRh 2X/A== X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: ALKqPwdVkaJZPnxRAV+0uHA2dh8kWrtzytFLq0xCA17A9IkdQqaYIxYz Kf6q+GqLYL9NUQtiDINkWeg8z9UpOQgEQWe8H52++Bba5BsDTZQwhhFbvSU3Hsmqj0uRhpBsCkU Y27oVgMtcwgyhgijEvIfVhSyVZETT8wIVK56OjqXxC5Vb5Evq2fvq6Zq8KLxHKwY= X-Received: by 2002:a50:f292:: with SMTP id f18-v6mr18890653edm.176.1527594659807; Tue, 29 May 2018 04:50:59 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoFk08gnGPvWdmrvM6NBDlPchahjyNRaUDmtjsti+Ydeg0mED9HKwe+9vRYwHXTrCJjBnMU X-Received: by 2002:a50:f292:: with SMTP id f18-v6mr18890598edm.176.1527594658898; Tue, 29 May 2018 04:50:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527594658; cv=none; d=google.com; s=arc-20160816; b=B5hYJfJgBj3PK5Yxc0y5BgAdUHp7IvN++xtUO1Wz8OknJwfxSHPBBfSB0PH8Gh/LmR s86xqTKYCjrM/fnQW80UA67AgKcca0jrbjRIufOtdx4X8WwKgr8t+2Lcx57OCd6EnTHi +EdQKNefvD3gnSiiuNzxULqysYS802j5EssOqEC4PedT2n487r4fnZQ7YvKYJoLxikSt uMf+HzgHY1VP2IDkN+026pWXcSCFjVYCdKb1d5sUB1Lscuu6bfKSdJsjAGmmErdQ3Moj UtiCQTA7xl4mkKrgBLCxBV9/hJN/AxxYn/CCPCCnyBTg8aQeOLIWU+Nv3t6KSMejdrQ5 Dk5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:user-agent:in-reply-to:content-disposition:mime-version :references:subject:cc:to:from:date:arc-authentication-results; bh=Gzb/ULIbpoXlQ7TFU1+Ps/UpTb7fFtq1uIFJ41bhdcM=; b=IKRxJNRmr3NbZMAIrbVtmq/80ZMSZ7L6zrtJ+OHy6yX5z9I/9v0lEDu6msdSYrOXrK pQ7WbTDOGpDtrx99j0XHIN00ivXq+YYQ8sDUysZmF47YDD9eVz7OccAcCjbUb4saJ0kk 1ACtllf+ttCEbTxryQNQWUmLSqXzLJNRkIESjeLMZoQXqyp2AmfCf39U8bFGX3ASd+Qu 6x2P/nbynJo4foxvwJEp/46BEvVUrnymsfDuRARU/lZ8nnc1buNaKrjXQ/VgtCqAg0Kn QRluND2sblm8i+n7UZe4oYuVZCXkS0z+m6af3SQj1cXYjgV/KDaVSTvSZWmav+CEj66t fudg== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com. [148.163.156.1]) by mx.google.com with ESMTPS id g91-v6si3946596ede.420.2018.05.29.04.50.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 May 2018 04:50:58 -0700 (PDT) Received-SPF: neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) client-ip=148.163.156.1; Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4TBnoI8085547 for ; Tue, 29 May 2018 07:50:56 -0400 Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 2j93ggyj16-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 May 2018 07:50:56 -0400 Received: from localhost by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 29 May 2018 12:50:54 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp12.uk.ibm.com (192.168.101.142) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 29 May 2018 12:50:50 +0100 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w4TBonpo24576116 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 29 May 2018 11:50:49 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CCD544203F; Tue, 29 May 2018 12:41:22 +0100 (BST) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1F70242042; Tue, 29 May 2018 12:41:22 +0100 (BST) Received: from rapoport-lnx (unknown [9.148.8.198]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Tue, 29 May 2018 12:41:22 +0100 (BST) Date: Tue, 29 May 2018 14:50:47 +0300 From: Mike Rapoport To: Michal Hocko Cc: Jonathan Corbet , Dave Chinner , Randy Dunlap , LKML , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko Subject: Re: [PATCH v2] doc: document scope NOFS, NOIO APIs References: <20180524114341.1101-1-mhocko@kernel.org> <20180529082644.26192-1-mhocko@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180529082644.26192-1-mhocko@kernel.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-TM-AS-GCONF: 00 x-cbid: 18052911-0008-0000-0000-000004FC2B3B X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18052911-0009-0000-0000-00001E904834 Message-Id: <20180529115047.GC13092@rapoport-lnx> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-05-29_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805290136 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On Tue, May 29, 2018 at 10:26:44AM +0200, Michal Hocko wrote: > From: Michal Hocko > > Although the api is documented in the source code Ted has pointed out > that there is no mention in the core-api Documentation and there are > people looking there to find answers how to use a specific API. > > Changes since v1 > - add kerneldoc for the api - suggested by Johnatan > - review feedback from Dave and Johnatan > - feedback from Dave about more general critical context rather than > locking > - feedback from Mike > - typo fixed - Randy, Dave > > Requested-by: "Theodore Y. Ts'o" > Signed-off-by: Michal Hocko I believe it's worth linking the kernel-doc part with the text. e.g.: Except that, all looks good to me Reviewed-by: Mike Rapoport > --- > .../core-api/gfp_mask-from-fs-io.rst | 61 +++++++++++++++++++ > Documentation/core-api/index.rst | 1 + > include/linux/sched/mm.h | 38 ++++++++++++ > 3 files changed, 100 insertions(+) > create mode 100644 Documentation/core-api/gfp_mask-from-fs-io.rst > > diff --git a/Documentation/core-api/gfp_mask-from-fs-io.rst b/Documentation/core-api/gfp_mask-from-fs-io.rst > new file mode 100644 > index 000000000000..2dc442b04a77 > --- /dev/null > +++ b/Documentation/core-api/gfp_mask-from-fs-io.rst > @@ -0,0 +1,61 @@ > +================================= > +GFP masks used from FS/IO context > +================================= > + > +:Date: May, 2018 > +:Author: Michal Hocko > + > +Introduction > +============ > + > +Code paths in the filesystem and IO stacks must be careful when > +allocating memory to prevent recursion deadlocks caused by direct > +memory reclaim calling back into the FS or IO paths and blocking on > +already held resources (e.g. locks - most commonly those used for the > +transaction context). > + > +The traditional way to avoid this deadlock problem is to clear __GFP_FS > +respectively __GFP_IO (note the latter implies clearing the first as well) in > +the gfp mask when calling an allocator. GFP_NOFS respectively GFP_NOIO can be > +used as shortcut. It turned out though that above approach has led to > +abuses when the restricted gfp mask is used "just in case" without a > +deeper consideration which leads to problems because an excessive use > +of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory > +reclaim issues. > + > +New API > +======== > + > +Since 4.12 we do have a generic scope API for both NOFS and NOIO context > +``memalloc_nofs_save``, ``memalloc_nofs_restore`` respectively ``memalloc_noio_save``, > +``memalloc_noio_restore`` which allow to mark a scope to be a critical > +section from a filesystem or I/O point of view. Any allocation from that > +scope will inherently drop __GFP_FS respectively __GFP_IO from the given > +mask so no memory allocation can recurse back in the FS/IO. > + > +FS/IO code then simply calls the appropriate save function before > +any critical section with respect to the reclaim is started - e.g. > +lock shared with the reclaim context or when a transaction context > +nesting would be possible via reclaim. The restore function should be > +called when the critical section ends. All that ideally along with an > +explanation what is the reclaim context for easier maintenance. > + > +Please note that the proper pairing of save/restore functions > +allows nesting so it is safe to call ``memalloc_noio_save`` or > +``memalloc_noio_restore`` respectively from an existing NOIO or NOFS > +scope. > + > +What about __vmalloc(GFP_NOFS) > +============================== > + > +vmalloc doesn't support GFP_NOFS semantic because there are hardcoded > +GFP_KERNEL allocations deep inside the allocator which are quite non-trivial > +to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is > +almost always a bug. The good news is that the NOFS/NOIO semantic can be > +achieved by the scope API. > + > +In the ideal world, upper layers should already mark dangerous contexts > +and so no special care is required and vmalloc should be called without > +any problems. Sometimes if the context is not really clear or there are > +layering violations then the recommended way around that is to wrap ``vmalloc`` > +by the scope API with a comment explaining the problem. > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > index c670a8031786..8a5f48ef16f2 100644 > --- a/Documentation/core-api/index.rst > +++ b/Documentation/core-api/index.rst > @@ -25,6 +25,7 @@ Core utilities > genalloc > errseq > printk-formats > + gfp_mask-from-fs-io > > Interfaces for kernel debugging > =============================== > diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h > index e1f8411e6b80..af5ba077bbc4 100644 > --- a/include/linux/sched/mm.h > +++ b/include/linux/sched/mm.h > @@ -166,6 +166,17 @@ static inline void fs_reclaim_acquire(gfp_t gfp_mask) { } > static inline void fs_reclaim_release(gfp_t gfp_mask) { } > #endif > > +/** > + * memalloc_noio_save - Marks implicit GFP_NOIO allocation scope. > + * > + * This functions marks the beginning of the GFP_NOIO allocation scope. > + * All further allocations will implicitly drop __GFP_IO flag and so > + * they are safe for the IO critical section from the allocation recursion > + * point of view. Use memalloc_noio_restore to end the scope with flags > + * returned by this function. > + * > + * This function is safe to be used from any context. > + */ > static inline unsigned int memalloc_noio_save(void) > { > unsigned int flags = current->flags & PF_MEMALLOC_NOIO; > @@ -173,11 +184,30 @@ static inline unsigned int memalloc_noio_save(void) > return flags; > } > > +/** > + * memalloc_noio_restore - Ends the implicit GFP_NOIO scope. > + * @flags: Flags to restore. > + * > + * Ends the implicit GFP_NOIO scope started by memalloc_noio_save function. > + * Always make sure that that the given flags is the return value from the > + * pairing memalloc_noio_save call. > + */ > static inline void memalloc_noio_restore(unsigned int flags) > { > current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags; > } > > +/** > + * memalloc_nofs_save - Marks implicit GFP_NOFS allocation scope. > + * > + * This functions marks the beginning of the GFP_NOFS allocation scope. > + * All further allocations will implicitly drop __GFP_FS flag and so > + * they are safe for the FS critical section from the allocation recursion > + * point of view. Use memalloc_nofs_restore to end the scope with flags > + * returned by this function. > + * > + * This function is safe to be used from any context. > + */ > static inline unsigned int memalloc_nofs_save(void) > { > unsigned int flags = current->flags & PF_MEMALLOC_NOFS; > @@ -185,6 +215,14 @@ static inline unsigned int memalloc_nofs_save(void) > return flags; > } > > +/** > + * memalloc_nofs_restore - Ends the implicit GFP_NOFS scope. > + * @flags: Flags to restore. > + * > + * Ends the implicit GFP_NOFS scope started by memalloc_nofs_save function. > + * Always make sure that that the given flags is the return value from the > + * pairing memalloc_nofs_save call. > + */ > static inline void memalloc_nofs_restore(unsigned int flags) > { > current->flags = (current->flags & ~PF_MEMALLOC_NOFS) | flags; > -- > 2.17.0 > diff --git a/Documentation/core-api/gfp_mask-from-fs-io.rst b/Documentation/core-api/gfp_mask-from-fs-io.rst index 2dc442b..b001f5f 100644 --- a/Documentation/core-api/gfp_mask-from-fs-io.rst +++ b/Documentation/core-api/gfp_mask-from-fs-io.rst @@ -59,3 +59,10 @@ and so no special care is required and vmalloc should be called without any problems. Sometimes if the context is not really clear or there are layering violations then the recommended way around that is to wrap ``vmalloc`` by the scope API with a comment explaining the problem. + +Reference +========= + +.. kernel-doc:: include/linux/sched/mm.h + :functions: memalloc_nofs_save memalloc_nofs_restore \ + memalloc_noio_save memalloc_noio_restore