From patchwork Tue Jan 1 02:16:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10745633 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60D4814E2 for ; Tue, 1 Jan 2019 02:16:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C71628C82 for ; Tue, 1 Jan 2019 02:16:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 405C628C84; Tue, 1 Jan 2019 02:16:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C890D28C82 for ; Tue, 1 Jan 2019 02:16:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728135AbfAACQz (ORCPT ); Mon, 31 Dec 2018 21:16:55 -0500 Received: from userp2120.oracle.com ([156.151.31.85]:36082 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728133AbfAACQz (ORCPT ); Mon, 31 Dec 2018 21:16:55 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id x012Ejj6173739 for ; Tue, 1 Jan 2019 02:16:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=2uxy0uXkFQxNC7IqXocVBCIc8ABaWWtPYsdtleCNQGg=; b=xnXuvAX+vfEYcYc1B/pdI44N4EPVZWAPusj9vPnBh2vBeq8AMkQUv0t6xQzaDKV0+cmg 5jptSf7vQgt/IJqRQ/0HtKM0p+owv7iTj5mfiqlLkslRAVe4gd1EZgTnPXhkkX5xfAyz Jg0f7JRGsIQubMO8MK6r3dqr0ymR+Qt3iUCy1Z6dBBeXZ+1KzHsU7eX1/ViQfcBDUHBJ JwC8kstlRcypFn8H78hjJNJfj2C99EvOeCdiu4+edFG9yR2q4wMbjy2SGFFUTW0qxfKm jZG24+zv/zLivhxUmaDlIIthcvlhJxLy7iBdySl8LRkULTXetQ5gqMmRiNVRpw0GGl5m Ug== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2pp1jqx40d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 01 Jan 2019 02:16:53 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x012Gqdb000728 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 1 Jan 2019 02:16:52 GMT Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x012GqbI006054 for ; Tue, 1 Jan 2019 02:16:52 GMT Received: from localhost (/10.159.150.85) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 31 Dec 2018 18:16:51 -0800 Subject: [PATCH 00/12] xfs: deferred inode inactivation From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 31 Dec 2018 18:16:50 -0800 Message-ID: <154630901076.16693.13111277988041606505.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9123 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901010019 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi all, This is a new patch series implementing deferred inode inactivation. Inactivation is the process of updating all on-disk metadata when a file is deleted -- freeing the data/attr/COW fork extent allocations, removing the inode from the unlinked hash, marking the inode record itself free, and updating the inode btrees so that they show the inode as not being in use. Currently, all this inactivation is performed during in-core inode reclaim, which creates two big headaches: first, this makes direct memory reclamation /really/ slow, and second, it prohibits us from partially freezing the filesystem for online fsck activity because scrub can hit direct memory reclaim. It's ok for scrub to fail with ENOMEM, but it's not ok for scrub to deadlock memory reclaim. :) The implementation will be familiar to those who have studied how XFS scans for reclaimable in-core inodes -- we create a couple more inode state flags to mark an inode as needing inactivation and being in the middle of inactivation. When inodes need inactivation, we set iflags, set the RECLAIM radix tree tag, update a count of how many resources will be freed by the pending inactivations, and schedule a deferred work item. The deferred work item scans the inode radix tree for inodes to inactivate, and does all the on-disk metadata updates. Once the inode has been inactivated, it is left in the reclaim state and the background reclaim worker (or direct reclaim) will get to it eventually. Patch 1 fixes fs freeze to clean out COW extents when possible. Patch 2-3 refactor some of the inactivation predicates. Patches 4-5 implement the count of blocks/quota that can be freed by running inactivation; this is necessary to preserve the behavior where you rm a file and the fs counters update immediately. Patches 6-7 refactor more inode reclaim code so that we can reuse some of it for inactivation. Patch 8 delivers the core of the inactivation changes by altering the inode lifetime state machine to include the new inode flags and background workers. Patches 9-10 makes it so that if an allocation attempt hits ENOSPC it will force inactivation to free resources and try again. Patch 11 converts the per-fs inactivation scanner to be tracked on a per-AG basis so that we can be more targeted in our inactivation. Patch 12 makes it so that a process deleting a directory tree or a very fragmented file will wait for inactivation to happen so that a deltree cannot flood the system with inactive inodes. If you're going to start using this mess, you probably ought to just pull from my git trees. The kernel patches[1] should apply against 4.20. xfsprogs[2] and xfstests[3] can be found in their usual places. The git trees contain all four series' worth of changes. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D [1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel [2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel [3] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=djwong-devel