From patchwork Tue Nov 14 21:56:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10058439 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AECFD602A7 for ; Tue, 14 Nov 2017 21:57:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A072328609 for ; Tue, 14 Nov 2017 21:57:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 955EB29A74; Tue, 14 Nov 2017 21:57:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F103928609 for ; Tue, 14 Nov 2017 21:57:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756836AbdKNV5d (ORCPT ); Tue, 14 Nov 2017 16:57:33 -0500 Received: from mail-qt0-f176.google.com ([209.85.216.176]:43975 "EHLO mail-qt0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757217AbdKNV5H (ORCPT ); Tue, 14 Nov 2017 16:57:07 -0500 Received: by mail-qt0-f176.google.com with SMTP id r58so5770745qtc.0 for ; Tue, 14 Nov 2017 13:57:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xWxaREi4N5RQWleWPWsC56qV4KQdQnjaYGd12wo3RTY=; b=bfKzkqoMYQcPWlZrduS9IKKGCJSFVyVzGkIQxt+Cjus7I/AgBYTNZZS7NA/BdlqkR9 aysUr4iOuu4Jkxtm5QEw3bcRr4VzR5/lZ1zaCjRi204Ci91ybiBUOF8OSFWR/QENTpwX 6gCeHpa1LJObkIYTXxW7bsG7WOgQxQxOgup62sdWEXy2LB+z6aYYG2JT9QDdj9ikLpPC gMRUTE7XHPKh/UgcWaJPdsiWlJcFAgMq1oMsTJd+F2TJmGQ1czQs1pthMqTXyNoVUMfY C0y4gTft/C7pYyeYKIZ2ph22LJ0WZ3/J4GVbJM506H9TstVaPvbMxGmpAJpAoddMCfeW a1WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xWxaREi4N5RQWleWPWsC56qV4KQdQnjaYGd12wo3RTY=; b=J5LXhoYp2miF24+T4MMkV4PQ9egtYWgjRkQAUZtVHFb3KpyVQ2jrVuamE3kDTwvKtR 1REbPv9mAJoWzKvgH3Ci3ktCrADODnpI2KvcaH1Ujq3r7WdlQgbkVpxx1B5Yxn5fMJ0T AJ/gH6jANCqhFDwNTWSSum1G1CKck02hLZMRvEREz+CLix/dfHHW2o9Z864r1eITY9On iqB84+zWTHTKiLCRi1ir61WSjPz/Vn0drH5GG9pggFOtO1YF30ULoRSyQaVLmTg3M8Cl AxsWVKFry9woRZij7JbO0fJlAO2APu68fNzSmCSkq1fxeGG/7MKpFaIZqqNmuaMy/Zmf BrYw== X-Gm-Message-State: AJaThX5EpPBvkWR5EkwWbLh+jTz4LP+ZIhnBq5ywSCkHl55usA0GBDsm YbZGFXrsKZROUQxdv9nt/u042A== X-Google-Smtp-Source: AGs4zMYhNKUXp5ic90NQte0ZKin/8a1knUWcPOtZUru5OmkzADlqYwsinlZRtYPoHlfdpR9apAcoZg== X-Received: by 10.200.13.70 with SMTP id r6mr20739893qti.303.1510696626726; Tue, 14 Nov 2017 13:57:06 -0800 (PST) Received: from localhost (cpe-2606-A000-4381-1201-225-22FF-FEB3-E51A.dyn6.twc.com. [2606:a000:4381:1201:225:22ff:feb3:e51a]) by smtp.gmail.com with ESMTPSA id w6sm13470373qtc.53.2017.11.14.13.57.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 13:57:06 -0800 (PST) From: Josef Bacik To: hannes@cmpxchg.org, linux-mm@kvack.org, akpm@linux-foundation.org, jack@suse.cz, linux-fsdevel@vger.kernel.org, kernel-team@fb.com, linux-btrfs@vger.kernel.org Cc: Josef Bacik Subject: [PATCH 07/10] writeback: introduce super_operations->write_metadata Date: Tue, 14 Nov 2017 16:56:53 -0500 Message-Id: <1510696616-8489-7-git-send-email-josef@toxicpanda.com> X-Mailer: git-send-email 2.7.5 In-Reply-To: <1510696616-8489-1-git-send-email-josef@toxicpanda.com> References: <1510696616-8489-1-git-send-email-josef@toxicpanda.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Josef Bacik Now that we have metadata counters in the VM, we need to provide a way to kick writeback on dirty metadata. Introduce super_operations->write_metadata. This allows file systems to deal with writing back any dirty metadata we need based on the writeback needs of the system. Since there is no inode to key off of we need a list in the bdi for dirty super blocks to be added. From there we can find any dirty sb's on the bdi we are currently doing writeback on and call into their ->write_metadata callback. Signed-off-by: Josef Bacik Reviewed-by: Jan Kara Reviewed-by: Tejun Heo --- fs/fs-writeback.c | 72 ++++++++++++++++++++++++++++++++++++---- fs/super.c | 6 ++++ include/linux/backing-dev-defs.h | 2 ++ include/linux/fs.h | 4 +++ mm/backing-dev.c | 2 ++ 5 files changed, 80 insertions(+), 6 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 987448ed7698..fba703dff678 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1479,6 +1479,31 @@ static long writeback_chunk_size(struct bdi_writeback *wb, return pages; } +static long writeback_sb_metadata(struct super_block *sb, + struct bdi_writeback *wb, + struct wb_writeback_work *work) +{ + struct writeback_control wbc = { + .sync_mode = work->sync_mode, + .tagged_writepages = work->tagged_writepages, + .for_kupdate = work->for_kupdate, + .for_background = work->for_background, + .for_sync = work->for_sync, + .range_cyclic = work->range_cyclic, + .range_start = 0, + .range_end = LLONG_MAX, + }; + long write_chunk; + + write_chunk = writeback_chunk_size(wb, work); + wbc.nr_to_write = write_chunk; + sb->s_op->write_metadata(sb, &wbc); + work->nr_pages -= write_chunk - wbc.nr_to_write; + + return write_chunk - wbc.nr_to_write; +} + + /* * Write a portion of b_io inodes which belong to @sb. * @@ -1505,6 +1530,7 @@ static long writeback_sb_inodes(struct super_block *sb, unsigned long start_time = jiffies; long write_chunk; long wrote = 0; /* count both pages and inodes */ + bool done = false; while (!list_empty(&wb->b_io)) { struct inode *inode = wb_inode(wb->b_io.prev); @@ -1621,12 +1647,18 @@ static long writeback_sb_inodes(struct super_block *sb, * background threshold and other termination conditions. */ if (wrote) { - if (time_is_before_jiffies(start_time + HZ / 10UL)) - break; - if (work->nr_pages <= 0) + if (time_is_before_jiffies(start_time + HZ / 10UL) || + work->nr_pages <= 0) { + done = true; break; + } } } + if (!done && sb->s_op->write_metadata) { + spin_unlock(&wb->list_lock); + wrote += writeback_sb_metadata(sb, wb, work); + spin_lock(&wb->list_lock); + } return wrote; } @@ -1635,6 +1667,7 @@ static long __writeback_inodes_wb(struct bdi_writeback *wb, { unsigned long start_time = jiffies; long wrote = 0; + bool done = false; while (!list_empty(&wb->b_io)) { struct inode *inode = wb_inode(wb->b_io.prev); @@ -1654,12 +1687,39 @@ static long __writeback_inodes_wb(struct bdi_writeback *wb, /* refer to the same tests at the end of writeback_sb_inodes */ if (wrote) { - if (time_is_before_jiffies(start_time + HZ / 10UL)) - break; - if (work->nr_pages <= 0) + if (time_is_before_jiffies(start_time + HZ / 10UL) || + work->nr_pages <= 0) { + done = true; break; + } } } + + if (!done && wb_stat(wb, WB_METADATA_DIRTY_BYTES)) { + LIST_HEAD(list); + + spin_unlock(&wb->list_lock); + spin_lock(&wb->bdi->sb_list_lock); + list_splice_init(&wb->bdi->dirty_sb_list, &list); + while (!list_empty(&list)) { + struct super_block *sb; + + sb = list_first_entry(&list, struct super_block, + s_bdi_dirty_list); + list_move_tail(&sb->s_bdi_dirty_list, + &wb->bdi->dirty_sb_list); + if (!sb->s_op->write_metadata) + continue; + if (!trylock_super(sb)) + continue; + spin_unlock(&wb->bdi->sb_list_lock); + wrote += writeback_sb_metadata(sb, wb, work); + spin_lock(&wb->bdi->sb_list_lock); + up_read(&sb->s_umount); + } + spin_unlock(&wb->bdi->sb_list_lock); + spin_lock(&wb->list_lock); + } /* Leave any unwritten inodes on b_io */ return wrote; } diff --git a/fs/super.c b/fs/super.c index 166c4ee0d0ed..2290bef486a3 100644 --- a/fs/super.c +++ b/fs/super.c @@ -214,6 +214,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags, spin_lock_init(&s->s_inode_list_lock); INIT_LIST_HEAD(&s->s_inodes_wb); spin_lock_init(&s->s_inode_wblist_lock); + INIT_LIST_HEAD(&s->s_bdi_dirty_list); if (list_lru_init_memcg(&s->s_dentry_lru)) goto fail; @@ -446,6 +447,11 @@ void generic_shutdown_super(struct super_block *sb) spin_unlock(&sb_lock); up_write(&sb->s_umount); if (sb->s_bdi != &noop_backing_dev_info) { + if (!list_empty(&sb->s_bdi_dirty_list)) { + spin_lock(&sb->s_bdi->sb_list_lock); + list_del_init(&sb->s_bdi_dirty_list); + spin_unlock(&sb->s_bdi->sb_list_lock); + } bdi_put(sb->s_bdi); sb->s_bdi = &noop_backing_dev_info; } diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 78c65e2910dc..a961f9a51a38 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -176,6 +176,8 @@ struct backing_dev_info { struct timer_list laptop_mode_wb_timer; + spinlock_t sb_list_lock; + struct list_head dirty_sb_list; #ifdef CONFIG_DEBUG_FS struct dentry *debug_dir; struct dentry *debug_stats; diff --git a/include/linux/fs.h b/include/linux/fs.h index 339e73742e73..298a28eaed2b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1440,6 +1440,8 @@ struct super_block { spinlock_t s_inode_wblist_lock; struct list_head s_inodes_wb; /* writeback inodes */ + + struct list_head s_bdi_dirty_list; } __randomize_layout; /* Helper functions so that in most cases filesystems will @@ -1830,6 +1832,8 @@ struct super_operations { struct shrink_control *); long (*free_cached_objects)(struct super_block *, struct shrink_control *); + void (*write_metadata)(struct super_block *sb, + struct writeback_control *wbc); }; /* diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 8961ac364f06..1de7c49c5839 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -839,6 +839,8 @@ static int bdi_init(struct backing_dev_info *bdi) bdi->max_prop_frac = FPROP_FRAC_BASE * PAGE_SIZE; INIT_LIST_HEAD(&bdi->bdi_list); INIT_LIST_HEAD(&bdi->wb_list); + INIT_LIST_HEAD(&bdi->dirty_sb_list); + spin_lock_init(&bdi->sb_list_lock); init_waitqueue_head(&bdi->wb_waitq); ret = cgwb_bdi_init(bdi);