[v3,5/5] btrfs: ensure that metadata and flush are issued from the root cgroup
diff mbox

Message ID 20171012170628.GS3301751@devbig577.frc2.facebook.com
State New
Headers show

Commit Message

Tejun Heo Oct. 12, 2017, 5:06 p.m. UTC
Issuing metdata or otherwise shared IOs from !root cgroup can lead to
priority inversion.  This patch ensures that those IOs are always
issued from the root cgroup.

v3: Dropped unnecessary btree_inode handling as suggested by David
    Sterba.

v2: Fixed missing @bh in submit_bh_blkcg_css() call.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
---
 fs/btrfs/check-integrity.c |    2 +-
 fs/btrfs/disk-io.c         |    4 ++++
 fs/btrfs/ioctl.c           |    4 ++++
 3 files changed, 9 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Sterba Oct. 12, 2017, 6:56 p.m. UTC | #1
On Thu, Oct 12, 2017 at 10:06:28AM -0700, Tejun Heo wrote:
> Issuing metdata or otherwise shared IOs from !root cgroup can lead to
> priority inversion.  This patch ensures that those IOs are always
> issued from the root cgroup.
> 
> v3: Dropped unnecessary btree_inode handling as suggested by David
>     Sterba.
> 
> v2: Fixed missing @bh in submit_bh_blkcg_css() call.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
> Cc: David Sterba <dsterba@suse.cz>

Acked-by: David Sterba <dsterba@suse.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2741,7 +2741,7 @@  int btrfsic_submit_bh(int op, int op_fla
 	struct btrfsic_dev_state *dev_state;
 
 	if (!btrfsic_is_initialized)
-		return submit_bh(op, op_flags, bh);
+		return submit_bh_blkcg_css(op, op_flags, bh, blkcg_root_css);
 
 	mutex_lock(&btrfsic_mutex);
 	/* since btrfsic_submit_bh() might also be called before
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1025,6 +1025,8 @@  static blk_status_t btree_submit_bio_hoo
 	int async = check_async_write(bio_flags);
 	blk_status_t ret;
 
+	bio_associate_blkcg(bio, blkcg_root_css);
+
 	if (bio_op(bio) != REQ_OP_WRITE) {
 		/*
 		 * called for a read, do the setup so that checksum validation
@@ -3512,6 +3514,8 @@  static void write_dev_flush(struct btrfs
 		return;
 
 	bio_reset(bio);
+	bio_associate_blkcg(bio, blkcg_root_css);
+
 	bio->bi_end_io = btrfs_end_empty_barrier;
 	bio_set_dev(bio, device->bdev);
 	bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH;
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -150,6 +150,10 @@  void btrfs_update_iflags(struct inode *i
 		new_fl |= S_NOATIME;
 	if (ip->flags & BTRFS_INODE_DIRSYNC)
 		new_fl |= S_DIRSYNC;
+	/*
+	 * The btree_inode will be always in the root cgroup. The cgroup
+	 * writeback can be enabled on regular inodes selectively.
+	 */
 	new_fl |= S_CGROUPWB;
 
 	set_mask_bits(&inode->i_flags,