From patchwork Tue Mar 10 19:45:16 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 5980241 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 004ABBF440 for ; Tue, 10 Mar 2015 19:47:35 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1784B20165 for ; Tue, 10 Mar 2015 19:47:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1B1F120108 for ; Tue, 10 Mar 2015 19:47:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753459AbbCJTrR (ORCPT ); Tue, 10 Mar 2015 15:47:17 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:59287 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753123AbbCJTrO (ORCPT ); Tue, 10 Mar 2015 15:47:14 -0400 Received: from pps.filterd (m0044010 [127.0.0.1]) by mx0a-00082601.pphosted.com (8.14.5/8.14.5) with SMTP id t2AJj8k7019997; Tue, 10 Mar 2015 12:47:09 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=GdVybfWYpNgAXAbbQzs01QuLaC+FIGZdg9shgaZP+gA=; b=j9xxFYUYBhJHy/N5ltZG9uXOS6hiaVs2LPdReqJRPjikfL3IjV1YvmeShmKOWUzWfls2 7f1ARMkH9s97l6/eK1ss+ww485nd1jedFp7fUf7KbOmM4xykdMWybMhb2pxQoKJMkjvM mRuQ+sD6XBSvNC1JqBUvaDszpjtx3dhqGX4= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 1t270a807w-10 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Tue, 10 Mar 2015 12:47:08 -0700 Received: from localhost (192.168.54.13) by mail.thefacebook.com (192.168.16.15) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 10 Mar 2015 12:45:32 -0700 From: Josef Bacik To: , , , , CC: Dave Chinner Subject: [PATCH 1/9] writeback: plug writeback at a high level Date: Tue, 10 Mar 2015 15:45:16 -0400 Message-ID: <1426016724-23912-2-git-send-email-jbacik@fb.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1426016724-23912-1-git-send-email-jbacik@fb.com> References: <1426016724-23912-1-git-send-email-jbacik@fb.com> MIME-Version: 1.0 X-Originating-IP: [192.168.54.13] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68, 1.0.33, 0.0.0000 definitions=2015-03-10_07:2015-03-10, 2015-03-10, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.925924926977281 suspectscore=0 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 rbsscore=0.925924926977281 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.925924926977281 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1503100201 X-FB-Internal: deliver Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner Doing writeback on lots of little files causes terrible IOPS storms because of the per-mapping writeback plugging we do. This essentially causes imeediate dispatch of IO for each mapping, regardless of the context in which writeback is occurring. IOWs, running a concurrent write-lots-of-small 4k files using fsmark on XFS results in a huge number of IOPS being issued for data writes. Metadata writes are sorted and plugged at a high level by XFS, so aggregate nicely into large IOs. However, data writeback IOs are dispatched in individual 4k IOs, even when the blocks of two consecutively written files are adjacent. Test VM: 8p, 8GB RAM, 4xSSD in RAID0, 100TB sparse XFS filesystem, metadata CRCs enabled. Kernel: 3.10-rc5 + xfsdev + my 3.11 xfs queue (~70 patches) Test: $ ./fs_mark -D 10000 -S0 -n 10000 -s 4096 -L 120 -d /mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d /mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d /mnt/scratch/6 -d /mnt/scratch/7 Result: wall sys create rate Physical write IO time CPU (avg files/s) IOPS Bandwidth ----- ----- ------------ ------ --------- unpatched 6m56s 15m47s 24,000+/-500 26,000 130MB/s patched 5m06s 13m28s 32,800+/-600 1,500 180MB/s improvement -26.44% -14.68% +36.67% -94.23% +38.46% If I use zero length files, this workload at about 500 IOPS, so plugging drops the data IOs from roughly 25,500/s to 1000/s. 3 lines of code, 35% better throughput for 15% less CPU. The benefits of plugging at this layer are likely to be higher for spinning media as the IO patterns for this workload are going make a much bigger difference on high IO latency devices..... Signed-off-by: Dave Chinner Reviewed-by: Jan Kara --- fs/fs-writeback.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index e907052..a9ff2b7 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -659,7 +659,9 @@ static long writeback_sb_inodes(struct super_block *sb, unsigned long start_time = jiffies; long write_chunk; long wrote = 0; /* count both pages and inodes */ + struct blk_plug plug; + blk_start_plug(&plug); while (!list_empty(&wb->b_io)) { struct inode *inode = wb_inode(wb->b_io.prev); @@ -756,6 +758,7 @@ static long writeback_sb_inodes(struct super_block *sb, break; } } + blk_finish_plug(&plug); return wrote; }