From patchwork Thu Jul 18 11:34:24 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Younger Liu X-Patchwork-Id: 2829626 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 56A7FC0AB2 for ; Thu, 18 Jul 2013 11:41:48 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 569362021E for ; Thu, 18 Jul 2013 11:41:47 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A71C720218 for ; Thu, 18 Jul 2013 11:41:45 +0000 (UTC) Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r6IBfYAM021545 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 18 Jul 2013 11:41:35 GMT Received: from oss.oracle.com (oss-external.oracle.com [137.254.96.51]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r6IBfV8N006604 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 18 Jul 2013 11:41:32 GMT Received: from localhost ([127.0.0.1] helo=oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1UzmaR-0007OM-Jq; Thu, 18 Jul 2013 04:41:31 -0700 Received: from ucsinet22.oracle.com ([156.151.31.94]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Uzma9-0007O0-AK for ocfs2-devel@oss.oracle.com; Thu, 18 Jul 2013 04:41:13 -0700 Received: from aserp1030.oracle.com (aserp1030.oracle.com [141.146.126.68]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r6IBfCH8029171 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 18 Jul 2013 11:41:12 GMT Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [119.145.14.65]) by aserp1030.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r6IBf8cF013020 (version=TLSv1/SSLv3 cipher=DES-CBC3-SHA bits=168 verify=FAIL) for ; Thu, 18 Jul 2013 11:41:11 GMT Received: from 172.24.2.119 (EHLO szxeml212-edg.china.huawei.com) ([172.24.2.119]) by szxrg02-dlp.huawei.com (MOS 4.3.4-GA FastPath queued) with ESMTP id BEU29150; Thu, 18 Jul 2013 19:35:28 +0800 (CST) Received: from SZXEML404-HUB.china.huawei.com (10.82.67.59) by szxeml212-edg.china.huawei.com (172.24.2.181) with Microsoft SMTP Server (TLS) id 14.1.323.7; Thu, 18 Jul 2013 19:34:27 +0800 Received: from [127.0.0.1] (10.135.69.19) by szxeml404-hub.china.huawei.com (10.82.67.59) with Microsoft SMTP Server id 14.1.323.7; Thu, 18 Jul 2013 19:34:26 +0800 Message-ID: <51E7D2C0.50602@huawei.com> Date: Thu, 18 Jul 2013 19:34:24 +0800 From: Younger Liu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: Andrew Morton X-Originating-IP: [10.135.69.19] X-CFilter-Loop: Reflected X-Flow-Control-Info: class=Pass-to-MM reputation=ipRisk-All ip=119.145.14.65 ct-class=R3 ct-vol1=0 ct-vol2=6 ct-vol3=5 ct-risk=10 ct-spam1=0 ct-spam2=0 ct-bulk=87 rcpts=1 size=4497 X-Sendmail-CM-Score: 0.00% X-Sendmail-CM-Analysis: v=2.1 cv=IZS+Bwaa c=1 sm=1 tr=0 a=qbZWUeANkjeORAZY4leFnw==:117 a=qbZWUeANkjeORAZY4leFnw==:17 a=5VX5zscz4CYA:10 a=mEcZkHQFRz0A:10 a=CaQsjZ6uBp0A:10 a=O9dq5j03pVQA:10 a=8nJEP1OIZ-IA:10 a=i0EeH86SAAAA:8 a=fNr5SzA9L5gA:10 a=yPCof4ZbAAAA:8 a= 8oBYv7vgZYH1YLscUY4A:9 a=wPNLvfGTeEIA:10 a=hPjdaMEvmhQA:10 a=7DSvI1NPTFQA:10 X-Sendmail-CT-Classification: not spam X-Sendmail-CT-RefID: str=0001.0A020206.51E7D457.0133:SCFSTAT1612107, ss=1, re=-4.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 Cc: Ocfs2-Devel Subject: [Ocfs2-devel] [PATCH V2 RESENT] ocfs2: lighten up allocate transaction X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: acsinet21.oracle.com [141.146.126.237] X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The issue scenario is as following: When fallocating a very large disk space for a small file, __ocfs2_extend_allocation attempts to get a very large transaction. For some journal sizes, there may be not enough room for this transaction, and the fallocate will fail. The patch below extents & restarts the transaction as necessary while allocating space, and should work with even the smallest journal. This patch refers ext4 resize. Test: # mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc ...(jounral size is 32M) # mount.ocfs2 /dev/sdc /mnt/ocfs2/ # touch /mnt/ocfs2/1.log # fallocate -o 0 -l 400G /mnt/ocfs2/1.log fallocate: /mnt/ocfs2/1.log: fallocate failed: Cannot allocate memory # tail -f /var/log/messages [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048) [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12 [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12 [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12 ^C With this patch, the test works well. Signed-off-by: Younger Liu Tested-by: Jie Liu --- fs/ocfs2/file.c | 6 +----- fs/ocfs2/journal.c | 34 ++++++++++++++++++++++++++++++++++ fs/ocfs2/journal.h | 11 +++++++++++ fs/ocfs2/ocfs2_trace.h | 2 ++ 4 files changed, 48 insertions(+), 5 deletions(-) -- 1.7.9.7 diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 05c0bb1..4943918 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -666,11 +666,7 @@ restarted_transaction: } else { BUG_ON(why != RESTART_TRANS); - /* TODO: This can be more intelligent. */ - credits = ocfs2_calc_extend_credits(osb->sb, - &fe->id2.i_list, - clusters_to_add); - status = ocfs2_extend_trans(handle, credits); + status = ocfs2_allocate_extend_trans(handle, 1); if (status < 0) { /* handle still has to be committed at * this point. */ diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 8eccfab..46bce79 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -455,6 +455,40 @@ bail: return status; } +/* + * If we have fewer than thresh credits, extend by OCFS2_MAX_TRANS_DATA. + * If that fails, restart the transaction & regain write access for the + * buffer head which is used for metadata modifications. + * Taken from Ext4: extend_or_restart_transaction() + */ +int ocfs2_allocate_extend_trans(handle_t *handle, int thresh) +{ + int status, old_nblks; + + BUG_ON(!handle); + + old_nblks = handle->h_buffer_credits; + trace_ocfs2_allocate_extend_trans(old_nblks, thresh); + + if (old_nblks < thresh) + return 0; + + status = jbd2_journal_extend(handle, OCFS2_MAX_TRANS_DATA); + if (status < 0) { + mlog_errno(status); + goto bail; + } + + if (status > 0) { + status = jbd2_journal_restart(handle, OCFS2_MAX_TRANS_DATA); + if (status < 0) + mlog_errno(status); + } + +bail: + return status; +} + struct ocfs2_triggers { struct jbd2_buffer_trigger_type ot_triggers; int ot_offset; diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index a3385b6..62a26bd 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -259,6 +259,17 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb, int ocfs2_commit_trans(struct ocfs2_super *osb, handle_t *handle); int ocfs2_extend_trans(handle_t *handle, int nblocks); +int ocfs2_allocate_extend_trans(handle_t *handle, + int thresh); + +/* + * Define an arbitrary limit for the amount of data we will anticipate + * writing to any given transaction. For unbounded transactions such as + * fallocate(2) we can write more than this, but we always + * start off at the maximum transaction size and grow the transaction + * optimistically as we go. + */ +#define OCFS2_MAX_TRANS_DATA 64U /* * Create access is for when we get a newly created buffer and we're diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h index 3b481f4..1b60c62 100644 --- a/fs/ocfs2/ocfs2_trace.h +++ b/fs/ocfs2/ocfs2_trace.h @@ -2579,6 +2579,8 @@ DEFINE_OCFS2_INT_INT_EVENT(ocfs2_extend_trans); DEFINE_OCFS2_INT_EVENT(ocfs2_extend_trans_restart); +DEFINE_OCFS2_INT_INT_EVENT(ocfs2_allocate_extend_trans); + DEFINE_OCFS2_ULL_ULL_UINT_UINT_EVENT(ocfs2_journal_access); DEFINE_OCFS2_ULL_EVENT(ocfs2_journal_dirty);