From patchwork Fri Dec 28 08:29:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia Guo X-Patchwork-Id: 10744263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5ACC6C2 for ; Fri, 28 Dec 2018 08:31:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D04AA2895F for ; Fri, 28 Dec 2018 08:31:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BF61F289D5; Fri, 28 Dec 2018 08:31:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from aserp2130.oracle.com (aserp2130.oracle.com [141.146.126.79]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D1F152895F for ; Fri, 28 Dec 2018 08:31:32 +0000 (UTC) Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wBS8TTHE010616; Fri, 28 Dec 2018 08:31:21 GMT Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2phase03b8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 28 Dec 2018 08:31:21 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id wBS8VGdh022605 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Dec 2018 08:31:17 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gcnY4-00009G-FQ; Fri, 28 Dec 2018 00:31:16 -0800 Received: from aserv0021.oracle.com ([141.146.126.233]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gcnXh-00008K-LY for ocfs2-devel@oss.oracle.com; Fri, 28 Dec 2018 00:30:54 -0800 Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id wBS8UqmF032162 (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256 verify=OK) for ; Fri, 28 Dec 2018 08:30:53 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wBS8Td6G022832 for ; Fri, 28 Dec 2018 08:30:52 GMT Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by userp2030.oracle.com with ESMTP id 2pn833gq1j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Fri, 28 Dec 2018 08:30:52 +0000 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 7FA4214DA0011; Fri, 28 Dec 2018 16:30:15 +0800 (CST) Received: from [10.177.218.160] (10.177.218.160) by smtp.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.408.0; Fri, 28 Dec 2018 16:30:07 +0800 From: Jia Guo To: "mark@fasheh.com" , "jlbec@evilplan.org" , "junxiao.bi@oracle.com" , "jiangqi903@gmail.com" , "akpm@linux-foundation.org" Message-ID: Date: Fri, 28 Dec 2018 16:29:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Language: en-US X-Originating-IP: [10.177.218.160] X-CFilter-Loop: Reflected X-CLX-Shades: MLX X-CLX-Response: 1TFkXGxIbEQpMehcbEh8RCllNF2dmchEKWUkXGnEaEBp3BhscG3EeGhAadwY YGgYaEQpZXhdobmYRCklGF0VYS0lGT3VaWEVOX0leQ0VEGXVPSxEKQ04XB1oHb211T29ZQGdySw daZnl8SR97fRJgWUUaWxkTHhMRClhcFx8EGgQbHx0HHhxLSRtMSBwFGxoEGxoaBB4SBBsQGx4aH xoRCl5ZF359HnpyEQpNXBcSHhkRCkxaF2lraUJNexEKQ1oXHh8EGB4TBBgbGAQbExsRCkJeFxsR CkReFxsaEQpESRcSEQpCRhdgU24eQxlzZ1hGZhEKQlwXGhEKQkUXY0FgEllCHWQYRxkRCkJOF2x CSFlTGk1leHgdEQpCTBdrcFpLSEYFXVNuEhEKQmwXYG0TbAEaQWhOHHsRCkJAF2Z+aGUBT0tdZx 5GEQpCWBdifW95AU8YGXBwexEKWlgXGxEKcGcXblNeGRIdXk5ISV8QGRoRCnBoF2dQGRJHcBpab 35MEBwaEQpwaBdvZloYf3kSHhxwXRAeEhEKcGgXa29kSBhraUlTbXwQHRoRCnBoF21BZkQfQBNN AW1/EBkaEQpwaBdic2h8UHhISGAcSxAdGhEKcGwXYHoZfhxEXmtaARkQGRoRCm1+FxoRClhNF0s RIA== X-PDR: PASS X-Source-IP: 45.249.212.191 X-ServerName: szxga05-in.huawei.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:45.249.212.32 ip4:45.249.212.35 ip4:119.145.14.93 ip4:58.251.152.93 ip4:206.16.17.72 ip4:45.249.212.255 ip4:45.249.212.187/29 ip4:45.249.212.191 ip4:185.176.76.210 ~all X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9119 signatures=668680 X-Proofpoint-DMARC-Record: none X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=185 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=181 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812280077 X-Spam: Clean Cc: zren@suse.com, tao.ma@oracle.com, "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH v2] ocfs2: flush truncate log when main bitmap run out of space X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9119 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812280077 X-Virus-Scanned: ClamAV using ClamSMTP Truncate log problem has been described in commit 50308d813bf2 ("ocfs2: Try to free truncate log when meeting ENOSPC in write.") and commit 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in truncate log"), but the following situations cannot be solved: case 1: Main bitmap has been updated, but the transaction of the deleted blocks has not been committed completely. In this case, function ocfs2_reserve_cluster_bitmap_bits() returns success while function __ocfs2_claim_clusters return ENOSPC because we cannot reuse the deleted blocks before the transaction committed(test by function ocfs2_test_bg_bit_allocatable()). One can reproduce this by following steps: a, prepare a file which size is 50G, and volume still have 30G free space b, open the file with O_TRUNC flag c, sleep 5 seconds d, fallocate a 50G file, fallocate will fail. case 2: Main bitmap doesn't have enough free bits, so does truncate log. But main bitmap plus truncate log has enough free bits. We are not gonging to try flush truncate log in this case, which is not reasonable. So force commit the transaction when flushing the truncate log for case 1. For case 2, the value of osb->truncated_clusters doesn't seem to make sense, do the flush whenever we run out of space seems to be more reasonable. Signed-off-by: Jia Guo Reviewed-by: Yiwen Jiang Reviewed-by: Gang He Acked-by: Joseph Qi --- fs/ocfs2/alloc.c | 43 +++---------------------------------------- fs/ocfs2/alloc.h | 2 -- fs/ocfs2/aops.c | 8 +++----- fs/ocfs2/ocfs2.h | 5 ----- fs/ocfs2/suballoc.c | 6 +++--- 5 files changed, 9 insertions(+), 55 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index d1cbb27..8b6938d 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -5921,7 +5921,6 @@ int ocfs2_truncate_log_append(struct ocfs2_super *osb, ocfs2_journal_dirty(handle, tl_bh); - osb->truncated_clusters += num_clusters; bail: return status; } @@ -5940,6 +5939,7 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb, struct inode *tl_inode = osb->osb_tl_inode; struct buffer_head *tl_bh = osb->osb_tl_bh; handle_t *handle; + tid_t target; di = (struct ocfs2_dinode *) tl_bh->b_data; tl = &di->id2.i_dealloc; @@ -5989,8 +5989,8 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb, ocfs2_commit_trans(osb, handle); i--; } - - osb->truncated_clusters = 0; + if (jbd2_journal_start_commit(osb->journal->j_journal, &target)) + jbd2_log_wait_commit(osb->journal->j_journal, target); bail: return status; @@ -6102,43 +6102,6 @@ void ocfs2_schedule_truncate_log_flush(struct ocfs2_super *osb, } } -/* - * Try to flush truncate logs if we can free enough clusters from it. - * As for return value, "< 0" means error, "0" no space and "1" means - * we have freed enough spaces and let the caller try to allocate again. - */ -int ocfs2_try_to_free_truncate_log(struct ocfs2_super *osb, - unsigned int needed) -{ - tid_t target; - int ret = 0; - unsigned int truncated_clusters; - - inode_lock(osb->osb_tl_inode); - truncated_clusters = osb->truncated_clusters; - inode_unlock(osb->osb_tl_inode); - - /* - * Check whether we can succeed in allocating if we free - * the truncate log. - */ - if (truncated_clusters < needed) - goto out; - - ret = ocfs2_flush_truncate_log(osb); - if (ret) { - mlog_errno(ret); - goto out; - } - - if (jbd2_journal_start_commit(osb->journal->j_journal, &target)) { - jbd2_log_wait_commit(osb->journal->j_journal, target); - ret = 1; - } -out: - return ret; -} - static int ocfs2_get_truncate_log_info(struct ocfs2_super *osb, int slot_num, struct inode **tl_inode, diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h index 250bcac..b343a6f 100644 --- a/fs/ocfs2/alloc.h +++ b/fs/ocfs2/alloc.h @@ -188,8 +188,6 @@ int ocfs2_truncate_log_append(struct ocfs2_super *osb, u64 start_blk, unsigned int num_clusters); int __ocfs2_flush_truncate_log(struct ocfs2_super *osb); -int ocfs2_try_to_free_truncate_log(struct ocfs2_super *osb, - unsigned int needed); /* * Process local structure which describes the block unlinks done diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index eb1ce30..1a1af0c 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -1674,7 +1674,7 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, struct buffer_head *di_bh, struct page *mmap_page) { int ret, cluster_of_pages, credits = OCFS2_INODE_UPDATE_CREDITS; - unsigned int clusters_to_alloc, extents_to_split, clusters_need = 0; + unsigned int clusters_to_alloc, extents_to_split; struct ocfs2_write_ctxt *wc; struct inode *inode = mapping->host; struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); @@ -1723,7 +1723,6 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, mlog_errno(ret); goto out; } else if (ret == 1) { - clusters_need = wc->w_clen; ret = ocfs2_refcount_cow(inode, di_bh, wc->w_cpos, wc->w_clen, UINT_MAX); if (ret) { @@ -1738,7 +1737,6 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, mlog_errno(ret); goto out; } - clusters_need += clusters_to_alloc; di = (struct ocfs2_dinode *)wc->w_di_bh->b_data; @@ -1893,8 +1891,8 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, */ try_free = 0; - ret1 = ocfs2_try_to_free_truncate_log(osb, clusters_need); - if (ret1 == 1) + ret1 = ocfs2_flush_truncate_log(osb); + if (!ret1) goto try_again; if (ret1 < 0) diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 4f86ac0..f913647 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -440,11 +440,6 @@ struct ocfs2_super struct buffer_head *osb_tl_bh; struct delayed_work osb_truncate_log_wq; atomic_t osb_tl_disable; - /* - * How many clusters in our truncate log. - * It must be protected by osb_tl_inode->i_mutex. - */ - unsigned int truncated_clusters; struct ocfs2_node_map osb_recovering_orphan_dirs; unsigned int *osb_orphan_wipes; diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index f7c972f..d086bb9 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1187,14 +1187,14 @@ static int ocfs2_reserve_clusters_with_limit(struct ocfs2_super *osb, if (status == -ENOSPC) { retry: status = ocfs2_reserve_cluster_bitmap_bits(osb, *ac); - /* Retry if there is sufficient space cached in truncate log */ + /* Retry after flush truncate log */ if (status == -ENOSPC && !retried) { retried = 1; ocfs2_inode_unlock((*ac)->ac_inode, 1); inode_unlock((*ac)->ac_inode); - ret = ocfs2_try_to_free_truncate_log(osb, bits_wanted); - if (ret == 1) { + ret = ocfs2_flush_truncate_log(osb); + if (!ret) { iput((*ac)->ac_inode); (*ac)->ac_inode = NULL; goto retry;