From patchwork Sat May 27 05:54:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zhangguanghui X-Patchwork-Id: 9751595 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 75FB16037E for ; Sat, 27 May 2017 05:55:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 625352838B for ; Sat, 27 May 2017 05:55:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 532DB2843F; Sat, 27 May 2017 05:55:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00, HTML_FONT_FACE_BAD, HTML_FONT_LOW_CONTRAST, HTML_MESSAGE, MIME_HTML_MOSTLY, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E872C2838B for ; Sat, 27 May 2017 05:55:17 +0000 (UTC) Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v4R5ssD3019328 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 27 May 2017 05:54:55 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v4R5sk6T016340 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 27 May 2017 05:54:47 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1dEUgY-0005T9-Pg; Fri, 26 May 2017 22:54:46 -0700 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1dEUgE-0005Py-PR for ocfs2-devel@oss.oracle.com; Fri, 26 May 2017 22:54:27 -0700 Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v4R5sQLP010734 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Sat, 27 May 2017 05:54:26 GMT Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.20/8.16.0.20) with SMTP id v4R5s7jZ005529 for ; Sat, 27 May 2017 05:54:26 GMT Authentication-Results: oracle.com; spf=pass smtp.mailfrom=zhang.guanghui@h3c.com Received: from h3cmg01-ex.h3c.com (smtp.h3c.com [60.191.123.56]) by userp2040.oracle.com with ESMTP id 2aq03eumdu-1; Sat, 27 May 2017 05:54:24 +0000 Received: from BJHUB02-EX.srv.huawei-3com.com (unknown [10.63.20.170]) by h3cmg01-ex.h3c.com with smtp id 5441_00cb_c2ad0fc2_5165_498b_833a_75bb2d3f1841; Sat, 27 May 2017 13:54:21 +0800 Received: from H3CMLB12-EX.srv.huawei-3com.com ([fe80::10fe:abde:731b:fdde]) by BJHUB02-EX.srv.huawei-3com.com ([::1]) with mapi id 14.03.0248.002; Sat, 27 May 2017 13:54:07 +0800 From: Zhangguanghui To: "ocfs2-devel@oss.oracle.com" Thread-Topic: ocfs2: fix sparse file & data ordering issue in direct io. review Thread-Index: AQHS1q1yECCafES2qE+BLwrgG3y2Aw== Date: Sat, 27 May 2017 05:54:06 +0000 Message-ID: References: , , Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.153.28.123] MIME-Version: 1.0 X-PDR: PASS X-ServerName: smtp.h3c.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:60.191.123.56 ip4:60.191.123.50 ip4:221.12.31.13 ip4:221.12.31.56 X-Proofpoint-Virus-Version: vendor=nai engine=5800 definitions=8541 signatures=668465 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705270114 Cc: Mark Fasheh , "ryan.ding" Subject: [Ocfs2-devel] ocfs2: fix sparse file & data ordering issue in direct io. review X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: userv0021.oracle.com [156.151.31.71] X-Virus-Scanned: ClamAV using ClamSMTP comments and questions are, as always, welcome. Thanks ________________________________ All the best wishes for you. zhangguanghui From: zhangguanghui 10102 (Cloud) Date: 2017-05-26 16:21 To: ocfs2-devel@oss.oracle.com CC: ryan.ding; Andrew Morton; wangww631; Joel Becker; Mark Fasheh Subject: Re: ocfs2: fix sparse file & data ordering issue in direct io. review Hi This patch replace that function ocfs2_direct_IO_get_blocks with this function ocfs2_get_blocks in ocfs2_direct_IO, and remove the ip_alloc_sem. but i think ip_alloc_sem is still needed because protect allocation changes is very correct. Now, BUG_ON have been tiggered in the process of testing direct-io. Comments and questions are, as always, welcome. Thanks As wangww631 described In ocfs2, ip_alloc_sem is used to protect allocation changes on the node. In direct IO, we add ip_alloc_sem to protect date consistent between direct-io and ocfs2_truncate_file race (buffer io use ip_alloc_sem already). Although inode->i_mutex lock is used to avoid concurrency of above situation, i think ip_alloc_sem is still needed because protect allocation changes is significant. Other filesystem like ext4 also uses rw_semaphore to protect data consistent between get_block-vs-truncate race by other means, So ip_alloc_sem in ocfs2 direct io is needed. Date: Fri, 11 Sep 2015 16:19:18 +0800 From: Ryan Ding Subject: [Ocfs2-devel] [PATCH 7/8] ocfs2: fix sparse file & data ordering issue in direct io. To: ocfs2-devel@oss.oracle.com Cc: mfasheh@suse.de Message-ID: <1441959559-29947-8-git-send-email-ryan.ding@oracle.com> There are mainly 3 issue in the direct io code path after commit 24c40b329e03 ("ocfs2: implement ocfs2_direct_IO_write"): * Do not support sparse file. * Do not support data ordering. eg: when write to a file hole, it will alloc extent first. If system crashed before io finished, data will corrupt. * Potential risk when doing aio+dio. The -EIOCBQUEUED return value is likely to be ignored by ocfs2_direct_IO_write(). To resolve above problems, re-design direct io code with following ideas: * Use buffer io to fill in holes. And this will make better performance also. * Clear unwritten after direct write finished. So we can make sure meta data changes after data write to disk. (Unwritten extent is invisible to user, from user's view, meta data is not changed when allocate an unwritten extent.) * Clear ocfs2_direct_IO_write(). Do all ending work in end_io. This patch has passed fs,dio,ltp-aiodio.part1,ltp-aiodio.part2,ltp-aiodio.part4 test cases of ltp. Signed-off-by: Ryan Ding Reviewed-by: Junxiao Bi cc: Joseph Qi ________________________________ All the best wishes for you. zhangguanghui ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有新华三技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from New H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! --- a/aops.c 2017-05-27 01:23:35.591274026 -0400 +++ b/aops.c 2017-05-27 01:29:44.743285821 -0400 @@ -2396,6 +2396,35 @@ return 0; } +/* + * TODO: Make this into a generic get_blocks function. + * + * In ocfs2, ip_alloc_sem is used to protect allocation changes on the node. + * In direct IO, we add ip_alloc_sem to protect date consistent between + * direct-io and ocfs2_truncate_file race (buffer io use ip_alloc_sem + * already). Although inode->i_mutex lock is used to avoid concurrency of + * above situation, i think ip_alloc_sem is still needed because protect + * allocation changes is significant. + * + * This function is called directly from get_more_blocks in direct-io.c. + * + * called like this: dio->get_blocks(dio->inode, fs_startblk, + * fs_count, map_bh, dio->rw == READ); + */ +static int ocfs2_dio_read_get_block(struct inode *inode, sector_t iblock, + struct buffer_head *bh_result, int create) +{ + struct ocfs2_inode_info *oi = OCFS2_I(inode); + int ret = 0; + + down_read(&oi->ip_alloc_sem); + /* This is the fast path for direct-io reading. */ + ret = ocfs2_get_block(inode, iblock, bh_result, create); + up_read(&oi->ip_alloc_sem); + + return ret; +} + static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter) { struct file *file = iocb->ki_filp; @@ -2416,7 +2445,7 @@ return 0; if (iov_iter_rw(iter) == READ) - get_block = ocfs2_get_block; + get_block = ocfs2_dio_read_get_block; else get_block = ocfs2_dio_get_block;