From patchwork Thu Nov 22 15:54:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia Guo X-Patchwork-Id: 10694445 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9148D5A4 for ; Thu, 22 Nov 2018 15:56:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FF972A428 for ; Thu, 22 Nov 2018 15:56:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 73EB82B773; Thu, 22 Nov 2018 15:56:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EBAF82A428 for ; Thu, 22 Nov 2018 15:56:15 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wAMFoUw9011506; Thu, 22 Nov 2018 15:56:03 GMT Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2120.oracle.com with ESMTP id 2ntbmqyj1q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 22 Nov 2018 15:56:03 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id wAMFtvPQ021713 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 22 Nov 2018 15:55:58 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gPrKf-0002W3-Sb; Thu, 22 Nov 2018 07:55:57 -0800 Received: from userv0021.oracle.com ([156.151.31.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gPrKF-0002S3-LJ for ocfs2-devel@oss.oracle.com; Thu, 22 Nov 2018 07:55:32 -0800 Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id wAMFtVjq021040 (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256 verify=OK) for ; Thu, 22 Nov 2018 15:55:31 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wAMFt49x021424 for ; Thu, 22 Nov 2018 15:55:31 GMT Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by userp2030.oracle.com with ESMTP id 2nwua8cqyb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Thu, 22 Nov 2018 15:55:30 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 48EA5798577AD; Thu, 22 Nov 2018 23:55:22 +0800 (CST) Received: from [10.177.218.160] (10.177.218.160) by smtp.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.408.0; Thu, 22 Nov 2018 23:55:14 +0800 To: "mark@fasheh.com" , "jlbec@evilplan.org" , "junxiao.bi@oracle.com" , "jiangqi903@gmail.com" , "akpm@linux-foundation.org" From: Jia Guo Message-ID: <5292e287-8f1a-fd4a-1a14-661e555e0bed@huawei.com> Date: Thu, 22 Nov 2018 23:54:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Language: en-US X-Originating-IP: [10.177.218.160] X-CFilter-Loop: Reflected X-CLX-Shades: MLX X-CLX-Response: 1TFkXGBkfEQpMehcbHB4RCllNF2dmchEKWUkXGnEaEBp3BhgbH3EbGRMQGnc GGBoGGhEKWV4XaG5mEQpJRhdFWEtJRk91WlhFTl9JXkNFRBl1T0sRCkNOF2FYW3hzf0kTZXN/XW llZVlgWBxgbEd7aGEdcBNZE3l7EQpYXBcfBBoEGx8fBxwSExtJHEhOBRsaBBsaGgQeEgQfEBseG h8aEQpeWRd+ZmJzZBEKTVwXGRITEQpMWhdpaGlCTV0RCk1OF2gRCkNaFx4fBBgeEwQYGxgEGxMb EQpCXhcbEQpEXhcbGhEKREkXHBEKQkYXZxNtYBtbZUIffn0RCkJcFxoRCkJFF2NBYBJZQh1kGEc ZEQpCThdsQkhZUxpNZXh4HREKQkwXa3BaS0hGBV1TbhIRCkJsF2BtE2wBGkFoThx7EQpCQBdkSU BIfkBhXB9mHREKQlgXYn1veQFPGBlwcHsRClpYFxsRCnBoF25gHWBQR2hZeGMBEBoRCnBoF2RDU BxrSE8BR097EBoRCnBoF20ZGmFnWUcaHk5BEBoRCnBoF2JtTl9CEx1uaE9aEBoRCnBoF2xAb1x8 e14ZG1xQEBoRCnBsF2B6GX4cRF5rWgEZEB4SEQptfhcaEQpYTRdLESA= X-PDR: PASS X-Source-IP: 45.249.212.191 X-ServerName: szxga05-in.huawei.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:45.249.212.32 ip4:45.249.212.35 ip4:119.145.14.93 ip4:58.251.152.93 ip4:206.16.17.72 ip4:45.249.212.255 ip4:45.249.212.187/29 ip4:45.249.212.191 ip4:185.176.76.210 ~all X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9084 signatures=668683 X-Proofpoint-DMARC-Record: none X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=164 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=235 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=764 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811220141 X-Spam: Clean Cc: "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH] ocfs2: clear zero in unaligned direct IO X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9084 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811220141 X-Virus-Scanned: ClamAV using ClamSMTP Unused portion of a part-written fs-block-sized block is not set to zero in unaligned append direct write.This can lead to serious data inconsistencies. Ocfs2 manage disk with cluster size(for example, 1M), part-written in one cluster will change the cluster state from UN-WRITTEN to WRITTEN, VFS(function dio_zero_block) doesn't do the cleaning because bh's state is not set to NEW in function ocfs2_dio_wr_get_block when we write a WRITTEN cluster. For example, the cluster size is 1M, file size is 8k and we direct write from 14k to 15k, then 12k~14k and 15k~16k will contain dirty data. We have to deal with two cases: 1.The starting position of direct write is outside the file. 2.The starting position of direct write is located in the file. We need set bh's state to NEW in the first case. In the second case, we need mapped twice because bh's state of area out file should be set to NEW while area in file not. Signed-off-by: Jia Guo Reviewed-by: Yiwen Jiang --- fs/ocfs2/aops.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index eb1ce30..6e1f902 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2152,13 +2152,30 @@ static int ocfs2_dio_wr_get_block(struct inode *inode, sector_t iblock, struct ocfs2_dio_write_ctxt *dwc = NULL; struct buffer_head *di_bh = NULL; u64 p_blkno; - loff_t pos = iblock << inode->i_sb->s_blocksize_bits; + unsigned i_blkbits = inode->i_sb->s_blocksize_bits; + loff_t pos = iblock << i_blkbits; + sector_t endblk = (i_size_read(inode) - 1) >> i_blkbits; unsigned len, total_len = bh_result->b_size; int ret = 0, first_get_block = 0; len = osb->s_clustersize - (pos & (osb->s_clustersize - 1)); len = min(total_len, len); + /* + * bh_result->b_size is count in get_more_blocks according to write + * "pos" and "end", we need map twice to return different buffer state: + * 1. area in file size, not set NEW; + * 2. area out file size, set NEW. + * + * iblock endblk + * |--------|---------|---------|--------- + * |<-------area in file------->| + */ + + if ((iblock <= endblk) && + ((iblock + ((len - 1) >> i_blkbits)) > endblk)) + len = (endblk - iblock + 1) << i_blkbits; + mlog(0, "get block of %lu at %llu:%u req %u\n", inode->i_ino, pos, len, total_len); @@ -2242,6 +2259,9 @@ static int ocfs2_dio_wr_get_block(struct inode *inode, sector_t iblock, if (desc->c_needs_zero) set_buffer_new(bh_result); + if (iblock > endblk) + set_buffer_new(bh_result); + /* May sleep in end_io. It should not happen in a irq context. So defer * it to dio work queue. */ set_buffer_defer_completion(bh_result);