From patchwork Thu Oct 11 04:14:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10635993 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0565E14DB for ; Thu, 11 Oct 2018 04:15:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDEA72787C for ; Thu, 11 Oct 2018 04:15:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE9C12A655; Thu, 11 Oct 2018 04:15:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5073E2787C for ; Thu, 11 Oct 2018 04:15:08 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9B4E21W002930; Thu, 11 Oct 2018 04:14:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : date : message-id : in-reply-to : references : mime-version : cc : subject : list-id : list-unsubscribe : list-archive : list-post : list-help : list-subscribe : content-type : content-transfer-encoding : sender; s=corp-2018-07-02; bh=BuIl7GdwOyYF/ltcnZ2tT41IQTaG/KtV6tbxW5o9t6A=; b=OeYaZf5zog3+lSosUR/BQepbjj4o/vzSdPJiWVyjlViGZXm1QqKMCJtLO99b2zf+QQZU /h+A3HhPbXYb8qxp8Z89uTJyp+Hf7KWX7CExF+zHOPlx2huIuA99PbZvFKN2+lUpG3GQ gQ99MSIknj8s1bpHPoGklwAdwFW076omqBumFnf0Y21tflf1UHOi0VG+fnVKMm5KA/VS 5YiGrVgdGIuKrYwSnpk9Qy49BUebUhwSFFccqlgmtyvZkrFjpsnwEahLcImJ2HKYGpm1 +69RYkqKWdbYhsqU3+gx4UT/NZTk9mRx3Vd3U9sKL86lB0JfMc15QPnsDWbEyLyIVKS4 CQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2mxnpr994d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Oct 2018 04:14:54 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w9B4ErCt020280 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 11 Oct 2018 04:14:53 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gASNB-00072l-9m; Wed, 10 Oct 2018 21:14:53 -0700 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1gASN6-00072N-02 for ocfs2-devel@oss.oracle.com; Wed, 10 Oct 2018 21:14:48 -0700 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w9B4El7w024822 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Oct 2018 04:14:47 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w9B4ElOM030326; Thu, 11 Oct 2018 04:14:47 GMT Received: from localhost (/10.159.132.249) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 11 Oct 2018 04:14:46 +0000 From: "Darrick J. Wong" To: david@fromorbit.com, darrick.wong@oracle.com Date: Wed, 10 Oct 2018 21:14:45 -0700 Message-ID: <153923128529.5546.6430455638279784448.stgit@magnolia> In-Reply-To: <153923113649.5546.9840926895953408273.stgit@magnolia> References: <153923113649.5546.9840926895953408273.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: sandeen@redhat.com, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, Amir Goldstein , linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [Ocfs2-devel] [PATCH 19/25] vfs: implement opportunistic short dedupe X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9042 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810110040 X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong For a given dedupe request, the bytes_deduped field in the control structure tells userspace if we managed to deduplicate some, but not all of, the requested regions starting from the file offsets supplied. However, due to sloppy coding, the current dedupe code returns FILE_DEDUPE_RANGE_DIFFERS if any part of the range is different. Fix this so that we can actually support partial request completion. Signed-off-by: Darrick J. Wong Reviewed-by: Amir Goldstein --- fs/read_write.c | 48 ++++++++++++++++++++++++++++++++++++++---------- include/linux/fs.h | 7 +++++-- 2 files changed, 43 insertions(+), 12 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index c88a443d9eb2..de055cb9c5ae 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1737,13 +1737,26 @@ static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset) return page; } +static unsigned int vfs_dedupe_memcmp(const char *s1, const char *s2, + unsigned int len) +{ + const char *orig_s1; + + for (orig_s1 = s1; len > 0; s1++, s2++, len--) + if (*s1 != *s2) + break; + + return s1 - orig_s1; +} + /* * Compare extents of two files to see if they are the same. * Caller must have locked both inodes to prevent write races. */ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, - loff_t len, bool *is_same) + loff_t *req_len, + unsigned int remap_flags) { loff_t src_poff; loff_t dest_poff; @@ -1751,8 +1764,11 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, void *dest_addr; struct page *src_page; struct page *dest_page; - loff_t cmp_len; + loff_t len = *req_len; + loff_t same_len = 0; bool same; + unsigned int cmp_len; + unsigned int cmp_same; int error; error = -EINVAL; @@ -1762,7 +1778,7 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, dest_poff = destoff & (PAGE_SIZE - 1); cmp_len = min(PAGE_SIZE - src_poff, PAGE_SIZE - dest_poff); - cmp_len = min(cmp_len, len); + cmp_len = min_t(loff_t, cmp_len, len); if (cmp_len <= 0) goto out_error; @@ -1784,7 +1800,10 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, flush_dcache_page(src_page); flush_dcache_page(dest_page); - if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len)) + cmp_same = vfs_dedupe_memcmp(src_addr + src_poff, + dest_addr + dest_poff, cmp_len); + same_len += cmp_same; + if (cmp_same != cmp_len) same = false; kunmap_atomic(dest_addr); @@ -1802,7 +1821,17 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, len -= cmp_len; } - *is_same = same; + /* + * If less than the whole range matched, we have to back down to the + * nearest block boundary. + */ + if (*req_len != same_len) { + if (!(remap_flags & RFR_SHORT_DEDUPE)) + return -EBADE; + + *req_len = ALIGN_DOWN(same_len, dest->i_sb->s_blocksize); + } + return 0; out_error: @@ -1881,13 +1910,11 @@ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, * Check that the extents are the same. */ if (is_dedupe) { - bool is_same = false; - ret = vfs_dedupe_file_range_compare(inode_in, pos_in, - inode_out, pos_out, *len, &is_same); + inode_out, pos_out, len, remap_flags); if (ret) return ret; - if (!is_same) + if (*len == 0) return -EBADE; } @@ -2013,7 +2040,8 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, { loff_t ret; - WARN_ON_ONCE(remap_flags & ~(RFR_SAME_DATA)); + WARN_ON_ONCE(remap_flags & ~(RFR_SAME_DATA | RFR_CAN_SHORTEN | + RFR_SHORT_DEDUPE)); ret = mnt_want_write_file(dst_file); if (ret) diff --git a/include/linux/fs.h b/include/linux/fs.h index f0603ed007e9..18b6db85ab64 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1727,16 +1727,19 @@ struct block_device_operations; * RFR_SAME_DATA: only remap if contents identical (i.e. deduplicate) * RFR_TO_SRC_EOF: remap to the end of the source file * RFR_CAN_SHORTEN: caller can handle a shortened request + * RFR_SHORT_DEDUPE: deduplicate from byte 0 until the file data don't match */ #define RFR_SAME_DATA (1 << 0) #define RFR_TO_SRC_EOF (1 << 1) #define RFR_CAN_SHORTEN (1 << 2) +#define RFR_SHORT_DEDUPE (1 << 3) #define RFR_VALID_FLAGS (RFR_SAME_DATA | RFR_TO_SRC_EOF | \ - RFR_CAN_SHORTEN) + RFR_CAN_SHORTEN | RFR_SHORT_DEDUPE) /* Implemented by the VFS, so these are advisory. */ -#define RFR_VFS_FLAGS (RFR_TO_SRC_EOF | RFR_CAN_SHORTEN) +#define RFR_VFS_FLAGS (RFR_TO_SRC_EOF | RFR_CAN_SHORTEN | \ + RFR_SHORT_DEDUPE) /* * Filesystem remapping implementations should call this helper on their