diff mbox series

fsdax: dedupe should compare the min of two iters' length

Message ID 1679469958-2-1-git-send-email-ruansy.fnst@fujitsu.com (mailing list archive)
State Accepted
Commit e900ba10d15041a6236cc75778cc6e06c3590a58
Headers show
Series fsdax: dedupe should compare the min of two iters' length | expand

Commit Message

Shiyang Ruan March 22, 2023, 7:25 a.m. UTC
In an dedupe corporation iter loop, the length of iomap_iter decreases
because it implies the remaining length after each iteration.  The
compare function should use the min length of the current iters, not the
total length.

Fixes: 0e79e3736d54 ("fsdax: dedupe: iter two files at the same time")
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 fs/dax.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Darrick J. Wong March 22, 2023, 2:44 p.m. UTC | #1
On Wed, Mar 22, 2023 at 07:25:58AM +0000, Shiyang Ruan wrote:
> In an dedupe corporation iter loop, the length of iomap_iter decreases
> because it implies the remaining length after each iteration.  The
> compare function should use the min length of the current iters, not the
> total length.
> 
> Fixes: 0e79e3736d54 ("fsdax: dedupe: iter two files at the same time")
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>

Makese sense,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/dax.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 3e457a16c7d1..9800b93ee14d 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -2022,8 +2022,8 @@ int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
>  
>  	while ((ret = iomap_iter(&src_iter, ops)) > 0 &&
>  	       (ret = iomap_iter(&dst_iter, ops)) > 0) {
> -		compared = dax_range_compare_iter(&src_iter, &dst_iter, len,
> -						  same);
> +		compared = dax_range_compare_iter(&src_iter, &dst_iter,
> +				min(src_iter.len, dst_iter.len), same);
>  		if (compared < 0)
>  			return ret;
>  		src_iter.processed = dst_iter.processed = compared;
> -- 
> 2.39.2
>
Andrew Morton March 22, 2023, 11:12 p.m. UTC | #2
On Wed, 22 Mar 2023 07:25:58 +0000 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> In an dedupe corporation iter loop, the length of iomap_iter decreases
> because it implies the remaining length after each iteration.  The
> compare function should use the min length of the current iters, not the
> total length.

Please describe the user-visible runtime effects of this flaw, thanks.
Shiyang Ruan March 23, 2023, 6:48 a.m. UTC | #3
在 2023/3/23 7:12, Andrew Morton 写道:
> On Wed, 22 Mar 2023 07:25:58 +0000 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:
> 
>> In an dedupe corporation iter loop, the length of iomap_iter decreases
>> because it implies the remaining length after each iteration.  The
>> compare function should use the min length of the current iters, not the
>> total length.
> 
> Please describe the user-visible runtime effects of this flaw, thanks.

This patch fixes fail of generic/561, with test config:

export TEST_DEV=/dev/pmem0
export TEST_DIR=/mnt/test
export SCRATCH_DEV=/dev/pmem1
export SCRATCH_MNT=/mnt/scratch
export MKFS_OPTIONS="-m reflink=1,rmapbt=1"
export MOUNT_OPTIONS="-o dax"
export XFS_MOUNT_OPTIONS="-o dax"


--
Thanks,
Ruan.
Andrew Morton March 23, 2023, 10:12 p.m. UTC | #4
On Thu, 23 Mar 2023 14:48:25 +0800 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 
> 
> 在 2023/3/23 7:12, Andrew Morton 写道:
> > On Wed, 22 Mar 2023 07:25:58 +0000 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:
> > 
> >> In an dedupe corporation iter loop, the length of iomap_iter decreases
> >> because it implies the remaining length after each iteration.  The
> >> compare function should use the min length of the current iters, not the
> >> total length.
> > 
> > Please describe the user-visible runtime effects of this flaw, thanks.
> 
> This patch fixes fail of generic/561, with test config:
> 
> export TEST_DEV=/dev/pmem0
> export TEST_DIR=/mnt/test
> export SCRATCH_DEV=/dev/pmem1
> export SCRATCH_MNT=/mnt/scratch
> export MKFS_OPTIONS="-m reflink=1,rmapbt=1"
> export MOUNT_OPTIONS="-o dax"
> export XFS_MOUNT_OPTIONS="-o dax"

Again, how does the bug impact real-world kernel users?

Thanks.
Shiyang Ruan March 24, 2023, 4:19 a.m. UTC | #5
在 2023/3/24 6:12, Andrew Morton 写道:
> On Thu, 23 Mar 2023 14:48:25 +0800 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:
> 
>>
>>
>> 在 2023/3/23 7:12, Andrew Morton 写道:
>>> On Wed, 22 Mar 2023 07:25:58 +0000 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:
>>>
>>>> In an dedupe corporation iter loop, the length of iomap_iter decreases
>>>> because it implies the remaining length after each iteration.  The
>>>> compare function should use the min length of the current iters, not the
>>>> total length.
>>>
>>> Please describe the user-visible runtime effects of this flaw, thanks.
>>
>> This patch fixes fail of generic/561, with test config:
>>
>> export TEST_DEV=/dev/pmem0
>> export TEST_DIR=/mnt/test
>> export SCRATCH_DEV=/dev/pmem1
>> export SCRATCH_MNT=/mnt/scratch
>> export MKFS_OPTIONS="-m reflink=1,rmapbt=1"
>> export MOUNT_OPTIONS="-o dax"
>> export XFS_MOUNT_OPTIONS="-o dax"
> 
> Again, how does the bug impact real-world kernel users?

The dedupe command will fail with -EIO if the range is larger than one 
page size and not aligned to the page size.  Also report warning in dmesg:

[ 4338.498374] ------------[ cut here ]------------
[ 4338.498689] WARNING: CPU: 3 PID: 1415645 at fs/iomap/iter.c:16 
iomap_iter+0x2b2/0x2c0
[ 4338.499216] Modules linked in: bfq ext4 mbcache jbd2 auth_rpcgss 
oid_registry nfsv4 algif_hash af_alg af_packet nft_reject_inet 
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set 
nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter ip_tables 
x_tables nd_pmem nd_btt dax_pmem sch_fq_codel configfs xfs libcrc32c 
fuse [last unloaded: scsi_debug]
[ 4338.501419] CPU: 3 PID: 1415645 Comm: pool Kdump: loaded Tainted: G 
      W          6.1.0-rc4+ #118 242c3ad9724cd53a53c9a3b3cd3050ef1060e37a
[ 4338.502093] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
Arch Linux 1.16.0-3-3 04/01/2014
[ 4338.502610] RIP: 0010:iomap_iter+0x2b2/0x2c0
[ 4338.502933] Code: 0d 63 6f ce 7e 0f 85 d0 fe ff ff e8 f4 ea cc ff e9 
c6 fe ff ff 0f 0b e9 a0 fe ff ff 0f 0b e9 a5 fe ff ff 0f 0b e9 85 fe ff 
ff <0f> 0b b8 fb ff ff ff e9 aa fe ff ff cc cc 0f 1f 44 00 00 48 8b 42
[ 4338.503963] RSP: 0018:ffffc9000317faa8 EFLAGS: 00010287
[ 4338.504318] RAX: 0000000000000178 RBX: ffffc9000317fba8 RCX: 
0000000000001000
[ 4338.504701] RDX: 0000000000000178 RSI: 0000000000399000 RDI: 
ffffc9000317fba8
[ 4338.505062] RBP: ffffffffa0321b30 R08: ffffc9000317fae0 R09: 
0000000000000000
[ 4338.505490] R10: 0000000000000004 R11: ffff888102577740 R12: 
ffffc9000317fd00
[ 4338.505956] R13: 0000000000399000 R14: 0000000000001000 R15: 
0000000000000000
[ 4338.506460] FS:  00007f57ce200640(0000) GS:ffff88817bd80000(0000) 
knlGS:0000000000000000
[ 4338.507126] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4338.507490] CR2: 00007f57b4a00e88 CR3: 0000000153669002 CR4: 
00000000003706e0
[ 4338.507887] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[ 4338.508288] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[ 4338.508717] Call Trace:
[ 4338.508962]  <TASK>
[ 4338.509234]  dax_dedupe_file_range_compare+0xd9/0x210
[ 4338.509589]  __generic_remap_file_range_prep+0x2af/0x760
[ 4338.509942]  xfs_reflink_remap_prep+0xeb/0x240 [xfs 
4155ff90e551a4608ed7504e9f2aa737a690cfa3]
[ 4338.510505]  xfs_file_remap_range+0x83/0x320 [xfs 
4155ff90e551a4608ed7504e9f2aa737a690cfa3]
[ 4338.511118]  vfs_dedupe_file_range_one+0x196/0x1b0
[ 4338.511407]  vfs_dedupe_file_range+0x170/0x1e0
[ 4338.511719]  do_vfs_ioctl+0x48d/0x8f0
[ 4338.511975]  ? kmem_cache_free+0x2a1/0x460
[ 4338.512278]  ? do_sys_openat2+0x7d/0x150
[ 4338.512556]  __x64_sys_ioctl+0x40/0xa0
[ 4338.512829]  do_syscall_64+0x2b/0x80
[ 4338.513109]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 4338.513421] RIP: 0033:0x7f57ce50748f
[ 4338.513687] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 
00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 
05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[ 4338.514582] RSP: 002b:00007f57ce1ffbb0 EFLAGS: 00000246 ORIG_RAX: 
0000000000000010
[ 4338.515067] RAX: ffffffffffffffda RBX: 00007f57b4885870 RCX: 
00007f57ce50748f
[ 4338.515441] RDX: 00007f57b4885870 RSI: 00000000c0189436 RDI: 
000000000000000e
[ 4338.516300] RBP: 00007f57b486f648 R08: 0000000000000001 R09: 
00007f57b486f650
[ 4338.516694] R10: 0000000000000000 R11: 0000000000000246 R12: 
00007f57b486f650
[ 4338.517108] R13: 00007f57b486f610 R14: 00007f57b486f650 R15: 
00007f57b486f610
[ 4338.518781]  </TASK>
[ 4338.519037] ---[ end trace 0000000000000000 ]---

--
Thanks,
Ruan.

> 
> Thanks.
Andrew Morton March 24, 2023, 8:34 p.m. UTC | #6
On Fri, 24 Mar 2023 12:19:46 +0800 Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> > Again, how does the bug impact real-world kernel users?
> 
> The dedupe command will fail with -EIO if the range is larger than one 
> page size and not aligned to the page size.  Also report warning in dmesg:
> 
> [ 4338.498374] ------------[ cut here ]------------
> [ 4338.498689] WARNING: CPU: 3 PID: 1415645 at fs/iomap/iter.c:16 

OK, thanks.  I added the above to the changelog and added cc:stable.
diff mbox series

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 3e457a16c7d1..9800b93ee14d 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -2022,8 +2022,8 @@  int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
 
 	while ((ret = iomap_iter(&src_iter, ops)) > 0 &&
 	       (ret = iomap_iter(&dst_iter, ops)) > 0) {
-		compared = dax_range_compare_iter(&src_iter, &dst_iter, len,
-						  same);
+		compared = dax_range_compare_iter(&src_iter, &dst_iter,
+				min(src_iter.len, dst_iter.len), same);
 		if (compared < 0)
 			return ret;
 		src_iter.processed = dst_iter.processed = compared;