diff mbox series

fstests: fix fssum to actually ignore file holes when supposed to

Message ID 20181029094340.18154-1-fdmanana@kernel.org (mailing list archive)
State New, archived
Headers show
Series fstests: fix fssum to actually ignore file holes when supposed to | expand

Commit Message

Filipe Manana Oct. 29, 2018, 9:43 a.m. UTC
From: Filipe Manana <fdmanana@suse.com>

Unless the '-s' option is passed to fssum, it should not detect file holes
and have their existence influence the computed checksum for a file. This
tool was added to test btrfs' send/receive feature, so that it checks for
any metadata and data differences between the original filesystem and the
filesystem that receives send streams.

For a long time the test btrfs/007, which tests btrfs' send/receive with
fsstress, fails sporadically reporting data differences between files.
However the md5sum/sha1sum from the reported files in the original and
new filesystems are the same. The reason why fssum fails is because even
in normal mode it still accounts for number of holes that exist in the
file and their respective lengths. This is done using the SEEK_DATA mode
of lseek. The btrfs send feature does not preserve holes nor prealloc
extents (not supported by the current protocol), so whenever a hole or
prealloc (unwritten) extent is detected in the source filesystem, it
issues a write command full of zeroes, which will translate to a regular
(written) extent in the destination filesystem. This is why fssum reports
a different checksum. A prealloc extent also counts as hole when using
lseek.

For example when passing a seed of 1540592967 to fsstress in btrfs/007,
the test fails, as file p0/d0/f7 has a prealloc extent in the original
filesystem (in the incr snapshot).

Fix this by making fssum just read the hole file and feed its data to the
digest calculation function when option '-s' is not given. If we ever get
btrfs' send/receive to support holes and fallocate, we can just change
the test and pass the '-s' option to all fssum calls.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 src/fssum.c | 65 +++++--------------------------------------------------------
 1 file changed, 5 insertions(+), 60 deletions(-)

Comments

Josef Bacik Oct. 31, 2018, 4:05 p.m. UTC | #1
On Mon, Oct 29, 2018 at 09:43:40AM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Unless the '-s' option is passed to fssum, it should not detect file holes
> and have their existence influence the computed checksum for a file. This
> tool was added to test btrfs' send/receive feature, so that it checks for
> any metadata and data differences between the original filesystem and the
> filesystem that receives send streams.
> 
> For a long time the test btrfs/007, which tests btrfs' send/receive with
> fsstress, fails sporadically reporting data differences between files.
> However the md5sum/sha1sum from the reported files in the original and
> new filesystems are the same. The reason why fssum fails is because even
> in normal mode it still accounts for number of holes that exist in the
> file and their respective lengths. This is done using the SEEK_DATA mode
> of lseek. The btrfs send feature does not preserve holes nor prealloc
> extents (not supported by the current protocol), so whenever a hole or
> prealloc (unwritten) extent is detected in the source filesystem, it
> issues a write command full of zeroes, which will translate to a regular
> (written) extent in the destination filesystem. This is why fssum reports
> a different checksum. A prealloc extent also counts as hole when using
> lseek.
> 
> For example when passing a seed of 1540592967 to fsstress in btrfs/007,
> the test fails, as file p0/d0/f7 has a prealloc extent in the original
> filesystem (in the incr snapshot).
> 
> Fix this by making fssum just read the hole file and feed its data to the
> digest calculation function when option '-s' is not given. If we ever get
> btrfs' send/receive to support holes and fallocate, we can just change
> the test and pass the '-s' option to all fssum calls.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
Qu Wenruo Feb. 13, 2019, 5:44 a.m. UTC | #2
On 2018/10/29 下午5:43, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Unless the '-s' option is passed to fssum, it should not detect file holes
> and have their existence influence the computed checksum for a file. This
> tool was added to test btrfs' send/receive feature, so that it checks for
> any metadata and data differences between the original filesystem and the
> filesystem that receives send streams.
> 
> For a long time the test btrfs/007, which tests btrfs' send/receive with
> fsstress, fails sporadically reporting data differences between files.
> However the md5sum/sha1sum from the reported files in the original and
> new filesystems are the same. The reason why fssum fails is because even
> in normal mode it still accounts for number of holes that exist in the
> file and their respective lengths. This is done using the SEEK_DATA mode
> of lseek. The btrfs send feature does not preserve holes nor prealloc
> extents (not supported by the current protocol), so whenever a hole or
> prealloc (unwritten) extent is detected in the source filesystem, it
> issues a write command full of zeroes, which will translate to a regular
> (written) extent in the destination filesystem. This is why fssum reports
> a different checksum. A prealloc extent also counts as hole when using
> lseek.
> 
> For example when passing a seed of 1540592967 to fsstress in btrfs/007,
> the test fails, as file p0/d0/f7 has a prealloc extent in the original
> filesystem (in the incr snapshot).
> 
> Fix this by making fssum just read the hole file and feed its data to the
> digest calculation function when option '-s' is not given. If we ever get
> btrfs' send/receive to support holes and fallocate, we can just change
> the test and pass the '-s' option to all fssum calls.

However this is causing more problem here.

Now since fssum doesn't skip holes, for a super large sparse file, fssum
will try to fill all the holes with zero and spend tons of CPU time
calculating the result.

E.g. when seed = 1550703281.
We will have a PiB level file:
$ ls /mnt/scratch/base/p0/f3 -alh
-rw-rw-rw- 1 1441 2774 666P Feb 13 13:39 /mnt/scratch/base/p0/f3

(well, 666, no wonder this will fail)

But it only takes aroud 1M:
$ du /mnt/scratch/base/p0/f3
1044	/mnt/scratch/base/p0/f3

This is even more annoying than test failure.

Can't we just filter out the fallocate/hole punching part in the test
itself?

Thanks,
Qu

> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  src/fssum.c | 65 +++++--------------------------------------------------------
>  1 file changed, 5 insertions(+), 60 deletions(-)
> 
> diff --git a/src/fssum.c b/src/fssum.c
> index 5da39abf..f1da72fb 100644
> --- a/src/fssum.c
> +++ b/src/fssum.c
> @@ -224,71 +224,16 @@ int
>  sum_file_data_permissive(int fd, sum_t *dst)
>  {
>  	int ret;
> -	off_t pos;
> -	off_t old;
> -	int i;
> -	uint64_t zeros = 0;
> -
> -	pos = lseek(fd, 0, SEEK_CUR);
> -	if (pos == (off_t)-1)
> -		return errno == ENXIO ? 0 : -2;
>  
>  	while (1) {
> -		old = pos;
> -		pos = lseek(fd, pos, SEEK_DATA);
> -		if (pos == (off_t)-1) {
> -			if (errno == ENXIO) {
> -				ret = 0;
> -				pos = lseek(fd, 0, SEEK_END);
> -				if (pos != (off_t)-1)
> -					zeros += pos - old;
> -			} else {
> -				ret = -2;
> -			}
> -			break;
> -		}
>  		ret = read(fd, buf, sizeof(buf));
> -		assert(ret); /* eof found by lseek */
> -		if (ret <= 0)
> +		if (ret < 0)
> +			return -errno;
> +		sum_add(dst, buf, ret);
> +		if (ret < sizeof(buf))
>  			break;
> -		if (old < pos) /* hole */
> -			zeros += pos - old;
> -		for (i = 0; i < ret; ++i) {
> -			for (old = i; buf[i] == 0 && i < ret; ++i)
> -				;
> -			if (old < i) /* code like a hole */
> -				zeros += i - old;
> -			if (i == ret)
> -				break;
> -			if (zeros) {
> -				if (verbose >= 2)
> -					fprintf(stderr,
> -						"adding %llu zeros to sum\n",
> -						(unsigned long long)zeros);
> -				sum_add_u64(dst, 0);
> -				sum_add_u64(dst, zeros);
> -				zeros = 0;
> -			}
> -			for (old = i; buf[i] != 0 && i < ret; ++i)
> -				;
> -			if (verbose >= 2)
> -				fprintf(stderr, "adding %u non-zeros to sum\n",
> -					i - (int)old);
> -			sum_add(dst, buf + old, i - old);
> -		}
> -		pos += ret;
>  	}
> -
> -	if (zeros) {
> -		if (verbose >= 2)
> -			fprintf(stderr,
> -				"adding %llu zeros to sum (finishing)\n",
> -				(unsigned long long)zeros);
> -		sum_add_u64(dst, 0);
> -		sum_add_u64(dst, zeros);
> -	}
> -
> -	return ret;
> +	return 0;
>  }
>  
>  int
>
Filipe Manana Feb. 13, 2019, 10:21 a.m. UTC | #3
On Wed, Feb 13, 2019 at 5:44 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2018/10/29 下午5:43, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Unless the '-s' option is passed to fssum, it should not detect file holes
> > and have their existence influence the computed checksum for a file. This
> > tool was added to test btrfs' send/receive feature, so that it checks for
> > any metadata and data differences between the original filesystem and the
> > filesystem that receives send streams.
> >
> > For a long time the test btrfs/007, which tests btrfs' send/receive with
> > fsstress, fails sporadically reporting data differences between files.
> > However the md5sum/sha1sum from the reported files in the original and
> > new filesystems are the same. The reason why fssum fails is because even
> > in normal mode it still accounts for number of holes that exist in the
> > file and their respective lengths. This is done using the SEEK_DATA mode
> > of lseek. The btrfs send feature does not preserve holes nor prealloc
> > extents (not supported by the current protocol), so whenever a hole or
> > prealloc (unwritten) extent is detected in the source filesystem, it
> > issues a write command full of zeroes, which will translate to a regular
> > (written) extent in the destination filesystem. This is why fssum reports
> > a different checksum. A prealloc extent also counts as hole when using
> > lseek.
> >
> > For example when passing a seed of 1540592967 to fsstress in btrfs/007,
> > the test fails, as file p0/d0/f7 has a prealloc extent in the original
> > filesystem (in the incr snapshot).
> >
> > Fix this by making fssum just read the hole file and feed its data to the
> > digest calculation function when option '-s' is not given. If we ever get
> > btrfs' send/receive to support holes and fallocate, we can just change
> > the test and pass the '-s' option to all fssum calls.
>
> However this is causing more problem here.
>
> Now since fssum doesn't skip holes, for a super large sparse file, fssum
> will try to fill all the holes with zero and spend tons of CPU time
> calculating the result.

Yes. So will md5sum for example and many other things.

The problem is well known and it's due to the addition of the splice
operation to fsstress:

https://www.spinics.net/lists/fstests/msg11257.html

It was also making other tests slower.
This got fixed recently in the last Sunday's update by:

https://www.spinics.net/lists/fstests/msg11266.html

>
> E.g. when seed = 1550703281.
> We will have a PiB level file:
> $ ls /mnt/scratch/base/p0/f3 -alh
> -rw-rw-rw- 1 1441 2774 666P Feb 13 13:39 /mnt/scratch/base/p0/f3
>
> (well, 666, no wonder this will fail)
>
> But it only takes aroud 1M:
> $ du /mnt/scratch/base/p0/f3
> 1044    /mnt/scratch/base/p0/f3
>
> This is even more annoying than test failure.
>
> Can't we just filter out the fallocate/hole punching part in the test
> itself?

No. Removing fallocate and hole punching operations from btrfs/007, or
any other test, limits coverage.
There were bugs in the past uncovered by that test exactly because
fsstress exercises those operations.

Further, that would not prevent creation of sparse files. Operations
like pwrite, dwrite, clone, etc also cause sparse files.

>
> Thanks,
> Qu
>
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  src/fssum.c | 65 +++++--------------------------------------------------------
> >  1 file changed, 5 insertions(+), 60 deletions(-)
> >
> > diff --git a/src/fssum.c b/src/fssum.c
> > index 5da39abf..f1da72fb 100644
> > --- a/src/fssum.c
> > +++ b/src/fssum.c
> > @@ -224,71 +224,16 @@ int
> >  sum_file_data_permissive(int fd, sum_t *dst)
> >  {
> >       int ret;
> > -     off_t pos;
> > -     off_t old;
> > -     int i;
> > -     uint64_t zeros = 0;
> > -
> > -     pos = lseek(fd, 0, SEEK_CUR);
> > -     if (pos == (off_t)-1)
> > -             return errno == ENXIO ? 0 : -2;
> >
> >       while (1) {
> > -             old = pos;
> > -             pos = lseek(fd, pos, SEEK_DATA);
> > -             if (pos == (off_t)-1) {
> > -                     if (errno == ENXIO) {
> > -                             ret = 0;
> > -                             pos = lseek(fd, 0, SEEK_END);
> > -                             if (pos != (off_t)-1)
> > -                                     zeros += pos - old;
> > -                     } else {
> > -                             ret = -2;
> > -                     }
> > -                     break;
> > -             }
> >               ret = read(fd, buf, sizeof(buf));
> > -             assert(ret); /* eof found by lseek */
> > -             if (ret <= 0)
> > +             if (ret < 0)
> > +                     return -errno;
> > +             sum_add(dst, buf, ret);
> > +             if (ret < sizeof(buf))
> >                       break;
> > -             if (old < pos) /* hole */
> > -                     zeros += pos - old;
> > -             for (i = 0; i < ret; ++i) {
> > -                     for (old = i; buf[i] == 0 && i < ret; ++i)
> > -                             ;
> > -                     if (old < i) /* code like a hole */
> > -                             zeros += i - old;
> > -                     if (i == ret)
> > -                             break;
> > -                     if (zeros) {
> > -                             if (verbose >= 2)
> > -                                     fprintf(stderr,
> > -                                             "adding %llu zeros to sum\n",
> > -                                             (unsigned long long)zeros);
> > -                             sum_add_u64(dst, 0);
> > -                             sum_add_u64(dst, zeros);
> > -                             zeros = 0;
> > -                     }
> > -                     for (old = i; buf[i] != 0 && i < ret; ++i)
> > -                             ;
> > -                     if (verbose >= 2)
> > -                             fprintf(stderr, "adding %u non-zeros to sum\n",
> > -                                     i - (int)old);
> > -                     sum_add(dst, buf + old, i - old);
> > -             }
> > -             pos += ret;
> >       }
> > -
> > -     if (zeros) {
> > -             if (verbose >= 2)
> > -                     fprintf(stderr,
> > -                             "adding %llu zeros to sum (finishing)\n",
> > -                             (unsigned long long)zeros);
> > -             sum_add_u64(dst, 0);
> > -             sum_add_u64(dst, zeros);
> > -     }
> > -
> > -     return ret;
> > +     return 0;
> >  }
> >
> >  int
> >
>
Qu Wenruo Feb. 13, 2019, 10:44 a.m. UTC | #4
On 2019/2/13 下午6:21, Filipe Manana wrote:
> On Wed, Feb 13, 2019 at 5:44 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2018/10/29 下午5:43, fdmanana@kernel.org wrote:
>>> From: Filipe Manana <fdmanana@suse.com>
>>>
>>> Unless the '-s' option is passed to fssum, it should not detect file holes
>>> and have their existence influence the computed checksum for a file. This
>>> tool was added to test btrfs' send/receive feature, so that it checks for
>>> any metadata and data differences between the original filesystem and the
>>> filesystem that receives send streams.
>>>
>>> For a long time the test btrfs/007, which tests btrfs' send/receive with
>>> fsstress, fails sporadically reporting data differences between files.
>>> However the md5sum/sha1sum from the reported files in the original and
>>> new filesystems are the same. The reason why fssum fails is because even
>>> in normal mode it still accounts for number of holes that exist in the
>>> file and their respective lengths. This is done using the SEEK_DATA mode
>>> of lseek. The btrfs send feature does not preserve holes nor prealloc
>>> extents (not supported by the current protocol), so whenever a hole or
>>> prealloc (unwritten) extent is detected in the source filesystem, it
>>> issues a write command full of zeroes, which will translate to a regular
>>> (written) extent in the destination filesystem. This is why fssum reports
>>> a different checksum. A prealloc extent also counts as hole when using
>>> lseek.
>>>
>>> For example when passing a seed of 1540592967 to fsstress in btrfs/007,
>>> the test fails, as file p0/d0/f7 has a prealloc extent in the original
>>> filesystem (in the incr snapshot).
>>>
>>> Fix this by making fssum just read the hole file and feed its data to the
>>> digest calculation function when option '-s' is not given. If we ever get
>>> btrfs' send/receive to support holes and fallocate, we can just change
>>> the test and pass the '-s' option to all fssum calls.
>>
>> However this is causing more problem here.
>>
>> Now since fssum doesn't skip holes, for a super large sparse file, fssum
>> will try to fill all the holes with zero and spend tons of CPU time
>> calculating the result.
> 
> Yes. So will md5sum for example and many other things.
> 
> The problem is well known and it's due to the addition of the splice
> operation to fsstress:
> 
> https://www.spinics.net/lists/fstests/msg11257.html
> 
> It was also making other tests slower.
> This got fixed recently in the last Sunday's update by:
> 
> https://www.spinics.net/lists/fstests/msg11266.html
> 
>>
>> E.g. when seed = 1550703281.
>> We will have a PiB level file:
>> $ ls /mnt/scratch/base/p0/f3 -alh
>> -rw-rw-rw- 1 1441 2774 666P Feb 13 13:39 /mnt/scratch/base/p0/f3
>>
>> (well, 666, no wonder this will fail)
>>
>> But it only takes aroud 1M:
>> $ du /mnt/scratch/base/p0/f3
>> 1044    /mnt/scratch/base/p0/f3
>>
>> This is even more annoying than test failure.
>>
>> Can't we just filter out the fallocate/hole punching part in the test
>> itself?
> 
> No. Removing fallocate and hole punching operations from btrfs/007, or
> any other test, limits coverage.
> There were bugs in the past uncovered by that test exactly because
> fsstress exercises those operations.
> 
> Further, that would not prevent creation of sparse files. Operations
> like pwrite, dwrite, clone, etc also cause sparse files.

Although fsstress get fixed, I'm considering zero page detector for
fssum. Check hole first, then check all zero page.

Then we could handle large sparse file along with zero filled receive.

Thanks,
Qu

> 
>>
>> Thanks,
>> Qu
>>
>>>
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>> ---
>>>  src/fssum.c | 65 +++++--------------------------------------------------------
>>>  1 file changed, 5 insertions(+), 60 deletions(-)
>>>
>>> diff --git a/src/fssum.c b/src/fssum.c
>>> index 5da39abf..f1da72fb 100644
>>> --- a/src/fssum.c
>>> +++ b/src/fssum.c
>>> @@ -224,71 +224,16 @@ int
>>>  sum_file_data_permissive(int fd, sum_t *dst)
>>>  {
>>>       int ret;
>>> -     off_t pos;
>>> -     off_t old;
>>> -     int i;
>>> -     uint64_t zeros = 0;
>>> -
>>> -     pos = lseek(fd, 0, SEEK_CUR);
>>> -     if (pos == (off_t)-1)
>>> -             return errno == ENXIO ? 0 : -2;
>>>
>>>       while (1) {
>>> -             old = pos;
>>> -             pos = lseek(fd, pos, SEEK_DATA);
>>> -             if (pos == (off_t)-1) {
>>> -                     if (errno == ENXIO) {
>>> -                             ret = 0;
>>> -                             pos = lseek(fd, 0, SEEK_END);
>>> -                             if (pos != (off_t)-1)
>>> -                                     zeros += pos - old;
>>> -                     } else {
>>> -                             ret = -2;
>>> -                     }
>>> -                     break;
>>> -             }
>>>               ret = read(fd, buf, sizeof(buf));
>>> -             assert(ret); /* eof found by lseek */
>>> -             if (ret <= 0)
>>> +             if (ret < 0)
>>> +                     return -errno;
>>> +             sum_add(dst, buf, ret);
>>> +             if (ret < sizeof(buf))
>>>                       break;
>>> -             if (old < pos) /* hole */
>>> -                     zeros += pos - old;
>>> -             for (i = 0; i < ret; ++i) {
>>> -                     for (old = i; buf[i] == 0 && i < ret; ++i)
>>> -                             ;
>>> -                     if (old < i) /* code like a hole */
>>> -                             zeros += i - old;
>>> -                     if (i == ret)
>>> -                             break;
>>> -                     if (zeros) {
>>> -                             if (verbose >= 2)
>>> -                                     fprintf(stderr,
>>> -                                             "adding %llu zeros to sum\n",
>>> -                                             (unsigned long long)zeros);
>>> -                             sum_add_u64(dst, 0);
>>> -                             sum_add_u64(dst, zeros);
>>> -                             zeros = 0;
>>> -                     }
>>> -                     for (old = i; buf[i] != 0 && i < ret; ++i)
>>> -                             ;
>>> -                     if (verbose >= 2)
>>> -                             fprintf(stderr, "adding %u non-zeros to sum\n",
>>> -                                     i - (int)old);
>>> -                     sum_add(dst, buf + old, i - old);
>>> -             }
>>> -             pos += ret;
>>>       }
>>> -
>>> -     if (zeros) {
>>> -             if (verbose >= 2)
>>> -                     fprintf(stderr,
>>> -                             "adding %llu zeros to sum (finishing)\n",
>>> -                             (unsigned long long)zeros);
>>> -             sum_add_u64(dst, 0);
>>> -             sum_add_u64(dst, zeros);
>>> -     }
>>> -
>>> -     return ret;
>>> +     return 0;
>>>  }
>>>
>>>  int
>>>
>>
diff mbox series

Patch

diff --git a/src/fssum.c b/src/fssum.c
index 5da39abf..f1da72fb 100644
--- a/src/fssum.c
+++ b/src/fssum.c
@@ -224,71 +224,16 @@  int
 sum_file_data_permissive(int fd, sum_t *dst)
 {
 	int ret;
-	off_t pos;
-	off_t old;
-	int i;
-	uint64_t zeros = 0;
-
-	pos = lseek(fd, 0, SEEK_CUR);
-	if (pos == (off_t)-1)
-		return errno == ENXIO ? 0 : -2;
 
 	while (1) {
-		old = pos;
-		pos = lseek(fd, pos, SEEK_DATA);
-		if (pos == (off_t)-1) {
-			if (errno == ENXIO) {
-				ret = 0;
-				pos = lseek(fd, 0, SEEK_END);
-				if (pos != (off_t)-1)
-					zeros += pos - old;
-			} else {
-				ret = -2;
-			}
-			break;
-		}
 		ret = read(fd, buf, sizeof(buf));
-		assert(ret); /* eof found by lseek */
-		if (ret <= 0)
+		if (ret < 0)
+			return -errno;
+		sum_add(dst, buf, ret);
+		if (ret < sizeof(buf))
 			break;
-		if (old < pos) /* hole */
-			zeros += pos - old;
-		for (i = 0; i < ret; ++i) {
-			for (old = i; buf[i] == 0 && i < ret; ++i)
-				;
-			if (old < i) /* code like a hole */
-				zeros += i - old;
-			if (i == ret)
-				break;
-			if (zeros) {
-				if (verbose >= 2)
-					fprintf(stderr,
-						"adding %llu zeros to sum\n",
-						(unsigned long long)zeros);
-				sum_add_u64(dst, 0);
-				sum_add_u64(dst, zeros);
-				zeros = 0;
-			}
-			for (old = i; buf[i] != 0 && i < ret; ++i)
-				;
-			if (verbose >= 2)
-				fprintf(stderr, "adding %u non-zeros to sum\n",
-					i - (int)old);
-			sum_add(dst, buf + old, i - old);
-		}
-		pos += ret;
 	}
-
-	if (zeros) {
-		if (verbose >= 2)
-			fprintf(stderr,
-				"adding %llu zeros to sum (finishing)\n",
-				(unsigned long long)zeros);
-		sum_add_u64(dst, 0);
-		sum_add_u64(dst, zeros);
-	}
-
-	return ret;
+	return 0;
 }
 
 int