diff mbox series

[2/3] fsx: support eof page pollution for eof zeroing test coverage

Message ID 20240822144422.188462-3-bfoster@redhat.com (mailing list archive)
State New
Headers show
Series fstests/fsx: test coverage for eof zeroing | expand

Commit Message

Brian Foster Aug. 22, 2024, 2:44 p.m. UTC
File ranges that are newly exposed via size changing operations are
expected to return zeroes until written to. This behavior tends to
be difficult to regression test as failures can be racy and
transient. fsx is probably the best tool for this type of test
coverage, but uncovering issues can require running for a
significantly longer period of time than is typically invoked
through fstests tests. As a result, these types of regressions tend
to go unnoticed for an unfortunate amount of time.

To facilitate uncovering these problems more quickly, implement an
eof pollution mode in fsx that opportunistically injects post-eof
data prior to operations that change file size. Since data injection
occurs immediately before the size changing operation, it can be
used to detect problems in partial eof page/block zeroing associated
with each relevant operation.

The implementation takes advantage of the fact that mapped writes
can place data beyond eof so long as the page starts within eof. The
main reason for the isolated per-operation approach (vs. something
like allowing mapped writes to write beyond eof, for example) is to
accommodate the fact that writeback zeroes post-eof data on the eof
page. The current approach is therefore not necessarily guaranteed
to detect all problems, but provides more generic and broad test
coverage than the alternative of testing explicit command sequences
and doesn't require significant changes to how fsx works. If this
proves useful long term, further enhancements can be considered that
might facilitate the presence of post-eof data across operations.

Enable the feature with the -e command line option. It is disabled
by default because zeroing behavior is inconsistent across
filesystems. This can also be revisited in the future if zeroing
behavior is refined for the major filesystems that rely on fstests
for regression testing.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 ltp/fsx.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 72 insertions(+), 2 deletions(-)

Comments

Darrick J. Wong Aug. 22, 2024, 8:52 p.m. UTC | #1
On Thu, Aug 22, 2024 at 10:44:21AM -0400, Brian Foster wrote:
> File ranges that are newly exposed via size changing operations are
> expected to return zeroes until written to. This behavior tends to
> be difficult to regression test as failures can be racy and
> transient. fsx is probably the best tool for this type of test
> coverage, but uncovering issues can require running for a
> significantly longer period of time than is typically invoked
> through fstests tests. As a result, these types of regressions tend
> to go unnoticed for an unfortunate amount of time.
> 
> To facilitate uncovering these problems more quickly, implement an
> eof pollution mode in fsx that opportunistically injects post-eof
> data prior to operations that change file size. Since data injection
> occurs immediately before the size changing operation, it can be
> used to detect problems in partial eof page/block zeroing associated
> with each relevant operation.
> 
> The implementation takes advantage of the fact that mapped writes
> can place data beyond eof so long as the page starts within eof. The
> main reason for the isolated per-operation approach (vs. something
> like allowing mapped writes to write beyond eof, for example) is to
> accommodate the fact that writeback zeroes post-eof data on the eof
> page. The current approach is therefore not necessarily guaranteed
> to detect all problems, but provides more generic and broad test
> coverage than the alternative of testing explicit command sequences
> and doesn't require significant changes to how fsx works. If this
> proves useful long term, further enhancements can be considered that
> might facilitate the presence of post-eof data across operations.
> 
> Enable the feature with the -e command line option. It is disabled
> by default because zeroing behavior is inconsistent across
> filesystems. This can also be revisited in the future if zeroing
> behavior is refined for the major filesystems that rely on fstests
> for regression testing.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  ltp/fsx.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 72 insertions(+), 2 deletions(-)
> 
> diff --git a/ltp/fsx.c b/ltp/fsx.c
> index 1389c51d..20b8cd9f 100644
> --- a/ltp/fsx.c
> +++ b/ltp/fsx.c
> @@ -178,6 +178,7 @@ int	dedupe_range_calls = 1;		/* -B flag disables */
>  int	copy_range_calls = 1;		/* -E flag disables */
>  int	exchange_range_calls = 1;	/* -0 flag disables */
>  int	integrity = 0;			/* -i flag */
> +int	pollute_eof = 0;		/* -e flag */
>  int	fsxgoodfd = 0;
>  int	o_direct;			/* -Z */
>  int	aio = 0;
> @@ -983,6 +984,58 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
>  	}
>  }
>  
> +/*
> + * Pollute the EOF page with data beyond EOF prior to size change operations.
> + * This provides additional test coverage for partial EOF block/page zeroing.
> + * If the upcoming operation does not correctly zero, incorrect file data will
> + * be detected.
> + */
> +void
> +pollute_eofpage(unsigned int maxoff)
> +{
> +	unsigned offset = file_size;
> +	unsigned pg_offset;
> +	unsigned write_size;
> +	char    *p;
> +
> +	if (!pollute_eof)
> +		return;
> +
> +	/* write up to specified max or the end of the eof page */
> +	pg_offset = offset & mmap_mask;
> +	write_size = MIN(PAGE_SIZE - pg_offset, maxoff - offset);
> +
> +	if (!pg_offset)
> +		return;
> +
> +	if (!quiet &&
> +	    ((progressinterval && testcalls % progressinterval == 0) ||
> +	    (debug &&
> +	     (monitorstart == -1 ||
> +	     (offset + write_size > monitorstart &&
> +	      (monitorend == -1 || offset <= monitorend)))))) {
> +		prt("%lld pollute_eof\t0x%x thru\t0x%x\t(0x%x bytes)\n",
> +			testcalls, offset, offset + write_size - 1, write_size);
> +	}
> +
> +	if ((p = (char *)mmap(0, PAGE_SIZE, PROT_READ | PROT_WRITE,
> +			      MAP_FILE | MAP_SHARED, fd,
> +			      (off_t)(offset - pg_offset))) == (char *)-1) {

Nit:

if (mmap(...) == MAP_FAILED)?

Otherwise I like the concept here. :)

--D

> +		prterr("pollute_eofpage: mmap");
> +		return;
> +	}
> +
> +	/*
> +	 * Write to a range just past EOF of the test file. Do not update the
> +	 * good buffer because the upcoming operation is expected to zero this
> +	 * range of the file.
> +	 */
> +	gendata(original_buf, p, pg_offset, write_size);
> +
> +	if (munmap(p, PAGE_SIZE) != 0)
> +		prterr("pollute_eofpage: munmap");
> +}
> +
>  /*
>   * Helper to update the tracked file size. If the offset begins beyond current
>   * EOF, zero the range from EOF to offset in the good buffer.
> @@ -990,8 +1043,10 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
>  void
>  update_file_size(unsigned offset, unsigned size)
>  {
> -	if (offset > file_size)
> +	if (offset > file_size) {
> +		pollute_eofpage(offset + size);
>  		memset(good_buf + file_size, '\0', offset - file_size);
> +	}
>  	file_size = offset + size;
>  }
>  
> @@ -1143,6 +1198,9 @@ dotruncate(unsigned size)
>  
>  	log4(OP_TRUNCATE, 0, size, FL_NONE);
>  
> +	/* pollute the current EOF before a truncate down */
> +	if (size < file_size)
> +		pollute_eofpage(maxfilelen);
>  	update_file_size(size, 0);
>  
>  	if (testcalls <= simulatedopcount)
> @@ -1305,6 +1363,9 @@ do_collapse_range(unsigned offset, unsigned length)
>  
>  	log4(OP_COLLAPSE_RANGE, offset, length, FL_NONE);
>  
> +	/* pollute current eof before collapse truncates down */
> +	pollute_eofpage(maxfilelen);
> +
>  	if (testcalls <= simulatedopcount)
>  		return;
>  
> @@ -1356,6 +1417,9 @@ do_insert_range(unsigned offset, unsigned length)
>  
>  	log4(OP_INSERT_RANGE, offset, length, FL_NONE);
>  
> +	/* pollute current eof before insert truncates up */
> +	pollute_eofpage(maxfilelen);
> +
>  	if (testcalls <= simulatedopcount)
>  		return;
>  
> @@ -2385,6 +2449,7 @@ usage(void)
>  	-b opnum: beginning operation number (default 1)\n\
>  	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
>  	-d: debug output for all operations\n\
> +	-e: pollute post-eof on size changes (default 0)\n\
>  	-f: flush and invalidate cache after I/O\n\
>  	-g X: write character X instead of random generated data\n\
>  	-i logdev: do integrity testing, logdev is the dm log writes device\n\
> @@ -2783,7 +2848,7 @@ main(int argc, char **argv)
>  	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
>  
>  	while ((ch = getopt_long(argc, argv,
> -				 "0b:c:dfg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> +				 "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
>  				 longopts, NULL)) != EOF)
>  		switch (ch) {
>  		case 'b':
> @@ -2805,6 +2870,11 @@ main(int argc, char **argv)
>  		case 'd':
>  			debug = 1;
>  			break;
> +		case 'e':
> +			pollute_eof = getnum(optarg, &endp);
> +			if (pollute_eof < 0 || pollute_eof > 1)
> +				usage();
> +			break;
>  		case 'f':
>  			flush = 1;
>  			break;
> -- 
> 2.45.0
> 
>
Brian Foster Aug. 26, 2024, 2:04 p.m. UTC | #2
On Thu, Aug 22, 2024 at 01:52:57PM -0700, Darrick J. Wong wrote:
> On Thu, Aug 22, 2024 at 10:44:21AM -0400, Brian Foster wrote:
> > File ranges that are newly exposed via size changing operations are
> > expected to return zeroes until written to. This behavior tends to
> > be difficult to regression test as failures can be racy and
> > transient. fsx is probably the best tool for this type of test
> > coverage, but uncovering issues can require running for a
> > significantly longer period of time than is typically invoked
> > through fstests tests. As a result, these types of regressions tend
> > to go unnoticed for an unfortunate amount of time.
> > 
> > To facilitate uncovering these problems more quickly, implement an
> > eof pollution mode in fsx that opportunistically injects post-eof
> > data prior to operations that change file size. Since data injection
> > occurs immediately before the size changing operation, it can be
> > used to detect problems in partial eof page/block zeroing associated
> > with each relevant operation.
> > 
> > The implementation takes advantage of the fact that mapped writes
> > can place data beyond eof so long as the page starts within eof. The
> > main reason for the isolated per-operation approach (vs. something
> > like allowing mapped writes to write beyond eof, for example) is to
> > accommodate the fact that writeback zeroes post-eof data on the eof
> > page. The current approach is therefore not necessarily guaranteed
> > to detect all problems, but provides more generic and broad test
> > coverage than the alternative of testing explicit command sequences
> > and doesn't require significant changes to how fsx works. If this
> > proves useful long term, further enhancements can be considered that
> > might facilitate the presence of post-eof data across operations.
> > 
> > Enable the feature with the -e command line option. It is disabled
> > by default because zeroing behavior is inconsistent across
> > filesystems. This can also be revisited in the future if zeroing
> > behavior is refined for the major filesystems that rely on fstests
> > for regression testing.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  ltp/fsx.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 72 insertions(+), 2 deletions(-)
> > 
> > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > index 1389c51d..20b8cd9f 100644
> > --- a/ltp/fsx.c
> > +++ b/ltp/fsx.c
> > @@ -178,6 +178,7 @@ int	dedupe_range_calls = 1;		/* -B flag disables */
> >  int	copy_range_calls = 1;		/* -E flag disables */
> >  int	exchange_range_calls = 1;	/* -0 flag disables */
> >  int	integrity = 0;			/* -i flag */
> > +int	pollute_eof = 0;		/* -e flag */
> >  int	fsxgoodfd = 0;
> >  int	o_direct;			/* -Z */
> >  int	aio = 0;
> > @@ -983,6 +984,58 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
> >  	}
> >  }
> >  
> > +/*
> > + * Pollute the EOF page with data beyond EOF prior to size change operations.
> > + * This provides additional test coverage for partial EOF block/page zeroing.
> > + * If the upcoming operation does not correctly zero, incorrect file data will
> > + * be detected.
> > + */
> > +void
> > +pollute_eofpage(unsigned int maxoff)
> > +{
> > +	unsigned offset = file_size;
> > +	unsigned pg_offset;
> > +	unsigned write_size;
> > +	char    *p;
> > +
> > +	if (!pollute_eof)
> > +		return;
> > +
> > +	/* write up to specified max or the end of the eof page */
> > +	pg_offset = offset & mmap_mask;
> > +	write_size = MIN(PAGE_SIZE - pg_offset, maxoff - offset);
> > +
> > +	if (!pg_offset)
> > +		return;
> > +
> > +	if (!quiet &&
> > +	    ((progressinterval && testcalls % progressinterval == 0) ||
> > +	    (debug &&
> > +	     (monitorstart == -1 ||
> > +	     (offset + write_size > monitorstart &&
> > +	      (monitorend == -1 || offset <= monitorend)))))) {
> > +		prt("%lld pollute_eof\t0x%x thru\t0x%x\t(0x%x bytes)\n",
> > +			testcalls, offset, offset + write_size - 1, write_size);
> > +	}
> > +
> > +	if ((p = (char *)mmap(0, PAGE_SIZE, PROT_READ | PROT_WRITE,
> > +			      MAP_FILE | MAP_SHARED, fd,
> > +			      (off_t)(offset - pg_offset))) == (char *)-1) {
> 
> Nit:
> 
> if (mmap(...) == MAP_FAILED)?
> 
> Otherwise I like the concept here. :)
> 

Yep, copy paste fail. Will fix.

Brian

> --D
> 
> > +		prterr("pollute_eofpage: mmap");
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * Write to a range just past EOF of the test file. Do not update the
> > +	 * good buffer because the upcoming operation is expected to zero this
> > +	 * range of the file.
> > +	 */
> > +	gendata(original_buf, p, pg_offset, write_size);
> > +
> > +	if (munmap(p, PAGE_SIZE) != 0)
> > +		prterr("pollute_eofpage: munmap");
> > +}
> > +
> >  /*
> >   * Helper to update the tracked file size. If the offset begins beyond current
> >   * EOF, zero the range from EOF to offset in the good buffer.
> > @@ -990,8 +1043,10 @@ gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
> >  void
> >  update_file_size(unsigned offset, unsigned size)
> >  {
> > -	if (offset > file_size)
> > +	if (offset > file_size) {
> > +		pollute_eofpage(offset + size);
> >  		memset(good_buf + file_size, '\0', offset - file_size);
> > +	}
> >  	file_size = offset + size;
> >  }
> >  
> > @@ -1143,6 +1198,9 @@ dotruncate(unsigned size)
> >  
> >  	log4(OP_TRUNCATE, 0, size, FL_NONE);
> >  
> > +	/* pollute the current EOF before a truncate down */
> > +	if (size < file_size)
> > +		pollute_eofpage(maxfilelen);
> >  	update_file_size(size, 0);
> >  
> >  	if (testcalls <= simulatedopcount)
> > @@ -1305,6 +1363,9 @@ do_collapse_range(unsigned offset, unsigned length)
> >  
> >  	log4(OP_COLLAPSE_RANGE, offset, length, FL_NONE);
> >  
> > +	/* pollute current eof before collapse truncates down */
> > +	pollute_eofpage(maxfilelen);
> > +
> >  	if (testcalls <= simulatedopcount)
> >  		return;
> >  
> > @@ -1356,6 +1417,9 @@ do_insert_range(unsigned offset, unsigned length)
> >  
> >  	log4(OP_INSERT_RANGE, offset, length, FL_NONE);
> >  
> > +	/* pollute current eof before insert truncates up */
> > +	pollute_eofpage(maxfilelen);
> > +
> >  	if (testcalls <= simulatedopcount)
> >  		return;
> >  
> > @@ -2385,6 +2449,7 @@ usage(void)
> >  	-b opnum: beginning operation number (default 1)\n\
> >  	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
> >  	-d: debug output for all operations\n\
> > +	-e: pollute post-eof on size changes (default 0)\n\
> >  	-f: flush and invalidate cache after I/O\n\
> >  	-g X: write character X instead of random generated data\n\
> >  	-i logdev: do integrity testing, logdev is the dm log writes device\n\
> > @@ -2783,7 +2848,7 @@ main(int argc, char **argv)
> >  	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
> >  
> >  	while ((ch = getopt_long(argc, argv,
> > -				 "0b:c:dfg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > +				 "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> >  				 longopts, NULL)) != EOF)
> >  		switch (ch) {
> >  		case 'b':
> > @@ -2805,6 +2870,11 @@ main(int argc, char **argv)
> >  		case 'd':
> >  			debug = 1;
> >  			break;
> > +		case 'e':
> > +			pollute_eof = getnum(optarg, &endp);
> > +			if (pollute_eof < 0 || pollute_eof > 1)
> > +				usage();
> > +			break;
> >  		case 'f':
> >  			flush = 1;
> >  			break;
> > -- 
> > 2.45.0
> > 
> > 
>
diff mbox series

Patch

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 1389c51d..20b8cd9f 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -178,6 +178,7 @@  int	dedupe_range_calls = 1;		/* -B flag disables */
 int	copy_range_calls = 1;		/* -E flag disables */
 int	exchange_range_calls = 1;	/* -0 flag disables */
 int	integrity = 0;			/* -i flag */
+int	pollute_eof = 0;		/* -e flag */
 int	fsxgoodfd = 0;
 int	o_direct;			/* -Z */
 int	aio = 0;
@@ -983,6 +984,58 @@  gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
 	}
 }
 
+/*
+ * Pollute the EOF page with data beyond EOF prior to size change operations.
+ * This provides additional test coverage for partial EOF block/page zeroing.
+ * If the upcoming operation does not correctly zero, incorrect file data will
+ * be detected.
+ */
+void
+pollute_eofpage(unsigned int maxoff)
+{
+	unsigned offset = file_size;
+	unsigned pg_offset;
+	unsigned write_size;
+	char    *p;
+
+	if (!pollute_eof)
+		return;
+
+	/* write up to specified max or the end of the eof page */
+	pg_offset = offset & mmap_mask;
+	write_size = MIN(PAGE_SIZE - pg_offset, maxoff - offset);
+
+	if (!pg_offset)
+		return;
+
+	if (!quiet &&
+	    ((progressinterval && testcalls % progressinterval == 0) ||
+	    (debug &&
+	     (monitorstart == -1 ||
+	     (offset + write_size > monitorstart &&
+	      (monitorend == -1 || offset <= monitorend)))))) {
+		prt("%lld pollute_eof\t0x%x thru\t0x%x\t(0x%x bytes)\n",
+			testcalls, offset, offset + write_size - 1, write_size);
+	}
+
+	if ((p = (char *)mmap(0, PAGE_SIZE, PROT_READ | PROT_WRITE,
+			      MAP_FILE | MAP_SHARED, fd,
+			      (off_t)(offset - pg_offset))) == (char *)-1) {
+		prterr("pollute_eofpage: mmap");
+		return;
+	}
+
+	/*
+	 * Write to a range just past EOF of the test file. Do not update the
+	 * good buffer because the upcoming operation is expected to zero this
+	 * range of the file.
+	 */
+	gendata(original_buf, p, pg_offset, write_size);
+
+	if (munmap(p, PAGE_SIZE) != 0)
+		prterr("pollute_eofpage: munmap");
+}
+
 /*
  * Helper to update the tracked file size. If the offset begins beyond current
  * EOF, zero the range from EOF to offset in the good buffer.
@@ -990,8 +1043,10 @@  gendata(char *original_buf, char *good_buf, unsigned offset, unsigned size)
 void
 update_file_size(unsigned offset, unsigned size)
 {
-	if (offset > file_size)
+	if (offset > file_size) {
+		pollute_eofpage(offset + size);
 		memset(good_buf + file_size, '\0', offset - file_size);
+	}
 	file_size = offset + size;
 }
 
@@ -1143,6 +1198,9 @@  dotruncate(unsigned size)
 
 	log4(OP_TRUNCATE, 0, size, FL_NONE);
 
+	/* pollute the current EOF before a truncate down */
+	if (size < file_size)
+		pollute_eofpage(maxfilelen);
 	update_file_size(size, 0);
 
 	if (testcalls <= simulatedopcount)
@@ -1305,6 +1363,9 @@  do_collapse_range(unsigned offset, unsigned length)
 
 	log4(OP_COLLAPSE_RANGE, offset, length, FL_NONE);
 
+	/* pollute current eof before collapse truncates down */
+	pollute_eofpage(maxfilelen);
+
 	if (testcalls <= simulatedopcount)
 		return;
 
@@ -1356,6 +1417,9 @@  do_insert_range(unsigned offset, unsigned length)
 
 	log4(OP_INSERT_RANGE, offset, length, FL_NONE);
 
+	/* pollute current eof before insert truncates up */
+	pollute_eofpage(maxfilelen);
+
 	if (testcalls <= simulatedopcount)
 		return;
 
@@ -2385,6 +2449,7 @@  usage(void)
 	-b opnum: beginning operation number (default 1)\n\
 	-c P: 1 in P chance of file close+open at each op (default infinity)\n\
 	-d: debug output for all operations\n\
+	-e: pollute post-eof on size changes (default 0)\n\
 	-f: flush and invalidate cache after I/O\n\
 	-g X: write character X instead of random generated data\n\
 	-i logdev: do integrity testing, logdev is the dm log writes device\n\
@@ -2783,7 +2848,7 @@  main(int argc, char **argv)
 	setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
 
 	while ((ch = getopt_long(argc, argv,
-				 "0b:c:dfg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
+				 "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ",
 				 longopts, NULL)) != EOF)
 		switch (ch) {
 		case 'b':
@@ -2805,6 +2870,11 @@  main(int argc, char **argv)
 		case 'd':
 			debug = 1;
 			break;
+		case 'e':
+			pollute_eof = getnum(optarg, &endp);
+			if (pollute_eof < 0 || pollute_eof > 1)
+				usage();
+			break;
 		case 'f':
 			flush = 1;
 			break;