diff mbox series

Randomly fail readahead I/Os

Message ID 20201003012110.GC20115@casper.infradead.org (mailing list archive)
State New
Headers show
Series Randomly fail readahead I/Os | expand

Commit Message

Matthew Wilcox (Oracle) Oct. 3, 2020, 1:21 a.m. UTC
I have a patch in my THP development tree that fails 10% of the readahead
I/Os in order to make sure the fallback paths are tested.  This has
exposed three distinct problems so far, resulted in three patches that
are scheduled for 5.10:

    iomap: Set all uptodate bits for an Uptodate page
    iomap: Mark read blocks uptodate in write_begin
    iomap: Clear page error before beginning a write

I've hit a fourth problem when running generic/127:

XFS (sdb): Internal error isnullstartblock(got.br_startblock) at line 5807 of file fs/xfs/libxfs/xfs_bmap.c.  Caller xfs_bmap_collapse_extents+0x2bd/0x370
CPU: 4 PID: 4493 Comm: fsx Kdump: loaded Not tainted 5.9.0-rc3-00178-g35daf53935c9-dirty #765
Call Trace:

That finally persuaded me to port the patch to the current iomap for-next
tree (see below).  Unfortunately, it doesn't reproduce, but I wonder
if it's simply that a 4kB page size is too small.  Would anyone like to
give this a shot on a 64kB page size system?  It usually takes less than
15 minutes to reproduce with my THP patchset, but doesn't reproduce in
2 hours without the THP patchset.
diff mbox series


diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 8180061b9e16..2e67631a12ce 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -193,6 +193,8 @@  iomap_read_end_io(struct bio *bio)
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
+	if (bio->bi_private == (void *)7)
+		error = -EIO;
 	bio_for_each_segment_all(bvec, bio, iter_all)
 		iomap_read_page_end_io(bvec, error);
@@ -286,6 +288,12 @@  iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		if (ctx->rac) /* same as readahead_gfp_mask */
 			gfp |= __GFP_NORETRY | __GFP_NOWARN;
 		ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
+		if (ctx->rac) {
+			static int error = 0;
+			ctx->bio->bi_private = (void *)(error++);
+			if (error == 10)
+				error = 0;
+		}
 		 * If the bio_alloc fails, try it again for a single page to
 		 * avoid having to deal with partial page reads.  This emulates