diff mbox

[BUG] pvmove corrupting XFS filesystems (was Re: [BUG] Internal error xfs_dir2_data_reada_verify)

Message ID 20130308015723.GA23616@dastard (mailing list archive)
State Deferred, archived
Headers show

Commit Message

Dave Chinner March 8, 2013, 1:57 a.m. UTC
On Thu, Mar 07, 2013 at 07:09:31PM -0500, Matteo Frigo wrote:
> Dave Chinner <david@fromorbit.com> writes:
> 
> > You need the XFS patch I posted so that readahead buffer
> > verification is avoided in the case of an error being returned from
> > the readahead.
> 
> I apologize if I was not clear in my previous post.  I mean to say that
> returning -EIO from dm, even in conjunction with your patch, is not
> sufficient to fix the problem.
> 
> Specifically, I repeated the experiment with v3.8.2 patched as discussed
> below, running my original script (repeated here for completeness):
> 
>    pvcreate /dev/vd[bc]
>    vgcreate test /dev/vd[bc]
>    lvcreate -L 8G -n vol test /dev/vdb
>    mkfs.xfs -f /dev/mapper/test-vol
>    mount -o noatime /dev/mapper/test-vol /mnt
>    cd /mnt
>    git clone ~/linux-stable
>    cd /
>    umount /mnt
> 
>    mount -o noatime /dev/mapper/test-vol /mnt
>    pvmove -b /dev/vdb /dev/vdc
>    sleep 2
>    rm -rf /mnt/linux-stable
> 
> I obtained a string of errors that starts with this:
> 
>   [  166.596574] XFS (dm-1): metadata I/O error: block 0x805060 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.599556] XFS (dm-1): metadata I/O error: block 0x805060 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.604845] XFS (dm-1): metadata I/O error: block 0x5285b8 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.607894] XFS (dm-1): metadata I/O error: block 0x5285b8 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.614242] XFS (dm-1): metadata I/O error: block 0x54f2b0 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.617307] XFS (dm-1): metadata I/O error: block 0x54f2b0 ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.651373] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.653517] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.655545] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.657614] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.659685] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.661731] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.663761] XFS (dm-1): Corruption detected. Unmount and run xfs_repair

Add the the patch below. If you still see errors, then they are real
IO errors from the block device.

Cheers,

Dave.

Comments

Matteo Frigo March 8, 2013, 11:38 a.m. UTC | #1
Dave Chinner <david@fromorbit.com> writes:

> Add the the patch below. If you still see errors, then they are real
> IO errors from the block device.

This patch fixes the problem for me.

The patch works both when dm-raid1 returns -EIO and when it returns
-EWOULDBLOCK.

Thanks for your help.

Cheers,
Matteo

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 50eb603..82b70bd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1336,6 +1336,12 @@  _xfs_buf_ioapply(
 	int		size;
 	int		i;
 
+	/*
+	 * Make sure we capture only current IO errors rather than stale errors
+	 * left over from previous use of the buffer (e.g. failed readahead).
+	 */
+	bp->b_error = 0;
+
 	if (bp->b_flags & XBF_WRITE) {
 		if (bp->b_flags & XBF_SYNCIO)
 			rw = WRITE_SYNC;