diff mbox series

[v4,RESEND,2/2] buffer: record blockdev write errors in super_block that it backs

Message ID 20200414120409.293749-3-jlayton@kernel.org (mailing list archive)
State New, archived
Headers show
Series vfs: have syncfs() return error when there are writeback errors | expand

Commit Message

Jeff Layton April 14, 2020, 12:04 p.m. UTC
From: Jeff Layton <jlayton@redhat.com>

When syncing out a block device (a'la __sync_blockdev), any error
encountered will only be recorded in the bd_inode's mapping. When the
blockdev contains a filesystem however, we'd like to also record the
error in the super_block that's stored there.

Make mark_buffer_write_io_error also record the error in the
corresponding super_block when a writeback error occurs and the block
device contains a mounted superblock.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/buffer.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jan Kara April 14, 2020, 4:26 p.m. UTC | #1
On Tue 14-04-20 08:04:09, Jeff Layton wrote:
> From: Jeff Layton <jlayton@redhat.com>
> 
> When syncing out a block device (a'la __sync_blockdev), any error
> encountered will only be recorded in the bd_inode's mapping. When the
> blockdev contains a filesystem however, we'd like to also record the
> error in the super_block that's stored there.
> 
> Make mark_buffer_write_io_error also record the error in the
> corresponding super_block when a writeback error occurs and the block
> device contains a mounted superblock.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>

The patch looks good to me. I'd just note that bh->b_bdev->bd_super
dereference is safe only because we will flush all dirty data when
unmounting a filesystem which is somewhat tricky. Maybe that warrants a
comment? Otherwise feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/buffer.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index f73276d746bb..a9d986d27fa1 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1160,6 +1160,8 @@ void mark_buffer_write_io_error(struct buffer_head *bh)
>  		mapping_set_error(bh->b_page->mapping, -EIO);
>  	if (bh->b_assoc_map)
>  		mapping_set_error(bh->b_assoc_map, -EIO);
> +	if (bh->b_bdev->bd_super)
> +		errseq_set(&bh->b_bdev->bd_super->s_wb_err, -EIO);
>  }
>  EXPORT_SYMBOL(mark_buffer_write_io_error);
>  
> -- 
> 2.25.2
>
Jeff Layton April 14, 2020, 6:37 p.m. UTC | #2
On Tue, 2020-04-14 at 18:26 +0200, Jan Kara wrote:
> On Tue 14-04-20 08:04:09, Jeff Layton wrote:
> > From: Jeff Layton <jlayton@redhat.com>
> > 
> > When syncing out a block device (a'la __sync_blockdev), any error
> > encountered will only be recorded in the bd_inode's mapping. When the
> > blockdev contains a filesystem however, we'd like to also record the
> > error in the super_block that's stored there.
> > 
> > Make mark_buffer_write_io_error also record the error in the
> > corresponding super_block when a writeback error occurs and the block
> > device contains a mounted superblock.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> 
> The patch looks good to me. I'd just note that bh->b_bdev->bd_super
> dereference is safe only because we will flush all dirty data when
> unmounting a filesystem which is somewhat tricky. Maybe that warrants a
> comment? Otherwise feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Oh, hmm...now that I look again, I'm not sure this is actually safe.

bh->b_bdev gets cleared out as we discard the buffer, so I don't think
that could end up getting zeroed while we're still using it.

The bd_super pointer gets zeroed out in kill_block_super, and after that
point it calls sync_blockdev(). Could writeback error processing race
with kill_block_super such that bd_inode gets set to NULL after we test
it but before we dereference it?

Thanks,
Jan Kara April 15, 2020, 9:17 a.m. UTC | #3
On Tue 14-04-20 14:37:21, Jeff Layton wrote:
> On Tue, 2020-04-14 at 18:26 +0200, Jan Kara wrote:
> > On Tue 14-04-20 08:04:09, Jeff Layton wrote:
> > > From: Jeff Layton <jlayton@redhat.com>
> > > 
> > > When syncing out a block device (a'la __sync_blockdev), any error
> > > encountered will only be recorded in the bd_inode's mapping. When the
> > > blockdev contains a filesystem however, we'd like to also record the
> > > error in the super_block that's stored there.
> > > 
> > > Make mark_buffer_write_io_error also record the error in the
> > > corresponding super_block when a writeback error occurs and the block
> > > device contains a mounted superblock.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > 
> > The patch looks good to me. I'd just note that bh->b_bdev->bd_super
> > dereference is safe only because we will flush all dirty data when
> > unmounting a filesystem which is somewhat tricky. Maybe that warrants a
> > comment? Otherwise feel free to add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> 
> Oh, hmm...now that I look again, I'm not sure this is actually safe.
> 
> bh->b_bdev gets cleared out as we discard the buffer, so I don't think
> that could end up getting zeroed while we're still using it.

Correct.

> The bd_super pointer gets zeroed out in kill_block_super, and after that
> point it calls sync_blockdev(). Could writeback error processing race
> with kill_block_super such that bd_inode gets set to NULL after we test
> it but before we dereference it?

Yeah, you're right. But you can avoid the race with
READ_ONCE(bh->b_bdev->bd_super) and a big fat comment explaining why it is
safe... :)

Or you could be less daring and put rcu protection there because
superblocks are RCU freed...

								Honza
diff mbox series

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index f73276d746bb..a9d986d27fa1 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1160,6 +1160,8 @@  void mark_buffer_write_io_error(struct buffer_head *bh)
 		mapping_set_error(bh->b_page->mapping, -EIO);
 	if (bh->b_assoc_map)
 		mapping_set_error(bh->b_assoc_map, -EIO);
+	if (bh->b_bdev->bd_super)
+		errseq_set(&bh->b_bdev->bd_super->s_wb_err, -EIO);
 }
 EXPORT_SYMBOL(mark_buffer_write_io_error);