btrfs: return -EIO on error in btree_write_cache_pages
diff mbox series

Message ID 20200710140619.2366724-1-josef@toxicpanda.com
State New
Headers show
Series
  • btrfs: return -EIO on error in btree_write_cache_pages
Related show

Commit Message

Josef Bacik July 10, 2020, 2:06 p.m. UTC
Eric reported seeing this message while running generic/475

BTRFS: error (device dm-3) in btrfs_sync_log:3084: errno=-117 Filesystem corrupted

This ret came from btrfs_write_marked_extents().  If we get an aborted
transaction via an -EIO somewhere, we'll see it in
btree_write_cache_pages() and return -EUCLEAN, which we spit out as
"Filesystem corrupted".  Except we shouldn't be returning -EUCLEAN here,
we need to be returning -EIO.  -EUCLEAN is reserved for actual
corruption, not IO errors.

Reported-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Sandeen July 10, 2020, 2:50 p.m. UTC | #1
On 7/10/20 9:06 AM, Josef Bacik wrote:
> Eric reported seeing this message while running generic/475
> 
> BTRFS: error (device dm-3) in btrfs_sync_log:3084: errno=-117 Filesystem corrupted
> 
> This ret came from btrfs_write_marked_extents().  If we get an aborted
> transaction via an -EIO somewhere, we'll see it in
> btree_write_cache_pages() and return -EUCLEAN, which we spit out as
> "Filesystem corrupted".  Except we shouldn't be returning -EUCLEAN here,
> we need to be returning -EIO.  -EUCLEAN is reserved for actual
> corruption, not IO errors.

Is BTRFS_FS_STATE_ERROR only set for IO errors, or could it also be
set for an actual corruption state?

> Reported-by: Eric Sandeen <esandeen@redhat.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/extent_io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index a76b7da91aa6..6f0dd15729cc 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4122,7 +4122,7 @@ int btree_write_cache_pages(struct address_space *mapping,
>  	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
>  		ret = flush_write_bio(&epd);
>  	} else {
> -		ret = -EUCLEAN;
> +		ret = -EIO;
>  		end_write_bio(&epd, ret);
>  	}
>  	return ret;
>
Josef Bacik July 10, 2020, 2:54 p.m. UTC | #2
On 7/10/20 10:50 AM, Eric Sandeen wrote:
> On 7/10/20 9:06 AM, Josef Bacik wrote:
>> Eric reported seeing this message while running generic/475
>>
>> BTRFS: error (device dm-3) in btrfs_sync_log:3084: errno=-117 Filesystem corrupted
>>
>> This ret came from btrfs_write_marked_extents().  If we get an aborted
>> transaction via an -EIO somewhere, we'll see it in
>> btree_write_cache_pages() and return -EUCLEAN, which we spit out as
>> "Filesystem corrupted".  Except we shouldn't be returning -EUCLEAN here,
>> we need to be returning -EIO.  -EUCLEAN is reserved for actual
>> corruption, not IO errors.
> 
> Is BTRFS_FS_STATE_ERROR only set for IO errors, or could it also be
> set for an actual corruption state?

It's set when we abort the transaction, which can be either or I suppose.  At 
this point we don't have the offending error, but the transaction abort _would_ 
have it.  So if there was a corruption you would see it higher up in the logs. 
Thanks,

Josef

Patch
diff mbox series

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a76b7da91aa6..6f0dd15729cc 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4122,7 +4122,7 @@  int btree_write_cache_pages(struct address_space *mapping,
 	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
 		ret = flush_write_bio(&epd);
 	} else {
-		ret = -EUCLEAN;
+		ret = -EIO;
 		end_write_bio(&epd, ret);
 	}
 	return ret;