direct-io: propagate -ENOSPC errors

Message ID	x4960wosr4n.fsf@segfault.boston.devel.redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Jeff Moyer <jmoyer@redhat.com> To: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [patch] direct-io: propagate -ENOSPC errors Date: Mon, 14 Mar 2016 17:10:00 -0400 Message-ID: <x4960wosr4n.fsf@segfault.boston.devel.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk

Message ID

x4960wosr4n.fsf@segfault.boston.devel.redhat.com (mailing list archive)

State

New, archived

Headers

From: Jeff Moyer <jmoyer@redhat.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [patch] direct-io: propagate -ENOSPC errors
Date: Mon, 14 Mar 2016 17:10:00 -0400
Message-ID: <x4960wosr4n.fsf@segfault.boston.devel.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk

Commit Message

Jeff Moyer March 14, 2016, 9:10 p.m. UTC

dio_bio_complete turns all errors into -EIO.  This is historical,
since you used to only get 1 bit precision for errors (BIO_UPTODATE).
Now that we get actual error codes, we can return the appropriate
code to userspace.  File systems seem to only propagate either EIO
or ENOSPC, so I've followed suit in this patch.

This fixes an issue where -ENOSPC was being turned into -EIO when
testing dm-thin.

Reported-by: Carlos Maiolino <cmaiolin@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Carlos Maiolino March 16, 2016, 1:25 p.m. UTC | #1

This looks good to me.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO.  This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace.  File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.
> 
> This fixes an issue where -ENOSPC was being turned into -EIO when
> testing dm-thin.
> 
> Reported-by: Carlos Maiolino <cmaiolin@redhat.com>
> Tested-by: Mike Snitzer <snitzer@redhat.com>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index d6a9012..990e0aa 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -466,13 +466,15 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
>  {
>  	struct bio_vec *bvec;
>  	unsigned i;
> -	int err;
>  
> -	if (bio->bi_error)
> +	/* Only EIO and ENOSPC should be returned to userspace. */
> +	if (bio->bi_error == 0 ||
> +	    bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
> +		dio->io_error = bio->bi_error;
> +	else
>  		dio->io_error = -EIO;
>  
>  	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
> -		err = bio->bi_error;
>  		bio_check_pages_dirty(bio);	/* transfers ownership */
>  	} else {
>  		bio_for_each_segment_all(bvec, bio, i) {
> @@ -483,10 +485,9 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
>  				set_page_dirty_lock(page);
>  			page_cache_release(page);
>  		}
> -		err = bio->bi_error;
>  		bio_put(bio);
>  	}
> -	return err;
> +	return dio->io_error;
>  }
>  
>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Christoph Hellwig March 21, 2016, 4:02 p.m. UTC | #2

On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO.  This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace.  File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.

Do we?  Just propagating some errors defintively seems odd.  Even
if we do we should have a central helper doing that mapping instead
of opencoding it in various places.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jeff Moyer March 21, 2016, 8:22 p.m. UTC | #3

Hi, Christoph,

Christoph Hellwig <hch@infradead.org> writes:

> On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
>> dio_bio_complete turns all errors into -EIO.  This is historical,
>> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
>> Now that we get actual error codes, we can return the appropriate
>> code to userspace.  File systems seem to only propagate either EIO
>> or ENOSPC, so I've followed suit in this patch.
>
> Do we?

Sometimes!  :)  I was referring to this:

static inline void mapping_set_error(struct address_space *mapping, int error)
{
        if (unlikely(error)) {
                if (error == -ENOSPC)
                        set_bit(AS_ENOSPC, &mapping->flags);
                else
                        set_bit(AS_EIO, &mapping->flags);
        }
}

> Just propagating some errors defintively seems odd.

Not really.  read, write, etc only expect a subset of errnos to be
returned.  The goal was not to leak kernel-internal or unexpected error
numbers to userspace, and I didn't think I would be able to successfully
audit all code paths that lead here.  So, I opted for a more
conservative patch that just allows one more errno through.

> Even if we do we should have a central helper doing that mapping
> instead of opencoding it in various places.

OK.  I would argue that the right place to filter out errno's would be
up in the vfs.  I do wonder if I'm just being overly paranoid, though.

Al, do you have any opinions, here?

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Todd Vierling April 21, 2016, 2:17 p.m. UTC | #4

On 03/21/2016 04:22 PM, Jeff Moyer wrote:
>> Just propagating some errors defintively seems odd.
> 
> Not really.  read, write, etc only expect a subset of errnos to be
> returned.  The goal was not to leak kernel-internal or unexpected error
> numbers to userspace, and I didn't think I would be able to successfully
> audit all code paths that lead here.  So, I opted for a more
> conservative patch that just allows one more errno through.

We have a use case which makes heavy use of dm to map plain files into
block devs. Propagating ENOSPC is useful here, as a sparse backing file
might run into this when extending, and the downstream app would be able
to see something other than EIO.

I'm ambivalent on whether or not to allow all errnos through, or just a
peephole like ENOSPC here. However, there's another errno which is
effectively the same cause as ENOSPC but triggered by a logical rather
than physical limit: EDQUOT.

If we're looking at a peephole for ENOSPC to be let through, I'd suggest
also allowing EDQUOT through. To your concerns about confusing
applications with too many possible errnos, EDQUOT could be translated
to ENOSPC before being let through to the block dev layer. (The net
effect is the same in any case, as the backing store has denied a write
due to some kind of space limit.)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index d6a9012..990e0aa 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -466,13 +466,15 @@  static int dio_bio_complete(struct dio *dio, struct bio *bio)
 {
 	struct bio_vec *bvec;
 	unsigned i;
-	int err;
 
-	if (bio->bi_error)
+	/* Only EIO and ENOSPC should be returned to userspace. */
+	if (bio->bi_error == 0 ||
+	    bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
+		dio->io_error = bio->bi_error;
+	else
 		dio->io_error = -EIO;
 
 	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
-		err = bio->bi_error;
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
 		bio_for_each_segment_all(bvec, bio, i) {
@@ -483,10 +485,9 @@  static int dio_bio_complete(struct dio *dio, struct bio *bio)
 				set_page_dirty_lock(page);
 			page_cache_release(page);
 		}
-		err = bio->bi_error;
 		bio_put(bio);
 	}
-	return err;
+	return dio->io_error;
 }
 
 /*

direct-io: propagate -ENOSPC errors

Commit Message

Comments

Patch