diff mbox

direct-io: propagate -ENOSPC errors

Message ID x4960wosr4n.fsf@segfault.boston.devel.redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Moyer March 14, 2016, 9:10 p.m. UTC
dio_bio_complete turns all errors into -EIO.  This is historical,
since you used to only get 1 bit precision for errors (BIO_UPTODATE).
Now that we get actual error codes, we can return the appropriate
code to userspace.  File systems seem to only propagate either EIO
or ENOSPC, so I've followed suit in this patch.

This fixes an issue where -ENOSPC was being turned into -EIO when
testing dm-thin.

Reported-by: Carlos Maiolino <cmaiolin@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Carlos Maiolino March 16, 2016, 1:25 p.m. UTC | #1
This looks good to me.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO.  This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace.  File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.
> 
> This fixes an issue where -ENOSPC was being turned into -EIO when
> testing dm-thin.
> 
> Reported-by: Carlos Maiolino <cmaiolin@redhat.com>
> Tested-by: Mike Snitzer <snitzer@redhat.com>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index d6a9012..990e0aa 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -466,13 +466,15 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
>  {
>  	struct bio_vec *bvec;
>  	unsigned i;
> -	int err;
>  
> -	if (bio->bi_error)
> +	/* Only EIO and ENOSPC should be returned to userspace. */
> +	if (bio->bi_error == 0 ||
> +	    bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
> +		dio->io_error = bio->bi_error;
> +	else
>  		dio->io_error = -EIO;
>  
>  	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
> -		err = bio->bi_error;
>  		bio_check_pages_dirty(bio);	/* transfers ownership */
>  	} else {
>  		bio_for_each_segment_all(bvec, bio, i) {
> @@ -483,10 +485,9 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
>  				set_page_dirty_lock(page);
>  			page_cache_release(page);
>  		}
> -		err = bio->bi_error;
>  		bio_put(bio);
>  	}
> -	return err;
> +	return dio->io_error;
>  }
>  
>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig March 21, 2016, 4:02 p.m. UTC | #2
On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO.  This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace.  File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.

Do we?  Just propagating some errors defintively seems odd.  Even
if we do we should have a central helper doing that mapping instead
of opencoding it in various places.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Moyer March 21, 2016, 8:22 p.m. UTC | #3
Hi, Christoph,

Christoph Hellwig <hch@infradead.org> writes:

> On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
>> dio_bio_complete turns all errors into -EIO.  This is historical,
>> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
>> Now that we get actual error codes, we can return the appropriate
>> code to userspace.  File systems seem to only propagate either EIO
>> or ENOSPC, so I've followed suit in this patch.
>
> Do we?

Sometimes!  :)  I was referring to this:

static inline void mapping_set_error(struct address_space *mapping, int error)
{
        if (unlikely(error)) {
                if (error == -ENOSPC)
                        set_bit(AS_ENOSPC, &mapping->flags);
                else
                        set_bit(AS_EIO, &mapping->flags);
        }
}

> Just propagating some errors defintively seems odd.

Not really.  read, write, etc only expect a subset of errnos to be
returned.  The goal was not to leak kernel-internal or unexpected error
numbers to userspace, and I didn't think I would be able to successfully
audit all code paths that lead here.  So, I opted for a more
conservative patch that just allows one more errno through.

> Even if we do we should have a central helper doing that mapping
> instead of opencoding it in various places.

OK.  I would argue that the right place to filter out errno's would be
up in the vfs.  I do wonder if I'm just being overly paranoid, though.

Al, do you have any opinions, here?

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Todd Vierling April 21, 2016, 2:17 p.m. UTC | #4
On 03/21/2016 04:22 PM, Jeff Moyer wrote:
>> Just propagating some errors defintively seems odd.
> 
> Not really.  read, write, etc only expect a subset of errnos to be
> returned.  The goal was not to leak kernel-internal or unexpected error
> numbers to userspace, and I didn't think I would be able to successfully
> audit all code paths that lead here.  So, I opted for a more
> conservative patch that just allows one more errno through.

We have a use case which makes heavy use of dm to map plain files into
block devs. Propagating ENOSPC is useful here, as a sparse backing file
might run into this when extending, and the downstream app would be able
to see something other than EIO.

I'm ambivalent on whether or not to allow all errnos through, or just a
peephole like ENOSPC here. However, there's another errno which is
effectively the same cause as ENOSPC but triggered by a logical rather
than physical limit: EDQUOT.

If we're looking at a peephole for ENOSPC to be let through, I'd suggest
also allowing EDQUOT through. To your concerns about confusing
applications with too many possible errnos, EDQUOT could be translated
to ENOSPC before being let through to the block dev layer. (The net
effect is the same in any case, as the backing store has denied a write
due to some kind of space limit.)
diff mbox

Patch

diff --git a/fs/direct-io.c b/fs/direct-io.c
index d6a9012..990e0aa 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -466,13 +466,15 @@  static int dio_bio_complete(struct dio *dio, struct bio *bio)
 {
 	struct bio_vec *bvec;
 	unsigned i;
-	int err;
 
-	if (bio->bi_error)
+	/* Only EIO and ENOSPC should be returned to userspace. */
+	if (bio->bi_error == 0 ||
+	    bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
+		dio->io_error = bio->bi_error;
+	else
 		dio->io_error = -EIO;
 
 	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
-		err = bio->bi_error;
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
 		bio_for_each_segment_all(bvec, bio, i) {
@@ -483,10 +485,9 @@  static int dio_bio_complete(struct dio *dio, struct bio *bio)
 				set_page_dirty_lock(page);
 			page_cache_release(page);
 		}
-		err = bio->bi_error;
 		bio_put(bio);
 	}
-	return err;
+	return dio->io_error;
 }
 
 /*