diff mbox series

fs: fix lost error code in dio_complete

Message ID 20181030215739.4557-1-mheyne@amazon.de (mailing list archive)
State New, archived
Headers show
Series fs: fix lost error code in dio_complete | expand

Commit Message

Maximilian Heyne Oct. 30, 2018, 9:57 p.m. UTC
commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
generic_write_sync prototype") reworked callers of generic_write_sync(),
and ended up dropping the error return for the directio path. Prior to
that commit, in dio_complete(), an error would be bubbled up the stack,
but after that commit, errors passed on to dio_complete were eaten up.

This was reported on the list earlier, and a fix was proposed in
https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
never followed up with.  We recently hit this bug in our testing where
fencing io errors, which were previously erroring out with EIO, were
being returned as success operations after this commit.

The fix proposed on the list earlier was a little short -- it would have
still called generic_write_sync() in case `ret` already contained an
error.  This fix ensures generic_write_sync() is only called when
there's no pending error in the write.

CC: stable@vger.kernel.org
Reported-by: Ravi Nankani <rnankani@amazon.com>
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Torsten Mehlan <tomeh@amazon.de>
Signed-off-by: Uwe Dannowski <uwed@amazon.de>
Signed-off-by: Amit Shah <aams@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 fs/direct-io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig Oct. 31, 2018, 5:46 a.m. UTC | #1
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Shah, Amit Oct. 31, 2018, 9:24 a.m. UTC | #2
On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
> commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
> generic_write_sync prototype") reworked callers of generic_write_sync(),
> and ended up dropping the error return for the directio path. Prior to
> that commit, in dio_complete(), an error would be bubbled up the stack,
> but after that commit, errors passed on to dio_complete were eaten up.
> 
> This was reported on the list earlier, and a fix was proposed in
> https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
> never followed up with.  We recently hit this bug in our testing where
> fencing io errors, which were previously erroring out with EIO, were
> being returned as success operations after this commit.
> 
> The fix proposed on the list earlier was a little short -- it would have
> still called generic_write_sync() in case `ret` already contained an
> error.  This fix ensures generic_write_sync() is only called when
> there's no pending error in the write.
> 
> CC: stable@vger.kernel.org
> Reported-by: Ravi Nankani <rnankani@amazon.com>
> Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> Signed-off-by: Torsten Mehlan <tomeh@amazon.de>
> Signed-off-by: Uwe Dannowski <uwed@amazon.de>
> Signed-off-by: Amit Shah <aams@amazon.de>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  fs/direct-io.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index 093fb54cd316..199146036093 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
>  		 */
>  		dio->iocb->ki_pos += transferred;
>  
> -		if (dio->op == REQ_OP_WRITE)
> -			ret = generic_write_sync(dio->iocb,  transferred);
> +		if (ret > 0 && dio->op == REQ_OP_WRITE)
> +			ret = generic_write_sync(dio->iocb, ret);

Is the s/transferred/ret/ change necessary?  Needs explaining, at least.

>  		dio->iocb->ki_complete(dio->iocb, ret, 0);
>  	}
>  

Thanks,



				Amit
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Maximilian Heyne Nov. 1, 2018, 8:03 a.m. UTC | #3
On 10/31/18 10:24 AM, Shah, Amit wrote:
> On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
>> [...]
>>
>> diff --git a/fs/direct-io.c b/fs/direct-io.c
>> index 093fb54cd316..199146036093 100644
>> --- a/fs/direct-io.c
>> +++ b/fs/direct-io.c
>> @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
>>   		 */
>>   		dio->iocb->ki_pos += transferred;
>>   
>> -		if (dio->op == REQ_OP_WRITE)
>> -			ret = generic_write_sync(dio->iocb,  transferred);
>> +		if (ret > 0 && dio->op == REQ_OP_WRITE)
>> +			ret = generic_write_sync(dio->iocb, ret);
> Is the s/transferred/ret/ change necessary?  Needs explaining, at least.

In an above code line `ret` is set to `transferred`. So the change is
a no op. However, in my opinion the construct then looks cleaner.

>>   		dio->iocb->ki_complete(dio->iocb, ret, 0);
>>   	}
>>   
> Thanks,
>
>
>
> 				Amit



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Shah, Amit Nov. 1, 2018, 9:06 a.m. UTC | #4
On Do, 2018-11-01 at 09:03 +0100, Maximilian Heyne wrote:
> On 10/31/18 10:24 AM, Shah, Amit wrote:
> > 
> > On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
> > > 
> > > [...]
> > > 
> > > diff --git a/fs/direct-io.c b/fs/direct-io.c
> > > index 093fb54cd316..199146036093 100644
> > > --- a/fs/direct-io.c
> > > +++ b/fs/direct-io.c
> > > @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
> > >   		 */
> > >   		dio->iocb->ki_pos += transferred;
> > >   
> > > -		if (dio->op == REQ_OP_WRITE)
> > > -			ret = generic_write_sync(dio->iocb,  transferred);
> > > +		if (ret > 0 && dio->op == REQ_OP_WRITE)
> > > +			ret = generic_write_sync(dio->iocb, ret);
> > Is the s/transferred/ret/ change necessary?  Needs explaining, at least.
> In an above code line `ret` is set to `transferred`. So the change is
> a no op. However, in my opinion the construct then looks cleaner.

Yes, makes it also in line with the other callers, so this is good, thanks.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
diff mbox series

Patch

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 093fb54cd316..199146036093 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -325,8 +325,8 @@  static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
 		 */
 		dio->iocb->ki_pos += transferred;
 
-		if (dio->op == REQ_OP_WRITE)
-			ret = generic_write_sync(dio->iocb,  transferred);
+		if (ret > 0 && dio->op == REQ_OP_WRITE)
+			ret = generic_write_sync(dio->iocb, ret);
 		dio->iocb->ki_complete(dio->iocb, ret, 0);
 	}