diff mbox series

[-next,1/1] iomap: Fix a false positive of UBSAN in iomap_seek_data()

Message ID 20210702092109.2601-1-thunder.leizhen@huawei.com (mailing list archive)
State New, archived
Headers show
Series [-next,1/1] iomap: Fix a false positive of UBSAN in iomap_seek_data() | expand

Commit Message

Zhen Lei July 2, 2021, 9:21 a.m. UTC
Move the evaluation expression "size - offset" after the "if (offset < 0)"
judgment statement to eliminate a false positive produced by the UBSAN.

No functional changes.

==========================================================================
UBSAN: Undefined behaviour in fs/iomap.c:1435:9
signed integer overflow:
0 - -9223372036854775808 cannot be represented in type 'long long int'
CPU: 1 PID: 462 Comm: syz-executor852 Tainted: G ---------r-  - 4.18.0+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ...
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xca/0x13e lib/dump_stack.c:113
 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
 handle_overflow+0x193/0x1e2 lib/ubsan.c:190
 iomap_seek_data+0x128/0x140 fs/iomap.c:1435
 ext4_llseek+0x1e3/0x290 fs/ext4/file.c:494
 vfs_llseek fs/read_write.c:300 [inline]
 ksys_lseek+0xe9/0x160 fs/read_write.c:313
 do_syscall_64+0xca/0x5b0 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x6a/0xdf
==========================================================================

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 fs/iomap/seek.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig July 2, 2021, 9:34 a.m. UTC | #1
We might as well just kill off the length variable while we're at it:


diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index dab1b02eba5b7f..942e354e9e13e6 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -35,23 +35,21 @@ loff_t
 iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
 {
 	loff_t size = i_size_read(inode);
-	loff_t length = size - offset;
 	loff_t ret;
 
 	/* Nothing to be found before or beyond the end of the file. */
 	if (offset < 0 || offset >= size)
 		return -ENXIO;
 
-	while (length > 0) {
-		ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
-				  &offset, iomap_seek_hole_actor);
+	while (offset < size) {
+		ret = iomap_apply(inode, offset, size - offset, IOMAP_REPORT,
+				  ops, &offset, iomap_seek_hole_actor);
 		if (ret < 0)
 			return ret;
 		if (ret == 0)
 			break;
 
 		offset += ret;
-		length -= ret;
 	}
 
 	return offset;
Zhen Lei July 2, 2021, 11:50 a.m. UTC | #2
On 2021/7/2 17:34, Christoph Hellwig wrote:
> We might as well just kill off the length variable while we're at it:

Hi, Christoph:
  Maybe you need to write a separate patch. Because the patch I sent is
to modify function iomap_seek_data(). I didn't look at the other functions.
In fact, both iomap_seek_data() and iomap_seek_hole() need to be modified.
The iomap_seek_data() may not be intuitive to delete the variable 'length'.

I'm now analyzing if the "if (length <= 0)" statement in iomap_seek_data()
is redundant (the condition is never true).

> 
> 
> diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
> index dab1b02eba5b7f..942e354e9e13e6 100644
> --- a/fs/iomap/seek.c
> +++ b/fs/iomap/seek.c
> @@ -35,23 +35,21 @@ loff_t
>  iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
>  {
>  	loff_t size = i_size_read(inode);
> -	loff_t length = size - offset;
>  	loff_t ret;
>  
>  	/* Nothing to be found before or beyond the end of the file. */
>  	if (offset < 0 || offset >= size)
>  		return -ENXIO;
>  
> -	while (length > 0) {
> -		ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
> -				  &offset, iomap_seek_hole_actor);
> +	while (offset < size) {
> +		ret = iomap_apply(inode, offset, size - offset, IOMAP_REPORT,
> +				  ops, &offset, iomap_seek_hole_actor);
>  		if (ret < 0)
>  			return ret;
>  		if (ret == 0)
>  			break;
>  
>  		offset += ret;
> -		length -= ret;
>  	}
>  
>  	return offset;
> 
> .
>
Matthew Wilcox July 2, 2021, 7:56 p.m. UTC | #3
On Fri, Jul 02, 2021 at 05:21:09PM +0800, Zhen Lei wrote:
> Move the evaluation expression "size - offset" after the "if (offset < 0)"
> judgment statement to eliminate a false positive produced by the UBSAN.
> 
> No functional changes.
> 
> ==========================================================================
> UBSAN: Undefined behaviour in fs/iomap.c:1435:9
> signed integer overflow:
> 0 - -9223372036854775808 cannot be represented in type 'long long int'

I don't understand.  I thought we defined the behaviour of signed
integer overflow in the kernel with whatever-the-gcc-flag-is?
Matthew Wilcox July 4, 2021, 1:51 p.m. UTC | #4
On Fri, Jul 02, 2021 at 10:34:23AM +0100, Christoph Hellwig wrote:
> We might as well just kill off the length variable while we're at it:

Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
> index dab1b02eba5b7f..942e354e9e13e6 100644
> --- a/fs/iomap/seek.c
> +++ b/fs/iomap/seek.c
> @@ -35,23 +35,21 @@ loff_t
>  iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
>  {
>  	loff_t size = i_size_read(inode);
> -	loff_t length = size - offset;
>  	loff_t ret;
>  
>  	/* Nothing to be found before or beyond the end of the file. */
>  	if (offset < 0 || offset >= size)
>  		return -ENXIO;
>  
> -	while (length > 0) {
> -		ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
> -				  &offset, iomap_seek_hole_actor);
> +	while (offset < size) {
> +		ret = iomap_apply(inode, offset, size - offset, IOMAP_REPORT,
> +				  ops, &offset, iomap_seek_hole_actor);
>  		if (ret < 0)
>  			return ret;
>  		if (ret == 0)
>  			break;
>  
>  		offset += ret;
> -		length -= ret;
>  	}
>  
>  	return offset;
Zhen Lei July 5, 2021, 3:29 a.m. UTC | #5
On 2021/7/2 19:50, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/7/2 17:34, Christoph Hellwig wrote:
>> We might as well just kill off the length variable while we're at it:
> 
> Hi, Christoph:
>   Maybe you need to write a separate patch. Because the patch I sent is
> to modify function iomap_seek_data(). I didn't look at the other functions.
> In fact, both iomap_seek_data() and iomap_seek_hole() need to be modified.
> The iomap_seek_data() may not be intuitive to delete the variable 'length'.
> 
> I'm now analyzing if the "if (length <= 0)" statement in iomap_seek_data()
> is redundant (the condition is never true).

I've thought about it, and that "if" statement can be removed as follows:

diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index dab1b02eba5b..dc55f9ecd948 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -96,14 +96,13 @@ iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
 		if (ret < 0)
 			return ret;
 		if (ret == 0)
-			break;
+			return offset;

 		offset += ret;
 		length -= ret;
 	}

-	if (length <= 0)
-		return -ENXIO;
-	return offset;
+	/* The end of the file is reached, and no data is found */
+	return -ENXIO;
 }
 EXPORT_SYMBOL_GPL(iomap_seek_data);



> 
>>
>>
>> diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
>> index dab1b02eba5b7f..942e354e9e13e6 100644
>> --- a/fs/iomap/seek.c
>> +++ b/fs/iomap/seek.c
>> @@ -35,23 +35,21 @@ loff_t
>>  iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
>>  {
>>  	loff_t size = i_size_read(inode);
>> -	loff_t length = size - offset;
>>  	loff_t ret;
>>  
>>  	/* Nothing to be found before or beyond the end of the file. */
>>  	if (offset < 0 || offset >= size)
>>  		return -ENXIO;
>>  
>> -	while (length > 0) {
>> -		ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
>> -				  &offset, iomap_seek_hole_actor);
>> +	while (offset < size) {
>> +		ret = iomap_apply(inode, offset, size - offset, IOMAP_REPORT,
>> +				  ops, &offset, iomap_seek_hole_actor);
>>  		if (ret < 0)
>>  			return ret;
>>  		if (ret == 0)
>>  			break;
>>  
>>  		offset += ret;
>> -		length -= ret;
>>  	}
>>  
>>  	return offset;
>>
>> .
>>
Zhen Lei July 5, 2021, 3:35 a.m. UTC | #6
On 2021/7/3 3:56, Matthew Wilcox wrote:
> On Fri, Jul 02, 2021 at 05:21:09PM +0800, Zhen Lei wrote:
>> Move the evaluation expression "size - offset" after the "if (offset < 0)"
>> judgment statement to eliminate a false positive produced by the UBSAN.
>>
>> No functional changes.
>>
>> ==========================================================================
>> UBSAN: Undefined behaviour in fs/iomap.c:1435:9
>> signed integer overflow:
>> 0 - -9223372036854775808 cannot be represented in type 'long long int'
> 
> I don't understand.  I thought we defined the behaviour of signed
> integer overflow in the kernel with whatever-the-gcc-flag-is?

-9223372036854775808 ==> 0x8000000000000000 ==> -0

I don't fully understand what you mean. This is triggered by explicit error
injection '-0' at runtime, which should not be detected by compilation options.

lseek(r1, 0x8000000000000000, 0x3)

> 
> 
> .
>
Matthew Wilcox July 5, 2021, 3:43 a.m. UTC | #7
On Mon, Jul 05, 2021 at 11:29:44AM +0800, Leizhen (ThunderTown) wrote:
> I've thought about it, and that "if" statement can be removed as follows:

I think this really misses Christoph's point.  He's looking for
something more like this:

@@ -83,27 +83,23 @@ loff_t
 iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
 {
        loff_t size = i_size_read(inode);
-       loff_t length = size - offset;
        loff_t ret;

        /* Nothing to be found before or beyond the end of the file. */
        if (offset < 0 || offset >= size)
                return -ENXIO;

-       while (length > 0) {
+       while (offset < size) {
                ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
                                  &offset, iomap_seek_data_actor);
                if (ret < 0)
                        return ret;
                if (ret == 0)
-                       break;
+                       return offset;

                offset += ret;
-               length -= ret;
        }

-       if (length <= 0)
-               return -ENXIO;
-       return offset;
+       return -ENXIO;
 }
 EXPORT_SYMBOL_GPL(iomap_seek_data);

(not even slightly tested)
Zhen Lei July 5, 2021, 4:05 a.m. UTC | #8
On 2021/7/5 11:43, Matthew Wilcox wrote:
> On Mon, Jul 05, 2021 at 11:29:44AM +0800, Leizhen (ThunderTown) wrote:
>> I've thought about it, and that "if" statement can be removed as follows:
> 
> I think this really misses Christoph's point.  He's looking for
> something more like this:

Yes, I know that. I need to get rid of the "if" judgment first, and then I can
modify iomap_seek_data() according to Christoph's point. Otherwise there are too
many conversions from "length" to "size - offset" and the code doesn't look clear.

> 
> @@ -83,27 +83,23 @@ loff_t
>  iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
>  {
>         loff_t size = i_size_read(inode);
> -       loff_t length = size - offset;
>         loff_t ret;
> 
>         /* Nothing to be found before or beyond the end of the file. */
>         if (offset < 0 || offset >= size)
>                 return -ENXIO;
> 
> -       while (length > 0) {
> +       while (offset < size) {
>                 ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
>                                   &offset, iomap_seek_data_actor);
>                 if (ret < 0)
>                         return ret;
>                 if (ret == 0)
> -                       break;
> +                       return offset;
> 
>                 offset += ret;
> -               length -= ret;
>         }
> 
> -       if (length <= 0)
> -               return -ENXIO;
> -       return offset;
> +       return -ENXIO;
>  }
>  EXPORT_SYMBOL_GPL(iomap_seek_data);
> 
> (not even slightly tested)
> 
> .
>
Matthew Wilcox July 6, 2021, 11:08 a.m. UTC | #9
On Mon, Jul 05, 2021 at 11:35:08AM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/7/3 3:56, Matthew Wilcox wrote:
> > On Fri, Jul 02, 2021 at 05:21:09PM +0800, Zhen Lei wrote:
> >> Move the evaluation expression "size - offset" after the "if (offset < 0)"
> >> judgment statement to eliminate a false positive produced by the UBSAN.
> >>
> >> No functional changes.
> >>
> >> ==========================================================================
> >> UBSAN: Undefined behaviour in fs/iomap.c:1435:9
> >> signed integer overflow:
> >> 0 - -9223372036854775808 cannot be represented in type 'long long int'
> > 
> > I don't understand.  I thought we defined the behaviour of signed
> > integer overflow in the kernel with whatever-the-gcc-flag-is?
> 
> -9223372036854775808 ==> 0x8000000000000000 ==> -0
> 
> I don't fully understand what you mean. This is triggered by explicit error
> injection '-0' at runtime, which should not be detected by compilation options.

We use -fwrapv on the gcc command line:

'-fwrapv'
     This option instructs the compiler to assume that signed arithmetic
     overflow of addition, subtraction and multiplication wraps around
     using twos-complement representation.  This flag enables some
     optimizations and disables others.

> lseek(r1, 0x8000000000000000, 0x3)

I'll see about adding this to xfstests ...
Matthew Wilcox July 6, 2021, 11:59 a.m. UTC | #10
On Tue, Jul 06, 2021 at 12:08:14PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 05, 2021 at 11:35:08AM +0800, Leizhen (ThunderTown) wrote:
> > 
> > 
> > On 2021/7/3 3:56, Matthew Wilcox wrote:
> > > On Fri, Jul 02, 2021 at 05:21:09PM +0800, Zhen Lei wrote:
> > >> Move the evaluation expression "size - offset" after the "if (offset < 0)"
> > >> judgment statement to eliminate a false positive produced by the UBSAN.
> > >>
> > >> No functional changes.
> > >>
> > >> ==========================================================================
> > >> UBSAN: Undefined behaviour in fs/iomap.c:1435:9
> > >> signed integer overflow:
> > >> 0 - -9223372036854775808 cannot be represented in type 'long long int'
> > > 
> > > I don't understand.  I thought we defined the behaviour of signed
> > > integer overflow in the kernel with whatever-the-gcc-flag-is?
> > 
> > -9223372036854775808 ==> 0x8000000000000000 ==> -0

(actually, this is incorrect.  think about how twos-complement
arithmetic works.  first you negate every bit, so 8000..000 turns into
7fff..fff, then you add one, returning to 8000..000, so -LLONG_MIN ==
LLONG_MIN)

> > I don't fully understand what you mean. This is triggered by explicit error
> > injection '-0' at runtime, which should not be detected by compilation options.
> 
> We use -fwrapv on the gcc command line:
> 
> '-fwrapv'
>      This option instructs the compiler to assume that signed arithmetic
>      overflow of addition, subtraction and multiplication wraps around
>      using twos-complement representation.  This flag enables some
>      optimizations and disables others.
> 
> > lseek(r1, 0x8000000000000000, 0x3)
> 
> I'll see about adding this to xfstests ...

I have and it doesn't produce the problem.  My config:

CONFIG_UBSAN=y
# CONFIG_UBSAN_TRAP is not set
CONFIG_CC_HAS_UBSAN_BOUNDS=y
CONFIG_UBSAN_BOUNDS=y
CONFIG_UBSAN_ONLY_BOUNDS=y
CONFIG_UBSAN_SHIFT=y
CONFIG_UBSAN_DIV_ZERO=y
CONFIG_UBSAN_BOOL=y
CONFIG_UBSAN_ENUM=y
# CONFIG_UBSAN_ALIGNMENT is not set
CONFIG_UBSAN_SANITIZE_ALL=y
# CONFIG_TEST_UBSAN is not set

I even went as far as adding printks to be sure I'm hitting it:

hole length 0x8000000000000000
data length 0x8000000000000000

Are you compiling with:
KBUILD_CFLAGS   += -fno-strict-overflow

Or have you done something weird?  What compiler version are you using?
diff mbox series

Patch

diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index dab1b02eba5b..778e3e84c95e 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -83,13 +83,14 @@  loff_t
 iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops *ops)
 {
 	loff_t size = i_size_read(inode);
-	loff_t length = size - offset;
+	loff_t length;
 	loff_t ret;
 
 	/* Nothing to be found before or beyond the end of the file. */
 	if (offset < 0 || offset >= size)
 		return -ENXIO;
 
+	length = size - offset;
 	while (length > 0) {
 		ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops,
 				  &offset, iomap_seek_data_actor);