diff mbox

[RFC] btrfs: make max inline data can be equal to sectorsize

Message ID 20161011064742.17364-1-wangxg.fnst@cn.fujitsu.com (mailing list archive)
State Superseded
Headers show

Commit Message

Xiaoguang Wang Oct. 11, 2016, 6:47 a.m. UTC
If we use mount option "-o max_inline=sectorsize", say 4096, indeed
even for a fresh fs, say nodesize is 16k, we can not make the first
4k data completely inline, I found this conditon causing this issue:
  !compressed_size && (actual_end & (root->sectorsize - 1)) == 0

If it retuns true, we'll not make data inline. For 4k sectorsize,
0~4094 dara range, we can make it inline, but 0~4095, it can not.
I don't think this limition is useful, so here remove it which will
make max inline data can be equal to sectorsize.

Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
---
 fs/btrfs/inode.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Chris Murphy Oct. 11, 2016, 3:49 p.m. UTC | #1
On Tue, Oct 11, 2016 at 12:47 AM, Wang Xiaoguang
<wangxg.fnst@cn.fujitsu.com> wrote:
> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> even for a fresh fs, say nodesize is 16k, we can not make the first
> 4k data completely inline, I found this conditon causing this issue:
>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>
> If it retuns true, we'll not make data inline. For 4k sectorsize,
> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
> I don't think this limition is useful, so here remove it which will
> make max inline data can be equal to sectorsize.
>
> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
> ---
>  fs/btrfs/inode.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index ea15520..c0db393 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
>         if (start > 0 ||
>             actual_end > root->sectorsize ||
>             data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
> -           (!compressed_size &&
> -           (actual_end & (root->sectorsize - 1)) == 0) ||
>             end + 1 < isize ||
>             data_len > root->fs_info->max_inline) {
>                 return 1;
> --
> 2.9.0


Before making any further changes to inline data, does it make sense
to find the source of corruption Zygo has been experiencing? That's in
the "btrfs rare silent data corruption with kernel data leak" thread.
Xiaoguang Wang Oct. 12, 2016, 3:35 a.m. UTC | #2
hi,

On 10/11/2016 11:49 PM, Chris Murphy wrote:
> On Tue, Oct 11, 2016 at 12:47 AM, Wang Xiaoguang
> <wangxg.fnst@cn.fujitsu.com> wrote:
>> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
>> even for a fresh fs, say nodesize is 16k, we can not make the first
>> 4k data completely inline, I found this conditon causing this issue:
>>    !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>>
>> If it retuns true, we'll not make data inline. For 4k sectorsize,
>> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
>> I don't think this limition is useful, so here remove it which will
>> make max inline data can be equal to sectorsize.
>>
>> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
>> ---
>>   fs/btrfs/inode.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index ea15520..c0db393 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
>>          if (start > 0 ||
>>              actual_end > root->sectorsize ||
>>              data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
>> -           (!compressed_size &&
>> -           (actual_end & (root->sectorsize - 1)) == 0) ||
>>              end + 1 < isize ||
>>              data_len > root->fs_info->max_inline) {
>>                  return 1;
>> --
>> 2.9.0
>
> Before making any further changes to inline data, does it make sense
> to find the source of corruption Zygo has been experiencing? That's in
> the "btrfs rare silent data corruption with kernel data leak" thread.
Yes, agree.
Also Zygo has sent a patch to fix that bug this morning :)

Regards,
XIaoguang Wang

>
>



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zygo Blaxell Oct. 15, 2016, 10:28 p.m. UTC | #3
On Wed, Oct 12, 2016 at 11:35:46AM +0800, Wang Xiaoguang wrote:
> hi,
> 
> On 10/11/2016 11:49 PM, Chris Murphy wrote:
> >On Tue, Oct 11, 2016 at 12:47 AM, Wang Xiaoguang
> ><wangxg.fnst@cn.fujitsu.com> wrote:
> >>If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> >>even for a fresh fs, say nodesize is 16k, we can not make the first
> >>4k data completely inline, I found this conditon causing this issue:
> >>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
> >>
> >>If it retuns true, we'll not make data inline. For 4k sectorsize,
> >>0~4094 dara range, we can make it inline, but 0~4095, it can not.
> >>I don't think this limition is useful, so here remove it which will
> >>make max inline data can be equal to sectorsize.
> >>
> >>Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
> >>---
> >>  fs/btrfs/inode.c | 2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >>diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> >>index ea15520..c0db393 100644
> >>--- a/fs/btrfs/inode.c
> >>+++ b/fs/btrfs/inode.c
> >>@@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
> >>         if (start > 0 ||
> >>             actual_end > root->sectorsize ||
> >>             data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
> >>-           (!compressed_size &&
> >>-           (actual_end & (root->sectorsize - 1)) == 0) ||
> >>             end + 1 < isize ||
> >>             data_len > root->fs_info->max_inline) {
> >>                 return 1;
> >>--
> >>2.9.0
> >
> >Before making any further changes to inline data, does it make sense
> >to find the source of corruption Zygo has been experiencing? That's in
> >the "btrfs rare silent data corruption with kernel data leak" thread.
> Yes, agree.
> Also Zygo has sent a patch to fix that bug this morning :)

FWIW I don't see any connection between this and the problem I found.
A page-sized inline extent wouldn't have any room for uninitialized
bytes.  If anthing, it's the one rare case that already worked.  ;)

> Regards,
> XIaoguang Wang
> 
> >
> >
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaoguang Wang Oct. 26, 2016, 7:32 a.m. UTC | #4
hi,

On 10/11/2016 02:47 PM, Wang Xiaoguang wrote:
> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> even for a fresh fs, say nodesize is 16k, we can not make the first
> 4k data completely inline, I found this conditon causing this issue:
>    !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>
> If it retuns true, we'll not make data inline. For 4k sectorsize,
> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
> I don't think this limition is useful, so here remove it which will
> make max inline data can be equal to sectorsize.
>
> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
> ---
>   fs/btrfs/inode.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index ea15520..c0db393 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
>   	if (start > 0 ||
>   	    actual_end > root->sectorsize ||
>   	    data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
> -	    (!compressed_size &&
> -	    (actual_end & (root->sectorsize - 1)) == 0) ||
>   	    end + 1 < isize ||
>   	    data_len > root->fs_info->max_inline) {
>   		return 1;
Any comments about this patch?

Regards,
Xiaoguang Wang




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo Nov. 11, 2016, 8:22 p.m. UTC | #5
On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> even for a fresh fs, say nodesize is 16k, we can not make the first
> 4k data completely inline, I found this conditon causing this issue:
>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
> 
> If it retuns true, we'll not make data inline. For 4k sectorsize,
> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
> I don't think this limition is useful, so here remove it which will
> make max inline data can be equal to sectorsize.

It's difficult to tell whether we need this, I'm not a big fan of using
max_inline size more than the default size 2048, given that most reports
about ENOSPC is due to metadata and inline may make it worse.

Thanks,

-liubo

> 
> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
> ---
>  fs/btrfs/inode.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index ea15520..c0db393 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
>  	if (start > 0 ||
>  	    actual_end > root->sectorsize ||
>  	    data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
> -	    (!compressed_size &&
> -	    (actual_end & (root->sectorsize - 1)) == 0) ||
>  	    end + 1 < isize ||
>  	    data_len > root->fs_info->max_inline) {
>  		return 1;
> -- 
> 2.9.0
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo Nov. 14, 2016, 1:55 a.m. UTC | #6
At 11/12/2016 04:22 AM, Liu Bo wrote:
> On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
>> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
>> even for a fresh fs, say nodesize is 16k, we can not make the first
>> 4k data completely inline, I found this conditon causing this issue:
>>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>>
>> If it retuns true, we'll not make data inline. For 4k sectorsize,
>> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
>> I don't think this limition is useful, so here remove it which will
>> make max inline data can be equal to sectorsize.
>
> It's difficult to tell whether we need this, I'm not a big fan of using
> max_inline size more than the default size 2048, given that most reports
> about ENOSPC is due to metadata and inline may make it worse.

IMHO if we can use inline data extents to trigger ENOSPC more easily, 
then we should allow it to dig the problem further.

Just ignoring it because it may cause more bug will not solve the real 
problem anyway.

Thanks,
Qu

>
> Thanks,
>
> -liubo
>
>>
>> Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
>> ---
>>  fs/btrfs/inode.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index ea15520..c0db393 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -267,8 +267,6 @@ static noinline int cow_file_range_inline(struct btrfs_root *root,
>>  	if (start > 0 ||
>>  	    actual_end > root->sectorsize ||
>>  	    data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
>> -	    (!compressed_size &&
>> -	    (actual_end & (root->sectorsize - 1)) == 0) ||
>>  	    end + 1 < isize ||
>>  	    data_len > root->fs_info->max_inline) {
>>  		return 1;
>> --
>> 2.9.0
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Nov. 16, 2016, 4:10 p.m. UTC | #7
On Mon, Nov 14, 2016 at 09:55:34AM +0800, Qu Wenruo wrote:
> At 11/12/2016 04:22 AM, Liu Bo wrote:
> > On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
> >> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> >> even for a fresh fs, say nodesize is 16k, we can not make the first
> >> 4k data completely inline, I found this conditon causing this issue:
> >>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
> >>
> >> If it retuns true, we'll not make data inline. For 4k sectorsize,
> >> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
> >> I don't think this limition is useful, so here remove it which will
> >> make max inline data can be equal to sectorsize.
> >
> > It's difficult to tell whether we need this, I'm not a big fan of using
> > max_inline size more than the default size 2048, given that most reports
> > about ENOSPC is due to metadata and inline may make it worse.
> 
> IMHO if we can use inline data extents to trigger ENOSPC more easily, 
> then we should allow it to dig the problem further.
> 
> Just ignoring it because it may cause more bug will not solve the real 
> problem anyway.

Not allowing the full 4k value as max_inline looks artificial to me.
We've removed other similar limitation in the past so I'd tend to agree
to do the same here. There's no significant use for it as far as I can
tell, if you want to exhaust metadata, the difference to max_inline=4095
would be really tiny in the end. So, I'm okay with merging it. If
anybody feels like adding his reviewed-by, please do so.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Mason Nov. 18, 2016, 8:58 p.m. UTC | #8
On 11/16/2016 11:10 AM, David Sterba wrote:
> On Mon, Nov 14, 2016 at 09:55:34AM +0800, Qu Wenruo wrote:
>> At 11/12/2016 04:22 AM, Liu Bo wrote:
>>> On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
>>>> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
>>>> even for a fresh fs, say nodesize is 16k, we can not make the first
>>>> 4k data completely inline, I found this conditon causing this issue:
>>>>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>>>>
>>>> If it retuns true, we'll not make data inline. For 4k sectorsize,
>>>> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
>>>> I don't think this limition is useful, so here remove it which will
>>>> make max inline data can be equal to sectorsize.
>>>
>>> It's difficult to tell whether we need this, I'm not a big fan of using
>>> max_inline size more than the default size 2048, given that most reports
>>> about ENOSPC is due to metadata and inline may make it worse.
>>
>> IMHO if we can use inline data extents to trigger ENOSPC more easily,
>> then we should allow it to dig the problem further.
>>
>> Just ignoring it because it may cause more bug will not solve the real
>> problem anyway.
>
> Not allowing the full 4k value as max_inline looks artificial to me.
> We've removed other similar limitation in the past so I'd tend to agree
> to do the same here. There's no significant use for it as far as I can
> tell, if you want to exhaust metadata, the difference to max_inline=4095
> would be really tiny in the end. So, I'm okay with merging it. If
> anybody feels like adding his reviewed-by, please do so.

The check is there because in practice it doesn't make sense to inline 
an extent if it fits perfectly in a data block.  You could argue its 
saving seeks, but we're also adding seeks by spreading out the metadata 
in general.  So, I'd want to see benchmarks before deciding.

If we're using it for debugging, I'd rather stick with max_inline=4095.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zygo Blaxell Nov. 19, 2016, 8:27 a.m. UTC | #9
On Fri, Nov 18, 2016 at 03:58:06PM -0500, Chris Mason wrote:
> 
> 
> On 11/16/2016 11:10 AM, David Sterba wrote:
> >On Mon, Nov 14, 2016 at 09:55:34AM +0800, Qu Wenruo wrote:
> >>At 11/12/2016 04:22 AM, Liu Bo wrote:
> >>>On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
> >>>>If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> >>>>even for a fresh fs, say nodesize is 16k, we can not make the first
> >>>>4k data completely inline, I found this conditon causing this issue:
> >>>>  !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
> >>>>
> >>>>If it retuns true, we'll not make data inline. For 4k sectorsize,
> >>>>0~4094 dara range, we can make it inline, but 0~4095, it can not.
> >>>>I don't think this limition is useful, so here remove it which will
> >>>>make max inline data can be equal to sectorsize.
> >>>
> >>>It's difficult to tell whether we need this, I'm not a big fan of using
> >>>max_inline size more than the default size 2048, given that most reports
> >>>about ENOSPC is due to metadata and inline may make it worse.
> >>
> >>IMHO if we can use inline data extents to trigger ENOSPC more easily,
> >>then we should allow it to dig the problem further.
> >>
> >>Just ignoring it because it may cause more bug will not solve the real
> >>problem anyway.
> >
> >Not allowing the full 4k value as max_inline looks artificial to me.
> >We've removed other similar limitation in the past so I'd tend to agree
> >to do the same here. There's no significant use for it as far as I can
> >tell, if you want to exhaust metadata, the difference to max_inline=4095
> >would be really tiny in the end. So, I'm okay with merging it. If
> >anybody feels like adding his reviewed-by, please do so.
> 
> The check is there because in practice it doesn't make sense to inline an
> extent if it fits perfectly in a data block.  You could argue its saving
> seeks, but we're also adding seeks by spreading out the metadata in general.
> So, I'd want to see benchmarks before deciding.

Does that limit kick in before or after compression?  A compressed extent
could easily have 4096 bytes of data in 200 bytes.  If a filesystem
contained a whole lot of exactly-4096-byte compressible files that extra
byte might be worth something.

> If we're using it for debugging, I'd rather stick with max_inline=4095.
> 
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaoguang Wang Nov. 22, 2016, 7:54 a.m. UTC | #10
hello,

On 11/19/2016 04:58 AM, Chris Mason wrote:
>
>
> On 11/16/2016 11:10 AM, David Sterba wrote:
>> On Mon, Nov 14, 2016 at 09:55:34AM +0800, Qu Wenruo wrote:
>>> At 11/12/2016 04:22 AM, Liu Bo wrote:
>>>> On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
>>>>> If we use mount option "-o max_inline=sectorsize", say 4096, indeed
>>>>> even for a fresh fs, say nodesize is 16k, we can not make the first
>>>>> 4k data completely inline, I found this conditon causing this issue:
>>>>>   !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
>>>>>
>>>>> If it retuns true, we'll not make data inline. For 4k sectorsize,
>>>>> 0~4094 dara range, we can make it inline, but 0~4095, it can not.
>>>>> I don't think this limition is useful, so here remove it which will
>>>>> make max inline data can be equal to sectorsize.
>>>>
>>>> It's difficult to tell whether we need this, I'm not a big fan of 
>>>> using
>>>> max_inline size more than the default size 2048, given that most 
>>>> reports
>>>> about ENOSPC is due to metadata and inline may make it worse.
>>>
>>> IMHO if we can use inline data extents to trigger ENOSPC more easily,
>>> then we should allow it to dig the problem further.
>>>
>>> Just ignoring it because it may cause more bug will not solve the real
>>> problem anyway.
>>
>> Not allowing the full 4k value as max_inline looks artificial to me.
>> We've removed other similar limitation in the past so I'd tend to agree
>> to do the same here. There's no significant use for it as far as I can
>> tell, if you want to exhaust metadata, the difference to max_inline=4095
>> would be really tiny in the end. So, I'm okay with merging it. If
>> anybody feels like adding his reviewed-by, please do so.
>
> The check is there because in practice it doesn't make sense to inline 
> an extent if it fits perfectly in a data block. 
I see, thanks for this clarification.

> You could argue its saving seeks, but we're also adding seeks by 
> spreading out the metadata in general.  So, I'd want to see benchmarks 
> before deciding.
I had tried to construct some benchmark tests, such as create and read 
plenty of
small files, copy linux source codes, I don't see any obvious 
difference, it maybe
reasonable, after all, this patch just results in one bytes difference.

>
> If we're using it for debugging, I'd rather stick with max_inline=4095.
I was just curious that we could make inline for 4095, but not allow 
4096 before,
just one bytes :)

Regards,
Xiaoguang Wang

>
> -chris
>
>



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Jan. 2, 2017, 5:21 p.m. UTC | #11
On Sat, Nov 19, 2016 at 03:27:02AM -0500, Zygo Blaxell wrote:
> On Fri, Nov 18, 2016 at 03:58:06PM -0500, Chris Mason wrote:
> > 
> > 
> > On 11/16/2016 11:10 AM, David Sterba wrote:
> > >On Mon, Nov 14, 2016 at 09:55:34AM +0800, Qu Wenruo wrote:
> > >>At 11/12/2016 04:22 AM, Liu Bo wrote:
> > >>>On Tue, Oct 11, 2016 at 02:47:42PM +0800, Wang Xiaoguang wrote:
> > >>>>If we use mount option "-o max_inline=sectorsize", say 4096, indeed
> > >>>>even for a fresh fs, say nodesize is 16k, we can not make the first
> > >>>>4k data completely inline, I found this conditon causing this issue:
> > >>>>  !compressed_size && (actual_end & (root->sectorsize - 1)) == 0
> > >>>>
> > >>>>If it retuns true, we'll not make data inline. For 4k sectorsize,
> > >>>>0~4094 dara range, we can make it inline, but 0~4095, it can not.
> > >>>>I don't think this limition is useful, so here remove it which will
> > >>>>make max inline data can be equal to sectorsize.
> > >>>
> > >>>It's difficult to tell whether we need this, I'm not a big fan of using
> > >>>max_inline size more than the default size 2048, given that most reports
> > >>>about ENOSPC is due to metadata and inline may make it worse.
> > >>
> > >>IMHO if we can use inline data extents to trigger ENOSPC more easily,
> > >>then we should allow it to dig the problem further.
> > >>
> > >>Just ignoring it because it may cause more bug will not solve the real
> > >>problem anyway.
> > >
> > >Not allowing the full 4k value as max_inline looks artificial to me.
> > >We've removed other similar limitation in the past so I'd tend to agree
> > >to do the same here. There's no significant use for it as far as I can
> > >tell, if you want to exhaust metadata, the difference to max_inline=4095
> > >would be really tiny in the end. So, I'm okay with merging it. If
> > >anybody feels like adding his reviewed-by, please do so.
> > 
> > The check is there because in practice it doesn't make sense to inline an
> > extent if it fits perfectly in a data block.  You could argue its saving
> > seeks, but we're also adding seeks by spreading out the metadata in general.
> > So, I'd want to see benchmarks before deciding.
> 
> Does that limit kick in before or after compression?  A compressed extent
> could easily have 4096 bytes of data in 200 bytes.  If a filesystem
> contained a whole lot of exactly-4096-byte compressible files that extra
> byte might be worth something.

A 4k file that compresses to a small number of bytes will be inlined.
The exact constraints are:
 * file size < sectorsize,
 * compressed result <= inline limit (either in the leaf space or max_inline).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ea15520..c0db393 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -267,8 +267,6 @@  static noinline int cow_file_range_inline(struct btrfs_root *root,
 	if (start > 0 ||
 	    actual_end > root->sectorsize ||
 	    data_len > BTRFS_MAX_INLINE_DATA_SIZE(root) ||
-	    (!compressed_size &&
-	    (actual_end & (root->sectorsize - 1)) == 0) ||
 	    end + 1 < isize ||
 	    data_len > root->fs_info->max_inline) {
 		return 1;