diff mbox

[1/2] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

Message ID 20180515073622.18732-2-wqu@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Qu Wenruo May 15, 2018, 7:36 a.m. UTC
As btrfs(5) specified:

	Note
	If nodatacow or nodatasum are enabled, compression is disabled.

If NODATASUM or NODATACOW set, we should not compress the extent.

Normally NODATACOW is detected properly in run_delalloc_range() so
compression won't happen for NODATACOW.

However for NODATASUM we don't have any check, and it can cause
compressed extent without csum pretty easily, just by:
------
mkfs.btrfs -f $dev
mount $dev $mnt -o nodatasum
touch $mnt/foobar
mount -o remount,datasum,compress $mnt
xfs_io -f -c "pwrite 0 128K" $mnt/foobar
------

And in fact, we have bug report about corrupted compressed extent
without proper data checksum so even RAID1 can't recover the corruption.
(https://bugzilla.kernel.org/show_bug.cgi?id=199707)

Running compression without proper checksum could cause more damage when
corruption happens, so there is no need to allow compression for
NODATACSUM.

Reported-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Nikolay Borisov May 15, 2018, 8:21 a.m. UTC | #1
On 15.05.2018 10:36, Qu Wenruo wrote:
> As btrfs(5) specified:
> 
> 	Note
> 	If nodatacow or nodatasum are enabled, compression is disabled.
> 
> If NODATASUM or NODATACOW set, we should not compress the extent.
> 
> Normally NODATACOW is detected properly in run_delalloc_range() so
> compression won't happen for NODATACOW.
> 
> However for NODATASUM we don't have any check, and it can cause
> compressed extent without csum pretty easily, just by:
> ------
> mkfs.btrfs -f $dev
> mount $dev $mnt -o nodatasum
> touch $mnt/foobar
> mount -o remount,datasum,compress $mnt
> xfs_io -f -c "pwrite 0 128K" $mnt/foobar
> ------
> 
> And in fact, we have bug report about corrupted compressed extent
> without proper data checksum so even RAID1 can't recover the corruption.
> (https://bugzilla.kernel.org/show_bug.cgi?id=199707)
> 
> Running compression without proper checksum could cause more damage when
> corruption happens, so there is no need to allow compression for
> NODATACSUM.
> 
> Reported-by: James Harvey <jamespharvey20@gmail.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/inode.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d241285a0d2a..dbef3f404559 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -396,6 +396,14 @@ static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
>  {
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  
> +	/*
> +	 * Btrfs doesn't support compression without csum or CoW.
> +	 * This should have the highest priority.
> +	 */
> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
> +	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
> +		return 0;
> +

How is this not buggy, given that if inode_need_compress as called from 
compress_file_range will return zero, meaning we jump to cont: label. 
Then in the case of an inline extent we can execute : 

ret = cow_file_range_inline(inode, start, end,          
                           total_compressed,           
                           compress_type, pages);   

where compress_type would have been set at the beginning of the 
function unconditionally to fs_info->compress_type. 

For non-inline extents I guess we are ok, given that will_compress 
will not be set. However, this code is rather messy and I'm not sure 
it's well defined what's going to happen in this case with inline extents. 

OTOH, I think there is something fundamentally wrong in calling 
inode_need_compress in compress_file_range. I.e they work at different 
abstractions. IMO compress_file_range should only be called if we know 
we have to compress the range. 

So looking around the code in run_delalloc_range (the only function 
which calls cow_file_range_async) we already have : 

 } else if (!inode_need_compress(inode, start, end)) {                   
                ret = cow_file_range(inode, locked_page, start, end, end,       
                                      page_started, nr_written, 1, NULL);   

and in the else branch we have the cow_file_range_async. So the code 
is sort of half-way there to actually decoupling compression checking from 
performing the actual compression. 


>  	/* force compress */
>  	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
>  		return 1;

One more thing, in inode_need_compress shouldn't the inode specific
checks come first something like : 


static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
{                                                                               
        struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);                  
                                                                                
        /* defrag ioctl */                                                      
        if (BTRFS_I(inode)->defrag_compress)                                    
                return 1;                                                       
        /* bad compression ratios */                                            
        if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
                return 0;                                                       
        /* force compress */                                                    
        if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
                return 1;                                                       
        if (btrfs_test_opt(fs_info, COMPRESS) ||                                
            BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
            BTRFS_I(inode)->prop_compress)                                      
                return btrfs_compress_heuristic(inode, start, end);             
        return 0;                                                               
}             

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo May 15, 2018, 8:30 a.m. UTC | #2
On 2018年05月15日 16:21, Nikolay Borisov wrote:
> 
> 
> On 15.05.2018 10:36, Qu Wenruo wrote:
>> As btrfs(5) specified:
>>
>> 	Note
>> 	If nodatacow or nodatasum are enabled, compression is disabled.
>>
>> If NODATASUM or NODATACOW set, we should not compress the extent.
>>
>> Normally NODATACOW is detected properly in run_delalloc_range() so
>> compression won't happen for NODATACOW.
>>
>> However for NODATASUM we don't have any check, and it can cause
>> compressed extent without csum pretty easily, just by:
>> ------
>> mkfs.btrfs -f $dev
>> mount $dev $mnt -o nodatasum
>> touch $mnt/foobar
>> mount -o remount,datasum,compress $mnt
>> xfs_io -f -c "pwrite 0 128K" $mnt/foobar
>> ------
>>
>> And in fact, we have bug report about corrupted compressed extent
>> without proper data checksum so even RAID1 can't recover the corruption.
>> (https://bugzilla.kernel.org/show_bug.cgi?id=199707)
>>
>> Running compression without proper checksum could cause more damage when
>> corruption happens, so there is no need to allow compression for
>> NODATACSUM.
>>
>> Reported-by: James Harvey <jamespharvey20@gmail.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/inode.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index d241285a0d2a..dbef3f404559 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -396,6 +396,14 @@ static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
>>  {
>>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>  
>> +	/*
>> +	 * Btrfs doesn't support compression without csum or CoW.
>> +	 * This should have the highest priority.
>> +	 */
>> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
>> +	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
>> +		return 0;
>> +
> 
> How is this not buggy, given that if inode_need_compress as called from 
> compress_file_range will return zero, meaning we jump to cont: label. 
> Then in the case of an inline extent we can execute : 

In that case, you won't go into compress_file_range() at all.

As the only caller of compress_file_range() is async_cow_start(), which
get queued in cow_file_range_async().

And cow_file_range_async() traces back to run_delalloc_range().
Here we determine (basically) where some dirty range goes.

The modification in inode_need_compress() mostly affects the decision in
run_delalloc_range(), so we won't go cow_file_range_async(), thus we
won't hit the problem you described.
> 
> ret = cow_file_range_inline(inode, start, end,          
>                            total_compressed,           
>                            compress_type, pages);   
> 
> where compress_type would have been set at the beginning of the 
> function unconditionally to fs_info->compress_type. 
> 
> For non-inline extents I guess we are ok, given that will_compress 
> will not be set. However, this code is rather messy and I'm not sure 
> it's well defined what's going to happen in this case with inline extents. 
> 
> OTOH, I think there is something fundamentally wrong in calling 
> inode_need_compress in compress_file_range. I.e they work at different 
> abstractions. IMO compress_file_range should only be called if we know 
> we have to compress the range. 
> 
> So looking around the code in run_delalloc_range (the only function 
> which calls cow_file_range_async) we already have : 
> 
>  } else if (!inode_need_compress(inode, start, end)) {                   
>                 ret = cow_file_range(inode, locked_page, start, end, end,       
>                                       page_started, nr_written, 1, NULL);   
> 
> and in the else branch we have the cow_file_range_async. So the code 
> is sort of half-way there to actually decoupling compression checking from 
> performing the actual compression. 
> 
> 
>>  	/* force compress */
>>  	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
>>  		return 1;
> 
> One more thing, in inode_need_compress shouldn't the inode specific
> checks come first something like :
> 
> 
> static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
> {                                                                               
>         struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);                  
>                                                                                 
>         /* defrag ioctl */                                                      
>         if (BTRFS_I(inode)->defrag_compress)                                    
>                 return 1;                                                       
>         /* bad compression ratios */                                            
>         if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
>                 return 0;                                                       

Not exactly.
Force-compress should less us bypass bad compression ratio, so it should
be at least before ratio check.

Thanks,
Qu

>         /* force compress */                                                    
>         if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
>                 return 1;                                                       
>         if (btrfs_test_opt(fs_info, COMPRESS) ||                                
>             BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
>             BTRFS_I(inode)->prop_compress)                                      
>                 return btrfs_compress_heuristic(inode, start, end);             
>         return 0;                                                               
> }             
> 
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nikolay Borisov May 15, 2018, 8:35 a.m. UTC | #3
On 15.05.2018 11:30, Qu Wenruo wrote:
> 
> 
> On 2018年05月15日 16:21, Nikolay Borisov wrote:
>>
>>
>> On 15.05.2018 10:36, Qu Wenruo wrote:
>>> As btrfs(5) specified:
>>>
>>> 	Note
>>> 	If nodatacow or nodatasum are enabled, compression is disabled.
>>>
>>> If NODATASUM or NODATACOW set, we should not compress the extent.
>>>
>>> Normally NODATACOW is detected properly in run_delalloc_range() so
>>> compression won't happen for NODATACOW.
>>>
>>> However for NODATASUM we don't have any check, and it can cause
>>> compressed extent without csum pretty easily, just by:
>>> ------
>>> mkfs.btrfs -f $dev
>>> mount $dev $mnt -o nodatasum
>>> touch $mnt/foobar
>>> mount -o remount,datasum,compress $mnt
>>> xfs_io -f -c "pwrite 0 128K" $mnt/foobar
>>> ------
>>>
>>> And in fact, we have bug report about corrupted compressed extent
>>> without proper data checksum so even RAID1 can't recover the corruption.
>>> (https://bugzilla.kernel.org/show_bug.cgi?id=199707)
>>>
>>> Running compression without proper checksum could cause more damage when
>>> corruption happens, so there is no need to allow compression for
>>> NODATACSUM.
>>>
>>> Reported-by: James Harvey <jamespharvey20@gmail.com>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>>  fs/btrfs/inode.c | 8 ++++++++
>>>  1 file changed, 8 insertions(+)
>>>
>>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>>> index d241285a0d2a..dbef3f404559 100644
>>> --- a/fs/btrfs/inode.c
>>> +++ b/fs/btrfs/inode.c
>>> @@ -396,6 +396,14 @@ static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
>>>  {
>>>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>>  
>>> +	/*
>>> +	 * Btrfs doesn't support compression without csum or CoW.
>>> +	 * This should have the highest priority.
>>> +	 */
>>> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
>>> +	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
>>> +		return 0;
>>> +
>>
>> How is this not buggy, given that if inode_need_compress as called from 
>> compress_file_range will return zero, meaning we jump to cont: label. 
>> Then in the case of an inline extent we can execute : 
> 
> In that case, you won't go into compress_file_range() at all.
> 
> As the only caller of compress_file_range() is async_cow_start(), which
> get queued in cow_file_range_async().
> 
> And cow_file_range_async() traces back to run_delalloc_range().
> Here we determine (basically) where some dirty range goes.
> 
> The modification in inode_need_compress() mostly affects the decision in
> run_delalloc_range(), so we won't go cow_file_range_async(), thus we
> won't hit the problem you described.

So you have re-iterated what I've described further below. This means it
should be possible to remove the invocation of inode_need_compress in
compress_file_range and simplify the code there, no? Perhaps
will_compress can also be removed etc?  As it stands currently it's
spaghetti code.

>>
>> ret = cow_file_range_inline(inode, start, end,          
>>                            total_compressed,           
>>                            compress_type, pages);   
>>
>> where compress_type would have been set at the beginning of the 
>> function unconditionally to fs_info->compress_type. 
>>
>> For non-inline extents I guess we are ok, given that will_compress 
>> will not be set. However, this code is rather messy and I'm not sure 
>> it's well defined what's going to happen in this case with inline extents. 
>>
>> OTOH, I think there is something fundamentally wrong in calling 
>> inode_need_compress in compress_file_range. I.e they work at different 
>> abstractions. IMO compress_file_range should only be called if we know 
>> we have to compress the range. 
>>
>> So looking around the code in run_delalloc_range (the only function 
>> which calls cow_file_range_async) we already have : 
>>
>>  } else if (!inode_need_compress(inode, start, end)) {                   
>>                 ret = cow_file_range(inode, locked_page, start, end, end,       
>>                                       page_started, nr_written, 1, NULL);   
>>
>> and in the else branch we have the cow_file_range_async. So the code 
>> is sort of half-way there to actually decoupling compression checking from 
>> performing the actual compression. 
>>
>>
>>>  	/* force compress */
>>>  	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
>>>  		return 1;
>>
>> One more thing, in inode_need_compress shouldn't the inode specific
>> checks come first something like :
>>
>>
>> static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
>> {                                                                               
>>         struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);                  
>>                                                                                 
>>         /* defrag ioctl */                                                      
>>         if (BTRFS_I(inode)->defrag_compress)                                    
>>                 return 1;                                                       
>>         /* bad compression ratios */                                            
>>         if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
>>                 return 0;                                                       
> 
> Not exactly.
> Force-compress should less us bypass bad compression ratio, so it should
> be at least before ratio check.
> 
> Thanks,
> Qu
> 
>>         /* force compress */                                                    
>>         if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
>>                 return 1;                                                       
>>         if (btrfs_test_opt(fs_info, COMPRESS) ||                                
>>             BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
>>             BTRFS_I(inode)->prop_compress)                                      
>>                 return btrfs_compress_heuristic(inode, start, end);             
>>         return 0;                                                               
>> }             
>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo May 15, 2018, 8:48 a.m. UTC | #4
On 2018年05月15日 16:35, Nikolay Borisov wrote:
> 
> 
> On 15.05.2018 11:30, Qu Wenruo wrote:
>>
>>
>> On 2018年05月15日 16:21, Nikolay Borisov wrote:
>>>
>>>
>>> On 15.05.2018 10:36, Qu Wenruo wrote:
>>>> As btrfs(5) specified:
>>>>
>>>> 	Note
>>>> 	If nodatacow or nodatasum are enabled, compression is disabled.
>>>>
>>>> If NODATASUM or NODATACOW set, we should not compress the extent.
>>>>
>>>> Normally NODATACOW is detected properly in run_delalloc_range() so
>>>> compression won't happen for NODATACOW.
>>>>
>>>> However for NODATASUM we don't have any check, and it can cause
>>>> compressed extent without csum pretty easily, just by:
>>>> ------
>>>> mkfs.btrfs -f $dev
>>>> mount $dev $mnt -o nodatasum
>>>> touch $mnt/foobar
>>>> mount -o remount,datasum,compress $mnt
>>>> xfs_io -f -c "pwrite 0 128K" $mnt/foobar
>>>> ------
>>>>
>>>> And in fact, we have bug report about corrupted compressed extent
>>>> without proper data checksum so even RAID1 can't recover the corruption.
>>>> (https://bugzilla.kernel.org/show_bug.cgi?id9707)
>>>>
>>>> Running compression without proper checksum could cause more damage when
>>>> corruption happens, so there is no need to allow compression for
>>>> NODATACSUM.
>>>>
>>>> Reported-by: James Harvey <jamespharvey20@gmail.com>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>  fs/btrfs/inode.c | 8 ++++++++
>>>>  1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>>>> index d241285a0d2a..dbef3f404559 100644
>>>> --- a/fs/btrfs/inode.c
>>>> +++ b/fs/btrfs/inode.c
>>>> @@ -396,6 +396,14 @@ static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
>>>>  {
>>>>  	struct btrfs_fs_info *fs_info =trfs_sb(inode->i_sb);
>>>>  
>>>> +	/*
>>>> +	 * Btrfs doesn't support compression without csum or CoW.
>>>> +	 * This should have the highest priority.
>>>> +	 */
>>>> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
>>>> +	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
>>>> +		return 0;
>>>> +
>>>
>>> How is this not buggy, given that if inode_need_compress as called from 
>>> compress_file_range will return zero, meaning we jump to cont: label. 
>>> Then in the case of an inline extent we can execute : 
>>
>> In that case, you won't go into compress_file_range() at all.
>>
>> As the only caller of compress_file_range() is async_cow_start(), which
>> get queued in cow_file_range_async().
>>
>> And cow_file_range_async() traces back to run_delalloc_range().
>> Here we determine (basically) where some dirty range goes.
>>
>> The modification in inode_need_compress() mostly affects the decision in
>> run_delalloc_range(), so we won't go cow_file_range_async(), thus we
>> won't hit the problem you described.
> 
> So you have re-iterated what I've described further below. This means it
> should be possible to remove the invocation of inode_need_compress in
> compress_file_range and simplify the code there, no?

Yep, that's true.

> Perhaps
> will_compress can also be removed etc?  As it stands currently it's
> spaghetti code.

Nice idea to further clean this code up.

I'll update both patch after receiving enough feedback.

Thanks,
Qu

> 
>>>
>>> ret =ow_file_range_inline(inode, start, end,          
>>>                            total_compressed,           
>>>                            compress_type, pages);   
>>>
>>> where compress_type would have been set at the beginning of the 
>>> function unconditionally to fs_info->compress_type. 
>>>
>>> For non-inline extents I guess we are ok, given that will_compress 
>>> will not be set. However, this code is rather messy and I'm not sure 
>>> it's well defined what's going to happen in this case with inline extents. 
>>>
>>> OTOH, I think there is something fundamentally wrong in calling 
>>> inode_need_compress in compress_file_range. I.e they work at different 
>>> abstractions. IMO compress_file_range should only be called if we know 
>>> we have to compress the range. 
>>>
>>> So looking around the code in run_delalloc_range (the only function 
>>> which calls cow_file_range_async) we already have : 
>>>
>>>  } else if (!inode_need_compress(inode, start, end)) {                   
>>>                 ret =ow_file_range(inode, locked_page, start, end, end,       
>>>                                       page_started, nr_written, 1, NULL);   
>>>
>>> and in the else branch we have the cow_file_range_async. So the code 
>>> is sort of half-way there to actually decoupling compression checking from 
>>> performing the actual compression. 
>>>
>>>
>>>>  	/* force compress */
>>>>  	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
>>>>  		return 1;
>>>
>>> One more thing, in inode_need_compress shouldn't the inode specific
>>> checks come first something like :
>>>
>>>
>>> static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
>>> {                                                                               
>>>         struct btrfs_fs_info *fs_info =trfs_sb(inode->i_sb);                  
>>>                                                                                 
>>>         /* defrag ioctl */                                                      
>>>         if (BTRFS_I(inode)->defrag_compress)                                    
>>>                 return 1;                                                       
>>>         /* bad compression ratios */                                            
>>>         if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
>>>                 return 0;                                                       
>>
>> Not exactly.
>> Force-compress should less us bypass bad compression ratio, so it should
>> be at least before ratio check.
>>
>> Thanks,
>> Qu
>>
>>>         /* force compress */                                                    
>>>         if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
>>>                 return 1;                                                       
>>>         if (btrfs_test_opt(fs_info, COMPRESS) ||                                
>>>             BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
>>>             BTRFS_I(inode)->prop_compress)                                      
>>>                 return btrfs_compress_heuristic(inode, start, end);             
>>>         return 0;                                                               
>>> }             
>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nikolay Borisov May 15, 2018, 10:36 a.m. UTC | #5
On 15.05.2018 11:48, Qu Wenruo wrote:
<SNIP>

>>>>
>>>>
>>>> static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
>>>> {                                                                               
>>>>         struct btrfs_fs_info *fs_info =trfs_sb(inode->i_sb);                  
>>>>                                                                                 
>>>>         /* defrag ioctl */                                                      
>>>>         if (BTRFS_I(inode)->defrag_compress)                                    
>>>>                 return 1;                                                       
>>>>         /* bad compression ratios */                                            
>>>>         if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
>>>>                 return 0;                                                       
>>>
>>> Not exactly.
>>> Force-compress should less us bypass bad compression ratio, so it should
>>> be at least before ratio check.

Fair enough, what prompted me in suggesting this is that perhaps the
check for BTRFS_INODE_NOCOMPRESS should be somwhere at the top of the
function (alongside the newly added two checks for inode flags), no ?
INODE_NOCOMPRESS can be set via icotl not necessarily only due to bad
compression ratio.

>>>
>>> Thanks,
>>> Qu
>>>
>>>>         /* force compress */                                                    
>>>>         if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
>>>>                 return 1;                                                       
>>>>         if (btrfs_test_opt(fs_info, COMPRESS) ||                                
>>>>             BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
>>>>             BTRFS_I(inode)->prop_compress)                                      
>>>>                 return btrfs_compress_heuristic(inode, start, end);             
>>>>         return 0;                                                               
>>>> }             

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo May 15, 2018, 10:48 a.m. UTC | #6
On 2018年05月15日 18:36, Nikolay Borisov wrote:
> 
> 
> On 15.05.2018 11:48, Qu Wenruo wrote:
> <SNIP>
> 
>>>>>
>>>>>
>>>>> static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)  
>>>>> {                                                                               
>>>>>         struct btrfs_fs_info *fs_info =trfs_sb(inode->i_sb);                  
>>>>>                                                                                 
>>>>>         /* defrag ioctl */                                                      
>>>>>         if (BTRFS_I(inode)->defrag_compress)                                    
>>>>>                 return 1;                                                       
>>>>>         /* bad compression ratios */                                            
>>>>>         if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS)                     
>>>>>                 return 0;                                                       
>>>>
>>>> Not exactly.
>>>> Force-compress should less us bypass bad compression ratio, so it should
>>>> be at least before ratio check.
> 
> Fair enough, what prompted me in suggesting this is that perhaps the
> check for BTRFS_INODE_NOCOMPRESS should be somwhere at the top of the
> function (alongside the newly added two checks for inode flags), no ?
> INODE_NOCOMPRESS can be set via icotl not necessarily only due to bad
> compression ratio.

This is much trickier than expected.

The point here is, what's the correct behavior for compress-force.
Should it override manually set NOCOMPRESS?

Unfortunately I have no idea at all.
So I can only leave it as is for now.

Thanks,
Qu

> 
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>         /* force compress */                                                    
>>>>>         if (btrfs_test_opt(fs_info, FORCE_COMPRESS))                            
>>>>>                 return 1;                                                       
>>>>>         if (btrfs_test_opt(fs_info, COMPRESS) ||                                
>>>>>             BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS ||                     
>>>>>             BTRFS_I(inode)->prop_compress)                                      
>>>>>                 return btrfs_compress_heuristic(inode, start, end);             
>>>>>         return 0;                                                               
>>>>> }             
> 
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo June 25, 2019, 8:24 a.m. UTC | #7
Ping?

This patch should fix the problem of compressed extent even when
nodatasum is set.

It has been one year but we still didn't get a conclusion on where
force_compress should behave.

But at least to me, NODATASUM is a strong exclusion for compress, no
matter whatever option we use, we should NEVER compress data without
datasum/datacow.

So I still want to push this patch as is.

Thanks,
Qu


On 2018/5/15 下午3:36, Qu Wenruo wrote:
> As btrfs(5) specified:
> 
> 	Note
> 	If nodatacow or nodatasum are enabled, compression is disabled.
> 
> If NODATASUM or NODATACOW set, we should not compress the extent.
> 
> Normally NODATACOW is detected properly in run_delalloc_range() so
> compression won't happen for NODATACOW.
> 
> However for NODATASUM we don't have any check, and it can cause
> compressed extent without csum pretty easily, just by:
> ------
> mkfs.btrfs -f $dev
> mount $dev $mnt -o nodatasum
> touch $mnt/foobar
> mount -o remount,datasum,compress $mnt
> xfs_io -f -c "pwrite 0 128K" $mnt/foobar
> ------
> 
> And in fact, we have bug report about corrupted compressed extent
> without proper data checksum so even RAID1 can't recover the corruption.
> (https://bugzilla.kernel.org/show_bug.cgi?id=199707)
> 
> Running compression without proper checksum could cause more damage when
> corruption happens, so there is no need to allow compression for
> NODATACSUM.
> 
> Reported-by: James Harvey <jamespharvey20@gmail.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/inode.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d241285a0d2a..dbef3f404559 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -396,6 +396,14 @@ static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
>  {
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  
> +	/*
> +	 * Btrfs doesn't support compression without csum or CoW.
> +	 * This should have the highest priority.
> +	 */
> +	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
> +	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
> +		return 0;
> +
>  	/* force compress */
>  	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
>  		return 1;
>
David Sterba June 27, 2019, 2:58 p.m. UTC | #8
On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
> Ping?
> 
> This patch should fix the problem of compressed extent even when
> nodatasum is set.
> 
> It has been one year but we still didn't get a conclusion on where
> force_compress should behave.

Note that pings to patches sent year ago will get lost, I noticed only
because you resent it and I remembered that we had some discussions,
without conclusions.

> But at least to me, NODATASUM is a strong exclusion for compress, no
> matter whatever option we use, we should NEVER compress data without
> datasum/datacow.

That's correct, but the way you fix it is IMO not right. This was also
noticed by Nikolay, that there are 2 locations that call
inode_need_compress but with different semantics.

One is the decision if compression applies at all, and the second one
when that's certain it's compression, to do it or not based on the
status decision of eg. heuristics.
Qu Wenruo June 28, 2019, 1:26 a.m. UTC | #9
On 2019/6/27 下午10:58, David Sterba wrote:
> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>> Ping?
>>
>> This patch should fix the problem of compressed extent even when
>> nodatasum is set.
>>
>> It has been one year but we still didn't get a conclusion on where
>> force_compress should behave.
> 
> Note that pings to patches sent year ago will get lost, I noticed only
> because you resent it and I remembered that we had some discussions,
> without conclusions.
> 
>> But at least to me, NODATASUM is a strong exclusion for compress, no
>> matter whatever option we use, we should NEVER compress data without
>> datasum/datacow.
> 
> That's correct, but the way you fix it is IMO not right. This was also
> noticed by Nikolay, that there are 2 locations that call
> inode_need_compress but with different semantics.
> 
> One is the decision if compression applies at all,

> and the second one
> when that's certain it's compression, to do it or not based on the
> status decision of eg. heuristics.

The second call is in compress_file_extent(), with inode_need_compress()
return 0 for NODATACOW/NODATASUM inodes, we will not go into
cow_file_range_async() branch at all.

So would you please explain how this could cause problem?
To me, prevent the problem in inode_need_compress() is the safest location.

Thanks,
Qu
Anand Jain June 28, 2019, 2:47 a.m. UTC | #10
On 27/6/19 10:58 PM, David Sterba wrote:
> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>> Ping?
>>
>> This patch should fix the problem of compressed extent even when
>> nodatasum is set.
>>
>> It has been one year but we still didn't get a conclusion on where
>> force_compress should behave.
> 
> Note that pings to patches sent year ago will get lost, I noticed only
> because you resent it and I remembered that we had some discussions,
> without conclusions.
> 
>> But at least to me, NODATASUM is a strong exclusion for compress, no
>> matter whatever option we use, we should NEVER compress data without
>> datasum/datacow.
> 
> That's correct, 

  But I wonder what's the reason that datasum/datacow is prerequisite 
for the compression ?

Thanks, Anand

>but the way you fix it is IMO not right. This was also
> noticed by Nikolay, that there are 2 locations that call
> inode_need_compress but with different semantics.
> 
> One is the decision if compression applies at all, and the second one
> when that's certain it's compression, to do it or not based on the
> status decision of eg. heuristics.
>
Qu Wenruo June 28, 2019, 5:58 a.m. UTC | #11
On 2019/6/28 上午10:47, Anand Jain wrote:
> On 27/6/19 10:58 PM, David Sterba wrote:
>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>>> Ping?
>>>
>>> This patch should fix the problem of compressed extent even when
>>> nodatasum is set.
>>>
>>> It has been one year but we still didn't get a conclusion on where
>>> force_compress should behave.
>>
>> Note that pings to patches sent year ago will get lost, I noticed only
>> because you resent it and I remembered that we had some discussions,
>> without conclusions.
>>
>>> But at least to me, NODATASUM is a strong exclusion for compress, no
>>> matter whatever option we use, we should NEVER compress data without
>>> datasum/datacow.
>>
>> That's correct, 
> 
>  But I wonder what's the reason that datasum/datacow is prerequisite for
> the compression ?

It's easy to understand the hard requirement for data COW.

If you overwrite compressed extent, a powerloss before transaction
commit could easily screw up the data.

Furthermore, what will happen if you're overwriting a 16K data extent
while its original compressed size is only 4K, while the new data after
compression is 8K?

For checksum, I'd say it's not as a hard requirement as data cow, but
still a very preferred one.

Since compressed data corruption could cause more problem, e.g. one bit
corruption can cause the whole extent to be corrupted, it's highly
recommended to have checksum to protect them.

Thanks,
Qu
> 
> Thanks, Anand
> 
>> but the way you fix it is IMO not right. This was also
>> noticed by Nikolay, that there are 2 locations that call
>> inode_need_compress but with different semantics.
>>
>> One is the decision if compression applies at all, and the second one
>> when that's certain it's compression, to do it or not based on the
>> status decision of eg. heuristics.
>>
>
Anand Jain June 28, 2019, 6:56 a.m. UTC | #12
> On 28 Jun 2019, at 1:58 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> 
> 
> 
> On 2019/6/28 上午10:47, Anand Jain wrote:
>> On 27/6/19 10:58 PM, David Sterba wrote:
>>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>>>> Ping?
>>>> 
>>>> This patch should fix the problem of compressed extent even when
>>>> nodatasum is set.
>>>> 
>>>> It has been one year but we still didn't get a conclusion on where
>>>> force_compress should behave.
>>> 
>>> Note that pings to patches sent year ago will get lost, I noticed only
>>> because you resent it and I remembered that we had some discussions,
>>> without conclusions.
>>> 
>>>> But at least to me, NODATASUM is a strong exclusion for compress, no
>>>> matter whatever option we use, we should NEVER compress data without
>>>> datasum/datacow.
>>> 
>>> That's correct, 
>> 
>>  But I wonder what's the reason that datasum/datacow is prerequisite for
>> the compression ?
> 
> It's easy to understand the hard requirement for data COW.
> 
> If you overwrite compressed extent, a powerloss before transaction
> commit could easily screw up the data.

 This scenario is even true for non compressed data, right?

> Furthermore, what will happen if you're overwriting a 16K data extent
> while its original compressed size is only 4K, while the new data after
> compression is 8K?

 Sorry I can’t think of any limitation, any idea?

> For checksum, I'd say it's not as a hard requirement as data cow, but
> still a very preferred one.
> 
> Since compressed data corruption could cause more problem, e.g. one bit
> corruption can cause the whole extent to be corrupted, it's highly
> recommended to have checksum to protect them.

 Isn’t it true that compression boundary/granularity is an extent,
 so the corrupted extent remains corrupted the same way after the
 decompress, and corruption won’t perpetuate to the other extents
 in the file. But can a non-impending corruption due to external
 factors be the reason for datasum perrequisite? it can’t be IMO.

Thanks, Anand

> Thanks,
> Qu
>> 
>> Thanks, Anand
>> 
>>> but the way you fix it is IMO not right. This was also
>>> noticed by Nikolay, that there are 2 locations that call
>>> inode_need_compress but with different semantics.
>>> 
>>> One is the decision if compression applies at all, and the second one
>>> when that's certain it's compression, to do it or not based on the
>>> status decision of eg. heuristics.
>>> 
>> 
>
Qu Wenruo June 28, 2019, 7:09 a.m. UTC | #13
On 2019/6/28 下午2:56, Anand Jain wrote:
> 
> 
>> On 28 Jun 2019, at 1:58 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2019/6/28 上午10:47, Anand Jain wrote:
>>> On 27/6/19 10:58 PM, David Sterba wrote:
>>>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>>>>> Ping?
>>>>>
>>>>> This patch should fix the problem of compressed extent even when
>>>>> nodatasum is set.
>>>>>
>>>>> It has been one year but we still didn't get a conclusion on where
>>>>> force_compress should behave.
>>>>
>>>> Note that pings to patches sent year ago will get lost, I noticed only
>>>> because you resent it and I remembered that we had some discussions,
>>>> without conclusions.
>>>>
>>>>> But at least to me, NODATASUM is a strong exclusion for compress, no
>>>>> matter whatever option we use, we should NEVER compress data without
>>>>> datasum/datacow.
>>>>
>>>> That's correct, 
>>>
>>>  But I wonder what's the reason that datasum/datacow is prerequisite for
>>> the compression ?
>>
>> It's easy to understand the hard requirement for data COW.
>>
>> If you overwrite compressed extent, a powerloss before transaction
>> commit could easily screw up the data.
> 
>  This scenario is even true for non compressed data, right?

Not exactly.

For non-compressed data, it's either:
1) NODATACOW (implies NODATASUM)
   Then we don't know if the new data is correct or not.
   But we still can READ it out, and let user space to determine.

2) DATACOW (no matter if it has data sum or not)
   The old data is not touched at all. So nothing but the uncommitted
   data is lost.

But for compressed NODATACOW data, it's more serious as:
We CANNOT read the data out, as the on-disk data must follow
compression schema, thus even we only overwrite the beginning 4K of a
16K compressed 128K uncompressed extent, the whole 128K extent can't be
read out.

So it's way more serious than non-compressed extent.

> 
>> Furthermore, what will happen if you're overwriting a 16K data extent
>> while its original compressed size is only 4K, while the new data after
>> compression is 8K?
> 
>  Sorry I can’t think of any limitation, any idea?

We don't know how the compressed extent size is until we compress it.

So how do you know whether we should CoW or not at the delalloc time if
we allow overwrite compressed extent?

> 
>> For checksum, I'd say it's not as a hard requirement as data cow, but
>> still a very preferred one.
>>
>> Since compressed data corruption could cause more problem, e.g. one bit
>> corruption can cause the whole extent to be corrupted, it's highly
>> recommended to have checksum to protect them.
> 
>  Isn’t it true that compression boundary/granularity is an extent,
>  so the corrupted extent remains corrupted the same way after the
>  decompress,

For corrupted compressed extent, it may not pass decompress.

> and corruption won’t perpetuate to the other extents
>  in the file. But can a non-impending corruption due to external
>  factors be the reason for datasum perrequisite? it can’t be IMO.

The corruption boundary is the WHOLE extent, not just where the
corruption is.
That's the point, and that's why it's more serious than plaintext extent
corruption, and why checksum is highly recommended.

Thanks,
Qu

> 
> Thanks, Anand
> 
>> Thanks,
>> Qu
>>>
>>> Thanks, Anand
>>>
>>>> but the way you fix it is IMO not right. This was also
>>>> noticed by Nikolay, that there are 2 locations that call
>>>> inode_need_compress but with different semantics.
>>>>
>>>> One is the decision if compression applies at all, and the second one
>>>> when that's certain it's compression, to do it or not based on the
>>>> status decision of eg. heuristics.
>>>>
>>>
>>
>
David Sterba June 28, 2019, 11:34 a.m. UTC | #14
On Fri, Jun 28, 2019 at 09:26:53AM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/6/27 下午10:58, David Sterba wrote:
> > On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
> >> Ping?
> >>
> >> This patch should fix the problem of compressed extent even when
> >> nodatasum is set.
> >>
> >> It has been one year but we still didn't get a conclusion on where
> >> force_compress should behave.
> > 
> > Note that pings to patches sent year ago will get lost, I noticed only
> > because you resent it and I remembered that we had some discussions,
> > without conclusions.
> > 
> >> But at least to me, NODATASUM is a strong exclusion for compress, no
> >> matter whatever option we use, we should NEVER compress data without
> >> datasum/datacow.
> > 
> > That's correct, but the way you fix it is IMO not right. This was also
> > noticed by Nikolay, that there are 2 locations that call
> > inode_need_compress but with different semantics.
> > 
> > One is the decision if compression applies at all,
> 
> > and the second one
> > when that's certain it's compression, to do it or not based on the
> > status decision of eg. heuristics.
> 
> The second call is in compress_file_extent(), with inode_need_compress()
> return 0 for NODATACOW/NODATASUM inodes, we will not go into
> cow_file_range_async() branch at all.
> 
> So would you please explain how this could cause problem?
> To me, prevent the problem in inode_need_compress() is the safest location.

Let me repeat: two places with different semantics. So this means that
we need two functions that reflect the differences. That it's in one
function that works both contexts is ok from functionality point of
view, but if we care about clarity of design and code we want two
functions.
Qu Wenruo June 28, 2019, 12:09 p.m. UTC | #15
On 2019/6/28 下午7:34, David Sterba wrote:
> On Fri, Jun 28, 2019 at 09:26:53AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2019/6/27 下午10:58, David Sterba wrote:
>>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
>>>> Ping?
>>>>
>>>> This patch should fix the problem of compressed extent even when
>>>> nodatasum is set.
>>>>
>>>> It has been one year but we still didn't get a conclusion on where
>>>> force_compress should behave.
>>>
>>> Note that pings to patches sent year ago will get lost, I noticed only
>>> because you resent it and I remembered that we had some discussions,
>>> without conclusions.
>>>
>>>> But at least to me, NODATASUM is a strong exclusion for compress, no
>>>> matter whatever option we use, we should NEVER compress data without
>>>> datasum/datacow.
>>>
>>> That's correct, but the way you fix it is IMO not right. This was also
>>> noticed by Nikolay, that there are 2 locations that call
>>> inode_need_compress but with different semantics.
>>>
>>> One is the decision if compression applies at all,
>>
>>> and the second one
>>> when that's certain it's compression, to do it or not based on the
>>> status decision of eg. heuristics.
>>
>> The second call is in compress_file_extent(), with inode_need_compress()
>> return 0 for NODATACOW/NODATASUM inodes, we will not go into
>> cow_file_range_async() branch at all.
>>
>> So would you please explain how this could cause problem?
>> To me, prevent the problem in inode_need_compress() is the safest location.
> 
> Let me repeat: two places with different semantics. So this means that
> we need two functions that reflect the differences. That it's in one
> function that works both contexts is ok from functionality point of
> view, but if we care about clarity of design and code we want two
> functions.
>

OK, so in next version I'll split the inode_need_compress() into two
functions for different semantics:
- inode_can_compress()
  The hard requirement for compress code. E.g. COW and checksum checks.
- inode_need_compress()
  The soft requirement, for things like ratio, force_compress checks.

Will this modification be fine?

Thanks,
Qu
David Sterba June 28, 2019, 4:38 p.m. UTC | #16
On Fri, Jun 28, 2019 at 08:09:46PM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/6/28 下午7:34, David Sterba wrote:
> > On Fri, Jun 28, 2019 at 09:26:53AM +0800, Qu Wenruo wrote:
> >>
> >>
> >> On 2019/6/27 下午10:58, David Sterba wrote:
> >>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote:
> >>>> Ping?
> >>>>
> >>>> This patch should fix the problem of compressed extent even when
> >>>> nodatasum is set.
> >>>>
> >>>> It has been one year but we still didn't get a conclusion on where
> >>>> force_compress should behave.
> >>>
> >>> Note that pings to patches sent year ago will get lost, I noticed only
> >>> because you resent it and I remembered that we had some discussions,
> >>> without conclusions.
> >>>
> >>>> But at least to me, NODATASUM is a strong exclusion for compress, no
> >>>> matter whatever option we use, we should NEVER compress data without
> >>>> datasum/datacow.
> >>>
> >>> That's correct, but the way you fix it is IMO not right. This was also
> >>> noticed by Nikolay, that there are 2 locations that call
> >>> inode_need_compress but with different semantics.
> >>>
> >>> One is the decision if compression applies at all,
> >>
> >>> and the second one
> >>> when that's certain it's compression, to do it or not based on the
> >>> status decision of eg. heuristics.
> >>
> >> The second call is in compress_file_extent(), with inode_need_compress()
> >> return 0 for NODATACOW/NODATASUM inodes, we will not go into
> >> cow_file_range_async() branch at all.
> >>
> >> So would you please explain how this could cause problem?
> >> To me, prevent the problem in inode_need_compress() is the safest location.
> > 
> > Let me repeat: two places with different semantics. So this means that
> > we need two functions that reflect the differences. That it's in one
> > function that works both contexts is ok from functionality point of
> > view, but if we care about clarity of design and code we want two
> > functions.
> >
> 
> OK, so in next version I'll split the inode_need_compress() into two
> functions for different semantics:
> - inode_can_compress()
>   The hard requirement for compress code. E.g. COW and checksum checks.
> - inode_need_compress()
>   The soft requirement, for things like ratio, force_compress checks.
> 
> Will this modification be fine?

Yes.
diff mbox

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d241285a0d2a..dbef3f404559 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -396,6 +396,14 @@  static inline int inode_need_compress(struct inode *inode, u64 start, u64 end)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 
+	/*
+	 * Btrfs doesn't support compression without csum or CoW.
+	 * This should have the highest priority.
+	 */
+	if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW ||
+	    BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)
+		return 0;
+
 	/* force compress */
 	if (btrfs_test_opt(fs_info, FORCE_COMPRESS))
 		return 1;