diff mbox series

btrfs: fix build warning due to u64 devided by u32 for 32bit arch

Message ID 20201102073114.66750-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix build warning due to u64 devided by u32 for 32bit arch | expand

Commit Message

Qu Wenruo Nov. 2, 2020, 7:31 a.m. UTC
[BUG]
When building the kernel with subpage preparation patches, 32bit arches
will complain about the following linking error:

   ld: fs/btrfs/extent_io.o: in function `release_extent_buffer':
   fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3'

[CAUSE]
For 32bits, dividing u64 with u32 need to call div_u64(), not directly
call u64 / u32.

[FIX]
Instead of calling the div_u64() macros, here we introduce a helper,
btrfs_sector_shift(), to calculate the sector shift, and we just do bit
shift to avoid executing the expensive division instruction.

The sector_shift may be better cached in btrfs_fs_info, but so far there
are only very limited callers for that, thus the fs_info::sector_shift
can be there for further cleanup.

David, can this patch be folded into the offending commit?
The patch is small enough, and doesn't change btrfs_fs_info.
Thus should be OK to fold.

Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h     |  5 +++++
 fs/btrfs/extent_io.c | 14 +++++++++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

Comments

David Sterba Nov. 2, 2020, 1:54 p.m. UTC | #1
On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote:
> [BUG]
> When building the kernel with subpage preparation patches, 32bit arches
> will complain about the following linking error:
> 
>    ld: fs/btrfs/extent_io.o: in function `release_extent_buffer':
>    fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3'
> 
> [CAUSE]
> For 32bits, dividing u64 with u32 need to call div_u64(), not directly
> call u64 / u32.
> 
> [FIX]
> Instead of calling the div_u64() macros, here we introduce a helper,
> btrfs_sector_shift(), to calculate the sector shift, and we just do bit
> shift to avoid executing the expensive division instruction.

Division is expensive but ilog2 does not come without a cost either.
It's implemented as bsrl+cmov, which can be also considered expensive
for frequent use.

> The sector_shift may be better cached in btrfs_fs_info, but so far there
> are only very limited callers for that, thus the fs_info::sector_shift
> can be there for further cleanup.
> 
> David, can this patch be folded into the offending commit?
> The patch is small enough, and doesn't change btrfs_fs_info.
> Thus should be OK to fold.

I have sent my series cleaning up the simple shifts, for the sectorsize
shift in particular see

https://lore.kernel.org/linux-btrfs/b38721840b8d703a29807b71460464134b9ca7e1.1603981453.git.dsterba@suse.com/

> Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ctree.h     |  5 +++++
>  fs/btrfs/extent_io.c | 14 +++++++++-----
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 8a83bce3225c..eb282af985f5 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3489,6 +3489,11 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
>  	return !!(btrfs_super_compat_ro_flags(disk_super) & flag);
>  }
>  
> +static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info)
> +{
> +	return ilog2(fs_info->sectorsize);

This has a runtime cost of calculating the the ilog2 each time we use
it.

> +}
> +
>  /* acl.c */
>  #ifdef CONFIG_BTRFS_FS_POSIX_ACL
>  struct posix_acl *btrfs_get_acl(struct inode *inode, int type);
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 80b35885004a..3452019aef79 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -5129,10 +5129,10 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info,
>  					 u64 start)
>  {
>  	struct extent_buffer *eb;
> +	u8 sector_shift = btrfs_sector_shift(fs_info);

And each use needs a temporary variable, where u8 generates worse
assembly and also potentially needs stack space.
Qu Wenruo Nov. 2, 2020, 2:14 p.m. UTC | #2
On 2020/11/2 下午9:54, David Sterba wrote:
> On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote:
>> [BUG]
>> When building the kernel with subpage preparation patches, 32bit arches
>> will complain about the following linking error:
>>
>>    ld: fs/btrfs/extent_io.o: in function `release_extent_buffer':
>>    fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3'
>>
>> [CAUSE]
>> For 32bits, dividing u64 with u32 need to call div_u64(), not directly
>> call u64 / u32.
>>
>> [FIX]
>> Instead of calling the div_u64() macros, here we introduce a helper,
>> btrfs_sector_shift(), to calculate the sector shift, and we just do bit
>> shift to avoid executing the expensive division instruction.
> 
> Division is expensive but ilog2 does not come without a cost either.
> It's implemented as bsrl+cmov, which can be also considered expensive
> for frequent use.

You're right, ilog2() is also expensive.
For our usage, what we really want is ffs(), which can be done with
hardware support to reduce the cost.

Thanks,
Qu
> 
>> The sector_shift may be better cached in btrfs_fs_info, but so far there
>> are only very limited callers for that, thus the fs_info::sector_shift
>> can be there for further cleanup.
>>
>> David, can this patch be folded into the offending commit?
>> The patch is small enough, and doesn't change btrfs_fs_info.
>> Thus should be OK to fold.
> 
> I have sent my series cleaning up the simple shifts, for the sectorsize
> shift in particular see
> 
> https://lore.kernel.org/linux-btrfs/b38721840b8d703a29807b71460464134b9ca7e1.1603981453.git.dsterba@suse.com/
> 
>> Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/ctree.h     |  5 +++++
>>  fs/btrfs/extent_io.c | 14 +++++++++-----
>>  2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 8a83bce3225c..eb282af985f5 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -3489,6 +3489,11 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
>>  	return !!(btrfs_super_compat_ro_flags(disk_super) & flag);
>>  }
>>  
>> +static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info)
>> +{
>> +	return ilog2(fs_info->sectorsize);
> 
> This has a runtime cost of calculating the the ilog2 each time we use
> it.
> 
>> +}
>> +
>>  /* acl.c */
>>  #ifdef CONFIG_BTRFS_FS_POSIX_ACL
>>  struct posix_acl *btrfs_get_acl(struct inode *inode, int type);
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 80b35885004a..3452019aef79 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -5129,10 +5129,10 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info,
>>  					 u64 start)
>>  {
>>  	struct extent_buffer *eb;
>> +	u8 sector_shift = btrfs_sector_shift(fs_info);
> 
> And each use needs a temporary variable, where u8 generates worse
> assembly and also potentially needs stack space.
>
David Sterba Nov. 2, 2020, 3:03 p.m. UTC | #3
On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote:
> [BUG]
> When building the kernel with subpage preparation patches, 32bit arches
> will complain about the following linking error:
> 
>    ld: fs/btrfs/extent_io.o: in function `release_extent_buffer':
>    fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3'
> 
> [CAUSE]
> For 32bits, dividing u64 with u32 need to call div_u64(), not directly
> call u64 / u32.
> 
> [FIX]
> Instead of calling the div_u64() macros, here we introduce a helper,
> btrfs_sector_shift(), to calculate the sector shift, and we just do bit
> shift to avoid executing the expensive division instruction.

I've refreshed and pushed the series adding fs_info::sectorsize_bits so
you can use it for the subpage patches.
diff mbox series

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8a83bce3225c..eb282af985f5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3489,6 +3489,11 @@  static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
 	return !!(btrfs_super_compat_ro_flags(disk_super) & flag);
 }
 
+static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info)
+{
+	return ilog2(fs_info->sectorsize);
+}
+
 /* acl.c */
 #ifdef CONFIG_BTRFS_FS_POSIX_ACL
 struct posix_acl *btrfs_get_acl(struct inode *inode, int type);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 80b35885004a..3452019aef79 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5129,10 +5129,10 @@  struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info,
 					 u64 start)
 {
 	struct extent_buffer *eb;
+	u8 sector_shift = btrfs_sector_shift(fs_info);
 
 	rcu_read_lock();
-	eb = radix_tree_lookup(&fs_info->buffer_radix,
-			       start / fs_info->sectorsize);
+	eb = radix_tree_lookup(&fs_info->buffer_radix, start >> sector_shift);
 	if (eb && atomic_inc_not_zero(&eb->refs)) {
 		rcu_read_unlock();
 		/*
@@ -5167,6 +5167,7 @@  struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info,
 					u64 start)
 {
 	struct extent_buffer *eb, *exists = NULL;
+	u8 sector_shift = btrfs_sector_shift(fs_info);
 	int ret;
 
 	eb = find_extent_buffer(fs_info, start);
@@ -5184,7 +5185,7 @@  struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info,
 	}
 	spin_lock(&fs_info->buffer_lock);
 	ret = radix_tree_insert(&fs_info->buffer_radix,
-				start / fs_info->sectorsize, eb);
+				start >> sector_shift, eb);
 	spin_unlock(&fs_info->buffer_lock);
 	radix_tree_preload_end();
 	if (ret == -EEXIST) {
@@ -5215,6 +5216,7 @@  struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 	struct extent_buffer *exists = NULL;
 	struct page *p;
 	struct address_space *mapping = fs_info->btree_inode->i_mapping;
+	u8 sector_shift = btrfs_sector_shift(fs_info);
 	int uptodate = 1;
 	int ret;
 
@@ -5292,7 +5294,7 @@  struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 
 	spin_lock(&fs_info->buffer_lock);
 	ret = radix_tree_insert(&fs_info->buffer_radix,
-				start / fs_info->sectorsize, eb);
+				start >> sector_shift, eb);
 	spin_unlock(&fs_info->buffer_lock);
 	radix_tree_preload_end();
 	if (ret == -EEXIST) {
@@ -5337,6 +5339,8 @@  static inline void btrfs_release_extent_buffer_rcu(struct rcu_head *head)
 static int release_extent_buffer(struct extent_buffer *eb)
 	__releases(&eb->refs_lock)
 {
+	u8 sector_shift = btrfs_sector_shift(eb->fs_info);
+
 	lockdep_assert_held(&eb->refs_lock);
 
 	WARN_ON(atomic_read(&eb->refs) == 0);
@@ -5348,7 +5352,7 @@  static int release_extent_buffer(struct extent_buffer *eb)
 
 			spin_lock(&fs_info->buffer_lock);
 			radix_tree_delete(&fs_info->buffer_radix,
-					  eb->start / fs_info->sectorsize);
+					  eb->start >> sector_shift);
 			spin_unlock(&fs_info->buffer_lock);
 		} else {
 			spin_unlock(&eb->refs_lock);